Affiliation networks can provide useful insights for strategic positioning of an organization, whether it is a university, an NGO or a corporate body. In this case, we have analysed the WHO affiliation network for MDR-TB research as an example to demonstrate the potential of affiliation analysis.
Using the same set of data originally used for topic modelling, an affiliation network of all institutions in the downloaded database was constructed. We then focused the analysis on the network connected to the World Health Organization (WHO).
Looking at the WHO network, there appeared to be two overlapping but distinct networks associated with the organization. One network is centred around the node called ‘WHO’, which we will call WHO1 for this analysis (Figure 1), while the other network is centred around the node called ‘World Hlth Org’, (WHO2 for this writing) (Figure 2). These organizational names are provided by authors during publication. Both these nodes contain metadata with location determined to be Geneva, Switzerland, so it was confirmed that these two nodes are involving the same organization.
At first glance it was assumed that there was simply an error in standardizing organizational nomenclature when publishing articles. We have noticed similar issues with MSF/Medecins Sans Frontiers, FIND/Found Innovat New Diagnost and Int Union/Int Union TB and Lung Dis. One option would be to merge these nodes belonging to the same organization together. However, it is also possible that within large organizations, there are distinct groups that work on different topics or with a different focus, and these groups may subsequently create separate collaboration networks. Thus, we decided to maintain the separation between the two different WHO nodes.
The first main difference that was noted was that the two different WHO groups seem to be focused on slightly different topics, as determined via topic modelling. In the visual below, the nodes are split into two halves to depict two properties (Figure 3). The left colour describes the main topic associated with the node, and the right colour describes the topic most active since 2015. The larger node, ‘WHO’ (WHO1) appears blue in colour throughout, which suggests that it has consistently been focused on the blue topic, which in this case is Operational research/Public health. The smaller node ‘World Hlth Org’ (WHO2) is split. This group was predominantly focused on the red topic over the years, which happens to be the topic of Treatment optimization. However, since 2015 up to the end of 2017 when the database was assembled, this group has been more active in the blue topic, Operational/Public health research, suggesting a change in emphasis.
To really find out if these two WHO groups are really different, additional investigations were made. The first question one could ask is whether these two groups have significant overlaps in terms of collaborations?
Out of 140 nodes in total for the two WHO groups, they overlap over 23 nodes. The institutions they both collaborate with can be seen in the visualization below. In this case the nodes are coloured for the main topic they have been focused on over the years. We can see that there are overlaps with major organizations such as Center for Disease Control and Prevention as well as Harvard University.
Besides these overlaps, have these two WHO groups developed their own distinct collaboration networks? To view this, the overlapping nodes are removed from the network, and all that remains are the exclusive nodes for either WHO1 (Figure 5) or WHO2 (Figure 6).
We can now observe that in the WHO1 network, connections to other smaller WHO nodes can be seen. A closer look at these other WHO nodes show that they are based around the world such as Cairo and Copenhagen. Interestingly these other nodes also call themselves ‘WHO’ and not ‘World Hlth Organisation’ as WHO2.
The visuals show that clearly WHO1 and WHO2 have formed distinctly different collaborations. This can be also visualized in a map view below (note that the overlapping institutions described earlier such as Harvard, CDC etc. have been removed) (Figures 7 & 8).
There are potentially 2 main groups within the WHO that work on MDR-TB with emphasis on different topics. WHO1 has been consistently focused on publications relating to Public Health whereas WHO2 has been mainly focused on Treatment Optimization, although since 2015, they have also published articles relating to Public health.
The algorithms also identified that both these WHO groups have slightly different publication histories. The larger group, WHO1, has published a total of 109 articles with the first in the database appearing in the year 2007 and the last in 2017 (note that 2007 is the earliest date available from Web of Science, and therefore does not signify the first article ever published in this field by the WHO). The group WHO2 has published 43 articles, with the first in 2008 and the last in 2017.
Both these groups are actively collaborating in common with some main players in the field, such as Harvard University, the CDC, Imperial College, Chinese CDC, McGill University, London School of Hygiene and Tropical Medicine and the International Union against TB and Lung Disease.
Additionally, WHO1 and WHO2 have formed distinct networks that cover different geographical regions. For example, WHO1 collaborates with groups in different parts of South Africa such as in Cape Town and Kwazulu-Natal, whereas WHO2 collaborates with a group in Johannesburg. WHO1’s reach goes all the way to Australia, whereas WHO2 forms collaborations in Peru instead.
Together, these two WHO groups form a central focus within the MDR-TB research network as can be seen in the merged network of the two groups below. Just by looking at the colours, we can see that much of the WHO collaboration network is red (topic Treatment optimization), with the remaining nodes spread between blue (Public health/Operational research), green (Drug-related research) and yellow (Diagnostics).
In the algorithm used for this analysis, the nodes are sized according to the PageRank algorithm, where the size of the nodes depend on the number of links that are incoming as well as the importance of these links. Therefore, we can see that Harvard and the CDC have the highest PageRank values (in the whole MDR-TB network), suggesting that this part of the network is highly influential (Figure 9). If we look through a list of institutions with the highest PageRank value in the whole network, we will find that WHO is connected to most of these institutions, suggesting that the connections the WHO make are important centres for MDR-TB research, even if WHO itself may not be the most influential in terms of research output.
As WHO is an institution involved mainly in policy-making, it will not do as well as other institutions, based on the PageRank algorithm, which will favour institutions that produce research work.
Another algorithm, based on a different set of weighted measurements can be subsequently used to determine the influence of a node over a network. This algorithm measures Betweenness centrality, which determines if a node connects diverse parts of a network by measuring the number of times it lies along the shortest path between two other nodes. Nodes with high Betweenness value are therefore influential in terms of their role in connecting different players in the network.
Using this algorithm, we see a slightly different picture emerging. We can observe that the International Union Against TB and Lung Diseases is now the most prominent node (Figure 10). WHO1 also appear to be a central player together with UCL and the CDC. However, Harvard University is no longer as prominent as before suggesting that the university is not central in connecting institutions together within the network.
Identifying other influential nodes in the WHO network is not the only way to analyse this data. Another question one may ask may be where does the WHO network not reach to?
If we look at the network above in Figure 10, we can see that there are many nodes in grey that are not covered by the WHO network. In fact, the WHO1 and WHO2 networks combined form only 140 nodes in the network, out of a total of 1305 nodes. To visualize the remaining network, we can create an inverse visualization, where we now see the remaining 1165 nodes (Figure 11).
Of these, the top institutions based on PageRank are Stellenbosch University, Johns Hopkins University, Fudan University, Karolinska Institute and Capital Medical University. All four of these institutions are coloured green, and are mainly focused on the topic Drug-related research, with the exception of Karolinska Institute which is coloured yellow (Diagnostics). If we include the whole network of 1305 nodes, we find that Stellenbosch University and Johns Hopkins University are ranked 3rd and 4th after CDC and Harvard University. This means that these two universities are key players in the field but are not connected directly to the WHO network.
Perhaps it is not too surprising that there is a lack of collaboration between WHO and these universities, given the focus on WHO on the topics of Public health and Treatment optimization, rather than on Drug-related research.
Nonetheless, these visualizations permit us to clearly see and analyse how organizations like WHO collaborate on a given subject such as MDR-TB and identify the gaps that may exist signifying potential areas for improvement, depending on the strategic needs of the organization.