In order to label the topics obtained via topic modelling, we visualized the top words within the topic using a word cloud (Figure 1), used the termite plot to compare words across topics (Figure 2) and the pyLDAvis plots (Figure 3) to determine the uniqueness of words within the corpus. We also then looked through the abstract of the top 10 articles within the topic (up to 50 if needed) to gain further clarity of the topic content. In concert, these approaches enabled a coherent picture to emerge regarding the topic which was used in the final labelling (see post Insights from topic modelling of MDR-TB).

word_topic6
Figure 1: Word cloud of top words within a topic
tb_mdrtb_termite
Figure 2: Visualization of top words across topics using a termite plot
pyldavis_all_r1
Figure 3: pyLDAvis plots to determine uniqueness of words within corpus. Circle numbers denote ranking of topics, not the topics themselves. For example, circle 1 refers to topic 5, the largest topic in the study.
Advertisements