Please login first
Comparing keywords associated with obesity before and after the COVID-19 pandemic using topic modeling: Analyzing the news big data of South Korea
1 , * 2
1  Dept. of Statistics, Inje University Graduate School
2  Dept. of Medical Big Data, College of AI Convergence, Inje University
Academic Editor: David Nieman


The World Health Organization (WHO) declared the COVID-19 pandemic on March 11, 2020. As COVID-19 has spread afterward, lockdown has been declared all over the world, including the United States, Europe, Asia, Africa, and South Korea. Consequently, it has changed daily life rapidly, including "social distancing". In particular, in terms of nutritional science, a representative change is the rapid increase in food delivery after the outbreak of COVID-19. The "COVID-19 Impact Report in KOREA" published in April 2020 revealed that food delivery accounted for 52% of all meals after the COVID-19 pandemic, which nearly doubled from 32% before the pandemic. The changes in nutritional intake, decrease in physical activity, and increase in depression and stress due to the COVID-19 pandemic and the spread of food delivery culture have greatly affected weight gain. Particularly, the Korea National Health and Nutrition Examination Survey (2020) confirmed that the prevalence of obesity (≥19 years old) was 31.4% in 2011, 33.8% in 2019, and 38.3% in 2020, indicating a rapid increase after the outbreak of COVID-19. It is a critical issue in health science to identify the differences in potential factors for obesity before and after the COVID-19 pandemic particularly because many previous studies showed that obesity increased the infection risk of COVID-19 and, even after infection with COVID-19, people with obesity suffered from higher severity and mortality rates than people with normal-weight or underweight. This study conducted a web crawling targeting South Korean press media (e.g., news) using "obesity" as a keyword and analyzed keywords related to obesity before the COVID-19 pandemic (2019.2.28-2020.3.10) and after it (2020.3.11-2021.12.31) through topic modeling. The procedure of this study is as follows. First, this study calculated main topics using the Latent Dirichlet Allocation (LDA) method. Then, the calculated LDA identified word networks between topics (words) using network analysis based on co-occurring words and words with high use frequency. Finally, this study discovered the ratio between positive words and negative words using sentiment analysis. The text-mining results will be used as fundamental data to effectively manage the problems associated with obesity and nutritional imbalance during the post-COVID-19 era.

Keywords: COVID-19 pandemic; obesity; topic modeling; text mining