Please login first
Distributional asymmetries in online nutrition discourse across social determinants of health domains on Reddit
, , *
1  Computer Simulation, Genomics and Data Analysis Laboratory, Department of Food Science and Nutrition, University of the Aegean, Metropolite Ioakeim 2, 81400 Myrina, Lemnos, Greece
Academic Editor: Pierre Desrochers

Abstract:

Introduction: Online discussions about nutrition unfold across heterogeneous digital communities that extend beyond explicitly health-focused forums and spaces. Yet, it remains unclear how nutrition-related discourse is distributed across broader social domains and whether its thematic embedding reflects the Social Determinants of Health (SDH) framework. To address this question, we systematically mapped Reddit communities using a taxonomy derived from Wikipedia’s curated content categories.

Methods: Approximately 60,000 subreddits were classified into thematic domains, derived from Wikipedia’s curated content taxonomy using large language models (LLMs) with enforced structured outputs. Nutrition-related subreddits were then identified and aligned with SDH domains, using an LLM-assisted approach followed by manual verification. To examine how nutrition discourse is distributed across social contexts at the community level, we conducted analyses using two complementary perspectives: an unweighted approach capturing community diversity and an activity-weighted approach capturing discourse intensity.

Results: In the unweighted analysis, 65% of nutrition-related subreddits were classified under health care access and quality, while 35% were distributed across non-health SDH domains, including lifestyle and individual behavior (24%), economic stability (6%), social and community context (3%), and education access and quality (1.5%). However, the activity-weighted analysis revealed a clear distributional asymmetry: although health-oriented communities are more numerous, discourse intensity is disproportionately concentrated in lifestyle-centered spaces. Specifically, while lifestyle- and individual behavior-focused communities represent only 24% of nutrition-related subreddits, they generate 40% of total nutrition-related activity, compared with 47% generated by health care-focused communities.

Conclusions: A clear distributional asymmetry in online nutrition discourse was revealed in this study. Our findings demonstrate that digital nutrition communication is not confined to formal health contexts but is embedded within broader social domains central to the Social Determinants of Health framework. Methodologically, the study illustrates the value of LLM-assisted taxonomy mapping for large-scale, domain-level analysis in computational social science.

Keywords: nutrition discourse, social determinants of Health (SDH), large language models (LLMs), Reddit, computational social science, thematic classification
Top