Events The 1st International Online Conference on Education Sciences

Event submissions

Published

This submission belongs to the session S1. Technology Enhanced Education of the event The 1st International Online Conference on Education Sciences

Published date

10 Jun, 2026

Academic Editor

Mike Joy

Citation

Elmira Almukhambetova, Murat Almukhambetov, ChatGPT as a Tool for Generating Clinical Simulation Cases in Emergency Medicine Education: A Validity-Based Comparative Study, in Proceedings of The 1st International Online Conference on Education Sciences, 15 June–17 June 2026, MDPI: Basel, Switzerland

Facebook

Twitter

ChatGPT as a Tool for Generating Clinical Simulation Cases in Emergency Medicine Education: A Validity-Based Comparative Study

Elmira Almukhambetova ¹

Murat Almukhambetov ¹

1. Department of Emergency and Urgent Medical Care, S.D. Asfendiyarov Kazakh National Medical University, Almaty, Kazakhstan, Kazakhstan

Abstract

Abstract

Artificial intelligence (AI) is increasingly integrated into higher education, offering scalable tools for developing interactive learning resources. In medical education, large language models such as ChatGPT show potential for generating clinical simulation cases; however, their educational validity and clinical reliability remain insufficiently examined. Ensuring the accuracy and pedagogical quality of AI‑generated materials is essential before integrating them into formal training programs.

Methods

The study was conducted by 1 author of the prompts; 2 authors of the expert (human) clinical scenarios; and 5 independent expert reviewers. A comparative validity study was conducted using 50 multiple‑choice clinical scenarios (MCQs) in emergency medicine: 25 developed by experienced instructors and 25 generated using ChatGPT (GPT‑5 mini). The scenarios covered five core emergency topics: cardiac arrest, shock, trauma and accidents, acute coronary syndrome, and acute respiratory failure. Five independent experts evaluated all cases using ten predefined content validity criteria, including clinical accuracy, completeness, structural clarity, realism, educational value, error‑free presentation, applicability, coherence between scenario and question, uniqueness, and distractor homogeneity. Quantitative assessment included the Item Content Validity Index (I‑CVI) and Aiken’s V coefficient.

Results

Instructor‑developed cases demonstrated significantly higher overall quality than AI‑generated cases (3.8 ± 0.13 vs. 3.0 ± 0.59; p < 0.001). Expert-developed cases showed excellent content validity (I‑CVI = 0.984; S‑CVI/Ave = 0.99), whereas AI‑generated cases demonstrated substantially lower validity (I‑CVI = 0.496; S‑CVI/Ave = 0.50). Aiken’s V indicated very high expert agreement for instructor‑developed cases (0.936) and moderate agreement for AI‑generated cases (0.671). Common issues in AI‑generated cases included insufficient clinical detail, heterogeneous distractors, and occasional logical inconsistencies.

Conclusion

ChatGPT can serve as an efficient support tool for the rapid generation of emergency medicine simulation cases. However, expert review and pedagogical refinement remain essential to ensure clinical accuracy and educational quality. AI‑generated content should complement, rather than replace, expert‑designed instructional materials.

Keywords

artificial intelligence

ChatGPT

emergency medicine

medical education

clinical simulation

content validity

assessment design

Poster

постер тесты.pdf

Pre-service Teachers’ Perceptions of Artificial Intelligence in Primary Mathematics Education

The Relationship Between Positive Youth Development and Attitudes Toward Artificial Intelligence.