Automatic Short-Answer Grading (ASAG) has become an increasingly relevant research area within technology-enhanced STEM education, where short open-ended responses are frequently used to evaluate students’ conceptual understanding. However, authentic classroom datasets often present low-resource characteristics, including small sample sizes, lexical sparsity, and class imbalance, which pose significant challenges for reliable model evaluation, didactic usability, and reproducibility.
This work presents a reproducible and didactically oriented machine learning pipeline designed to support the evaluation of short open-ended student responses under low-resource educational conditions. Rather than proposing novel algorithms, the study emphasizes methodological transparency by integrating established linear classifiers—Logistic Regression, Multinomial Naïve Bayes, and Linear Support Vector Machines—within a unified and interpretable evaluation framework. Textual responses are represented using TF–IDF features, while model performance is assessed through adaptive stratified cross-validation to ensure robust accuracy estimation and minimize information leakage.
The pipeline is evaluated across multiple concept-specific datasets derived from undergraduate teacher education contexts in mathematics and science. Results demonstrate stable performance across classifiers and conceptual domains, supporting the viability of interpretable linear models for small-scale classroom datasets. Additionally, the framework enables token-level inspection of discriminative lexical features, facilitating formative didactic feedback and supporting educators in monitoring students’ conceptual development.
By prioritizing reproducibility, interpretability, and didactic applicability, the proposed framework provides a transparent methodological reference for applied ASAG research. Furthermore, the pipeline establishes a foundation for future studies exploring automated feedback mechanisms and classroom-oriented learning analytics in STEM education.
