As the demand for publishing in English grows across STEM fields, researchers who use English as a second language face persistent writing challenges, including limited mastery of domain-specific terminology, a tendency toward literal translation, and sentence-length and error issues originating from compact rhetorical habits in their native language. Recent advances in artificial intelligence and data-driven methods offer new opportunities for language instruction and writing support. This study presents a comprehensive review of traditional corpus-building approaches and recent data-driven AI techniques, and then provides a focused case discussion in marine engineering conducted in collaboration with domain researchers. By comparing corpus resources, feature engineering strategies, and modeling approaches, we evaluate the transferability of data-driven instructional models for improving publication readiness among non-native engineering researchers. The discussion highlights evidence-based revision strategies, targeted pathways for disciplinary vocabulary acquisition, and practical ways to integrate automated feedback with expert human guidance to improve manuscript quality. Our review indicates that data-driven AI tools hold promise for supporting terminology acquisition, identifying common error patterns, and generating actionable revision suggestions, but their effectiveness depends on high-quality, domain-specific corpora and close collaboration with subject-matter experts. We conclude with recommendations for future research and pedagogical practice to scale evidence-based ERPP (English for Research Publication Purposes) support across STEM disciplines.
Previous Article in event
Previous Article in session
Next Article in event
Next Article in session
Teaching English for Research Publication Purposes (ERPP) based on discipline-specific corpus and Artificial Intelligent Method: A Case Study of Ocean Engineering
Published:
10 June 2026
by MDPI
in The 1st International Online Conference on Education Sciences
session STEM Education
Abstract:
Keywords: STEM; English as a second language; Data-driven AI techniques; ERPP (English for Research Publication Purposes); Corpus-building approaches
