Purpose: The objective of this paper is to evaluate the effectiveness of the M-score and F-score in detecting financial statement fraud in the Chinese market and to develop machine learning models tailored for detecting such fraudulent activities.
Design/Methodology/Approach: We utilize the data of fraudulent cases from the CSMAR database for the period 2010-2019 and implement a random sampling by industry to match between fraudulent enterprises and non-fraudulent enterprises. Based on this sample, we first test the effectiveness of M-score and F-score in detecting financial frauds among Chinese listed companies. Next, we construct the machine learning models—Random Forest, Gradient Boosting Decision Tree (GBDT), K-Nearest Neighbor (KNN) and Support Vector Machine (SVM)—using the constituent variables of F-score and M-score, along with an additional loss indicator. The performance of these models in detecting financial frauds is then comparatively assessed.
Findings: The results reveal varying degrees of ineffectiveness of the M-score and F-score in accurately identifying financially fraudulent companies in the Chinese market. In contrast, the machine learning models show satisfactory performance, each exhibiting distinct advantages in reducing false negative and false positive rates.
Practical Implications: This research presents effective machine learning models for detecting and predicting financial statement fraud in the Chinese context, helping investors mitigate risks associated with stock investments in the Chinese stock market.