Minh Nguyen Hoang is a Computer Science Master student at Ho Chi Minh University of Education, Viet Nam. He enjoys using his skills to contribute the exciting technological advances. His research interests mainly focus on Machine Learning, Deep Learning, and Data Science. He is currently working on project about machine learning model to detect accounting fraud. In the future, he aims to improve his knowledge by researching more fields, e.g detecting human emotion using Machine Learning. Besides, he also plans to pursue Ph.D degree in Computer Science field.
This topic aims to show a machine learning model that enables to predict signs of financial statement frauds by combining the domain knowledge of machine learning and accounting. Inputs of this model is a published dataset of financial statements, and outputs involve the conclusions whether the predicted financial statements indicate the signs of financial statement frauds or not. Currently, XGBoost is recognized as one of the most popular classification methods with fast performance, flexibility, and scalability. However, its default properties are not suitable for fraudulent detecting of imbalanced datasets. To overcome this drawback, this research introduces a new machine learning model based on XGBoost technique, called f(raud)-XGBoost. The proposed model not only inherits XGBoost advantages but also enables it to detect financial statement frauds. We apply the Area Under the Receiver Operating Characteristics Curve and NDCG@k to perform the evaluation process. The experimental results show that the new model performs slightly better than three existing models including logistic regression model that is based on financial ratios, Support-vector-machine model, and RUSBoost model.