Anomaly Detection in Accounting Entries Using Deep Learning with Autoencoder Neural Networks

Document Type : Research Paper

Author

Accounting, Allameh Tabatabae'i University

Abstract

Anomaly and fraud detection has always been one of the primary challenges for auditors and examiners. This study aims to establish a novel framework for applying emerging deep learning techniques in financial auditing by employing Autoencoder Neural Networks (AE-NN) for anomaly detection in accounting information systems. First, the theoretical foundations of anomalies in bookkeeping and financial document registration are reviewed. Then, two real-world datasets were employed: the Rahkaran system dataset containing 36,538 accounting entries and the Sepidar system dataset containing 30,000 entries, both covering fiscal years 2021–2023. To evaluate the detection power of the proposed method, several artificial anomalies were injected into the datasets. The empirical results indicate that the AE-NN demonstrates a superior ability in identifying anomalous accounting entries compared to alternative machine learning techniques, including Decision Trees, Extra Trees Regressor, Random Forest, AdaBoost Classifier, and Quadratic Discriminant Analysis. Furthermore, increasing the network depth led to an improvement in anomaly detection accuracy. The study also reveals that reconstruction error varies across subsystems generating the documents, highlighting the necessity of considering the origin subsystem when performing anomaly detection. In addition, the most influential features in detecting anomalies were found to be counterparty account (detailed ledger), subsidiary account, cost center, and the last modification date of the document. Overall, the findings demonstrate that AE-NN provides an effective and practical approach for enhancing the reliability of accounting information systems and supporting auditors in fraud detection.

Keywords

Main Subjects


ACFE (2024). Report to the Nations on Occupational Fraud and Abuse, The 2016 Global Fraud Study. Association of Certified Fraud Examiners, https://www.acfe.com/-/media/files/acfe/pdfs/rttn/2024/2024-report-to-the-nations.pdf
AICPA (2022). SASs Consideration of Fraud in a Financial Statement Audit. American Institute of Certified Public Accountants, https://www.aicpa-cima.com/resources/download/aicpa-statements-on-auditing-standards-currently-effective/
Amani, F.A., Fadlalla, A.M. (2017). Data mining applications in accounting: A review of the literature and organizing framework. International Journal of Accounting Information Systems 24, 32.
Bakumenko, A., & Elragal, A. (2022). Detecting Anomalies in Financial Data Using Machine Learning Algorithms. Systems10(5), 130. https://doi.org/10.3390/systems10050130
Bao, Y., Ke, B., Li, B., Yu, Y.J. and Zhang, J. (2020), Detecting Accounting Fraud in Publicly Traded U.S. Firms Using a Machine Learning Approach. Journal of Accounting Research, 58: 199-235. https://doi.org/10.1111/1475-679X.12292
Bay, S., Kumaraswamy, K., Anderle, M.G., Kumar, R., Steier, D.M., Blvd, A., Jose, S. (2006). Large Scale Detection of Irregularities in Accounting Data. In: Data Mining. ICDM'06. Sixth International Conference on Data Mining (ICDM'06), Hong Kong, China, 2006, pp. 75-86, doi: 10.1109/ICDM.2006.93.
Beck, P. J., and I. Solomon. 1985. Sampling risks and audit consequences under alternative testing approaches. The Accounting Review 60 (4): 714–723. https://www.jstor.org/stable/247467
Bengio, Y., Yao, L., Alain, G., Vincent, P. (2013). Generalized denoising auto-encoders as generative models. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1 (NIPS'13), Vol. 1. Curran Associates Inc., Red Hook, NY, USA, 899–907.
Berahmand, K., Daneshfar, F., Salehi, E. S., Li, Y., & Xu, Y. (2024). Autoencoders and their applications in machine learning: A survey. Artificial Intelligence Review, 57(2), 28. https://doi.org/10.1007/s10462-023-10662-6
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J. (2000). LOF: Identifying Density-Based Local Outliers. In: Proceedings of the Acm Sigmod International Conference on Management of Data. pp. 1–12.
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3), 1-58
Chen, A. Y., & Velikov, M. (2023). Zeroing In on the Expected Returns of Anomalies. Journal of Financial and Quantitative Analysis, 58(3), 968–1004. doi:10.1017/S0022109022000874
Debreceny, R.S., Gray, G.L. (2010). Data mining journal entries for fraud detection: An exploratory study. International Journal of Accounting Information Systems 11(3), 157 – 181
Deng, Q., & Mei, G. (2009). Combining self-organizing map and K-means clustering for detecting fraudulent financial statements. IEEE International Conference on Granular Computing, 126–131.
Drake Michael S. & Guest Nicholas M. & Twedt Brady J. (2014). The Media and Mispricing: The Role of the Business Press in the Pricing of Accounting Information, The Accounting Review,  89 (5): 1673–1701.
Domingos, S. L., Carvalho, R. N., Carvalho, R. S., & Ramos, G. N. (2016). Identifying IT Purchases Anomalies in the Brazilian Government Procurement System Using Deep Learning. 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), 722–727. https://doi.org/10.1109/ICMLA.2016.0129
Fang, Z., & Wang, S. (2024). Boosting financial market prediction accuracy with deep learning and big data. Journal of Organizational and End User Computing, 36(1). https://doi.org/10.4018/JOEUC.358454
Gomes, T. A., Carvalho, R. N., & Carvalho, R. S. (2017). Identifying Anomalies in Parliamentary Expenditures of Brazilian Chamber of Deputies with Deep Autoencoders, 16th IEEE International Conference on Machine Learning and Applications (ICMLA), Cancun, Mexico, 2017, pp. 940-943, doi: 10.1109/ICMLA.2017.00-33.
Guo, K. H., Yu, X., & Wilkin, C. (2022). A picture is worth a thousand journal entries: Accounting graph topology for auditing and fraud detection. Journal of Information Systems, 36(2), 53–81. https://doi.org/10.2308/ISYS-2021-003
Hawkins, S., He, H., Williams, G., Baxter, R. (2002). Outlier Detection Using Replicator Neural Networks. In: International Conference on Data Warehousing and Knowledge Discovery Lecture Notes in Computer Science, vol 2454. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46145-0_17
Hernandez Aros, L., Bustamante Molano, L. X., Gutierrez-Portela, F., Moreno Hernandez, J. J., & Rodríguez Barrero, M. S. (2024). Financial fraud detection through the application of machine learning techniques: A literature review. Humanities and Social Sciences Communications, 11(1), Article 1130. https://doi.org/10.1057/s41599-024-03606-0
Huang, F., No, W. G., Vasarhelyi, M. A., & Yan, Z. (2022). Audit data analytics, machine learning, and full population testing. The Journal of Finance and Data Science, 8, 138–144. https://doi.org/10.1016/j.jfds.2022.05.002
Islam, A.K., Corney, M., Mohay, G., Clark, A., Bracher, S., Raub, T., Flegel, U. (2010). Fraud detection in ERP systems using Scenario matching. In: Rannenberg, K., Varadharajan, V., Weber, C. (eds) Security and Privacy – Silver Linings in the Cloud. SEC 2010. IFIP Advances in Information and Communication Technology, vol 330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15257-3_11
Jans, M., Lybaert, N., & Vanhoof, K. (2007). Data mining for fraud detection: Toward an improvement on internal control systems? Proceedings of the 30th Annual Congress European Accounting Association (EAA2007).
Jans, M., Lybaert, N., & Vanhoof, K. (2010). Internal fraud risk reduction: Results of a data mining case study. International Journal of Accounting Information Systems, 11(1), 17–41.
Jans, M., Van DerWerf, J.M., Lybaert, N., Vanhoof, K. (2011). A business process mining application for internal transaction fraud mitigation. Expert Systems with Applications 38(10), 13351-13359
Khan, R., Corney, M., Clark, A., Mohay, G. (2010). Transaction Mining for Fraud Detection in ERP Systems. Industrial Engineering and Management Systems 9(2), pp. 141 – 156
Khan, R., Corney, M. (2009). A role mining inspired approach to representing user behavior in ERP systems. In: Proceedings of the 10th Asia Pacific Industrial Engineering and Management Systems Conference. pp. 2541 - 2552
Kogan, A., Alles, M. G., Vasarhelyi, M. A., & Wu, J. (2014). Design and Evaluation of a Continuous Data Level Auditing System. AUDITING: A Journal of Practice & Theory, 33(4), 221–245. https://doi.org/10.2308/ajpt-50844
Kuna, H. D., García-Martinez, R., & Villatoro, F. R. (2014). Outlier detection in audit logs for application systems. Information Systems, 44, 22–33. https://doi.org/10.1016/j.is.2014.03.001
Lokanan, M., Tran, V. and Vuong, N.H. (2019), Detecting anomalies in financial statements using machine learning algorithm: The case of Vietnamese listed firms, Asian Journal of Accounting Research, Vol. 4 No. 2, pp. 181-201.
Lu, F., Boritz, J. E., & Covvey, D. (2006). Adaptive fraud detection using Benford’s law. Canadian AI, 347–358. https://doi.org/10.1007/11766247_30.
McGlohon, M., Bay, S., Anderle, M.G.M., Steier, D.M., Faloutsos, C. (2009). SNARE: A Link Analytic System for Graph Labeling and Risk Detection. Kdd-09: 15Th Acm Sigkdd Conference on Knowledge Discovery and Data Mining.
Nonnenmacher, Jakob & Gómez Jorge Marx (2021). Unsupervised anomaly detection for internal auditing: Literature review and research agenda. The International Journal of Digital Accounting Research. Vol. 21, pp. 1-22.
Paula, E. L., Ladeira, M., Carvalho, R. N., & Marzagão, T. (2016). Deep Learning Anomaly Detection as Support Fraud Investigation in Brazilian Exports and Anti-Money Laundering. 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), 954–960. https://doi.org/10.1109/ICMLA.2016.0172
Pincus, Morton& Rajgopal, Shivaram & Venkatachalam, Mohan (2007). The Accrual Anomaly: International Evidence. The Accounting Review 82 (1): 169–203.
Rumelhart, D.E., Hinton, G.E., Williams, R.J. (1986). Learning internal representation by error propagation. In: Parallel distributed processing: explorations in the microstructure of cognition,  MIT Press, 1987, pp.318-362.
Schreyer, M., Sattarov, T., Borth, D., Dengel, A.R., & Reimer, B. (2017). Detection of Anomalies in Large Scale Accounting Data using Deep Autoencoder Networks. ArXiv, abs/1709.05254.
Schreyer, M., Sattarov, T., Schulze, C., Reimer, B., & Borth, D. (2019). Detection of Accounting Anomalies in the Latent Space using Adversarial Autoencoder Neural Networks. https://arxiv.org/abs/1908.00734
Sharma, S., & Lokanan, M. (2025). The use of machine learning algorithms to predict financial statement fraud. The British Accounting Review, 57(1), 101560. https://doi.org/10.1016/j.bar.2025.101560
Singleton, T., Singleton, A.J. (2010). Fraud auditing and forensic accounting (4th ed). John Wiley & Sons.
Sun, T., & Vasarhelyi, M. A. (2017, June). Deep learning and the future of auditing: How an evolving technology could transform analysis and improve judgment. The CPA Journal, 87(6), 24–29. https://www.cpajournal.com/2017/06/19/deep-learning-future-auditing/
Teitlebaum, A. D., and C. F. Robinson. 1975. The real risks in audit sampling. Journal of Accounting Research 13: 70–91. https:// doi.org/10.2307/2490480
Teoh S.H. & Welch I.& Wong T.J. (1998). Earnings management and the underperformance of seasoned equity offerings Journal of Financial Economics, 50, pp. 63-99
Thiprungsri, S., & Vasarhelyi, M. A. (2011). Cluster analysis for anomaly detection in accounting data: An audit approach. International Journal of Digital Accounting Research11, 69-84. https://doi.org/10.4192/1577-8517-v11_4
Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P. (2008). Extracting and composing robust features with denoising autoencoders. International Conference on Machine Learning.
Wei, D., Cho, S., Vasarhelyi, M. A., & Te-Wierik, L. (2024). Outlier detection in auditing: Integrating unsupervised learning within a multilevel framework for general ledger analysis. Journal of Information Systems, 38(2), 123–142. https://doi.org/10.2308/ISYS-2022-026
Wells, J.T. (2017). Corporate Fraud Handbook: Prevention and Detection. John Wiley & Sons.
Williams, G. & Baxter, R. & He H & Hawkins S and Gu L (2002) A comparative study of RNN for outlier detection in data mining. IEEE International Conference on Data Mining, 1–16.
Yan, Xuemin (Sterling); Zheng, Lingling (2017). Fundamental Analysis and the Cross-Section of Stock Returns: A Data-Mining Approach, The Review of Financial Studies, 30(4), 1382–1423.