Machine learning techniques in healthcare provider fraud detection and analysis: A systematic literature review
Abstract
Healthcare fraud is a growing concern, resulting in substantial financial losses and threatening the quality and trustworthiness of healthcare delivery. According to the National Health Care Anti-Fraud Association (NHCAA), healthcare fraud costs the economy tens of billions of dollars annually. Fraudulent activities, including upcoding, billing for unprovided services, and illegal kickbacks, contribute to rising healthcare costs, increased insurance premiums, and reduced quality of patient care. Combating healthcare fraud requires advanced detection systems, strict regulatory enforcement, and greater awareness among providers and patients. Machine learning (ML), a field within artificial intelligence, has emerged as a critical tool in healthcare fraud detection. This literature review examines the most recent scholarly articles on ML applications in fraud analytics, with a focus on (1) identifying and categorizing ML models used for provider fraud detection, (2) evaluating the effectiveness and challenges of ML-based approaches, and (3) exploring emerging trends and future advancements in fraud analytics. The findings reveal that supervised learning models such as Logistic Regression, decision trees, and deep neural networks and unsupervised techniques like anomaly detection and clustering are widely used to identify fraudulent patterns. Hybrid approaches that combine multiple ML models have demonstrated improved detection accuracy. Blockchain technology is an advanced database mechanism that, along with ML, can be used to improve the security, efficiency, and interoperability of healthcare data management and fraud detection. Nonetheless, there are still issues, including problems with data quality and standardization, data imbalance, evolving fraud tactics, and privacy concerns. This review study aims to assist researchers, professionals, and policymakers in implementing and managing machine learning models for fraud detection by providing insights into the key factors influencing these models. Understanding these factors will enhance decision-making in research projects and organizational operations, ultimately contributing to more effective fraud mitigation solutions in healthcare using state-of-the-art machine learning techniques.
Copyright (c) 2025 Author(s)

This work is licensed under a Creative Commons Attribution 4.0 International License.