Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural Network
- URL: http://arxiv.org/abs/2512.06648v2
- Date: Wed, 10 Dec 2025 02:58:32 GMT
- Title: Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural Network
- Authors: Xiao Li,
- Abstract summary: This paper proposes a financial fraud detection framework for Chinese A-share listed companies based on convolutional neural networks (CNNs)<n> Experiments show that the CNN outperforms logistic regression and LightGBM in accuracy, robustness, and early-warning performance.<n>We find that solvency, ratio structure, governance structure, and internal control are general predictors of fraud, while environmental indicators matter mainly in high-pollution industries.
- Score: 4.504327589607446
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Since the emergence of joint-stock companies, financial fraud by listed firms has repeatedly undermined capital markets. Fraud is difficult to detect because of covert tactics and the high labor and time costs of audits. Traditional statistical models are interpretable but struggle with nonlinear feature interactions, while machine learning models are powerful but often opaque. In addition, most existing methods judge fraud only for the current year based on current year data, limiting timeliness. This paper proposes a financial fraud detection framework for Chinese A-share listed companies based on convolutional neural networks (CNNs). We design a feature engineering scheme that transforms firm-year panel data into image like representations, enabling the CNN to capture cross-sectional and temporal patterns and to predict fraud in advance. Experiments show that the CNN outperforms logistic regression and LightGBM in accuracy, robustness, and early-warning performance, and that proper tuning of the classification threshold is crucial in high-risk settings. To address interpretability, we analyze the model along the dimensions of entity, feature, and time using local explanation techniques. We find that solvency, ratio structure, governance structure, and internal control are general predictors of fraud, while environmental indicators matter mainly in high-pollution industries. Non-fraud firms share stable feature patterns, whereas fraud firms exhibit heterogeneous patterns concentrated in short time windows. A case study of Guanong Shares in 2022 shows that cash flow analysis, social responsibility, governance structure, and per-share indicators are the main drivers of the model's fraud prediction, consistent with the company's documented misconduct.
Related papers
- Towards Explainable and Reliable AI in Finance [2.666791490663749]
We present several approaches to explainable and reliable AI in finance.<n>By integrating predictive performance with reliability estimation and rule-based reasoning, our framework advances transparent and auditable financial AI systems.
arXiv Detail & Related papers (2025-10-30T11:05:15Z) - Revisiting Multivariate Time Series Forecasting with Missing Values [65.30332997607141]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z) - Your AI, Not Your View: The Bias of LLMs in Investment Analysis [62.388554963415906]
In finance, Large Language Models (LLMs) face frequent knowledge conflicts arising from discrepancies between their pre-trained parametric knowledge and real-time market data.<n>These conflicts are especially problematic in real-world investment services, where a model's inherent biases can misalign with institutional objectives.<n>We propose an experimental framework to investigate emergent behaviors in such conflict scenarios, offering a quantitative analysis of bias in investment analysis.
arXiv Detail & Related papers (2025-07-28T16:09:38Z) - Machine Learning based Enterprise Financial Audit Framework and High Risk Identification [6.433444278723668]
This study proposes an AI-driven framework for enterprise financial audits and high-risk identification.<n>Using a dataset from the Big Four accounting firms (EY, PwC, Deloitte, KPMG) from 2020 to 2025, the research examines trends in risk assessment, compliance violations, and fraud detection.<n>The study recommends adopting Random Forest as a core model, enhancing features via engineering, and implementing real-time risk monitoring.
arXiv Detail & Related papers (2025-07-08T00:22:49Z) - On the Potential of Network-Based Features for Fraud Detection [3.0846824529023382]
This article explores using the personalised PageRank (PPR) algorithm to capture the social dynamics of fraud.
The primary objective is to compare the performance of traditional features with the addition of PPR in fraud detection models.
Results indicate that integrating PPR enhances the model's predictive power, surpassing the baseline model.
arXiv Detail & Related papers (2024-02-14T13:20:09Z) - Explainable Fraud Detection with Deep Symbolic Classification [4.1205832766381985]
We present Deep Classification, an extension of the Deep Symbolic Regression framework to classification problems.
Because the functions are mathematical expressions that are in closed-form and concise, the model is inherently explainable both at the level of a single classification decision and the model's decision process.
An evaluation on the PaySim data set demonstrates competitive predictive performance with state-of-the-art models, while surpassing them in terms of explainability.
arXiv Detail & Related papers (2023-12-01T13:50:55Z) - Designing an attack-defense game: how to increase robustness of
financial transaction models via a competition [69.08339915577206]
Given the escalating risks of malicious attacks in the finance sector, understanding adversarial strategies and robust defense mechanisms for machine learning models is critical.
We aim to investigate the current state and dynamics of adversarial attacks and defenses for neural network models that use sequential financial data as the input.
We have designed a competition that allows realistic and detailed investigation of problems in modern financial transaction data.
The participants compete directly against each other, so possible attacks and defenses are examined in close-to-real-life conditions.
arXiv Detail & Related papers (2023-08-22T12:53:09Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - Transaction Fraud Detection via an Adaptive Graph Neural Network [64.9428588496749]
We propose an Adaptive Sampling and Aggregation-based Graph Neural Network (ASA-GNN) that learns discriminative representations to improve the performance of transaction fraud detection.
A neighbor sampling strategy is performed to filter noisy nodes and supplement information for fraudulent nodes.
Experiments on three real financial datasets demonstrate that the proposed method ASA-GNN outperforms state-of-the-art ones.
arXiv Detail & Related papers (2023-07-11T07:48:39Z) - Bayesian Bilinear Neural Network for Predicting the Mid-price Dynamics
in Limit-Order Book Markets [84.90242084523565]
Traditional time-series econometric methods often appear incapable of capturing the true complexity of the multi-level interactions driving the price dynamics.
By adopting a state-of-the-art second-order optimization algorithm, we train a Bayesian bilinear neural network with temporal attention.
By addressing the use of predictive distributions to analyze errors and uncertainties associated with the estimated parameters and model forecasts, we thoroughly compare our Bayesian model with traditional ML alternatives.
arXiv Detail & Related papers (2022-03-07T18:59:54Z) - Relational Graph Neural Networks for Fraud Detection in a Super-App
environment [53.561797148529664]
We propose a framework of relational graph convolutional networks methods for fraudulent behaviour prevention in the financial services of a Super-App.
We use an interpretability algorithm for graph neural networks to determine the most important relations to the classification task of the users.
Our results show that there is an added value when considering models that take advantage of the alternative data of the Super-App and the interactions found in their high connectivity.
arXiv Detail & Related papers (2021-07-29T00:02:06Z) - Every Corporation Owns Its Image: Corporate Credit Ratings via
Convolutional Neural Networks [2.867517731896504]
We analyze the performance of traditional machine learning models in predicting corporate credit rating.
We propose a novel end-to-end method, Corporate Credit Ratings via Convolutional Neural Networks, CCR-CNN.
CCR-CNN outperforms the state-of-the-art methods consistently.
arXiv Detail & Related papers (2020-12-03T01:01:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.