Related papers: Adversarial Semi-supervised Learning for Corporate Credit Ratings

Adversarial Semi-supervised Learning for Corporate Credit Ratings

URL: http://arxiv.org/abs/2104.02479v1
Date: Sun, 4 Apr 2021 09:05:53 GMT
Title: Adversarial Semi-supervised Learning for Corporate Credit Ratings
Authors: Bojing Feng, Wenfang Xue
Abstract summary: In this work, we consider the problem of adversarial semi-supervised learning for corporate credit rating. In the first phase, we train a normal rating system via a normal machine-learning algorithm to give unlabeled data pseudo rating level. In the second phase, adversarial semi-supervised learning is applied uniting labeled data and pseudo-labeled data.
Score: 1.90365714903665
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Corporate credit rating is an analysis of credit risks within a corporation, which plays a vital role during the management of financial risk. Traditionally, the rating assessment process based on the historical profile of corporation is usually expensive and complicated, which often takes months. Therefore, most of the corporations, which are lacking in money and time, can't get their own credit level. However, we believe that although these corporations haven't their credit rating levels (unlabeled data), this big data contains useful knowledge to improve credit system. In this work, its major challenge lies in how to effectively learn the knowledge from unlabeled data and help improve the performance of the credit rating system. Specifically, we consider the problem of adversarial semi-supervised learning (ASSL) for corporate credit rating which has been rarely researched before. A novel framework adversarial semi-supervised learning for corporate credit rating (ASSL4CCR) which includes two phases is proposed to address these problems. In the first phase, we train a normal rating system via a normal machine-learning algorithm to give unlabeled data pseudo rating level. Then in the second phase, adversarial semi-supervised learning is applied uniting labeled data and pseudo-labeled data. To demonstrate the effectiveness of the proposed ASSL4CCR, we conduct extensive experiments on the Chinese public-listed corporate rating dataset, which proves that ASSL4CCR outperforms the state-of-the-art methods consistently.

Related papers

Accurate Forgetting for Heterogeneous Federated Continual Learning [89.08735771893608]
We propose a new concept accurate forgetting (AF) and develop a novel generative-replay methodMethodwhich selectively utilizes previous knowledge in federated networks. We employ a probabilistic framework based on a normalizing flow model to quantify the credibility of previous knowledge.
arXiv Detail & Related papers (2025-02-20T02:35:17Z)
DeCrisisMB: Debiased Semi-Supervised Learning for Crisis Tweet Classification via Memory Bank [52.20298962359658]
In crisis events, people often use social media platforms such as Twitter to disseminate information about the situation, warnings, advice, and support. fully-supervised approaches require annotating vast amounts of data and are impractical due to limited response time. Semi-supervised models can be biased, performing moderately well for certain classes while performing extremely poorly for others. We propose a simple but effective debiasing method, DeCrisisMB, that utilizes a Memory Bank to store and perform equal sampling for generated pseudo-labels from each class at each training.
arXiv Detail & Related papers (2023-10-23T05:25:51Z)
Empowering Many, Biasing a Few: Generalist Credit Scoring through Large Language Models [53.620827459684094]
Large Language Models (LLMs) have great potential for credit scoring tasks, with strong generalization ability across multiple tasks. We propose the first open-source comprehensive framework for exploring LLMs for credit scoring. We then propose the first Credit and Risk Assessment Large Language Model (CALM) by instruction tuning, tailored to the nuanced demands of various financial risk assessment tasks.
arXiv Detail & Related papers (2023-10-01T03:50:34Z)
KoLA: Carefully Benchmarking World Knowledge of Large Language Models [87.96683299084788]
We construct a Knowledge-oriented LLM Assessment benchmark (KoLA) We mimic human cognition to form a four-level taxonomy of knowledge-related abilities, covering $19$ tasks. We use both Wikipedia, a corpus prevalently pre-trained by LLMs, along with continuously collected emerging corpora, to evaluate the capacity to handle unseen data and evolving knowledge.
arXiv Detail & Related papers (2023-06-15T17:20:46Z)
On the dynamics of credit history and social interaction features, and their impact on creditworthiness assessment performance [3.6748639131154315]
This study aims to understand the creditworthiness assessment performance dynamics and how it is influenced by the credit history, repayment behavior, and social network features. Our research shows that borrowers' history increases performance at a decreasing rate during the first six months and then stabilizes. The most notable effect on perfomance of social networks features occurs at loan application.
arXiv Detail & Related papers (2022-04-13T00:42:27Z)
On the combination of graph data for assessing thin-file borrowers' creditworthiness [0.0]
We introduce a framework to improve credit scoring models by blending several Graph Representation Learning methods. We validated this framework using a unique dataset that characterizes the relationships and credit history for the entire population of a Latin American country. In Corporate lending, where the gain is much higher, it confirms that evaluating an unbanked company cannot solely consider its features.
arXiv Detail & Related papers (2021-11-26T18:45:23Z)
Feature-Level Fusion of Super-App and Telecommunication Alternative Data Sources for Credit Card Fraud Detection [106.33204064461802]
We review the effectiveness of a feature-level fusion of super-app customer information, mobile phone line data, and traditional credit risk variables for the early detection of identity theft credit card fraud. We evaluate our approach over approximately 90,000 users from a credit lender's digital platform database.
arXiv Detail & Related papers (2021-11-05T19:10:35Z)
Bagging Supervised Autoencoder Classifier for Credit Scoring [3.5977219275318166]
The imbalanced nature of credit scoring datasets, as well as the heterogeneous nature of features in credit scoring datasets, pose difficulties in developing and implementing effective credit scoring models. We propose the Bagging Supervised Autoencoder (BSAC) that mainly leverages the superior performance of the Supervised Autoencoder. BSAC also addresses the data imbalance problem by employing a variant of the Bagging process based on the undersampling of the majority class.
arXiv Detail & Related papers (2021-08-12T17:49:08Z)
Contrastive Pre-training for Imbalanced Corporate Credit Ratings [1.90365714903665]
We propose Contrastive Pre-training for Corporate Credit Rating (CP4 CCR), which utilizes the self-surpervision for getting over class imbalance. Experiments conducted on the Chinese public-listed corporate rating dataset, prove that CP4 CCR can improve the performance of standard corporate credit rating models.
arXiv Detail & Related papers (2021-02-18T08:14:46Z)
Explanations of Machine Learning predictions: a mandatory step for its application to Operational Processes [61.20223338508952]
Credit Risk Modelling plays a paramount role. Recent machine and deep learning techniques have been applied to the task. We suggest to use LIME technique to tackle the explainability problem in this field.
arXiv Detail & Related papers (2020-12-30T10:27:59Z)
PCAL: A Privacy-preserving Intelligent Credit Risk Modeling Framework Based on Adversarial Learning [111.19576084222345]
This paper proposes a framework of Privacy-preserving Credit risk modeling based on Adversarial Learning (PCAL) PCAL aims to mask the private information inside the original dataset, while maintaining the important utility information for the target prediction task performance. Results indicate that PCAL can learn an effective, privacy-free representation from user data, providing a solid foundation towards privacy-preserving machine learning for credit risk analysis.
arXiv Detail & Related papers (2020-10-06T07:04:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.