Related papers: Contrastive Pre-training for Imbalanced Corporate Credit Ratings

Contrastive Pre-training for Imbalanced Corporate Credit Ratings

URL: http://arxiv.org/abs/2102.12580v1
Date: Thu, 18 Feb 2021 08:14:46 GMT
Title: Contrastive Pre-training for Imbalanced Corporate Credit Ratings
Authors: Bojing Feng, Wenfang Xue
Abstract summary: We propose Contrastive Pre-training for Corporate Credit Rating (CP4 CCR), which utilizes the self-surpervision for getting over class imbalance. Experiments conducted on the Chinese public-listed corporate rating dataset, prove that CP4 CCR can improve the performance of standard corporate credit rating models.
Score: 1.90365714903665
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Corporate credit rating reflects the level of corporate credit and plays a crucial role in modern financial risk control. But real-world credit rating data usually shows long-tail distributions, which means heavy class imbalanced problem challenging the corporate credit rating system greatly. To tackle that, inspried by the recent advances of pre-train techniques in self-supervised representation learning, we propose a novel framework named Contrastive Pre-training for Corporate Credit Rating (CP4CCR), which utilizes the self-surpervision for getting over class imbalance. Specifically, we propose to, in the first phase, exert constrastive self-superivised pre-training without label information, which want to learn a better class-agnostic initialization. During this phase, two self-supervised task are developed within CP4CCR: (i) Feature Masking (FM) and (ii) Feature Swapping(FS). In the second phase, we can train any standard corporate redit rating model initialized by the pre-trained network. Extensive experiments conducted on the Chinese public-listed corporate rating dataset, prove that CP4CCR can improve the performance of standard corporate credit rating models, especially for class with few samples.

Related papers

CompassJudger-2: Towards Generalist Judge Model via Verifiable Rewards [72.44810390478229]
CompassJudger-2 is a novel generalist judge model that overcomes limitations via a task-driven, multi-domain data curation strategy.<n> CompassJudger-2 achieves superior results across multiple judge and reward benchmarks.
arXiv Detail & Related papers (2025-07-12T01:34:24Z)
Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark [62.58869921806019]
We propose a task decomposition evaluation framework based on GPT-4o to automatically construct a new training dataset. We design innovative training strategies to effectively distill GPT-4o's evaluation capabilities into a 7B open-source MLLM, MiniCPM-V-2.6. Experimental results demonstrate that our distilled open-source MLLM significantly outperforms the current state-of-the-art GPT-4o-base baseline.
arXiv Detail & Related papers (2024-11-23T08:06:06Z)
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study [61.65123150513683]
multimodal foundation models, such as CLIP, produce state-of-the-art zero-shot results. It is reported that these models close the robustness gap by matching the performance of supervised models trained on ImageNet. We show that CLIP leads to a significant robustness drop compared to supervised ImageNet models on our benchmark.
arXiv Detail & Related papers (2024-03-15T17:33:49Z)
The Effects of Data Imbalance Under a Federated Learning Approach for Credit Risk Forecasting [0.0]
Credit risk forecasting plays a crucial role for commercial banks and other financial institutions in granting loans to customers. Traditional machine learning methods require the sharing of sensitive client information with an external server to build a global model. A newly developed privacy-preserving distributed machine learning technique known as Federated Learning (FL) allows the training of a global model without the necessity of accessing private local data directly.
arXiv Detail & Related papers (2024-01-14T09:15:10Z)
Empowering Many, Biasing a Few: Generalist Credit Scoring through Large Language Models [53.620827459684094]
Large Language Models (LLMs) have great potential for credit scoring tasks, with strong generalization ability across multiple tasks. We propose the first open-source comprehensive framework for exploring LLMs for credit scoring. We then propose the first Credit and Risk Assessment Large Language Model (CALM) by instruction tuning, tailored to the nuanced demands of various financial risk assessment tasks.
arXiv Detail & Related papers (2023-10-01T03:50:34Z)
Client-side Gradient Inversion Against Federated Learning from Poisoning [59.74484221875662]
Federated Learning (FL) enables distributed participants to train a global model without sharing data directly to a central server. Recent studies have revealed that FL is vulnerable to gradient inversion attack (GIA), which aims to reconstruct the original training samples. We propose Client-side poisoning Gradient Inversion (CGI), which is a novel attack method that can be launched from clients.
arXiv Detail & Related papers (2023-09-14T03:48:27Z)
Stabilizing and Improving Federated Learning with Non-IID Data and Client Dropout [15.569507252445144]
Label distribution skew induced data heterogeniety has been shown to be a significant obstacle that limits the model performance in federated learning. We propose a simple yet effective framework by introducing a prior-calibrated softmax function for computing the cross-entropy loss. The improved model performance over existing baselines in the presence of non-IID data and client dropout is demonstrated.
arXiv Detail & Related papers (2023-03-11T05:17:59Z)
FedABC: Targeting Fair Competition in Personalized Federated Learning [76.9646903596757]
Federated learning aims to collaboratively train models without accessing their client's local private data. We propose a novel and generic PFL framework termed Federated Averaging via Binary Classification, dubbed FedABC. In particular, we adopt the one-vs-all'' training strategy in each client to alleviate the unfair competition between classes.
arXiv Detail & Related papers (2023-02-15T03:42:59Z)
Multi-task Envisioning Transformer-based Autoencoder for Corporate Credit Rating Migration Early Prediction [18.374597213278626]
Being able to predict rating changes will greatly benefit both investors and regulators alike. In this paper, we consider the corporate credit rating migration early prediction problem. We propose a new Multi-task Envisioning Transformer-based Autoencoder model to tackle this problem.
arXiv Detail & Related papers (2022-07-10T21:12:04Z)
Cooperative Multi-Agent Actor-Critic for Privacy-Preserving Load Scheduling in a Residential Microgrid [71.17179010567123]
We propose a privacy-preserving multi-agent actor-critic framework where the decentralized actors are trained with distributed critics. The proposed framework can preserve the privacy of the households while simultaneously learn the multi-agent credit assignment mechanism implicitly.
arXiv Detail & Related papers (2021-10-06T14:05:26Z)
Bagging Supervised Autoencoder Classifier for Credit Scoring [3.5977219275318166]
The imbalanced nature of credit scoring datasets, as well as the heterogeneous nature of features in credit scoring datasets, pose difficulties in developing and implementing effective credit scoring models. We propose the Bagging Supervised Autoencoder (BSAC) that mainly leverages the superior performance of the Supervised Autoencoder. BSAC also addresses the data imbalance problem by employing a variant of the Bagging process based on the undersampling of the majority class.
arXiv Detail & Related papers (2021-08-12T17:49:08Z)
Adversarial Semi-supervised Learning for Corporate Credit Ratings [1.90365714903665]
In this work, we consider the problem of adversarial semi-supervised learning for corporate credit rating. In the first phase, we train a normal rating system via a normal machine-learning algorithm to give unlabeled data pseudo rating level. In the second phase, adversarial semi-supervised learning is applied uniting labeled data and pseudo-labeled data.
arXiv Detail & Related papers (2021-04-04T09:05:53Z)
Explanations of Machine Learning predictions: a mandatory step for its application to Operational Processes [61.20223338508952]
Credit Risk Modelling plays a paramount role. Recent machine and deep learning techniques have been applied to the task. We suggest to use LIME technique to tackle the explainability problem in this field.
arXiv Detail & Related papers (2020-12-30T10:27:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.