LendNova: Towards Automated Credit Risk Assessment with Language Models
- URL: http://arxiv.org/abs/2601.02573v1
- Date: Mon, 05 Jan 2026 21:53:36 GMT
- Title: LendNova: Towards Automated Credit Risk Assessment with Language Models
- Authors: Kiarash Shamsi, Danijel Novokmet, Joshua Peters, Mao Lin Liu, Paul K Edwards, Vahab Khoshdel,
- Abstract summary: This paper introduces LendNova, the first practical automated end-to-end pipeline for credit risk assessment.<n>It is designed to utilize all available information in raw credit records by leveraging advanced NLP techniques and language models.
- Score: 1.22891098793232
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Credit risk assessment is essential in the financial sector, but has traditionally depended on costly feature-based models that often fail to utilize all available information in raw credit records. This paper introduces LendNova, the first practical automated end-to-end pipeline for credit risk assessment, designed to utilize all available information in raw credit records by leveraging advanced NLP techniques and language models. LendNova transforms risk modeling by operating directly on raw, jargon-heavy credit bureau text using a language model that learns task-relevant representations without manual feature engineering. By automatically capturing patterns and risk signals embedded in the text, it replaces manual preprocessing steps, reducing costs and improving scalability. Evaluation on real-world data further demonstrates its strong potential in accurate and efficient risk assessment. LendNova establishes a baseline for intelligent credit risk agents, demonstrating the feasibility of language models in this domain. It lays the groundwork for future research toward foundation systems that enable more accurate, adaptable, and automated financial decision-making.
Related papers
- Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval [60.25608870901428]
Trustworthiness is a core research challenge for agentic AI systems built on Large Language Models (LLMs)<n>We propose the task of fact-checking without retrieval, focusing on the verification of arbitrary natural language claims, independent of their source robustness.
arXiv Detail & Related papers (2026-03-05T18:42:51Z) - Self-Improvement in Language Models: The Sharpening Mechanism [70.9248553790022]
We offer a new perspective on the capabilities of self-improvement through a lens we refer to as sharpening.<n>Motivated by the observation that language models are often better at verifying response quality than they are at generating correct responses, we formalize self-improvement as using the model itself as a verifier during post-training.<n>We analyze two natural families of self-improvement algorithms based on SFT and RLHF.
arXiv Detail & Related papers (2024-12-02T20:24:17Z) - A Generative Approach to Credit Prediction with Learnable Prompts for Multi-scale Temporal Representation Learning [22.566512469446753]
FinLangNet is a novel framework that reformulates credit scoring as a multi-scale sequential learning problem.<n>In extensive evaluations, FinLangNet significantly outperforms a production XGBoost system.
arXiv Detail & Related papers (2024-04-19T17:01:46Z) - Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress.
Our investigation exposes a critical oversight in this belief.
By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z) - Empowering Many, Biasing a Few: Generalist Credit Scoring through Large
Language Models [53.620827459684094]
Large Language Models (LLMs) have great potential for credit scoring tasks, with strong generalization ability across multiple tasks.
We propose the first open-source comprehensive framework for exploring LLMs for credit scoring.
We then propose the first Credit and Risk Assessment Large Language Model (CALM) by instruction tuning, tailored to the nuanced demands of various financial risk assessment tasks.
arXiv Detail & Related papers (2023-10-01T03:50:34Z) - Enabling Machine Learning Algorithms for Credit Scoring -- Explainable
Artificial Intelligence (XAI) methods for clear understanding complex
predictive models [2.1723750239223034]
This paper compares various predictive models (logistic regression, logistic regression with weight of evidence transformations and modern artificial intelligence algorithms) and show that advanced tree based models give best results in prediction of client default.
We also show how to boost advanced models using techniques which allow to interpret them and made them more accessible for credit risk practitioners.
arXiv Detail & Related papers (2021-04-14T09:44:04Z) - Explaining Credit Risk Scoring through Feature Contribution Alignment
with Expert Risk Analysts [1.7778609937758323]
We focus on companies credit scoring and we benchmark different machine learning models.
The aim is to build a model to predict whether a company will experience financial problems in a given time horizon.
We bring light by providing an expert-aligned feature relevance score highlighting the disagreement between a credit risk expert and a model feature attribution explanation.
arXiv Detail & Related papers (2021-03-15T12:59:15Z) - Sequential Deep Learning for Credit Risk Monitoring with Tabular
Financial Data [0.901219858596044]
We present our attempts to create a novel approach to assessing credit risk using deep learning.
We propose a new credit card transaction sampling technique to use with deep recurrent and causal convolution-based neural networks.
We show that our sequential deep learning approach using a temporal convolutional network outperformed the benchmark non-sequential tree-based model.
arXiv Detail & Related papers (2020-12-30T21:29:48Z) - Explanations of Machine Learning predictions: a mandatory step for its
application to Operational Processes [61.20223338508952]
Credit Risk Modelling plays a paramount role.
Recent machine and deep learning techniques have been applied to the task.
We suggest to use LIME technique to tackle the explainability problem in this field.
arXiv Detail & Related papers (2020-12-30T10:27:59Z) - Super-App Behavioral Patterns in Credit Risk Models: Financial,
Statistical and Regulatory Implications [110.54266632357673]
We present the impact of alternative data that originates from an app-based marketplace, in contrast to traditional bureau data, upon credit scoring models.
Our results, validated across two countries, show that these new sources of data are particularly useful for predicting financial behavior in low-wealth and young individuals.
arXiv Detail & Related papers (2020-05-09T01:32:03Z) - The value of text for small business default prediction: A deep learning
approach [9.023847175654602]
It is standard policy for a loan officer to provide a textual loan assessment to mitigate limited data availability.
We exploit recent advances from the field of Deep Learning and Natural Language Processing to extract information from 60 000 textual assessments provided by a lender.
We find that the text alone is surprisingly effective for predicting default, but when combined with traditional data, it yields no additional predictive capability.
Our proposed deep learning model does, however, appear to be robust to the quality of the text and therefore suitable for partly automating the mSME lending process.
arXiv Detail & Related papers (2020-03-19T18:15:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.