Uncovering Representation Bias for Investment Decisions in Open-Source Large Language Models
- URL: http://arxiv.org/abs/2510.05702v2
- Date: Mon, 03 Nov 2025 01:00:40 GMT
- Title: Uncovering Representation Bias for Investment Decisions in Open-Source Large Language Models
- Authors: Fabrizio Dimino, Krati Saxena, Bhaskarjit Sarmah, Stefano Pasquali,
- Abstract summary: This paper focuses on representation bias in open-source Qwen models.<n>Using statistical tests and variance analysis, we find that firm size and valuation consistently increase model confidence.<n>When models are prompted for specific financial categories, their confidence rankings best align with fundamental data, moderately with technical signals, and least with growth indicators.
- Score: 0.06749750044497731
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large Language Models are increasingly adopted in financial applications to support investment workflows. However, prior studies have seldom examined how these models reflect biases related to firm size, sector, or financial characteristics, which can significantly impact decision-making. This paper addresses this gap by focusing on representation bias in open-source Qwen models. We propose a balanced round-robin prompting method over approximately 150 U.S. equities, applying constrained decoding and token-logit aggregation to derive firm-level confidence scores across financial contexts. Using statistical tests and variance analysis, we find that firm size and valuation consistently increase model confidence, while risk factors tend to decrease it. Confidence varies significantly across sectors, with the Technology sector showing the greatest variability. When models are prompted for specific financial categories, their confidence rankings best align with fundamental data, moderately with technical signals, and least with growth indicators. These results highlight representation bias in Qwen models and motivate sector-aware calibration and category-conditioned evaluation protocols for safe and fair financial LLM deployment.
Related papers
- Robust Reinforcement Learning in Finance: Modeling Market Impact with Elliptic Uncertainty Sets [57.179679246370114]
In financial applications, reinforcement learning (RL) agents are commonly trained on historical data, where their actions do not influence prices.<n>During deployment, these agents trade in live markets where their own transactions can shift asset prices, a phenomenon known as market impact.<n>Traditional robust RL approaches address this model misspecification by optimizing the worst-case performance over a set of uncertainties.<n>We develop a novel class of elliptic uncertainty sets, enabling efficient and tractable robust policy evaluation.
arXiv Detail & Related papers (2025-10-22T18:22:25Z) - Your AI, Not Your View: The Bias of LLMs in Investment Analysis [55.328782443604986]
Large Language Models (LLMs) face frequent knowledge conflicts due to discrepancies between pre-trained parametric knowledge and real-time market data.<n>This paper offers the first quantitative analysis of confirmation bias in LLM-based investment analysis.<n>We observe a consistent preference for large-cap stocks and contrarian strategies across most models.
arXiv Detail & Related papers (2025-07-28T16:09:38Z) - Multi-Channel Graph Neural Network for Financial Risk Prediction of NEEQ Enterprises [0.0]
We propose a multi-channel deep learning framework that integrates structured financial indicators, textual disclosures, and enterprise relationship data for comprehensive financial risk prediction.<n>We show that our model significantly outperforms traditional machine learning methods and single-modality baselines in terms of AUC, Precision, Recall, and F1 Score.<n>This work provides theoretical and practical insights into risk modeling for SMEs and offers a data-driven tool to support financial regulators and investors.
arXiv Detail & Related papers (2025-07-17T04:57:51Z) - FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting [58.70072722290475]
Financial time series (FinTS) record the behavior of human-brain-augmented decision-making.<n>FinTSB is a comprehensive and practical benchmark for financial time series forecasting.
arXiv Detail & Related papers (2025-02-26T05:19:16Z) - STORM: A Spatio-Temporal Factor Model Based on Dual Vector Quantized Variational Autoencoders for Financial Trading [55.02735046724146]
In financial trading, factor models are widely used to price assets and capture excess returns from mispricing.<n>We propose a Spatio-Temporal factOR Model based on dual vector quantized variational autoencoders, named STORM.<n>Storm extracts features of stocks from temporal and spatial perspectives, then fuses and aligns these features at the fine-grained and semantic level, and represents the factors as multi-dimensional embeddings.
arXiv Detail & Related papers (2024-12-12T17:15:49Z) - Harnessing Earnings Reports for Stock Predictions: A QLoRA-Enhanced LLM Approach [6.112119533910774]
This paper introduces an advanced approach by employing Large Language Models (LLMs) instruction fine-tuned with a novel combination of instruction-based techniques and quantized low-rank adaptation (QLoRA) compression.
Our methodology integrates 'base factors', such as financial metric growth and earnings transcripts, with 'external factors', including recent market indices performances and analyst grades, to create a rich, supervised dataset.
This study not only demonstrates the power of integrating cutting-edge AI with fine-tuned financial data but also paves the way for future research in enhancing AI-driven financial analysis tools.
arXiv Detail & Related papers (2024-08-13T04:53:31Z) - Are LLMs Rational Investors? A Study on Detecting and Reducing the Financial Bias in LLMs [44.53203911878139]
Large Language Models (LLMs) are increasingly adopted in financial analysis for interpreting complex market data and trends.
Financial Bias Indicators (FBI) is a framework with components like Bias Unveiler, Bias Detective, Bias Tracker, and Bias Antidote.
We evaluate 23 leading LLMs and propose a de-biasing method based on financial causal knowledge.
arXiv Detail & Related papers (2024-02-20T04:26:08Z) - Financial Distress Prediction For Small And Medium Enterprises Using
Machine Learning Techniques [5.301137510638804]
Financial Distress Prediction plays a crucial role in the economy by accurately forecasting the number and probability of failing structures.
However, predicting financial distress for Small and Medium Enterprises is challenging due to their inherent ambiguity.
We propose a corporate FCP model that better aligns with industry practice and incorporates the gathering of thin-head component analysis of financial data, corporate governance qualities, and market exchange data with a Relevant Vector Machine.
arXiv Detail & Related papers (2023-02-23T15:58:30Z) - Factor Investing with a Deep Multi-Factor Model [123.52358449455231]
We develop a novel deep multi-factor model that adopts industry neutralization and market neutralization modules with clear financial insights.
Tests on real-world stock market data demonstrate the effectiveness of our deep multi-factor model.
arXiv Detail & Related papers (2022-10-22T14:47:11Z) - Gaussian process imputation of multiple financial series [71.08576457371433]
Multiple time series such as financial indicators, stock prices and exchange rates are strongly coupled due to their dependence on the latent state of the market.
We focus on learning the relationships among financial time series by modelling them through a multi-output Gaussian process.
arXiv Detail & Related papers (2020-02-11T19:18:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.