Related papers: MASFIN: A Multi-Agent System for Decomposed Financial Reasoning and Forecasting

MASFIN: A Multi-Agent System for Decomposed Financial Reasoning and Forecasting

URL: http://arxiv.org/abs/2512.21878v1
Date: Fri, 26 Dec 2025 06:01:55 GMT
Title: MASFIN: A Multi-Agent System for Decomposed Financial Reasoning and Forecasting
Authors: Marc S. Montalvo, Hamed Yaghoobian,
Abstract summary: We introduce MASFIN, a modular multi-agent framework that integrates structured financial metrics and unstructured news.<n>In an eight-week evaluation, MASFIN delivered a 7.33% cumulative return, outperforming the S&P 500, NASDAQ-100, and Dow Jones benchmarks in six of eight weeks.<n>These findings demonstrate the promise of bias-aware, generative AI frameworks for financial forecasting.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in large language models (LLMs) are transforming data-intensive domains, with finance representing a high-stakes environment where transparent and reproducible analysis of heterogeneous signals is essential. Traditional quantitative methods remain vulnerable to survivorship bias, while many AI-driven approaches struggle with signal integration, reproducibility, and computational efficiency. We introduce MASFIN, a modular multi-agent framework that integrates LLMs with structured financial metrics and unstructured news, while embedding explicit bias-mitigation protocols. The system leverages GPT-4.1-nano for reproducability and cost-efficient inference and generates weekly portfolios of 15-30 equities with allocation weights optimized for short-term performance. In an eight-week evaluation, MASFIN delivered a 7.33% cumulative return, outperforming the S&P 500, NASDAQ-100, and Dow Jones benchmarks in six of eight weeks, albeit with higher volatility. These findings demonstrate the promise of bias-aware, generative AI frameworks for financial forecasting and highlight opportunities for modular multi-agent design to advance practical, transparent, and reproducible approaches in quantitative finance.

Related papers

$φ$-DPO: Fairness Direct Preference Optimization Approach to Continual Learning in Large Multimodal Models [58.217707070069885]
This paper presents a novel Fairness Direct Preference Optimization (FaiDPO or $$-DPO) framework for continual learning in LMMs.<n>We first propose a new continual learning paradigm based on Direct Preference Optimization (DPO) to mitigate catastrophic forgetting by aligning learning with pairwise preference signals.<n> Extensive experiments and ablation studies show the proposed $$-DPO achieves State-of-the-Art performance across multiple benchmarks.
arXiv Detail & Related papers (2026-02-26T04:14:33Z)
UniFinEval: Towards Unified Evaluation of Financial Multimodal Models across Text, Images and Videos [22.530796761115766]
We propose UniFinEval, the first unified multimodal benchmark for high-information-density financial environments.<n>UniFinEval systematically constructs five core financial scenarios grounded in real-world financial systems.<n> Gemini-3-pro-preview achieves the best overall performance, yet still exhibits a substantial gap compared to financial experts.
arXiv Detail & Related papers (2026-01-09T10:15:32Z)
Uni-FinLLM: A Unified Multimodal Large Language Model with Modular Task Heads for Micro-Level Stock Prediction and Macro-Level Systemic Risk Assessment [6.015507338546882]
Financial institutions and regulators require systems that integrate heterogeneous data to assess risks from stock fluctuations to systemic vulnerabilities.<n>We propose Uni-FinLLM, a unified multimodal large language model that uses a shared Transformer backbone and modular task heads to jointly process financial text, numerical time series, fundamentals, and visual data.
arXiv Detail & Related papers (2026-01-06T03:22:51Z)
LAET: A Layer-wise Adaptive Ensemble Tuning Framework for Pretrained Language Models [7.216206616406649]
Large language models (LLMs) like BloombergGPT and FinMA have set new benchmarks across various financial NLP tasks.<n>We propose Layer-wise Adaptive Ensemble Tuning (LAET), a novel strategy that selectively fine-tunes the most effective layers of pre-trained LLMs.<n>Our approach shows strong results in financial NLP tasks, outperforming existing benchmarks and state-of-the-art LLMs.
arXiv Detail & Related papers (2025-11-14T13:57:46Z)
Trade in Minutes! Rationality-Driven Agentic System for Quantitative Financial Trading [57.28635022507172]
TiMi is a rationality-driven multi-agent system that architecturally decouples strategy development from minute-level deployment.<n>We propose a two-tier analytical paradigm from macro patterns to micro customization, layered programming design for trading bot implementation, and closed-loop optimization driven by mathematical reflection.
arXiv Detail & Related papers (2025-10-06T13:08:55Z)
Enhancing Financial RAG with Agentic AI and Multi-HyDE: A Novel Approach to Knowledge Retrieval and Hallucination Reduction [0.5814806132299305]
We introduce a framework for financial Retrieval Augmented Generation (RAG)<n>RAG generates multiple, nonequivalent queries to boost the effectiveness and coverage of retrieval from large, structured financial corpora.<n>Our pipeline is optimized for token efficiency and multi-step financial reasoning.
arXiv Detail & Related papers (2025-09-19T19:24:30Z)
Uncertainty-Aware Collaborative System of Large and Small Models for Multimodal Sentiment Analysis [17.98292973608615]
We propose a novel Uncertainty-Aware Collaborative System (U-ACS) that orchestrates a powerful MLLM and a lightweight baseline model for multimodal sentiment analysis.<n>Our proposed method achieves state-of-the-art performance, while requiring only a fraction of the computational resources compared to using a standalone MLLM.
arXiv Detail & Related papers (2025-08-27T16:01:58Z)
FinDPO: Financial Sentiment Analysis for Algorithmic Trading through Preference Optimization of LLMs [2.06242362470764]
We introduce FinDPO, the first finance-specific sentiment analysis framework based on post-training human preference alignment.<n>The proposed FinDPO achieves state-of-the-art performance on standard sentiment classification benchmarks.<n>We show that FinDPO is the first sentiment-based approach to maintain substantial positive returns of 67% annually and strong risk-adjusted performance.
arXiv Detail & Related papers (2025-07-24T13:57:05Z)
Cross-Modal Temporal Fusion for Financial Market Forecasting [3.0756278306759635]
We introduce a transformer-based deep learning framework, Cross-Modal Temporal Fusion (CMTF), that fuses structured and unstructured financial data for improved market prediction.<n> Experimental results using FTSE 100 stock data demonstrate that CMTF achieves superior performance in price direction classification compared to classical and deep learning baselines.
arXiv Detail & Related papers (2025-04-18T07:20:18Z)
FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting [58.70072722290475]
Financial time series (FinTS) record the behavior of human-brain-augmented decision-making.<n>FinTSB is a comprehensive and practical benchmark for financial time series forecasting.
arXiv Detail & Related papers (2025-02-26T05:19:16Z)
BreakGPT: Leveraging Large Language Models for Predicting Asset Price Surges [55.2480439325792]
This paper introduces BreakGPT, a novel large language model (LLM) architecture adapted specifically for time series forecasting and the prediction of sharp upward movements in asset prices. We showcase BreakGPT as a promising solution for financial forecasting with minimal training and as a strong competitor for capturing both local and global temporal dependencies.
arXiv Detail & Related papers (2024-11-09T05:40:32Z)
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications [88.96861155804935]
We introduce textitOpen-FinLLMs, the first open-source multimodal financial LLMs.<n>FinLLaMA is pre-trained on a comprehensive 52-billion-token corpus; FinLLaMA-Instruct, fine-tuned with 573K financial instructions; and FinLLaVA, enhanced with 1.43M multimodal tuning pairs.<n>We evaluate Open-FinLLMs across 14 financial tasks, 30 datasets, and 4 multimodal tasks in zero-shot, few-shot, and supervised fine-tuning settings.
arXiv Detail & Related papers (2024-08-20T16:15:28Z)
FinBen: A Holistic Financial Benchmark for Large Language Models [75.09474986283394]
FinBen is the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks. FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, novel agent and Retrieval-Augmented Generation (RAG) evaluation, and three novel open-source evaluation datasets for text summarization, question answering, and stock trading.
arXiv Detail & Related papers (2024-02-20T02:16:16Z)
Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models [48.87381259980254]
We document the capability of large language models (LLMs) like ChatGPT to predict stock market reactions from news headlines without direct financial training.<n>Using post-knowledge-cutoff headlines, GPT-4 captures initial market responses, achieving approximately 90% portfolio-day hit rates for the non-tradable initial reaction.
arXiv Detail & Related papers (2023-04-15T19:22:37Z)
Gaussian process imputation of multiple financial series [71.08576457371433]
Multiple time series such as financial indicators, stock prices and exchange rates are strongly coupled due to their dependence on the latent state of the market. We focus on learning the relationships among financial time series by modelling them through a multi-output Gaussian process.
arXiv Detail & Related papers (2020-02-11T19:18:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.