Hallucination-minimized Data-to-answer Framework for Financial
Decision-makers
- URL: http://arxiv.org/abs/2311.07592v1
- Date: Thu, 9 Nov 2023 22:53:52 GMT
- Title: Hallucination-minimized Data-to-answer Framework for Financial
Decision-makers
- Authors: Sohini Roychowdhury, Andres Alvarez, Brian Moore, Marko Krema, Maria
Paz Gelpi, Federico Martin Rodriguez, Angel Rodriguez, Jose Ramon Cabrejas,
Pablo Martinez Serrano, Punit Agrawal, Arijit Mukherjee
- Abstract summary: Large Language Models (LLMs) have been applied to build several automation and personalized question-answering prototypes so far.
We present a novel Langchain-based framework that transforms data tables into hierarchical textual data chunks to enable a wide variety of actionable question answering.
- Score: 1.3781777926017094
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large Language Models (LLMs) have been applied to build several automation
and personalized question-answering prototypes so far. However, scaling such
prototypes to robust products with minimized hallucinations or fake responses
still remains an open challenge, especially in niche data-table heavy domains
such as financial decision making. In this work, we present a novel
Langchain-based framework that transforms data tables into hierarchical textual
data chunks to enable a wide variety of actionable question answering. First,
the user-queries are classified by intention followed by automated retrieval of
the most relevant data chunks to generate customized LLM prompts per query.
Next, the custom prompts and their responses undergo multi-metric scoring to
assess for hallucinations and response confidence. The proposed system is
optimized with user-query intention classification, advanced prompting, data
scaling capabilities and it achieves over 90% confidence scores for a variety
of user-queries responses ranging from {What, Where, Why, How, predict, trend,
anomalies, exceptions} that are crucial for financial decision making
applications. The proposed data to answers framework can be extended to other
analytical domains such as sales and payroll to ensure optimal hallucination
control guardrails.
Related papers
- InfoQuest: Evaluating Multi-Turn Dialogue Agents for Open-Ended Conversations with Hidden Context [4.262907114077643]
We introduce InfoQuest, a benchmark designed to evaluate how dialogue agents handle hidden context in open-ended user requests.
Our evaluation reveals that while proprietary models generally perform better, all current assistants struggle with effectively gathering critical information.
arXiv Detail & Related papers (2025-02-17T19:01:10Z) - Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control [52.405085773954596]
Retrieval-Augmented Generation (RAG) has emerged as a powerful approach to mitigate large language model hallucinations.
Existing RAG frameworks often apply retrieval indiscriminately,leading to inefficiencies-over-retrieving.
We introduce a novel user-controllable RAG framework that enables dynamic adjustment of the accuracy-cost trade-off.
arXiv Detail & Related papers (2025-02-17T18:56:20Z) - Unsupervised Query Routing for Retrieval Augmented Generation [64.47987041500966]
We introduce a novel unsupervised method that constructs the "upper-bound" response to evaluate the quality of retrieval-augmented responses.
This evaluation enables the decision of the most suitable search engine for a given query.
By eliminating manual annotations, our approach can automatically process large-scale real user queries and create training data.
arXiv Detail & Related papers (2025-01-14T02:27:06Z) - MAG-V: A Multi-Agent Framework for Synthetic Data Generation and Verification [5.666070277424383]
MAG-V is a framework to generate a dataset of questions that mimic customer queries.
Our synthetic data can improve agent performance on actual customer queries.
arXiv Detail & Related papers (2024-11-28T19:36:11Z) - A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection [0.0]
Large Language Models are prone to off-topic misuse, where users may prompt these models to perform tasks beyond their intended scope.
Current guardrails suffer from high false-positive rates, limited adaptability, and the impracticality of requiring real-world data that is not available in pre-production.
This paper introduces a flexible, data-free guardrail development methodology that addresses these challenges.
arXiv Detail & Related papers (2024-11-20T00:31:23Z) - IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization [59.06663981902496]
Query-focused summarization (QFS) aims to produce summaries that answer particular questions of interest, enabling greater user control and personalization.
We investigate two indispensable characteristics that the LLMs-based QFS models should be harnessed, Lengthy Document Summarization and Efficiently Fine-grained Query-LLM Alignment.
These innovations pave the way for broader application and accessibility in the field of QFS technology.
arXiv Detail & Related papers (2024-07-15T07:14:56Z) - Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - ERATTA: Extreme RAG for Table To Answers with Large Language Models [1.3318204310917532]
Large language models (LLMs) with retrieval augmented-generation (RAG) have been the optimal choice for scalable generative AI solutions.
We propose a unique LLM-based system where multiple LLMs can be invoked to enable data authentication, user-query routing, data-retrieval and custom prompting for question-answering capabilities from Enterprise-data tables.
Our proposed system and scoring metrics achieve >90% confidence scores across hundreds of user queries in the sustainability, financial health and social media domains.
arXiv Detail & Related papers (2024-05-07T02:49:59Z) - CLARINET: Augmenting Language Models to Ask Clarification Questions for Retrieval [52.134133938779776]
We present CLARINET, a system that asks informative clarification questions by choosing questions whose answers would maximize certainty in the correct candidate.
Our approach works by augmenting a large language model (LLM) to condition on a retrieval distribution, finetuning end-to-end to generate the question that would have maximized the rank of the true candidate at each turn.
arXiv Detail & Related papers (2024-04-28T18:21:31Z) - Persona-DB: Efficient Large Language Model Personalization for Response Prediction with Collaborative Data Refinement [79.2400720115588]
We introduce Persona-DB, a simple yet effective framework consisting of a hierarchical construction process to improve generalization across task contexts.
In the evaluation of response prediction, Persona-DB demonstrates superior context efficiency in maintaining accuracy with a significantly reduced retrieval size.
Our experiments also indicate a marked improvement of over 10% under cold-start scenarios, when users have extremely sparse data.
arXiv Detail & Related papers (2024-02-16T20:20:43Z) - Conversational Factor Information Retrieval Model (ConFIRM) [2.855224352436985]
Conversational Factor Information Retrieval Method (ConFIRM) is a novel approach to fine-tuning large language models (LLMs) for domain-specific retrieval tasks.
We demonstrate ConFIRM's effectiveness through a case study in the finance sector, fine-tuning a Llama-2-7b model using personality-aligned data.
The resulting model achieved 91% accuracy in classifying financial queries, with an average inference time of 0.61 seconds on an NVIDIA A100 GPU.
arXiv Detail & Related papers (2023-10-06T12:31:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.