Related papers: LLM Augmentations to support Analytical Reasoning over Multiple Documents

LLM Augmentations to support Analytical Reasoning over Multiple Documents

URL: http://arxiv.org/abs/2411.16116v1
Date: Mon, 25 Nov 2024 06:00:42 GMT
Title: LLM Augmentations to support Analytical Reasoning over Multiple Documents
Authors: Raquib Bin Yousuf, Nicholas Defelice, Mandar Sharma, Shengzhe Xu, Naren Ramakrishnan,
Abstract summary: We investigate the application of large language models (LLMs) to enhance in-depth analytical reasoning within the context of intelligence analysis. We develop an architecture to augment the capabilities of an LLM with a memory module called dynamic evidence trees (DETs) to develop and track multiple investigation threads.
Score: 8.99490805653946
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Building on their demonstrated ability to perform a variety of tasks, we investigate the application of large language models (LLMs) to enhance in-depth analytical reasoning within the context of intelligence analysis. Intelligence analysts typically work with massive dossiers to draw connections between seemingly unrelated entities, and uncover adversaries' plans and motives. We explore if and how LLMs can be helpful to analysts for this task and develop an architecture to augment the capabilities of an LLM with a memory module called dynamic evidence trees (DETs) to develop and track multiple investigation threads. Through extensive experiments on multiple datasets, we highlight how LLMs, as-is, are still inadequate to support intelligence analysts and offer recommendations to improve LLMs for such intricate reasoning applications.

Related papers

Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers [74.17516978246152]
Large language models (LLMs) have been widely integrated into information retrieval to advance traditional techniques.<n>We propose EXSEARCH, an agentic search framework, where the LLM learns to retrieve useful information as the reasoning unfolds.<n>Experiments on four knowledge-intensive benchmarks show that EXSEARCH substantially outperforms baselines.
arXiv Detail & Related papers (2025-05-26T15:27:55Z)
IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis [60.32962597618861]
IDA-Bench is a novel benchmark evaluating large language models in multi-round interactive scenarios.<n>Agent performance is judged by comparing its final numerical output to the human-derived baseline.<n>Even state-of-the-art coding agents (like Claude-3.7-thinking) succeed on 50% of the tasks, highlighting limitations not evident in single-turn tests.
arXiv Detail & Related papers (2025-05-23T09:37:52Z)
How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective [64.00022624183781]
Large language models (LLMs) can assess relevance and support information retrieval (IR) tasks. We investigate how different LLM modules contribute to relevance judgment through the lens of mechanistic interpretability.
arXiv Detail & Related papers (2025-04-10T16:14:55Z)
Agent-Enhanced Large Language Models for Researching Political Institutions [0.0]
This paper demonstrates how Large Language Models (LLMs) can serve as dynamic agents capable of streamlining tasks. Central to this approach is agentic retrieval-augmented generation (Agentic RAG) To demonstrate the potential of this approach, we introduce CongressRA, an LLM agent designed to support scholars studying the U.S. Congress.
arXiv Detail & Related papers (2025-03-14T22:04:40Z)
A Survey on Large Language Models with some Insights on their Capabilities and Limitations [0.3222802562733786]
Large Language Models (LLMs) exhibit remarkable performance across various language-related tasks. LLMs have demonstrated emergent abilities extending beyond their core functions. This paper explores the foundational components, scaling mechanisms, and architectural strategies that drive these capabilities.
arXiv Detail & Related papers (2025-01-03T21:04:49Z)
The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead? [60.01746782465275]
Large Language Models (LLMs) have shown capabilities close to human performance in various analytical tasks. This paper investigates the efficiency and accuracy of LLMs in specialized tasks through a structured user study focusing on Human-LLM partnership.
arXiv Detail & Related papers (2024-10-07T02:30:18Z)
Enhancing Temporal Understanding in LLMs for Semi-structured Tables [50.59009084277447]
We conduct a comprehensive analysis of temporal datasets to pinpoint the specific limitations of large language models (LLMs) Our investigation leads to enhancements in TempTabQA, a dataset specifically designed for temporal temporal question answering. We introduce a novel approach, C.L.E.A.R. to strengthen LLM capabilities in this domain.
arXiv Detail & Related papers (2024-07-22T20:13:10Z)
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z)
Towards Modeling Learner Performance with Large Language Models [7.002923425715133]
This paper investigates whether the pattern recognition and sequence modeling capabilities of LLMs can be extended to the domain of knowledge tracing. We compare two approaches to using LLMs for this task, zero-shot prompting and model fine-tuning, with existing, non-LLM approaches to knowledge tracing. While LLM-based approaches do not achieve state-of-the-art performance, fine-tuned LLMs surpass the performance of naive baseline models and perform on par with standard Bayesian Knowledge Tracing approaches.
arXiv Detail & Related papers (2024-02-29T14:06:34Z)
LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges. Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model. This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z)
When Large Language Models Meet Vector Databases: A Survey [0.0]
VecDBs offer efficient means to store, retrieve, and manage the high-dimensional vector representations intrinsic to LLM operations. VecDBs emerge as a compelling solution to these issues by offering an efficient means to store, retrieve, and manage the high-dimensional vector representations intrinsic to LLM operations. This survey aims to catalyze further research into optimizing the confluence of LLMs and VecDBs for advanced data handling and knowledge extraction capabilities.
arXiv Detail & Related papers (2024-01-30T23:35:28Z)
Conversational AI Threads for Visualizing Multidimensional Datasets [10.533569558002798]
Generative Large Language Models (LLMs) show potential in data analysis, yet their full capabilities remain uncharted. Our work explores the capabilities of LLMs for creating and refining visualizations via conversational interfaces.
arXiv Detail & Related papers (2023-11-09T18:47:46Z)
On the Performance of Multimodal Language Models [4.677125897916577]
This study conducts a comparative analysis of different multimodal instruction tuning approaches. We reveal key insights for guiding architectural choices when incorporating multimodal capabilities into large language models.
arXiv Detail & Related papers (2023-10-04T23:33:36Z)
Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation [109.8527403904657]
We show that large language models (LLMs) possess unwavering confidence in their knowledge and cannot handle the conflict between internal and external knowledge well. Retrieval augmentation proves to be an effective approach in enhancing LLMs' awareness of knowledge boundaries. We propose a simple method to dynamically utilize supporting documents with our judgement strategy.
arXiv Detail & Related papers (2023-07-20T16:46:10Z)
Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks. We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.