Related papers: TRACE: TRansformer-based Attribution using Contrastive Embeddings in LLMs

Related papers

Revisiting LLM Reasoning via Information Bottleneck [57.519119962528166]
Large language models (LLMs) have recently demonstrated remarkable progress in reasoning capabilities through reinforcement learning with verifiable rewards (RLVR)<n>We present a theoretical characterization of LLM reasoning grounded in information bottleneck (IB) principle.<n>We propose IB-aware reasoning optimization (IBRO), a framework that encourages reasoning trajectories to be both informative about the final correct answer and generalizable.
arXiv Detail & Related papers (2025-07-24T13:14:25Z)
RADIANT: Retrieval AugmenteD entIty-context AligNmenT -- Introducing RAG-ability and Entity-Context Divergence [5.066415370344766]
Retrieval-Augmented Generation (RAG) is a technique to enhance factual accuracy by integrating external knowledge into the generation process.<n>This paper introduces Radiant, a framework that merges RAG with alignment designed to optimize the interplay between retrieved evidence and generated content.
arXiv Detail & Related papers (2025-06-28T21:40:35Z)
RvLLM: LLM Runtime Verification with Domain Knowledge [8.15645390408007]
Large language models (LLMs) have emerged as a dominant AI paradigm due to their exceptional text understanding and generation capabilities.<n>Their tendency to generate inconsistent or erroneous outputs challenges their reliability, especially in high-stakes domains requiring accuracy and trustworthiness.<n>Existing research primarily focuses on detecting and mitigating model misbehavior in general-purpose scenarios, often overlooking the potential of integrating domain-specific knowledge.
arXiv Detail & Related papers (2025-05-24T08:21:44Z)
Informed Forecasting: Leveraging Auxiliary Knowledge to Boost LLM Performance on Time Series Forecasting [0.0]
A novel cross-domain knowledge transfer framework is proposed to enhance the performance of Large Language Models (LLMs) in time series forecasting.<n>The approach systematically infuses LLMs with structured temporal information to improve their forecasting accuracy.<n>Results show that knowledge-informed forecasting significantly outperforms the uninformed baseline in terms of predictive accuracy and generalization.
arXiv Detail & Related papers (2025-05-15T12:17:52Z)
MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model [54.14155564592936]
We propose a Mixture of Rule Experts guided by a Large Language Model (MoRE-LLM) MoRE-LLM steers the discovery of local rule-based surrogates during training and their utilization for the classification task. LLM is responsible for enhancing the domain knowledge alignment of the rules by correcting and contextualizing them.
arXiv Detail & Related papers (2025-03-26T11:09:21Z)
An Overview of Large Language Models for Statisticians [109.38601458831545]
Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI) This paper explores potential areas where statisticians can make important contributions to the development of LLMs. We focus on issues such as uncertainty quantification, interpretability, fairness, privacy, watermarking and model adaptation.
arXiv Detail & Related papers (2025-02-25T03:40:36Z)
Understanding Multimodal LLMs Under Distribution Shifts: An Information-Theoretic Approach [33.463823493423554]
Multimodal large language models (MLLMs) have shown promising capabilities but struggle under distribution shifts. We argue that establishing a formal framework that can characterize and quantify the risk of MLLMs is necessary to ensure the safe and reliable application of MLLMs in the real world.
arXiv Detail & Related papers (2025-02-01T22:06:56Z)
Aligning Large Language Models for Faithful Integrity Against Opposing Argument [71.33552795870544]
Large Language Models (LLMs) have demonstrated impressive capabilities in complex reasoning tasks. They can be easily misled by unfaithful arguments during conversations, even when their original statements are correct. We propose a novel framework, named Alignment for Faithful Integrity with Confidence Estimation.
arXiv Detail & Related papers (2025-01-02T16:38:21Z)
Large Language Model-Enhanced Symbolic Reasoning for Knowledge Base Completion [28.724919973497943]
Large language models (LLMs) and rule-based reasoning offer a powerful solution for improving the flexibility and reliability of Knowledge Base Completion. We propose a novel framework consisting of a Subgraph Extractor, an LLM Proposer, and a Rule Reasoner. Our approach offers several key benefits: the utilization of LLMs to enhance the richness and diversity of the proposed rules, and the integration with rule-based reasoning to improve reliability.
arXiv Detail & Related papers (2025-01-02T13:14:28Z)
CATER: Leveraging LLM to Pioneer a Multidimensional, Reference-Independent Paradigm in Translation Quality Evaluation [0.0]
Comprehensive AI-assisted Translation Edit Ratio (CATER) is a novel framework for evaluating machine translation (MT) quality. Uses large language models (LLMs) via a carefully designed prompt-based protocol.
arXiv Detail & Related papers (2024-12-15T17:45:34Z)
Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal [21.342265570934995]
Existing methods have largely overlooked the importance of refusal responses as a means of enhancing MLLMs reliability. We present the Information Boundary-aware Learning Framework (InBoL), a novel approach that empowers MLLMs to refuse to answer user queries when encountering insufficient information. This framework introduces a comprehensive data generation pipeline and tailored training strategies to improve the model's ability to deliver appropriate refusal responses.
arXiv Detail & Related papers (2024-12-15T14:17:14Z)
Refining Translations with LLMs: A Constraint-Aware Iterative Prompting Approach [7.5069214839655345]
Large language models (LLMs) have demonstrated remarkable proficiency in machine translation (MT) We propose a multi-step prompt chain that enhances translation faithfulness by prioritizing key terms crucial for semantic accuracy. Experiments using Llama and Qwen as base models on the FLORES-200 and WMT datasets demonstrate significant improvements over baselines.
arXiv Detail & Related papers (2024-11-13T05:40:24Z)
Is Factuality Enhancement a Free Lunch For LLMs? Better Factuality Can Lead to Worse Context-Faithfulness [39.74642729786543]
We argue that current factuality enhancement methods can significantly undermine context-faithfulness of large language models (LLMs) Experiments reveal that while these methods may yield inconsistent improvements in factual accuracy, they also cause a more severe decline in context-faithfulness.
arXiv Detail & Related papers (2024-03-30T02:08:28Z)
Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
EpiK-Eval: Evaluation for Language Models as Epistemic Models [16.485951373967502]
We introduce EpiK-Eval, a novel question-answering benchmark tailored to evaluate LLMs' proficiency in formulating a coherent and consistent knowledge representation from segmented narratives. We argue that these shortcomings stem from the intrinsic nature of prevailing training objectives. The findings from this study offer insights for developing more robust and reliable LLMs.
arXiv Detail & Related papers (2023-10-23T21:15:54Z)
Assessing the Reliability of Large Language Model Knowledge [78.38870272050106]
Large language models (LLMs) have been treated as knowledge bases due to their strong performance in knowledge probing tasks. How do we evaluate the capabilities of LLMs to consistently produce factually correct answers? We propose MOdel kNowledge relIabiliTy scORe (MONITOR), a novel metric designed to directly measure LLMs' factual reliability.
arXiv Detail & Related papers (2023-10-15T12:40:30Z)
Source Attribution for Large Language Model-Generated Data [57.85840382230037]
It is imperative to be able to perform source attribution by identifying the data provider who contributed to the generation of a synthetic text. We show that this problem can be tackled by watermarking. We propose a source attribution framework that satisfies these key properties due to our algorithmic designs.
arXiv Detail & Related papers (2023-10-01T12:02:57Z)
Improving Open Information Extraction with Large Language Models: A Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text. Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z)
User-Controlled Knowledge Fusion in Large Language Models: Balancing Creativity and Hallucination [5.046007553593371]
Large Language Models (LLMs) generate diverse, relevant, and creative responses. Striking a balance between the LLM's imaginative capabilities and its adherence to factual information is a key challenge. This paper presents an innovative user-controllable mechanism that modulates the balance between an LLM's imaginative capabilities and its adherence to factual information.
arXiv Detail & Related papers (2023-07-30T06:06:35Z)
Balancing Discriminability and Transferability for Source-Free Domain Adaptation [55.143687986324935]
Conventional domain adaptation (DA) techniques aim to improve domain transferability by learning domain-invariant representations. The requirement of simultaneous access to labeled source and unlabeled target renders them unsuitable for the challenging source-free DA setting. We derive novel insights to show that a mixup between original and corresponding translated generic samples enhances the discriminability-transferability trade-off.
arXiv Detail & Related papers (2022-06-16T09:06:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.