Related papers: Can Structured Data Reduce Epistemic Uncertainty?

Related papers

Detecting Hallucinations in Retrieval-Augmented Generation via Semantic-level Internal Reasoning Graph [12.233570103035312]
We propose a semantic-level internal reasoning graph-based method for detecting faithfulness hallucinations.<n>Our method achieves better overall performance compared to state-of-the-art baselines on RAGTruth and Dolly-15k.
arXiv Detail & Related papers (2026-01-06T14:35:20Z)
Mitigating Prompt-Induced Hallucinations in Large Language Models via Structured Reasoning [11.278137554160383]
We introduce a code module to guide knowledge-graph exploration and incorporate code as part of the chain-of-thought prompt.<n>We empirically evaluate the proposed approach using GPT-4 and LLaMA-3.3 on multiple public datasets.
arXiv Detail & Related papers (2026-01-06T06:02:45Z)
Did Models Sufficient Learn? Attribution-Guided Training via Subset-Selected Counterfactual Augmentation [61.248535801314375]
Subset-Selected Counterfactual Augmentation (SS-CA)<n>We develop Counterfactual LIMA to identify minimal spatial region sets whose removal can selectively alter model predictions.<n>Experiments show that SS-CA improves generalization on in-distribution (ID) test data and achieves superior performance on out-of-distribution (OOD) benchmarks.
arXiv Detail & Related papers (2025-11-15T08:39:22Z)
Studying the Role of Input-Neighbor Overlap in Retrieval-Augmented Language Models Training Efficiency [3.5634988336513587]
We investigate how varying levels of query-context overlap affect model performance during both training and inference.<n>Our experiments reveal that increased overlap initially has minimal effect, but substantially improves test-time perplexity and model accelerates learning above a critical threshold.
arXiv Detail & Related papers (2025-05-20T12:58:07Z)
Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
Generative retrieval reformulates retrieval as an autoregressive generation task, where large language models generate target documents directly from a query.<n>We systematically investigate training and inference scaling laws in generative retrieval, exploring how model size, training data scale, and inference-time compute jointly influence performance.
arXiv Detail & Related papers (2025-03-24T17:59:03Z)
Enhancing Knowledge Graph Construction: Evaluating with Emphasis on Hallucination, Omission, and Graph Similarity Metrics [0.9208007322096533]
This paper builds upon previous work, which evaluated various models using metrics like precision, recall, F1 score, triple matching, and graph matching. We propose an enhanced evaluation framework incorporating BERTScore for graph similarity, setting a practical threshold of 95% for graph matching.
arXiv Detail & Related papers (2025-02-07T11:19:01Z)
Exploring Foundation Models Fine-Tuning for Cytology Classification [0.10555513406636088]
We show how existing foundation models can be applied to cytological classification. We evaluate five foundation models across four cytological classification datasets. Our results demonstrate that fine-tuning the pre-trained backbones with LoRA significantly improves model performance.
arXiv Detail & Related papers (2024-11-22T14:34:04Z)
Pruning Literals for Highly Efficient Explainability at Word Level [13.249876381579158]
Tsetlin Machine(TM) is promising because of its capability of providing word-level explanation using proposition logic. In this paper, we design a post-hoc pruning of clauses that eliminate the randomly placed literals in the clause. Experiments on the publicly available YELP-HAT dataset demonstrate that the proposed pruned TM's attention map aligns more with the human attention map than the vanilla TM's attention map.
arXiv Detail & Related papers (2024-11-07T09:28:38Z)
Surgical Feature-Space Decomposition of LLMs: Why, When and How? [8.826164604720738]
We empirically study the efficacy of weight and feature space decomposition in transformer-based language models. We show that surgical decomposition provides critical insights into the trade-off between compression and language modelling performance. We extend our investigation to the implications of low-rank approximations on model bias.
arXiv Detail & Related papers (2024-05-17T07:34:03Z)
Silkie: Preference Distillation for Large Visual Language Models [56.10697821410489]
This paper explores preference distillation for large vision language models (LVLMs) We first build a vision-language feedback dataset utilizing AI annotation. We adopt GPT-4V to assess the generated outputs regarding helpfulness, visual faithfulness, and ethical considerations. The resulting model Silkie, achieves 6.9% and 9.5% relative improvement on the MME benchmark regarding the perception and cognition capabilities.
arXiv Detail & Related papers (2023-12-17T09:44:27Z)
Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling [69.60713300418467]
Learning to jump is a general recipe for generative modeling of various types of data. We demonstrate when learning to jump is expected to perform comparably to learning to denoise, and when it is expected to perform better.
arXiv Detail & Related papers (2023-05-28T05:38:28Z)
Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats. We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes. We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z)
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning [104.58874584354787]
In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning. This study aims to examine the in-context learning phenomenon through a Bayesian lens, viewing real-world LLMs as latent variable models.
arXiv Detail & Related papers (2023-01-27T18:59:01Z)
Training Strategies for Improved Lip-reading [61.661446956793604]
We investigate the performance of state-of-the-art data augmentation approaches, temporal models and other training strategies. A combination of all the methods results in a classification accuracy of 93.4%, which is an absolute improvement of 4.6% over the current state-of-the-art performance. An error analysis of the various training strategies reveals that the performance improves by increasing the classification accuracy of hard-to-recognise words.
arXiv Detail & Related papers (2022-09-03T09:38:11Z)
Investigating classification learning curves for automatically generated and labelled plant images [0.1338174941551702]
We present a dataset of plant images with representatives of crops and weeds common to the Manitoba prairies at different growth stages. We determine the learning curve for a classification task on this data with the ResNet architecture. We investigate how label noise and the reduction of trainable parameters impacts the learning curve on this dataset.
arXiv Detail & Related papers (2022-05-22T23:28:42Z)
Layer-wise Analysis of a Self-supervised Speech Representation Model [26.727775920272205]
Self-supervised learning approaches have been successful for pre-training speech representation models. Not much has been studied about the type or extent of information encoded in the pre-trained representations themselves.
arXiv Detail & Related papers (2021-07-10T02:13:25Z)
A Multi-Level Attention Model for Evidence-Based Fact Checking [58.95413968110558]
We present a simple model that can be trained on sequence structures. Results on a large-scale dataset for Fact Extraction and VERification show that our model outperforms the graph-based approaches.
arXiv Detail & Related papers (2021-06-02T05:40:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.