Related papers: Document-Level Tabular Numerical Cross-Checking: A Coarse-to-Fine Approach

Document-Level Tabular Numerical Cross-Checking: A Coarse-to-Fine Approach

URL: http://arxiv.org/abs/2506.13328v1
Date: Mon, 16 Jun 2025 10:17:21 GMT
Title: Document-Level Tabular Numerical Cross-Checking: A Coarse-to-Fine Approach
Authors: Chaoxu Pang, Yixuan Cao, Ganbin Zhou, Hongwei Li, Ping Luo,
Abstract summary: numerical consistency across tables in disclosure documents is critical for ensuring accuracy, maintaining credibility, and reputational and economic risks.<n>This paper introduces CoFiTCheck, a novel framework that addresses these challenges through two sequential stages: embedding-based filtering and discriminative classification.<n>CoFiTCheck significantly outperforms previous methods while maintaining practical efficiency.
Score: 27.581678327762003
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Numerical consistency across tables in disclosure documents is critical for ensuring accuracy, maintaining credibility, and avoiding reputational and economic risks. Automated tabular numerical cross-checking presents two significant challenges: (C1) managing the combinatorial explosion of candidate instances at the document level and (C2) comprehending multi-faceted numerical semantics. Previous research typically depends on heuristic-based filtering or simplified context extraction, often struggling to balance performance and efficiency. Recently, large language models (LLMs) have demonstrated remarkable contextual understanding capabilities that helps address C2 at the instance level, yet they remain hampered by computational inefficiency (C1) and limited domain expertise. This paper introduces CoFiTCheck, a novel LLM-based coarse-to-fine framework that addresses these challenges through two sequential stages: embedding-based filtering and discriminative classification. The embedding-based filtering stage introduces an instructional parallel encoding method to efficiently represent all numerical mentions in a table with LLMs, as well as a decoupled InfoNCE objective to mitigate the isolated mention problem. The discriminative classification stage employs a specialized LLM for fine-grained analysis of the remaining candidate pairs. This stage is further enhanced by our crosstable numerical alignment pretraining paradigm, which leverages weak supervision from cross-table numerical equality relationships to enrich task-specific priors without requiring manual annotation. Comprehensive evaluation across three types of real-world disclosure documents demonstrates that CoFiTCheck significantly outperforms previous methods while maintaining practical efficiency.

Related papers

TalentMine: LLM-Based Extraction and Question-Answering from Multimodal Talent Tables [5.365164774382722]
We introduce TalentMine, a novel framework that transforms extracted tables into semantically enriched representations.<n> TalentMine achieves 100% accuracy in query answering tasks compared to 0% for standard AWS Textract extraction.<n>Our comparative analysis also reveals that the Claude v3 Haiku model achieves optimal performance for talent management applications.
arXiv Detail & Related papers (2025-06-22T22:17:42Z)
SPARE: Single-Pass Annotation with Reference-Guided Evaluation for Automatic Process Supervision and Reward Modelling [70.01883340129204]
Single-Pass.<n>with Reference-Guided Evaluation (SPARE)<n>Novel structured framework that enables single-pass, per-step annotation by aligning each solution step to one or multiple steps in a reference solution, accompanied by explicit reasoning for evaluation.<n>SPARE achieves competitive performance on challenging mathematical datasets while offering 2.6 times greater efficiency, requiring only 38% of the runtime.
arXiv Detail & Related papers (2025-06-18T14:37:59Z)
Label-shift robust federated feature screening for high-dimensional classification [14.252760098879186]
This paper introduces a general framework that unifies existing screening methods and proposes a novel utility, label-shift robust federated feature screening (LR-FFS)<n>Building upon this framework, LR-FFS leverages conditional distribution functions and expectations to address label shift without adding computational burdens.<n> Experimental results and theoretical analyses demonstrate LR-FFS's superior performance across diverse client environments.
arXiv Detail & Related papers (2025-05-31T04:14:49Z)
GUM-SAGE: A Novel Dataset and Approach for Graded Entity Salience Prediction [12.172254885579706]
Graded entity salience assigns entities scores that reflect their relative importance in a text.<n>We introduce a novel approach for graded entity salience that combines the strengths of both approaches.<n>Our approach shows stronger correlation with scores based on human summaries and alignments, and outperforms existing techniques.
arXiv Detail & Related papers (2025-04-15T01:26:14Z)
Can LLMs Generate Tabular Summaries of Science Papers? Rethinking the Evaluation Protocol [83.90769864167301]
Literature review tables are essential for summarizing and comparing collections of scientific papers.<n>We explore the task of generating tables that best fulfill a user's informational needs given a collection of scientific papers.<n>Our contributions focus on three key challenges encountered in real-world use: (i) User prompts are often under-specified; (ii) Retrieved candidate papers frequently contain irrelevant content; and (iii) Task evaluation should move beyond shallow text similarity techniques.
arXiv Detail & Related papers (2025-04-14T14:52:28Z)
Order-agnostic Identifier for Large Language Model-based Generative Recommendation [94.37662915542603]
Items are assigned identifiers for Large Language Models (LLMs) to encode user history and generate the next item.<n>Existing approaches leverage either token-sequence identifiers, representing items as discrete token sequences, or single-token identifiers, using ID or semantic embeddings.<n>We propose SETRec, which leverages semantic tokenizers to obtain order-agnostic multi-dimensional tokens.
arXiv Detail & Related papers (2025-02-15T15:25:38Z)
Self-Calibrated Listwise Reranking with Large Language Models [137.6557607279876]
Large language models (LLMs) have been employed in reranking tasks through a sequence-to-sequence approach. This reranking paradigm requires a sliding window strategy to iteratively handle larger candidate sets. We propose a novel self-calibrated listwise reranking method, which aims to leverage LLMs to produce global relevance scores for ranking.
arXiv Detail & Related papers (2024-11-07T10:31:31Z)
Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning [9.601067780210006]
This paper leverages large language models (LLMs) for workflow anomaly detection by exploiting their ability to learn complex data patterns. Two approaches are investigated: 1) supervised fine-tuning (SFT), where pre-trained LLMs are fine-tuned on labeled data for sentence classification to identify anomalies, and 2) in-context learning (ICL) where prompts containing task descriptions and examples guide LLMs in few-shot anomaly detection without fine-tuning.
arXiv Detail & Related papers (2024-07-24T16:33:04Z)
H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables [56.73919743039263]
This paper introduces a novel algorithm that integrates both symbolic and semantic (textual) approaches in a two-stage process to address limitations.<n>Our experiments demonstrate that H-STAR significantly outperforms state-of-the-art methods across three question-answering (QA) and fact-verification datasets.
arXiv Detail & Related papers (2024-06-29T21:24:19Z)
Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models [24.085614720512744]
This study shows that large language models (LLMs) are vulnerable to changes in the number and arrangement of options in text classification. Key bottleneck arises from ambiguous decision boundaries and inherent biases towards specific tokens and positions. Our approach is grounded in the empirical observation that pairwise comparisons can effectively alleviate boundary ambiguity and inherent bias.
arXiv Detail & Related papers (2024-06-11T06:53:19Z)
Learning from Litigation: Graphs and LLMs for Retrieval and Reasoning in eDiscovery [6.037276428689637]
We introduce DISCOvery Graph (DISCOG), an emerging system that integrates knowledge graphs for enhanced document ranking and classification.<n> DISCOG outperforms strong baselines in F1-score, precision, and recall across both balanced and imbalanced datasets.<n>In real-world deployments, it has reduced litigation-related document review costs by approximately 98%, demonstrating significant business impact.
arXiv Detail & Related papers (2024-05-29T15:08:55Z)
Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score. Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score. Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.