Exploring Performance Contrasts in TableQA: Step-by-Step Reasoning Boosts Bigger Language Models, Limits Smaller Language Models
- URL: http://arxiv.org/abs/2411.16002v1
- Date: Sun, 24 Nov 2024 22:48:44 GMT
- Title: Exploring Performance Contrasts in TableQA: Step-by-Step Reasoning Boosts Bigger Language Models, Limits Smaller Language Models
- Authors: Haoyan Yang, Yixuan Wang, Keyue Tong, Hongjin Zhu, Yuanxin Zhang,
- Abstract summary: This paper proposes a detailed prompting flow, termed Table-Logic, to investigate the performance contrasts between bigger and smaller language models (LMs)
By deploying this method, we observe a 7.8% accuracy improvement in bigger LMs like Llama-3-70B compared to the vanilla on HybridQA.
Our findings highlight the limitations of the step-by-step reasoning method in small models and provide potential insights for making improvements.
- Score: 6.083393426133172
- License:
- Abstract: This paper proposes a detailed prompting flow, termed Table-Logic, to investigate the performance contrasts between bigger and smaller language models (LMs) utilizing step-by-step reasoning methods in the TableQA task. The method processes tasks by sequentially identifying critical columns and rows given question and table with its structure, determining necessary aggregations, calculations, or comparisons, and finally inferring the results to generate a precise prediction. By deploying this method, we observe a 7.8% accuracy improvement in bigger LMs like Llama-3-70B compared to the vanilla on HybridQA, while smaller LMs like Llama-2-7B shows an 11% performance decline. We empirically investigate the potential causes of performance contrasts by exploring the capabilities of bigger and smaller LMs from various dimensions in TableQA task. Our findings highlight the limitations of the step-by-step reasoning method in small models and provide potential insights for making improvements.
Related papers
- Adaptive Pruning for Large Language Models with Structural Importance Awareness [66.2690963378878]
Large language models (LLMs) have significantly improved language understanding and generation capabilities.
LLMs are difficult to deploy on resource-constrained edge devices due to their high computational and storage resource demands.
We propose structurally-aware adaptive pruning (SAAP) to significantly reduce the computational and memory costs while maintaining model performance.
arXiv Detail & Related papers (2024-12-19T18:08:04Z) - Interpretable LLM-based Table Question Answering [5.940265173828534]
Plan-of-inputs ( or POS) is an interpretable, effective, and efficient approach to Table QA.
We show that POS is most preferred among explanation methods, helps human users understand model decision boundaries, and facilitates model success and error identification.
arXiv Detail & Related papers (2024-12-16T22:44:31Z) - MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale [66.73529246309033]
multimodal large language models (MLLMs) have shown significant potential in a broad range of multimodal tasks.
Existing instruction-tuning datasets only provide phrase-level answers without any intermediate rationales.
We introduce a scalable and cost-effective method to construct a large-scale multimodal instruction-tuning dataset with rich intermediate rationales.
arXiv Detail & Related papers (2024-12-06T18:14:24Z) - TableRAG: Million-Token Table Understanding with Language Models [53.039560091592215]
TableRAG is a Retrieval-Augmented Generation (RAG) framework specifically designed for LM-based table understanding.
TableRAG leverages query expansion combined with schema and cell retrieval to pinpoint crucial information before providing it to the LMs.
Our results demonstrate that TableRAG achieves the highest retrieval quality, leading to the new state-of-the-art performance on large-scale table understanding.
arXiv Detail & Related papers (2024-10-07T04:15:02Z) - TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning [55.33939289989238]
We propose TAP4LLM as a versatile pre-processor suite for leveraging large language models (LLMs) in table-based tasks effectively.
It covers several distinct components: (1) table sampling to decompose large tables into manageable sub-tables based on query semantics, (2) table augmentation to enhance tables with additional knowledge from external sources or models, and (3) table packing & serialization to convert tables into various formats suitable for LLMs' understanding.
arXiv Detail & Related papers (2023-12-14T15:37:04Z) - Which Examples to Annotate for In-Context Learning? Towards Effective
and Efficient Selection [35.924633625147365]
Large Language Models (LLMs) can adapt to new tasks via in-context learning (ICL)
In this work, we investigate an active learning approach for ICL, where there is a limited budget for annotating examples.
We propose a model-adaptive optimization-free algorithm, termed AdaICL, which identifies examples that the model is uncertain about.
arXiv Detail & Related papers (2023-10-30T22:03:55Z) - Guiding Language Model Reasoning with Planning Tokens [122.43639723387516]
Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks.
We propose a hierarchical generation scheme to encourage a more structural generation of chain-of-thought steps.
Our approach requires a negligible increase in trainable parameters (0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme.
arXiv Detail & Related papers (2023-10-09T13:29:37Z) - SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs)
We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer.
We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z) - DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with
Self-Correction [7.388002745070808]
We study how breaking down the generation problem into sub-problems and feeding the solutions of those sub-problems into Large Language Models can be effective.
Our approach with in-context learning beats many heavily fine-tuned models by at least 5%.
arXiv Detail & Related papers (2023-04-21T15:02:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.