T-SciQ: Teaching Multimodal Chain-of-Thought Reasoning via Mixed Large
Language Model Signals for Science Question Answering
- URL: http://arxiv.org/abs/2305.03453v4
- Date: Mon, 18 Dec 2023 04:49:01 GMT
- Title: T-SciQ: Teaching Multimodal Chain-of-Thought Reasoning via Mixed Large
Language Model Signals for Science Question Answering
- Authors: Lei Wang, Yi Hu, Jiabang He, Xing Xu, Ning Liu, Hui Liu, Heng Tao Shen
- Abstract summary: Large Language Models (LLMs) have demonstrated exceptional performance in various Natural Language Processing (NLP) tasks.
We propose a novel method termed T-SciQ that aims at teaching science question answering with LLM signals.
Our approach achieves a new state-of-the-art performance on the ScienceQA benchmark, with an accuracy of 96.18%.
- Score: 59.63860993280275
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have recently demonstrated exceptional
performance in various Natural Language Processing (NLP) tasks. They have also
shown the ability to perform chain-of-thought (CoT) reasoning to solve complex
problems. Recent studies have explored CoT reasoning in complex multimodal
scenarios, such as the science question answering task, by fine-tuning
multimodal models with high-quality human-annotated CoT rationales. However,
collecting high-quality COT rationales is usually time-consuming and costly.
Besides, the annotated rationales are hardly accurate due to the external
essential information missed. To address these issues, we propose a novel
method termed T-SciQ that aims at teaching science question answering with LLM
signals. The T-SciQ approach generates high-quality CoT rationales as teaching
signals and is advanced to train much smaller models to perform CoT reasoning
in complex modalities. Additionally, we introduce a novel data mixing strategy
to produce more effective teaching data samples for simple and complex science
question answer problems. Extensive experimental results show that our T-SciQ
method achieves a new state-of-the-art performance on the ScienceQA benchmark,
with an accuracy of 96.18%. Moreover, our approach outperforms the most
powerful fine-tuned baseline by 4.5%. The code is publicly available at
https://github.com/T-SciQ/T-SciQ.
Related papers
- AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning [38.736190591684]
AtomR is a novel heterogeneous knowledge reasoning framework.
It decomposes complex questions into combinations of three atomic knowledge operators.
AtomR significantly outperforms state-of-the-art baselines across three single-source and two multi-source reasoning benchmarks.
arXiv Detail & Related papers (2024-11-25T15:35:51Z) - Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis [82.51626700527837]
Chain-of-shift (CoT) is an efficient method that enables the reasoning ability of large language models by augmenting the query using examples with multiple intermediate steps.
We show that despite the theoretical success of CoT, it fails to provide an accurate generalization when CoT does.
arXiv Detail & Related papers (2024-10-03T03:12:51Z) - ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting [124.69672273754144]
Chain-of-Thought (CoT) prompting can enhance the reasoning capabilities of large language models (LLMs)
Existing CoT approaches usually focus on simpler reasoning tasks and thus result in low-quality and inconsistent CoT prompts.
We introduce CoTGenius, a novel framework designed for the automatic generation of superior CoT prompts.
arXiv Detail & Related papers (2024-03-21T11:34:26Z) - Prompting Large Language Models with Chain-of-Thought for Few-Shot
Knowledge Base Question Generation [19.327008532572645]
Question Generation over Knowledge Bases (KBQG) aims to convert a logical form into a natural language question.
We propose Chain-of-Thought prompting, which is an in-context learning strategy for reasoning.
We conduct extensive experiments over three public KBQG datasets.
arXiv Detail & Related papers (2023-10-12T15:08:14Z) - Sci-CoT: Leveraging Large Language Models for Enhanced Knowledge
Distillation in Small Models for Scientific QA [5.117094291273979]
Large Language Models (LLMs) have shown outstanding performance across wide range of downstream tasks.
We propose Sci-CoT, a two-stage framework that separates the processes of generating rationales and inferring answers.
Our 80-million parameter model is able to exceed the performance of BLOOM-176B in the ARC-Easy dataset under the few shot setting.
arXiv Detail & Related papers (2023-08-09T03:18:07Z) - Multimodal Chain-of-Thought Reasoning in Language Models [94.70184390935661]
We propose Multimodal-CoT that incorporates language (text) and vision (images) modalities into a two-stage framework.
Experimental results on ScienceQA and A-OKVQA benchmark datasets show the effectiveness of our proposed approach.
arXiv Detail & Related papers (2023-02-02T07:51:19Z) - Learn to Explain: Multimodal Reasoning via Thought Chains for Science
Question Answering [124.16250115608604]
We present Science Question Answering (SQA), a new benchmark that consists of 21k multimodal multiple choice questions with a diverse set of science topics and annotations of their answers with corresponding lectures and explanations.
We show that SQA improves the question answering performance by 1.20% in few-shot GPT-3 and 3.99% in fine-tuned UnifiedQA.
Our analysis further shows that language models, similar to humans, benefit from explanations to learn from fewer data and achieve the same performance with just 40% of the data.
arXiv Detail & Related papers (2022-09-20T07:04:24Z) - Logic-Guided Data Augmentation and Regularization for Consistent
Question Answering [55.05667583529711]
This paper addresses the problem of improving the accuracy and consistency of responses to comparison questions.
Our method leverages logical and linguistic knowledge to augment labeled training data and then uses a consistency-based regularizer to train the model.
arXiv Detail & Related papers (2020-04-21T17:03:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.