Investigating Answerability of LLMs for Long-Form Question Answering
- URL: http://arxiv.org/abs/2309.08210v1
- Date: Fri, 15 Sep 2023 07:22:56 GMT
- Title: Investigating Answerability of LLMs for Long-Form Question Answering
- Authors: Meghana Moorthy Bhat, Rui Meng, Ye Liu, Yingbo Zhou and Semih Yavuz
- Abstract summary: We focus on long-form question answering (LFQA) because it has several practical and impactful applications.
We propose a question-generation method from abstractive summaries and show that generating follow-up questions from summaries of long documents can create a challenging setting.
- Score: 35.41413072729483
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As we embark on a new era of LLMs, it becomes increasingly crucial to
understand their capabilities, limitations, and differences. Toward making
further progress in this direction, we strive to build a deeper understanding
of the gaps between massive LLMs (e.g., ChatGPT) and smaller yet effective
open-source LLMs and their distilled counterparts. To this end, we specifically
focus on long-form question answering (LFQA) because it has several practical
and impactful applications (e.g., troubleshooting, customer service, etc.) yet
is still understudied and challenging for LLMs. We propose a
question-generation method from abstractive summaries and show that generating
follow-up questions from summaries of long documents can create a challenging
setting for LLMs to reason and infer from long contexts. Our experimental
results confirm that: (1) our proposed method of generating questions from
abstractive summaries pose a challenging setup for LLMs and shows performance
gaps between LLMs like ChatGPT and open-source LLMs (Alpaca, Llama) (2)
open-source LLMs exhibit decreased reliance on context for generated questions
from the original document, but their generation capabilities drop
significantly on generated questions from summaries -- especially for longer
contexts (>1024 tokens)
Related papers
- ALR$^2$: A Retrieve-then-Reason Framework for Long-context Question Answering [42.146660039671076]
We develop a retrieve-then-reason framework for large language models (LLMs)
We find that modern LLMs struggle to accurately retrieve relevant facts and instead, often hallucinate "retrieved facts"
We introduce ALR$2$, a method that augments the long-context reasoning capability of LLMs via an explicit two-stage procedure.
arXiv Detail & Related papers (2024-10-04T08:29:12Z) - Are LLMs Aware that Some Questions are not Open-ended? [58.93124686141781]
We study whether Large Language Models are aware that some questions have limited answers and need to respond more deterministically.
The lack of question awareness in LLMs leads to two phenomena: (1) too casual to answer non-open-ended questions or (2) too boring to answer open-ended questions.
arXiv Detail & Related papers (2024-10-01T06:07:00Z) - Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts [50.06633829833144]
Large Language Models (LLMs) are effective in performing various NLP tasks, but struggle to handle tasks that require extensive, real-world knowledge.
We propose a benchmark that requires knowledge of long-tail facts for answering the involved questions.
Our experiments show that LLMs alone struggle with answering these questions, especially when the long-tail level is high or rich knowledge is required.
arXiv Detail & Related papers (2024-05-10T15:10:20Z) - Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs [60.40396361115776]
This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in large language models (LLMs) with a slim proxy model.
We employ a proxy model which has far fewer parameters, and take its answers as answers.
Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM.
arXiv Detail & Related papers (2024-02-19T11:11:08Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts? [45.233517779029334]
We identify whether responses are attributed to generated or retrieved contexts.
Experiments reveal a significant bias in several LLMs to favor generated contexts, even when they provide incorrect information.
arXiv Detail & Related papers (2024-01-22T12:54:04Z) - Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method [36.24876571343749]
Large Language Models (LLMs) have shown great potential in Natural Language Processing (NLP) tasks.
Recent literature reveals that LLMs generate nonfactual responses intermittently.
We propose a novel self-detection method to detect which questions that a LLM does not know that are prone to generate nonfactual results.
arXiv Detail & Related papers (2023-10-27T06:22:14Z) - Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation [109.8527403904657]
We show that large language models (LLMs) possess unwavering confidence in their knowledge and cannot handle the conflict between internal and external knowledge well.
Retrieval augmentation proves to be an effective approach in enhancing LLMs' awareness of knowledge boundaries.
We propose a simple method to dynamically utilize supporting documents with our judgement strategy.
arXiv Detail & Related papers (2023-07-20T16:46:10Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.