Self-QA: Unsupervised Knowledge Guided Language Model Alignment
- URL: http://arxiv.org/abs/2305.11952v1
- Date: Fri, 19 May 2023 18:26:26 GMT
- Title: Self-QA: Unsupervised Knowledge Guided Language Model Alignment
- Authors: Xuanyu Zhang and Qing Yang
- Abstract summary: We introduce Self-QA, which replaces the traditional practice of human-written instruction seeds with a vast amount of unsupervised knowledge.
The effectiveness of our proposed method is demonstrated through experiments conducted on unsupervised corpora from various domains.
- Score: 17.436587487811387
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale language models like ChatGPT and GPT-4 have gained attention for
their impressive conversational and generative capabilities. However, the
creation of supervised paired question-answering data for instruction tuning
presents formidable challenges. This endeavor necessitates substantial human
effort for data annotation and wrestles with issues concerning data quality,
diversity, accuracy, and other related factors. To overcome these obstacles, we
introduce an innovative framework named Self-QA, which replaces the traditional
practice of human-written instruction seeds with a vast amount of unsupervised
knowledge, enabling the model to generate a larger quantity of correct and
domain-specific instruction data. The effectiveness of our proposed method is
demonstrated through experiments conducted on unsupervised corpora from various
domains.
Related papers
- The Superalignment of Superhuman Intelligence with Large Language Models [63.96120398355404]
We discuss the concept of superalignment from the learning perspective to answer this question.
We highlight some key research problems in superalignment, namely, weak-to-strong generalization, scalable oversight, and evaluation.
We present a conceptual framework for superalignment, which consists of three modules: an attacker which generates adversary queries trying to expose the weaknesses of a learner model; a learner which will refine itself by learning from scalable feedbacks generated by a critic model along with minimal human experts; and a critic which generates critics or explanations for a given query-response pair, with a target of improving the learner by criticizing.
arXiv Detail & Related papers (2024-12-15T10:34:06Z) - KBAlign: Efficient Self Adaptation on Specific Knowledge Bases [73.34893326181046]
Large language models (LLMs) usually rely on retrieval-augmented generation to exploit knowledge materials in an instant manner.
We propose KBAlign, an approach designed for efficient adaptation to downstream tasks involving knowledge bases.
Our method utilizes iterative training with self-annotated data such as Q&A pairs and revision suggestions, enabling the model to grasp the knowledge content efficiently.
arXiv Detail & Related papers (2024-11-22T08:21:03Z) - SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation.
Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z) - CAUS: A Dataset for Question Generation based on Human Cognition Leveraging Large Language Models [4.962252439662465]
We introduce the Curious About Uncertain Scene dataset to enable Large Language Models to emulate human cognitive processes for resolving uncertainties.
Our approach involves providing scene descriptions embedded with uncertainties to stimulate the generation of reasoning and queries.
Our results demonstrate that GPT-4 can effectively generate pertinent questions and grasp their nuances, particularly when given appropriate context and instructions.
arXiv Detail & Related papers (2024-04-18T01:31:19Z) - Towards Model-Based Data Acquisition for Subjective Multi-Task NLP
Problems [12.38430125789305]
We propose a new model-based approach that allows the selection of tasks annotated individually for each text in a multi-task scenario.
Experiments carried out on three datasets, dozens of NLP tasks, and thousands of annotations show that our method allows up to 40% reduction in the number of annotations with negligible loss of knowledge.
arXiv Detail & Related papers (2023-12-13T15:03:27Z) - Can Foundation Models Watch, Talk and Guide You Step by Step to Make a
Cake? [62.59699229202307]
Despite advances in AI, it remains a significant challenge to develop interactive task guidance systems.
We created a new multimodal benchmark dataset, Watch, Talk and Guide (WTaG) based on natural interaction between a human user and a human instructor.
We leveraged several foundation models to study to what extent these models can be quickly adapted to perceptually enabled task guidance.
arXiv Detail & Related papers (2023-11-01T15:13:49Z) - Offline Diversity Maximization Under Imitation Constraints [23.761620064055897]
We propose a principled offline algorithm for unsupervised skill discovery.
Our main analytical contribution is to connect Fenchel duality, reinforcement learning, and unsupervised skill discovery.
We demonstrate the effectiveness of our method on the standard offline benchmark D4RL.
arXiv Detail & Related papers (2023-07-21T06:12:39Z) - Few-shot Named Entity Recognition with Cloze Questions [3.561183926088611]
We propose a simple and intuitive adaptation of Pattern-Exploiting Training (PET), a recent approach which combines the cloze-questions mechanism and fine-tuning for few-shot learning.
Our approach achieves considerably better performance than standard fine-tuning and comparable or improved results with respect to other few-shot baselines.
arXiv Detail & Related papers (2021-11-24T11:08:59Z) - Knowledge-driven Data Construction for Zero-shot Evaluation in
Commonsense Question Answering [80.60605604261416]
We propose a novel neuro-symbolic framework for zero-shot question answering across commonsense tasks.
We vary the set of language models, training regimes, knowledge sources, and data generation strategies, and measure their impact across tasks.
We show that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.
arXiv Detail & Related papers (2020-11-07T22:52:21Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.