Self-QA: Unsupervised Knowledge Guided Language Model Alignment
- URL: http://arxiv.org/abs/2305.11952v1
- Date: Fri, 19 May 2023 18:26:26 GMT
- Title: Self-QA: Unsupervised Knowledge Guided Language Model Alignment
- Authors: Xuanyu Zhang and Qing Yang
- Abstract summary: We introduce Self-QA, which replaces the traditional practice of human-written instruction seeds with a vast amount of unsupervised knowledge.
The effectiveness of our proposed method is demonstrated through experiments conducted on unsupervised corpora from various domains.
- Score: 17.436587487811387
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale language models like ChatGPT and GPT-4 have gained attention for
their impressive conversational and generative capabilities. However, the
creation of supervised paired question-answering data for instruction tuning
presents formidable challenges. This endeavor necessitates substantial human
effort for data annotation and wrestles with issues concerning data quality,
diversity, accuracy, and other related factors. To overcome these obstacles, we
introduce an innovative framework named Self-QA, which replaces the traditional
practice of human-written instruction seeds with a vast amount of unsupervised
knowledge, enabling the model to generate a larger quantity of correct and
domain-specific instruction data. The effectiveness of our proposed method is
demonstrated through experiments conducted on unsupervised corpora from various
domains.
Related papers
- SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation.
Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z) - Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - CAUS: A Dataset for Question Generation based on Human Cognition Leveraging Large Language Models [4.962252439662465]
We introduce the Curious About Uncertain Scene dataset to enable Large Language Models to emulate human cognitive processes for resolving uncertainties.
Our approach involves providing scene descriptions embedded with uncertainties to stimulate the generation of reasoning and queries.
Our results demonstrate that GPT-4 can effectively generate pertinent questions and grasp their nuances, particularly when given appropriate context and instructions.
arXiv Detail & Related papers (2024-04-18T01:31:19Z) - Towards Model-Based Data Acquisition for Subjective Multi-Task NLP
Problems [12.38430125789305]
We propose a new model-based approach that allows the selection of tasks annotated individually for each text in a multi-task scenario.
Experiments carried out on three datasets, dozens of NLP tasks, and thousands of annotations show that our method allows up to 40% reduction in the number of annotations with negligible loss of knowledge.
arXiv Detail & Related papers (2023-12-13T15:03:27Z) - Can Foundation Models Watch, Talk and Guide You Step by Step to Make a
Cake? [62.59699229202307]
Despite advances in AI, it remains a significant challenge to develop interactive task guidance systems.
We created a new multimodal benchmark dataset, Watch, Talk and Guide (WTaG) based on natural interaction between a human user and a human instructor.
We leveraged several foundation models to study to what extent these models can be quickly adapted to perceptually enabled task guidance.
arXiv Detail & Related papers (2023-11-01T15:13:49Z) - Offline Diversity Maximization Under Imitation Constraints [23.761620064055897]
We propose a principled offline algorithm for unsupervised skill discovery.
Our main analytical contribution is to connect Fenchel duality, reinforcement learning, and unsupervised skill discovery.
We demonstrate the effectiveness of our method on the standard offline benchmark D4RL.
arXiv Detail & Related papers (2023-07-21T06:12:39Z) - Few-shot Named Entity Recognition with Cloze Questions [3.561183926088611]
We propose a simple and intuitive adaptation of Pattern-Exploiting Training (PET), a recent approach which combines the cloze-questions mechanism and fine-tuning for few-shot learning.
Our approach achieves considerably better performance than standard fine-tuning and comparable or improved results with respect to other few-shot baselines.
arXiv Detail & Related papers (2021-11-24T11:08:59Z) - Knowledge-driven Data Construction for Zero-shot Evaluation in
Commonsense Question Answering [80.60605604261416]
We propose a novel neuro-symbolic framework for zero-shot question answering across commonsense tasks.
We vary the set of language models, training regimes, knowledge sources, and data generation strategies, and measure their impact across tasks.
We show that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.
arXiv Detail & Related papers (2020-11-07T22:52:21Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z) - Enhancing Dialogue Generation via Multi-Level Contrastive Learning [57.005432249952406]
We propose a multi-level contrastive learning paradigm to model the fine-grained quality of the responses with respect to the query.
A Rank-aware (RC) network is designed to construct the multi-level contrastive optimization objectives.
We build a Knowledge Inference (KI) component to capture the keyword knowledge from the reference during training and exploit such information to encourage the generation of informative words.
arXiv Detail & Related papers (2020-09-19T02:41:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.