Distilling System 2 into System 1
- URL: http://arxiv.org/abs/2407.06023v3
- Date: Wed, 24 Jul 2024 18:40:36 GMT
- Title: Distilling System 2 into System 1
- Authors: Ping Yu, Jing Xu, Jason Weston, Ilia Kulikov,
- Abstract summary: Large language models (LLMs) can spend extra compute during inference to generate intermediate thoughts.
We show that several such techniques can be successfully distilled, resulting in improved results compared to the original System 1 performance.
- Score: 35.194258450176534
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models (LLMs) can spend extra compute during inference to generate intermediate thoughts, which helps to produce better final responses. Since Chain-of-Thought (Wei et al., 2022), many such System 2 techniques have been proposed such as Rephrase and Respond (Deng et al., 2023a), System 2 Attention (Weston and Sukhbaatar, 2023) and Branch-Solve-Merge (Saha et al., 2023). In this work we investigate self-supervised methods to ``compile'' (distill) higher quality outputs from System 2 techniques back into LLM generations without intermediate reasoning token sequences, as this reasoning has been distilled into System 1. We show that several such techniques can be successfully distilled, resulting in improved results compared to the original System 1 performance, and with less inference cost than System 2. We posit that such System 2 distillation will be an important feature of future continually learning AI systems, enabling them to focus System 2 capabilities on the reasoning tasks that they cannot yet do well.
Related papers
- System-1.x: Learning to Balance Fast and Slow Planning with Language Models [68.77277620915143]
Language models can be used to solve long-horizon planning problems in two distinct modes.
A fast 'System-1' mode, directly generating plans without any explicit search or backtracking, and a slow 'System-2' mode, planning step-by-step.
We propose the System-1.x Planner, a controllable planning framework with LLMs.
arXiv Detail & Related papers (2024-07-19T15:40:59Z) - Clarifying System 1 & 2 through the Common Model of Cognition [0.0]
We use the Common Model of Cognition to ground System-1 and System-2.
We aim to clarify their underlying mechanisms, persisting misconceptions, and implications for metacognition.
arXiv Detail & Related papers (2023-05-18T02:25:03Z) - AAAI 2022 Fall Symposium: System-1 and System-2 realized within the
Common Model of Cognition [0.0]
We situating System-1 and System-2 within the Common Model of Cognition.
Results show that what are thought to be distinctive characteristics of System-1 and 2 instead form a spectrum of cognitive properties.
arXiv Detail & Related papers (2023-05-16T01:28:06Z) - Learning Physical Concepts in Cyber-Physical Systems: A Case Study [72.74318982275052]
We provide an overview of the current state of research regarding methods for learning physical concepts in time series data.
We also analyze the most important methods from the current state of the art using the example of a three-tank system.
arXiv Detail & Related papers (2021-11-28T14:24:52Z) - STC speaker recognition systems for the NIST SRE 2021 [56.05258832139496]
This paper presents a description of STC Ltd. systems submitted to the NIST 2021 Speaker Recognition Evaluation.
These systems consists of a number of diverse subsystems based on using deep neural networks as feature extractors.
For video modality we developed our best solution with RetinaFace face detector and deep ResNet face embeddings extractor trained on large face image datasets.
arXiv Detail & Related papers (2021-11-03T15:31:01Z) - Determining Sentencing Recommendations and Patentability Using a Machine
Learning Trained Expert System [0.0]
This paper presents two studies that use a machine learning expert system (MLES)
One study focuses on a system to advise to U.S. federal judges for regarding consistent federal criminal sentencing.
The other study aims to develop a system that could assist the U.S. Patent and Trademark Office automate their patentability assessment process.
arXiv Detail & Related papers (2021-08-05T16:21:29Z) - The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural
Diarization and X-Vector Clustering Systems Combined by DOVER-Lap [67.395341302752]
This paper provides a detailed description of the Hitachi-JHU system that was submitted to the Third DIHARD Speech Diarization Challenge.
The system outputs the ensemble results of the five subsystems: two x-vector-based subsystems, two end-to-end neural diarization-based subsystems, and one hybrid subsystem.
arXiv Detail & Related papers (2021-02-02T07:30:44Z) - Interleaving Fast and Slow Decision Making [7.41244589428771]
Kahneman proposes that we use two different styles of thinking -- a fast and intuitive System 1 for certain tasks, along with a slower but more analytical System 2 for others.
We propose a novel and general framework which includes a new System 0 to oversee Systems 1 and 2.
We evaluate such a framework on a modified version of the classic Pac-Man game, with an already-trained RL algorithm for System 1, a Monte-Carlo tree search for System 2, and several different possible strategies for System 0.
arXiv Detail & Related papers (2020-10-30T13:16:10Z) - COVCOR20 at WNUT-2020 Task 2: An Attempt to Combine Deep Learning and
Expert rules [0.0]
In the scope of WNUT-2020 Task 2, we developed various text classification systems, using deep learning models and one using linguistically informed rules.
While both of the deep learning systems outperformed the system using the linguistically informed rules, we found that through the integration of (the output of) the three systems a better performance could be achieved.
arXiv Detail & Related papers (2020-09-07T15:54:23Z) - Exploration in two-stage recommender systems [79.50534282841618]
Two-stage recommender systems are widely adopted in industry due to their scalability and maintainability.
A key challenge of this setup is that optimal performance of each stage in isolation does not imply optimal global performance.
We propose a method of synchronising the exploration strategies between the ranker and the nominators.
arXiv Detail & Related papers (2020-09-01T16:52:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.