Open-Book Neural Algorithmic Reasoning
- URL: http://arxiv.org/abs/2501.00072v1
- Date: Mon, 30 Dec 2024 02:14:58 GMT
- Title: Open-Book Neural Algorithmic Reasoning
- Authors: Hefei Li, Chao Peng, Chenyang Xu, Zhengfeng Yang,
- Abstract summary: We propose a novel open-book learning framework for neural networks.
In this framework, the network can access and utilize all instances in the training dataset when reasoning for a given instance.
We show that this open-book attention mechanism offers insights into the inherent relationships among various tasks in the benchmark.
- Score: 5.057669848157507
- License:
- Abstract: Neural algorithmic reasoning is an emerging area of machine learning that focuses on building neural networks capable of solving complex algorithmic tasks. Recent advancements predominantly follow the standard supervised learning paradigm -- feeding an individual problem instance into the network each time and training it to approximate the execution steps of a classical algorithm. We challenge this mode and propose a novel open-book learning framework. In this framework, whether during training or testing, the network can access and utilize all instances in the training dataset when reasoning for a given instance. Empirical evaluation is conducted on the challenging CLRS Algorithmic Reasoning Benchmark, which consists of 30 diverse algorithmic tasks. Our open-book learning framework exhibits a significant enhancement in neural reasoning capabilities. Further, we notice that there is recent literature suggesting that multi-task training on CLRS can improve the reasoning accuracy of certain tasks, implying intrinsic connections between different algorithmic tasks. We delve into this direction via the open-book framework. When the network reasons for a specific task, we enable it to aggregate information from training instances of other tasks in an attention-based manner. We show that this open-book attention mechanism offers insights into the inherent relationships among various tasks in the benchmark and provides a robust tool for interpretable multi-task training.
Related papers
- A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning [12.608461657195367]
We study multi-task structured bandit problem where the goal is to learn a near-optimal algorithm that minimizes cumulative regret.
We use a transformer as a decision-making algorithm to learn this shared structure so as to generalize to the test task.
We show that our algorithm, without the knowledge of the underlying problem structure, can learn a near-optimal policy in-context.
arXiv Detail & Related papers (2024-06-07T16:34:31Z) - Reasoning Algorithmically in Graph Neural Networks [1.8130068086063336]
We aim to integrate the structured and rule-based reasoning of algorithms with adaptive learning capabilities of neural networks.
This dissertation provides theoretical and practical contributions to this area of research.
arXiv Detail & Related papers (2024-02-21T12:16:51Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - Algorithm Design for Online Meta-Learning with Task Boundary Detection [63.284263611646]
We propose a novel algorithm for task-agnostic online meta-learning in non-stationary environments.
We first propose two simple but effective detection mechanisms of task switches and distribution shift.
We show that a sublinear task-averaged regret can be achieved for our algorithm under mild conditions.
arXiv Detail & Related papers (2023-02-02T04:02:49Z) - Learning Good Features to Transfer Across Tasks and Domains [16.05821129333396]
We first show that such knowledge can be shared across tasks by learning a mapping between task-specific deep features in a given domain.
Then, we show that this mapping function, implemented by a neural network, is able to generalize to novel unseen domains.
arXiv Detail & Related papers (2023-01-26T18:49:39Z) - Hierarchically Structured Task-Agnostic Continual Learning [0.0]
We take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle.
We propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths.
Our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms.
arXiv Detail & Related papers (2022-11-14T19:53:15Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - A Framework for Verifiable and Auditable Federated Anomaly Detection [3.639790324866155]
Federated Leaning is an emerging approach to manage cooperation between a group of agents for the solution of Machine Learning tasks.
We present a novel algorithmic architecture that tackle this problem in the particular case of Anomaly Detection.
arXiv Detail & Related papers (2022-03-15T11:34:02Z) - Online Learning Probabilistic Event Calculus Theories in Answer Set
Programming [70.06301658267125]
Event Recognition (CER) systems detect occurrences in streaming time-stamped datasets using predefined event patterns.
We present a system based on Answer Set Programming (ASP), capable of probabilistic reasoning with complex event patterns in the form of rules weighted in the Event Calculus.
Our results demonstrate the superiority of our novel approach, both terms efficiency and predictive.
arXiv Detail & Related papers (2021-03-31T23:16:29Z) - Subset Sampling For Progressive Neural Network Learning [106.12874293597754]
Progressive Neural Network Learning is a class of algorithms that incrementally construct the network's topology and optimize its parameters based on the training data.
We propose to speed up this process by exploiting subsets of training data at each incremental training step.
Experimental results in object, scene and face recognition problems demonstrate that the proposed approach speeds up the optimization procedure considerably.
arXiv Detail & Related papers (2020-02-17T18:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.