Graph-based Heuristic Search for Module Selection Procedure in Neural
Module Network
- URL: http://arxiv.org/abs/2009.14759v1
- Date: Wed, 30 Sep 2020 15:55:44 GMT
- Title: Graph-based Heuristic Search for Module Selection Procedure in Neural
Module Network
- Authors: Yuxuan Wu and Hideki Nakayama
- Abstract summary: Graph-based Heuristic Search is the algorithm we proposed to discover the optimal program through a search on the data structure named Program Graph.
Our experiments on FigureQA and CLEVR dataset show that our methods can realize the training of NMN without ground-truth programs.
- Score: 25.418899358703378
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Module Network (NMN) is a machine learning model for solving the
visual question answering tasks. NMN uses programs to encode modules'
structures, and its modularized architecture enables it to solve logical
problems more reasonably. However, because of the non-differentiable procedure
of module selection, NMN is hard to be trained end-to-end. To overcome this
problem, existing work either included ground-truth program into training data
or applied reinforcement learning to explore the program. However, both of
these methods still have weaknesses. In consideration of this, we proposed a
new learning framework for NMN. Graph-based Heuristic Search is the algorithm
we proposed to discover the optimal program through a heuristic search on the
data structure named Program Graph. Our experiments on FigureQA and CLEVR
dataset show that our methods can realize the training of NMN without
ground-truth programs and achieve superior efficiency over existing
reinforcement learning methods in program exploration.
Related papers
- Searching Latent Program Spaces [0.0]
We propose an algorithm for program induction that learns a distribution over latent programs in a continuous space, enabling efficient search and test-time adaptation.
We show that can generalize beyond its training distribution and adapt to unseen tasks by utilizing test-time adaptation mechanisms.
arXiv Detail & Related papers (2024-11-13T15:50:32Z) - A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences.
It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations.
Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z) - Multimodal Representations for Teacher-Guided Compositional Visual
Reasoning [0.0]
NMNs provide enhanced explainability compared to integrated models.
We propose to exploit features obtained by a large-scale cross-modal encoder.
We introduce an NMN learning strategy involving scheduled teacher guidance.
arXiv Detail & Related papers (2023-10-24T07:51:08Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Decouple Graph Neural Networks: Train Multiple Simple GNNs Simultaneously Instead of One [60.5818387068983]
Graph neural networks (GNN) suffer from severe inefficiency.
We propose to decouple a multi-layer GNN as multiple simple modules for more efficient training.
We show that the proposed framework is highly efficient with reasonable performance.
arXiv Detail & Related papers (2023-04-20T07:21:32Z) - A Differentiable Approach to Combinatorial Optimization using Dataless
Neural Networks [20.170140039052455]
We propose a radically different approach in that no data is required for training the neural networks that produce the solution.
In particular, we reduce the optimization problem to a neural network and employ a dataless training scheme to refine the parameters of the network such that those parameters yield the structure of interest.
arXiv Detail & Related papers (2022-03-15T19:21:31Z) - Self Semi Supervised Neural Architecture Search for Semantic
Segmentation [6.488575826304023]
We propose a Neural Architecture Search strategy based on self supervision and semi-supervised learning for the task of semantic segmentation.
Our approach builds an optimized neural network model for this task.
Experiments on the Cityscapes and PASCAL VOC 2012 datasets demonstrate that the discovered neural network is more efficient than a state-of-the-art hand-crafted NN model.
arXiv Detail & Related papers (2022-01-29T19:49:44Z) - Learning to Execute Programs with Instruction Pointer Attention Graph
Neural Networks [55.98291376393561]
Graph neural networks (GNNs) have emerged as a powerful tool for learning software engineering tasks.
Recurrent neural networks (RNNs) are well-suited to long sequential chains of reasoning, but they do not naturally incorporate program structure.
We introduce a novel GNN architecture, the Instruction Pointer Attention Graph Neural Networks (IPA-GNN), which improves systematic generalization on the task of learning to execute programs.
arXiv Detail & Related papers (2020-10-23T19:12:30Z) - Strong Generalization and Efficiency in Neural Programs [69.18742158883869]
We study the problem of learning efficient algorithms that strongly generalize in the framework of neural program induction.
By carefully designing the input / output interfaces of the neural model and through imitation, we are able to learn models that produce correct results for arbitrary input sizes.
arXiv Detail & Related papers (2020-07-07T17:03:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.