Towards Better Out-of-Distribution Generalization of Neural Algorithmic
Reasoning Tasks
- URL: http://arxiv.org/abs/2211.00692v2
- Date: Sat, 18 Mar 2023 08:23:33 GMT
- Title: Towards Better Out-of-Distribution Generalization of Neural Algorithmic
Reasoning Tasks
- Authors: Sadegh Mahdavi, Kevin Swersky, Thomas Kipf, Milad Hashemi, Christos
Thrampoulidis, Renjie Liao
- Abstract summary: We study the OOD generalization of neural algorithmic reasoning tasks.
The goal is to learn an algorithm from input-output pairs using deep neural networks.
- Score: 51.8723187709964
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study the OOD generalization of neural algorithmic
reasoning tasks, where the goal is to learn an algorithm (e.g., sorting,
breadth-first search, and depth-first search) from input-output pairs using
deep neural networks. First, we argue that OOD generalization in this setting
is significantly different than common OOD settings. For example, some
phenomena in OOD generalization of image classifications such as \emph{accuracy
on the line} are not observed here, and techniques such as data augmentation
methods do not help as assumptions underlying many augmentation techniques are
often violated. Second, we analyze the main challenges (e.g., input
distribution shift, non-representative data generation, and uninformative
validation metrics) of the current leading benchmark, i.e., CLRS
\citep{deepmind2021clrs}, which contains 30 algorithmic reasoning tasks. We
propose several solutions, including a simple-yet-effective fix to the input
distribution shift and improved data generation. Finally, we propose an
attention-based 2WL-graph neural network (GNN) processor which complements
message-passing GNNs so their combination outperforms the state-of-the-art
model by a 3% margin averaged over all algorithms. Our code is available at:
\url{https://github.com/smahdavi4/clrs}.
Related papers
- Algorithm-Informed Graph Neural Networks for Leakage Detection and Localization in Water Distribution Networks [6.675805308519987]
Leakages are a significant challenge for the efficient and sustainable management of water distribution networks.
Recent approaches have used graph-based data-driven methods.
We propose an algorithm-informed graph neural network (AIGNN) to detect and localize leaks.
arXiv Detail & Related papers (2024-08-05T19:25:05Z) - Ensemble Quadratic Assignment Network for Graph Matching [52.20001802006391]
Graph matching is a commonly used technique in computer vision and pattern recognition.
Recent data-driven approaches have improved the graph matching accuracy remarkably.
We propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods.
arXiv Detail & Related papers (2024-03-11T06:34:05Z) - Triplet Edge Attention for Algorithmic Reasoning [16.130097693973845]
We introduce a new graph neural network layer called Triplet Edge Attention (TEA), an edge-aware graph attention layer.
Our algorithm works by precisely computing edge latent, aggregating multiple triplet messages using edge-based attention.
arXiv Detail & Related papers (2023-12-09T16:46:28Z) - Latent Space Representations of Neural Algorithmic Reasoners [15.920449080528536]
We perform a detailed analysis of the structure of the latent space induced by the GNN when executing algorithms.
We identify two possible failure modes: (i) loss of resolution, making it hard to distinguish similar values; (ii) inability to deal with values outside the range observed during training.
We show that these changes lead to improvements on the majority of algorithms in the standard CLRS-30 benchmark when using the state-of-the-art Triplet-GMPNN processor.
arXiv Detail & Related papers (2023-07-17T22:09:12Z) - Neural Algorithmic Reasoning with Causal Regularisation [18.299363749150093]
We make an important observation: there are many different inputs for which an algorithm will perform certain intermediate computations identically.
This insight allows us to develop data augmentation procedures that, given an algorithm's intermediate trajectory, produce inputs for which the target algorithm would have exactly the same next trajectory step.
We prove that the resulting method, which we call Hint-ReLIC, improves the OOD generalisation capabilities of the reasoner.
arXiv Detail & Related papers (2023-02-20T19:41:15Z) - Unsupervised Learning of Initialization in Deep Neural Networks via
Maximum Mean Discrepancy [74.34895342081407]
We propose an unsupervised algorithm to find good initialization for input data.
We first notice that each parameter configuration in the parameter space corresponds to one particular downstream task of d-way classification.
We then conjecture that the success of learning is directly related to how diverse downstream tasks are in the vicinity of the initial parameters.
arXiv Detail & Related papers (2023-02-08T23:23:28Z) - Invertible Neural Networks for Graph Prediction [22.140275054568985]
In this work, we address conditional generation using deep invertible neural networks.
We adopt an end-to-end training approach since our objective is to address prediction and generation in the forward and backward processes at once.
arXiv Detail & Related papers (2022-06-02T17:28:33Z) - Robustification of Online Graph Exploration Methods [59.50307752165016]
We study a learning-augmented variant of the classical, notoriously hard online graph exploration problem.
We propose an algorithm that naturally integrates predictions into the well-known Nearest Neighbor (NN) algorithm.
arXiv Detail & Related papers (2021-12-10T10:02:31Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z) - Learning to Hash with Graph Neural Networks for Recommender Systems [103.82479899868191]
Graph representation learning has attracted much attention in supporting high quality candidate search at scale.
Despite its effectiveness in learning embedding vectors for objects in the user-item interaction network, the computational costs to infer users' preferences in continuous embedding space are tremendous.
We propose a simple yet effective discrete representation learning framework to jointly learn continuous and discrete codes.
arXiv Detail & Related papers (2020-03-04T06:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.