Learning to Solve Constraint Satisfaction Problems with Recurrent
Transformer
- URL: http://arxiv.org/abs/2307.04895v1
- Date: Mon, 10 Jul 2023 20:35:22 GMT
- Title: Learning to Solve Constraint Satisfaction Problems with Recurrent
Transformer
- Authors: Zhun Yang, Adam Ishay, Joohyung Lee
- Abstract summary: We show that Transformer extended with recurrence is a viable approach to learning to solve constraint satisfaction problems.
With the ability of Transformer to handle visual input, the proposed Recurrent Transformer can straightforwardly be applied to visual constraint reasoning problems.
We also show how to leverage deductive knowledge of discrete constraints in the Transformer's inductive learning to achieve sample-efficient learning.
- Score: 5.532477732693001
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Constraint satisfaction problems (CSPs) are about finding values of variables
that satisfy the given constraints. We show that Transformer extended with
recurrence is a viable approach to learning to solve CSPs in an end-to-end
manner, having clear advantages over state-of-the-art methods such as Graph
Neural Networks, SATNet, and some neuro-symbolic models. With the ability of
Transformer to handle visual input, the proposed Recurrent Transformer can
straightforwardly be applied to visual constraint reasoning problems while
successfully addressing the symbol grounding problem. We also show how to
leverage deductive knowledge of discrete constraints in the Transformer's
inductive learning to achieve sample-efficient learning and semi-supervised
learning for CSPs.
Related papers
- Self-Supervised Transformers as Iterative Solution Improvers for Constraint Satisfaction [44.67050228129961]
We present a Transformer-based framework for Constraint Satisfaction Problems (CSPs)
CSPs find use in many applications and thus accelerating their solution with machine learning is of wide interest.
We propose ConsFormer, a self-supervised framework that leverages a Transformer as a solution refiner.
arXiv Detail & Related papers (2025-02-18T16:51:01Z) - Learning Linear Attention in Polynomial Time [115.68795790532289]
We provide the first results on learnability of single-layer Transformers with linear attention.
We show that linear attention may be viewed as a linear predictor in a suitably defined RKHS.
We show how to efficiently identify training datasets for which every empirical riskr is equivalent to the linear Transformer.
arXiv Detail & Related papers (2024-10-14T02:41:01Z) - Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning? [69.4145579827826]
We show a fast flow on the regression loss despite the gradient non-ity algorithms for our convergence landscape.
This is the first theoretical analysis for multi-layer Transformer in this setting.
arXiv Detail & Related papers (2024-10-10T18:29:05Z) - PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE enhances global feature representation of point cloud masked autoencoders by making them both discriminative and sensitive to transformations.<n>We propose a novel loss that explicitly penalizes invariant collapse, enabling the network to capture richer transformation cues while preserving discriminative representations.
arXiv Detail & Related papers (2024-09-24T07:57:21Z) - Transformers are Provably Optimal In-context Estimators for Wireless Communications [12.756143424752363]
We show that multi-layer transformers can efficiently solve in-context estimation problems.
We also prove that the optimal configuration of such transformers is indeed the minimizer of the corresponding training loss.
arXiv Detail & Related papers (2023-11-01T02:16:24Z) - Resilient Constrained Learning [94.27081585149836]
This paper presents a constrained learning approach that adapts the requirements while simultaneously solving the learning task.
We call this approach resilient constrained learning after the term used to describe ecological systems that adapt to disruptions by modifying their operation.
arXiv Detail & Related papers (2023-06-04T18:14:18Z) - GSB: Group Superposition Binarization for Vision Transformer with
Limited Training Samples [46.025105938192624]
Vision Transformer (ViT) has performed remarkably in various computer vision tasks.
ViT usually suffers from serious overfitting problems with a relatively limited number of training samples.
We propose a novel model binarization technique, called Group Superposition Binarization (GSB)
arXiv Detail & Related papers (2023-05-13T14:48:09Z) - Scaling Laws Beyond Backpropagation [64.0476282000118]
We study the ability of Direct Feedback Alignment to train causal decoder-only Transformers efficiently.
We find that DFA fails to offer more efficient scaling than backpropagation.
arXiv Detail & Related papers (2022-10-26T10:09:14Z) - Vision Transformer Equipped with Neural Resizer on Facial Expression
Recognition Task [1.3048920509133808]
We propose a novel training framework, Neural Resizer, to support Transformer by compensating information and downscaling in a data-driven manner.
Experiments show our Neural Resizer with F-PDLS loss function improves the performance with Transformer variants in general.
arXiv Detail & Related papers (2022-04-05T13:04:04Z) - AdaViT: Adaptive Tokens for Efficient Vision Transformer [91.88404546243113]
We introduce AdaViT, a method that adaptively adjusts the inference cost of vision transformer (ViT) for images of different complexity.
AdaViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.
arXiv Detail & Related papers (2021-12-14T18:56:07Z) - Characterizing the loss landscape of variational quantum circuits [77.34726150561087]
We introduce a way to compute the Hessian of the loss function of VQCs.
We show how this information can be interpreted and compared to classical neural networks.
arXiv Detail & Related papers (2020-08-06T17:48:12Z) - Sparse aNETT for Solving Inverse Problems with Deep Learning [2.5234156040689237]
We propose a sparse reconstruction framework (aNETT) for solving inverse problems.
We train an autoencoder network $D circ E$ with $E$ acting as a nonlinear sparsifying transform.
Numerical results are presented for sparse view CT.
arXiv Detail & Related papers (2020-04-20T18:43:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.