Join-Chain Network: A Logical Reasoning View of the Multi-head Attention
in Transformer
- URL: http://arxiv.org/abs/2210.02729v2
- Date: Fri, 7 Oct 2022 04:18:47 GMT
- Title: Join-Chain Network: A Logical Reasoning View of the Multi-head Attention
in Transformer
- Authors: Jianyi Zhang, Yiran Chen, Jianshu Chen
- Abstract summary: We propose a symbolic reasoning architecture that chains many join operators together to model output logical expressions.
In particular, we demonstrate that such an ensemble of join-chains can express a broad subset of ''tree-structured'' first-order logical expressions, named FOET.
We find that the widely used multi-head self-attention module in transformer can be understood as a special neural operator that implements the union bound of the join operator in probabilistic predicate space.
- Score: 59.73454783958702
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developing neural architectures that are capable of logical reasoning has
become increasingly important for a wide range of applications (e.g., natural
language processing). Towards this grand objective, we propose a symbolic
reasoning architecture that chains many join operators together to model output
logical expressions. In particular, we demonstrate that such an ensemble of
join-chains can express a broad subset of ''tree-structured'' first-order
logical expressions, named FOET, which is particularly useful for modeling
natural languages. To endow it with differentiable learning capability, we
closely examine various neural operators for approximating the symbolic
join-chains. Interestingly, we find that the widely used multi-head
self-attention module in transformer can be understood as a special neural
operator that implements the union bound of the join operator in probabilistic
predicate space. Our analysis not only provides a new perspective on the
mechanism of the pretrained models such as BERT for natural language
understanding but also suggests several important future improvement
directions.
Related papers
- LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and
Reasoning [73.98142349171552]
LOGICSEG is a holistic visual semantic that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge.
During fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training.
These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models.
arXiv Detail & Related papers (2023-09-24T05:43:19Z) - Modeling Hierarchical Reasoning Chains by Linking Discourse Units and
Key Phrases for Reading Comprehension [80.99865844249106]
We propose a holistic graph network (HGN) which deals with context at both discourse level and word level, as the basis for logical reasoning.
Specifically, node-level and type-level relations, which can be interpreted as bridges in the reasoning process, are modeled by a hierarchical interaction mechanism.
arXiv Detail & Related papers (2023-06-21T07:34:27Z) - SNeL: A Structured Neuro-Symbolic Language for Entity-Based Multimodal
Scene Understanding [0.0]
We introduce SNeL (Structured Neuro-symbolic Language), a versatile query language designed to facilitate nuanced interactions with neural networks processing multimodal data.
SNeL's expressive interface enables the construction of intricate queries, supporting logical and arithmetic operators, comparators, nesting, and more.
Our evaluations demonstrate SNeL's potential to reshape the way we interact with complex neural networks.
arXiv Detail & Related papers (2023-06-09T17:01:51Z) - Interpretable Multimodal Misinformation Detection with Logic Reasoning [40.851213962307206]
We propose a novel logic-based neural model for multimodal misinformation detection.
We parameterize symbolic logical elements using neural representations, which facilitate the automatic generation and evaluation of meaningful logic clauses.
Results on three public datasets demonstrate the feasibility and versatility of our model.
arXiv Detail & Related papers (2023-05-10T08:16:36Z) - Learning Language Representations with Logical Inductive Bias [19.842271716111153]
We explore a new logical inductive bias for better language representation learning.
We develop a novel neural architecture named FOLNet to encode this new inductive bias.
We find that the self-attention module in transformers can be composed by two of our neural logic operators.
arXiv Detail & Related papers (2023-02-19T02:21:32Z) - LogiGAN: Learning Logical Reasoning via Adversarial Pre-training [58.11043285534766]
We present LogiGAN, an unsupervised adversarial pre-training framework for improving logical reasoning abilities of language models.
Inspired by the facilitation effect of reflective thinking in human learning, we simulate the learning-thinking process with an adversarial Generator-Verifier architecture.
Both base and large size language models pre-trained with LogiGAN demonstrate obvious performance improvement on 12 datasets.
arXiv Detail & Related papers (2022-05-18T08:46:49Z) - Emergence of Machine Language: Towards Symbolic Intelligence with Neural
Networks [73.94290462239061]
We propose to combine symbolism and connectionism principles by using neural networks to derive a discrete representation.
By designing an interactive environment and task, we demonstrated that machines could generate a spontaneous, flexible, and semantic language.
arXiv Detail & Related papers (2022-01-14T14:54:58Z) - Discrete-Valued Neural Communication [85.3675647398994]
We show that restricting the transmitted information among components to discrete representations is a beneficial bottleneck.
Even though individuals have different understandings of what a "cat" is based on their specific experiences, the shared discrete token makes it possible for communication among individuals to be unimpeded by individual differences in internal representation.
We extend the quantization mechanism from the Vector-Quantized Variational Autoencoder to multi-headed discretization with shared codebooks and use it for discrete-valued neural communication.
arXiv Detail & Related papers (2021-07-06T03:09:25Z) - Logic Tensor Networks [9.004005678155023]
We present Logic Networks (LTN), a neurosymbolic formalism and computational model that supports learning and reasoning.
We show that LTN provides a uniform language for the specification and the computation of several AI tasks.
arXiv Detail & Related papers (2020-12-25T22:30:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.