Related papers: Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer

Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer

URL: http://arxiv.org/abs/2210.02729v2
Date: Fri, 7 Oct 2022 04:18:47 GMT
Title: Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer
Authors: Jianyi Zhang, Yiran Chen, Jianshu Chen
Abstract summary: We propose a symbolic reasoning architecture that chains many join operators together to model output logical expressions. In particular, we demonstrate that such an ensemble of join-chains can express a broad subset of ''tree-structured'' first-order logical expressions, named FOET. We find that the widely used multi-head self-attention module in transformer can be understood as a special neural operator that implements the union bound of the join operator in probabilistic predicate space.
Score: 59.73454783958702
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Developing neural architectures that are capable of logical reasoning has become increasingly important for a wide range of applications (e.g., natural language processing). Towards this grand objective, we propose a symbolic reasoning architecture that chains many join operators together to model output logical expressions. In particular, we demonstrate that such an ensemble of join-chains can express a broad subset of ''tree-structured'' first-order logical expressions, named FOET, which is particularly useful for modeling natural languages. To endow it with differentiable learning capability, we closely examine various neural operators for approximating the symbolic join-chains. Interestingly, we find that the widely used multi-head self-attention module in transformer can be understood as a special neural operator that implements the union bound of the join operator in probabilistic predicate space. Our analysis not only provides a new perspective on the mechanism of the pretrained models such as BERT for natural language understanding but also suggests several important future improvement directions.

Related papers

On the Limits of Hierarchically Embedded Logic in Classical Neural Networks [0.0]
We show that each layer can encode at most one additional level of logical reasoning.<n>We prove that a neural network of depth a particular depth cannot faithfully represent predicates in a one higher order logic.
arXiv Detail & Related papers (2025-07-28T16:13:41Z)
Symbolic Representation for Any-to-Any Generative Tasks [25.808462395329194]
We propose a symbolic generative task description language and an inference engine capable of representing arbitrary multimodal tasks as structured symbolic flows. Our framework successfully performs over 12 diverse multimodal generative tasks, demonstrating strong performance and flexibility without the need for task-specific tuning. Experiments show that our method not only matches or outperforms existing state-of-the-art unified models in content quality, but also offers greater efficiency, editability, and interruptibility.
arXiv Detail & Related papers (2025-04-24T05:35:47Z)
Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers. It is common to instead use proxy tasks that are similar in only an informal sense. We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z)
LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and Reasoning [73.98142349171552]
LOGICSEG is a holistic visual semantic that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge. During fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training. These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models.
arXiv Detail & Related papers (2023-09-24T05:43:19Z)
Modeling Hierarchical Reasoning Chains by Linking Discourse Units and Key Phrases for Reading Comprehension [80.99865844249106]
We propose a holistic graph network (HGN) which deals with context at both discourse level and word level, as the basis for logical reasoning. Specifically, node-level and type-level relations, which can be interpreted as bridges in the reasoning process, are modeled by a hierarchical interaction mechanism.
arXiv Detail & Related papers (2023-06-21T07:34:27Z)
SNeL: A Structured Neuro-Symbolic Language for Entity-Based Multimodal Scene Understanding [0.0]
We introduce SNeL (Structured Neuro-symbolic Language), a versatile query language designed to facilitate nuanced interactions with neural networks processing multimodal data. SNeL's expressive interface enables the construction of intricate queries, supporting logical and arithmetic operators, comparators, nesting, and more. Our evaluations demonstrate SNeL's potential to reshape the way we interact with complex neural networks.
arXiv Detail & Related papers (2023-06-09T17:01:51Z)
Interpretable Multimodal Misinformation Detection with Logic Reasoning [40.851213962307206]
We propose a novel logic-based neural model for multimodal misinformation detection. We parameterize symbolic logical elements using neural representations, which facilitate the automatic generation and evaluation of meaningful logic clauses. Results on three public datasets demonstrate the feasibility and versatility of our model.
arXiv Detail & Related papers (2023-05-10T08:16:36Z)
Learning Language Representations with Logical Inductive Bias [19.842271716111153]
We explore a new logical inductive bias for better language representation learning. We develop a novel neural architecture named FOLNet to encode this new inductive bias. We find that the self-attention module in transformers can be composed by two of our neural logic operators.
arXiv Detail & Related papers (2023-02-19T02:21:32Z)
LogiGAN: Learning Logical Reasoning via Adversarial Pre-training [58.11043285534766]
We present LogiGAN, an unsupervised adversarial pre-training framework for improving logical reasoning abilities of language models. Inspired by the facilitation effect of reflective thinking in human learning, we simulate the learning-thinking process with an adversarial Generator-Verifier architecture. Both base and large size language models pre-trained with LogiGAN demonstrate obvious performance improvement on 12 datasets.
arXiv Detail & Related papers (2022-05-18T08:46:49Z)
Emergence of Machine Language: Towards Symbolic Intelligence with Neural Networks [73.94290462239061]
We propose to combine symbolism and connectionism principles by using neural networks to derive a discrete representation. By designing an interactive environment and task, we demonstrated that machines could generate a spontaneous, flexible, and semantic language.
arXiv Detail & Related papers (2022-01-14T14:54:58Z)
Logic Tensor Networks [9.004005678155023]
We present Logic Networks (LTN), a neurosymbolic formalism and computational model that supports learning and reasoning. We show that LTN provides a uniform language for the specification and the computation of several AI tasks.
arXiv Detail & Related papers (2020-12-25T22:30:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.