Zero-shot Compositional Action Recognition with Neural Logic Constraints
- URL: http://arxiv.org/abs/2508.02320v1
- Date: Mon, 04 Aug 2025 11:40:42 GMT
- Title: Zero-shot Compositional Action Recognition with Neural Logic Constraints
- Authors: Gefan Ye, Lin Li, Kexin Li, Jun Xiao, Long chen,
- Abstract summary: ZS-CAR aims to identify unseen verb-object compositions in the videos by exploiting the learned knowledge of verb and object primitives during training.<n>Despite compositional learning's progress, two critical challenges persist: 1) Missing compositional structure constraint, leading to spurious correlations between primitives; 2) Neglecting semantic hierarchy constraint, leading to semantic ambiguity and impairing the training process.<n>We argue that human-like symbolic reasoning offers a principled solution to these challenges by explicitly modeling compositional and hierarchical structured abstraction.
- Score: 15.451848952659343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot compositional action recognition (ZS-CAR) aims to identify unseen verb-object compositions in the videos by exploiting the learned knowledge of verb and object primitives during training. Despite compositional learning's progress in ZS-CAR, two critical challenges persist: 1) Missing compositional structure constraint, leading to spurious correlations between primitives; 2) Neglecting semantic hierarchy constraint, leading to semantic ambiguity and impairing the training process. In this paper, we argue that human-like symbolic reasoning offers a principled solution to these challenges by explicitly modeling compositional and hierarchical structured abstraction. To this end, we propose a logic-driven ZS-CAR framework LogicCAR that integrates dual symbolic constraints: Explicit Compositional Logic and Hierarchical Primitive Logic. Specifically, the former models the restrictions within the compositions, enhancing the compositional reasoning ability of our model. The latter investigates the semantical dependencies among different primitives, empowering the models with fine-to-coarse reasoning capacity. By formalizing these constraints in first-order logic and embedding them into neural network architectures, LogicCAR systematically bridges the gap between symbolic abstraction and existing models. Extensive experiments on the Sth-com dataset demonstrate that our LogicCAR outperforms existing baseline methods, proving the effectiveness of our logic-driven constraints.
Related papers
- Improving Chain-of-Thought Reasoning via Quasi-Symbolic Abstractions [45.950841507164064]
Chain-of-Though (CoT) represents a common strategy for reasoning in Large Language Models.<n>We present QuaSAR, a variation of CoT that guides LLMs to operate at a higher level of abstraction via quasi-symbolic explanations.<n>Our experiments show that quasi-symbolic abstractions can improve CoT-based methods by up to 8% accuracy.
arXiv Detail & Related papers (2025-02-18T07:58:48Z) - Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning [89.89857766491475]
We propose a curriculum-based logical-aware instruction tuning framework, named LACT.<n>Specifically, we augment the arbitrary first-order logical queries via binary tree decomposition.<n> Experiments across widely used datasets demonstrate that LACT has substantial improvements(brings an average +5.5% MRR score) over advanced methods, achieving the new state-of-the-art.
arXiv Detail & Related papers (2024-05-02T18:12:08Z) - Learning with Logical Constraints but without Shortcut Satisfaction [23.219364371311084]
We present a new framework for learning with logical constraints.
Specifically, we address the shortcut satisfaction issue by introducing dual variables for logical connectives.
We propose a variational framework where the encoded logical constraint is expressed as a distributional loss.
arXiv Detail & Related papers (2024-03-01T07:17:20Z) - LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and
Reasoning [73.98142349171552]
LOGICSEG is a holistic visual semantic that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge.
During fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training.
These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models.
arXiv Detail & Related papers (2023-09-24T05:43:19Z) - Modeling Hierarchical Reasoning Chains by Linking Discourse Units and
Key Phrases for Reading Comprehension [80.99865844249106]
We propose a holistic graph network (HGN) which deals with context at both discourse level and word level, as the basis for logical reasoning.
Specifically, node-level and type-level relations, which can be interpreted as bridges in the reasoning process, are modeled by a hierarchical interaction mechanism.
arXiv Detail & Related papers (2023-06-21T07:34:27Z) - Query Structure Modeling for Inductive Logical Reasoning Over Knowledge
Graphs [67.043747188954]
We propose a structure-modeled textual encoding framework for inductive logical reasoning over KGs.
It encodes linearized query structures and entities using pre-trained language models to find answers.
We conduct experiments on two inductive logical reasoning datasets and three transductive datasets.
arXiv Detail & Related papers (2023-05-23T01:25:29Z) - Do Deep Neural Networks Capture Compositionality in Arithmetic
Reasoning? [31.692400722222278]
We introduce a skill tree on compositionality in arithmetic symbolic reasoning that defines the hierarchical levels of complexity along with three compositionality dimensions: systematicity, productivity, and substitutivity.
Our experiments revealed that among the three types of composition, the models struggled most with systematicity, performing poorly even with relatively simple compositions.
arXiv Detail & Related papers (2023-02-15T18:59:04Z) - Discourse-Aware Graph Networks for Textual Logical Reasoning [142.0097357999134]
Passage-level logical relations represent entailment or contradiction between propositional units (e.g., a concluding sentence)
We propose logic structural-constraint modeling to solve the logical reasoning QA and introduce discourse-aware graph networks (DAGNs)
The networks first construct logic graphs leveraging in-line discourse connectives and generic logic theories, then learn logic representations by end-to-end evolving the logic relations with an edge-reasoning mechanism and updating the graph features.
arXiv Detail & Related papers (2022-07-04T14:38:49Z) - MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning [63.50909998372667]
We propose MERIt, a MEta-path guided contrastive learning method for logical ReasonIng of text.
Two novel strategies serve as indispensable components of our method.
arXiv Detail & Related papers (2022-03-01T11:13:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.