A Communication Framework for Compositional Generation
- URL: http://arxiv.org/abs/2501.19182v2
- Date: Thu, 13 Feb 2025 21:04:44 GMT
- Title: A Communication Framework for Compositional Generation
- Authors: Rafael Elberg, Mircea Petrache, Denis Parra,
- Abstract summary: We present a self-supervised generative communication game-based framework for creating compositional encodings.<n>Our framework is based on rigorous justifications and proofs of defining and balancing the concepts of Efficiency, Unambiguity and Non-Holisticity in encoding.
- Score: 0.7578439720012189
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compositionality and compositional generalization--the ability to understand novel combinations of known concepts--are central characteristics of human language and are hypothesized to be essential for human cognition. In machine learning, the emergence of this property has been studied in a communication game setting, where independent agents (a sender and a receiver) converge to a shared encoding policy from a set of states to a space of discrete messages, where the receiver can correctly reconstruct the states observed by the sender using only the sender's messages. The use of communication games in generation tasks is still largely unexplored, with recent methods for compositional generation focusing mainly on the use of supervised guidance (either through class labels or text). In this work, we take the first steps to fill this gap, and we present a self-supervised generative communication game-based framework for creating compositional encodings in learned representations from pre-trained encoder-decoder models. In an Iterated Learning (IL) protocol involving a sender and a receiver, we apply alternating pressures for compression and diversity of encoded discrete messages, so that the protocol converges to an efficient but unambiguous encoding. Approximate message entropy regularization is used to favor compositional encodings. Our framework is based on rigorous justifications and proofs of defining and balancing the concepts of Efficiency, Unambiguity and Non-Holisticity in encoding. We test our method on the compositional image dataset Shapes3D, demonstrating robust performance in both reconstruction and compositionality metrics, surpassing other tested discrete message frameworks.
Related papers
- Boosting Neural Language Inference via Cascaded Interactive Reasoning [38.125341836302525]
Natural Language Inference (NLI) focuses on ascertaining the logical relationship between a given premise and hypothesis.<n>This task presents significant challenges due to inherent linguistic features such as diverse phrasing, semantic complexity, and contextual nuances.<n>We introduce the Cascaded Interactive Reasoning Network (CIRN), a novel architecture designed for deeper semantic comprehension in NLI.
arXiv Detail & Related papers (2025-05-10T11:37:15Z) - Universal Item Tokenization for Transferable Generative Recommendation [89.42584009980676]
We propose UTGRec, a universal item tokenization approach for transferable Generative Recommendation.
By devising tree-structured codebooks, we discretize content representations into corresponding codes for item tokenization.
For raw content reconstruction, we employ dual lightweight decoders to reconstruct item text and images from discrete representations.
For collaborative knowledge integration, we assume that co-occurring items are similar and integrate collaborative signals through co-occurrence alignment and reconstruction.
arXiv Detail & Related papers (2025-04-06T08:07:49Z) - Bridging Textual-Collaborative Gap through Semantic Codes for Sequential Recommendation [91.13055384151897]
CoCoRec is a novel Code-based textual and Collaborative semantic fusion method for sequential Recommendation.
We generate fine-grained semantic codes from multi-view text embeddings through vector quantization techniques.
In order to further enhance the fusion of textual and collaborative semantics, we introduce an optimization strategy.
arXiv Detail & Related papers (2025-03-15T15:54:44Z) - Hierarchical Banzhaf Interaction for General Video-Language Representation Learning [60.44337740854767]
Multimodal representation learning plays an important role in the artificial intelligence domain.<n>We introduce a new approach that models video-text as game players using multivariate cooperative game theory.<n>We extend our original structure into a flexible encoder-decoder framework, enabling the model to adapt to various downstream tasks.
arXiv Detail & Related papers (2024-12-30T14:09:15Z) - CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization [14.01847471143144]
We introduce Context Regularization (CoRe), which enhances the learning of the new concept's text embedding by regularizing its context tokens in the prompt.
CoRe can be applied to arbitrary prompts without requiring the generation of corresponding images.
Comprehensive experiments demonstrate that our method outperforms several baseline methods in both identity preservation and text alignment.
arXiv Detail & Related papers (2024-08-28T16:27:58Z) - Finding structure in logographic writing with library learning [55.63800121311418]
We develop a computational framework for discovering structure in a writing system.
Our framework discovers known linguistic structures in the Chinese writing system.
We demonstrate how a library learning approach may help reveal the fundamental computational principles that underlie the creation of structures in human cognition.
arXiv Detail & Related papers (2024-05-11T04:23:53Z) - Concept-Best-Matching: Evaluating Compositionality in Emergent Communication [44.995111025271086]
We propose a procedure to assess the compositionality of emergent communication by finding the best-match between emerged words and natural language concepts.
To the best of our knowledge, it is the first time that such direct and interpretable mapping between emergent words and human concepts is provided.
arXiv Detail & Related papers (2024-03-17T12:47:02Z) - Prompt-based Logical Semantics Enhancement for Implicit Discourse
Relation Recognition [4.7938839332508945]
We propose a Prompt-based Logical Semantics Enhancement (PLSE) method for Implicit Discourse Relation Recognition (IDRR)
Our method seamlessly injects knowledge relevant to discourse relation into pre-trained language models through prompt-based connective prediction.
Experimental results on PDTB 2.0 and CoNLL16 datasets demonstrate that our method achieves outstanding and consistent performance against the current state-of-the-art models.
arXiv Detail & Related papers (2023-11-01T08:38:08Z) - Language-Oriented Communication with Semantic Coding and Knowledge
Distillation for Text-to-Image Generation [53.97155730116369]
We put forward a novel framework of language-oriented semantic communication (LSC)
In LSC, machines communicate using human language messages that can be interpreted and manipulated via natural language processing (NLP) techniques for SC efficiency.
We introduce three innovative algorithms: 1) semantic source coding (SSC), which compresses a text prompt into its key head words capturing the prompt's syntactic essence; 2) semantic channel coding ( SCC), that improves robustness against errors by substituting head words with their lenghthier synonyms; and 3) semantic knowledge distillation (SKD), that produces listener-customized prompts via in-context learning the listener's
arXiv Detail & Related papers (2023-09-20T08:19:05Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Cognitive Semantic Communication Systems Driven by Knowledge Graph:
Principle, Implementation, and Performance Evaluation [74.38561925376996]
Two cognitive semantic communication frameworks are proposed for the single-user and multiple-user communication scenarios.
An effective semantic correction algorithm is proposed by mining the inference rule from the knowledge graph.
For the multi-user cognitive semantic communication system, a message recovery algorithm is proposed to distinguish messages of different users.
arXiv Detail & Related papers (2023-03-15T12:01:43Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Semantic-Native Communication: A Simplicial Complex Perspective [50.099494681671224]
We study semantic communication from a topological space perspective.
A transmitter first maps its data into a $k$-order simplicial complex and then learns its high-order correlations.
The receiver decodes the structure and infers the missing or distorted data.
arXiv Detail & Related papers (2022-10-30T22:33:44Z) - Learning Compositional Representations for Effective Low-Shot
Generalization [45.952867474500145]
We propose Recognition as Part Composition (RPC), an image encoding approach inspired by human cognition.
RPC encodes images by first decomposing them into salient parts, and then encoding each part as a mixture of a small number of prototypes.
We find that this type of learning can overcome hurdles faced by deep convolutional networks in low-shot generalization tasks.
arXiv Detail & Related papers (2022-04-17T21:31:11Z) - Cognitive Semantic Communication Systems Driven by Knowledge Graph [33.29303908864777]
A cognitive semantic communication framework is proposed by exploiting knowledge graph.
A simple, general and interpretable solution for semantic information detection is developed.
Our proposed system is superior to other benchmark systems in terms of the data compression rate and the reliability of communication.
arXiv Detail & Related papers (2022-02-24T08:26:18Z) - Disentangled Sequence to Sequence Learning for Compositional
Generalization [62.954842223732435]
We propose an extension to sequence-to-sequence models which allows us to learn disentangled representations by adaptively re-encoding the source input.
Experimental results on semantic parsing and machine translation empirically show that our proposal yields more disentangled representations and better generalization.
arXiv Detail & Related papers (2021-10-09T22:27:19Z) - Visually Grounded Concept Composition [31.981204314287282]
We learn the grounding of both primitive and all composed concepts by aligning them to images.
We show that learning to compose leads to more robust grounding results, measured in text-to-image matching accuracy.
arXiv Detail & Related papers (2021-09-29T00:38:58Z) - Hierarchical Poset Decoding for Compositional Generalization in Language [52.13611501363484]
We formalize human language understanding as a structured prediction task where the output is a partially ordered set (poset)
Current encoder-decoder architectures do not take the poset structure of semantics into account properly.
We propose a novel hierarchical poset decoding paradigm for compositional generalization in language.
arXiv Detail & Related papers (2020-10-15T14:34:26Z) - Structure-Augmented Text Representation Learning for Efficient Knowledge
Graph Completion [53.31911669146451]
Human-curated knowledge graphs provide critical supportive information to various natural language processing tasks.
These graphs are usually incomplete, urging auto-completion of them.
graph embedding approaches, e.g., TransE, learn structured knowledge via representing graph elements into dense embeddings.
textual encoding approaches, e.g., KG-BERT, resort to graph triple's text and triple-level contextualized representations.
arXiv Detail & Related papers (2020-04-30T13:50:34Z) - Structural Inductive Biases in Emergent Communication [36.26083882473554]
We investigate the impact of representation learning in artificial agents by developing graph referential games.
We show that agents parametrized by graph neural networks develop a more compositional language compared to bag-of-words and sequence models.
arXiv Detail & Related papers (2020-02-04T14:59:08Z) - Towards Graph Representation Learning in Emergent Communication [37.8523331078468]
We use graph convolutional networks to support the evolution of language and cooperation in multi-agent systems.
Motivated by an image-based referential game, we propose a graph referential game with varying degrees of complexity.
We show that the emerged communication protocol is robust, that the agents uncover the true factors of variation in the game, and that they learn to generalize beyond the samples encountered during training.
arXiv Detail & Related papers (2020-01-24T15:55:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.