A Communication Framework for Compositional Generation
- URL: http://arxiv.org/abs/2501.19182v2
- Date: Thu, 13 Feb 2025 21:04:44 GMT
- Title: A Communication Framework for Compositional Generation
- Authors: Rafael Elberg, Mircea Petrache, Denis Parra,
- Abstract summary: We present a self-supervised generative communication game-based framework for creating compositional encodings.
Our framework is based on rigorous justifications and proofs of defining and balancing the concepts of Efficiency, Unambiguity and Non-Holisticity in encoding.
- Score: 0.7578439720012189
- License:
- Abstract: Compositionality and compositional generalization--the ability to understand novel combinations of known concepts--are central characteristics of human language and are hypothesized to be essential for human cognition. In machine learning, the emergence of this property has been studied in a communication game setting, where independent agents (a sender and a receiver) converge to a shared encoding policy from a set of states to a space of discrete messages, where the receiver can correctly reconstruct the states observed by the sender using only the sender's messages. The use of communication games in generation tasks is still largely unexplored, with recent methods for compositional generation focusing mainly on the use of supervised guidance (either through class labels or text). In this work, we take the first steps to fill this gap, and we present a self-supervised generative communication game-based framework for creating compositional encodings in learned representations from pre-trained encoder-decoder models. In an Iterated Learning (IL) protocol involving a sender and a receiver, we apply alternating pressures for compression and diversity of encoded discrete messages, so that the protocol converges to an efficient but unambiguous encoding. Approximate message entropy regularization is used to favor compositional encodings. Our framework is based on rigorous justifications and proofs of defining and balancing the concepts of Efficiency, Unambiguity and Non-Holisticity in encoding. We test our method on the compositional image dataset Shapes3D, demonstrating robust performance in both reconstruction and compositionality metrics, surpassing other tested discrete message frameworks.
Related papers
- CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization [14.01847471143144]
We introduce Context Regularization (CoRe), which enhances the learning of the new concept's text embedding by regularizing its context tokens in the prompt.
CoRe can be applied to arbitrary prompts without requiring the generation of corresponding images.
Comprehensive experiments demonstrate that our method outperforms several baseline methods in both identity preservation and text alignment.
arXiv Detail & Related papers (2024-08-28T16:27:58Z) - Concept-Best-Matching: Evaluating Compositionality in Emergent Communication [44.995111025271086]
We propose a procedure to assess the compositionality of emergent communication by finding the best-match between emerged words and natural language concepts.
To the best of our knowledge, it is the first time that such direct and interpretable mapping between emergent words and human concepts is provided.
arXiv Detail & Related papers (2024-03-17T12:47:02Z) - Language-Oriented Communication with Semantic Coding and Knowledge
Distillation for Text-to-Image Generation [53.97155730116369]
We put forward a novel framework of language-oriented semantic communication (LSC)
In LSC, machines communicate using human language messages that can be interpreted and manipulated via natural language processing (NLP) techniques for SC efficiency.
We introduce three innovative algorithms: 1) semantic source coding (SSC), which compresses a text prompt into its key head words capturing the prompt's syntactic essence; 2) semantic channel coding ( SCC), that improves robustness against errors by substituting head words with their lenghthier synonyms; and 3) semantic knowledge distillation (SKD), that produces listener-customized prompts via in-context learning the listener's
arXiv Detail & Related papers (2023-09-20T08:19:05Z) - Cognitive Semantic Communication Systems Driven by Knowledge Graph:
Principle, Implementation, and Performance Evaluation [74.38561925376996]
Two cognitive semantic communication frameworks are proposed for the single-user and multiple-user communication scenarios.
An effective semantic correction algorithm is proposed by mining the inference rule from the knowledge graph.
For the multi-user cognitive semantic communication system, a message recovery algorithm is proposed to distinguish messages of different users.
arXiv Detail & Related papers (2023-03-15T12:01:43Z) - Semantic-Native Communication: A Simplicial Complex Perspective [50.099494681671224]
We study semantic communication from a topological space perspective.
A transmitter first maps its data into a $k$-order simplicial complex and then learns its high-order correlations.
The receiver decodes the structure and infers the missing or distorted data.
arXiv Detail & Related papers (2022-10-30T22:33:44Z) - Learning Compositional Representations for Effective Low-Shot
Generalization [45.952867474500145]
We propose Recognition as Part Composition (RPC), an image encoding approach inspired by human cognition.
RPC encodes images by first decomposing them into salient parts, and then encoding each part as a mixture of a small number of prototypes.
We find that this type of learning can overcome hurdles faced by deep convolutional networks in low-shot generalization tasks.
arXiv Detail & Related papers (2022-04-17T21:31:11Z) - Cognitive Semantic Communication Systems Driven by Knowledge Graph [33.29303908864777]
A cognitive semantic communication framework is proposed by exploiting knowledge graph.
A simple, general and interpretable solution for semantic information detection is developed.
Our proposed system is superior to other benchmark systems in terms of the data compression rate and the reliability of communication.
arXiv Detail & Related papers (2022-02-24T08:26:18Z) - CRIS: CLIP-Driven Referring Image Segmentation [71.56466057776086]
We propose an end-to-end CLIP-Driven Referring Image framework (CRIS)
CRIS resorts to vision-language decoding and contrastive learning for achieving the text-to-pixel alignment.
Our proposed framework significantly outperforms the state-of-the-art performance without any post-processing.
arXiv Detail & Related papers (2021-11-30T07:29:08Z) - Hierarchical Poset Decoding for Compositional Generalization in Language [52.13611501363484]
We formalize human language understanding as a structured prediction task where the output is a partially ordered set (poset)
Current encoder-decoder architectures do not take the poset structure of semantics into account properly.
We propose a novel hierarchical poset decoding paradigm for compositional generalization in language.
arXiv Detail & Related papers (2020-10-15T14:34:26Z) - Structure-Augmented Text Representation Learning for Efficient Knowledge
Graph Completion [53.31911669146451]
Human-curated knowledge graphs provide critical supportive information to various natural language processing tasks.
These graphs are usually incomplete, urging auto-completion of them.
graph embedding approaches, e.g., TransE, learn structured knowledge via representing graph elements into dense embeddings.
textual encoding approaches, e.g., KG-BERT, resort to graph triple's text and triple-level contextualized representations.
arXiv Detail & Related papers (2020-04-30T13:50:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.