Related papers: Latent Communication in Artificial Neural Networks

Latent Communication in Artificial Neural Networks

URL: http://arxiv.org/abs/2406.11014v1
Date: Sun, 16 Jun 2024 17:13:58 GMT
Title: Latent Communication in Artificial Neural Networks
Authors: Luca Moschella,
Abstract summary: This dissertation focuses on the universality and reusability of neural representations. A salient observation from our research is the emergence of similarities in latent representations.
Score: 2.5947832846531886
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As NNs permeate various scientific and industrial domains, understanding the universality and reusability of their representations becomes crucial. At their core, these networks create intermediate neural representations, indicated as latent spaces, of the input data and subsequently leverage them to perform specific downstream tasks. This dissertation focuses on the universality and reusability of neural representations. Do the latent representations crafted by a NN remain exclusive to a particular trained instance, or can they generalize across models, adapting to factors such as randomness during training, model architecture, or even data domain? This adaptive quality introduces the notion of Latent Communication -- a phenomenon that describes when representations can be unified or reused across neural spaces. A salient observation from our research is the emergence of similarities in latent representations, even when these originate from distinct or seemingly unrelated NNs. By exploiting a partial correspondence between the two data distributions that establishes a semantic link, we found that these representations can either be projected into a universal representation, coined as Relative Representation, or be directly translated from one space to another. Latent Communication allows for a bridge between independently trained NN, irrespective of their training regimen, architecture, or the data modality they were trained on -- as long as the data semantic content stays the same (e.g., images and their captions). This holds true for both generation, classification and retrieval downstream tasks; in supervised, weakly supervised, and unsupervised settings; and spans various data modalities including images, text, audio, and graphs -- showcasing the universality of the Latent Communication phenomenon. [...]

Related papers

Concept-Guided Interpretability via Neural Chunking [54.73787666584143]
We show that neural networks exhibit patterns in their raw population activity that mirror regularities in the training data.<n>We propose three methods to extract these emerging entities, complementing each other based on label availability and dimensionality.<n>Our work points to a new direction for interpretability, one that harnesses both cognitive principles and the structure of naturalistic data.
arXiv Detail & Related papers (2025-05-16T13:49:43Z)
Disentangling Representations through Multi-task Learning [0.0]
We provide experimental and theoretical results guaranteeing the emergence of disentangled representations in agents that optimally solve classification tasks. We experimentally validate these predictions in RNNs trained on multi-task classification. We find that transformers are particularly suited for disentangling representations, which might explain their unique world understanding abilities.
arXiv Detail & Related papers (2024-07-15T21:32:58Z)
Evaluating alignment between humans and neural network representations in image-based learning tasks [5.657101730705275]
We tested how well the representations of $86$ pretrained neural network models mapped to human learning trajectories. We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation. In conclusion, pretrained neural networks can serve to extract representations for cognitive models, as they appear to capture some fundamental aspects of cognition that are transferable across tasks.
arXiv Detail & Related papers (2023-06-15T08:18:29Z)
Less Data, More Knowledge: Building Next Generation Semantic Communication Networks [180.82142885410238]
We present the first rigorous vision of a scalable end-to-end semantic communication network. We first discuss how the design of semantic communication networks requires a move from data-driven networks towards knowledge-driven ones. By using semantic representation and languages, we show that the traditional transmitter and receiver now become a teacher and apprentice.
arXiv Detail & Related papers (2022-11-25T19:03:25Z)
Interpretable part-whole hierarchies and conceptual-semantic relationships in neural networks [4.153804257347222]
We present Agglomerator, a framework capable of providing a representation of part-whole hierarchies from visual cues. We evaluate our method on common datasets, such as SmallNORB, MNIST, FashionMNIST, CIFAR-10, and CIFAR-100.
arXiv Detail & Related papers (2022-03-07T10:56:13Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Semantic Representation and Inference for NLP [2.969705152497174]
This thesis investigates the use of deep learning for novel semantic representation and inference. We contribute the largest publicly available dataset of real-life factual claims for the purpose of automatic claim verification. We operationalize the compositionality of a phrase contextually by enriching the phrase representation with external word embeddings and knowledge graphs.
arXiv Detail & Related papers (2021-06-15T13:22:48Z)
Learning Deep Interleaved Networks with Asymmetric Co-Attention for Image Restoration [65.11022516031463]
We present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction. In this paper, we propose asymmetric co-attention (AsyCA) which is attached at each interleaved node to model the feature dependencies. Our presented DIN can be trained end-to-end and applied to various image restoration tasks.
arXiv Detail & Related papers (2020-10-29T15:32:00Z)
Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization [59.5104563755095]
We introduce a simple but effective approach to improve the generalization capability of deep neural networks in the field of medical imaging classification. Motivated by the observation that the domain variability of the medical images is to some extent compact, we propose to learn a representative feature space through variational encoding.
arXiv Detail & Related papers (2020-09-27T12:30:30Z)
Neural Additive Models: Interpretable Machine Learning with Neural Nets [77.66871378302774]
Deep neural networks (DNNs) are powerful black-box predictors that have achieved impressive performance on a wide variety of tasks. We propose Neural Additive Models (NAMs) which combine some of the expressivity of DNNs with the inherent intelligibility of generalized additive models. NAMs learn a linear combination of neural networks that each attend to a single input feature.
arXiv Detail & Related papers (2020-04-29T01:28:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.