Related papers: Vector Symbolic Open Source Information Discovery

Vector Symbolic Open Source Information Discovery

URL: http://arxiv.org/abs/2408.10734v1
Date: Tue, 20 Aug 2024 11:05:56 GMT
Title: Vector Symbolic Open Source Information Discovery
Authors: Cai Davies, Sam Meek, Philip Hawkins, Benomy Tutcher, Graham Bent, Alun Preece,
Abstract summary: Large language models (transformers) facilitate semantic data and metadata alignment but are inefficient in CJIIM settings. We demonstrate a novel integration of transformer models with VSA, combining the power of the former for semantic matching with the compactness and representational structure of the latter. This work was carried out as a bridge between previous low technology readiness level (TRL) research and future higher-TRL technology demonstration and deployment.
Score: 0.41468088383213214
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Combined, joint, intra-governmental, inter-agency and multinational (CJIIM) operations require rapid data sharing without the bottlenecks of metadata curation and alignment. Curation and alignment is particularly infeasible for external open source information (OSINF), e.g., social media, which has become increasingly valuable in understanding unfolding situations. Large language models (transformers) facilitate semantic data and metadata alignment but are inefficient in CJIIM settings characterised as denied, degraded, intermittent and low bandwidth (DDIL). Vector symbolic architectures (VSA) support semantic information processing using highly compact binary vectors, typically 1-10k bits, suitable in a DDIL setting. We demonstrate a novel integration of transformer models with VSA, combining the power of the former for semantic matching with the compactness and representational structure of the latter. The approach is illustrated via a proof-of-concept OSINF data discovery portal that allows partners in a CJIIM operation to share data sources with minimal metadata curation and low communications bandwidth. This work was carried out as a bridge between previous low technology readiness level (TRL) research and future higher-TRL technology demonstration and deployment.

Related papers

InfoCom: Kilobyte-Scale Communication-Efficient Collaborative Perception with Information Bottleneck [14.691852809650323]
InfoCom is an information-aware framework establishing the pioneering theoretical foundation for communication-efficient collaborative perception.<n>Experiments across multiple datasets demonstrate that InfoCom achieves near-lossless perception while reducing communication overhead from megabyte to kilobyte-scale, representing 440-fold and 90-fold reductions per agent compared to Where2comm and ERMVP, respectively.
arXiv Detail & Related papers (2025-12-11T05:51:02Z)
Generative MIMO Beam Map Construction for Location Recovery and Beam Tracking [67.65578956523403]
This paper proposes a generative framework to recover location labels directly from sparse channel state information (CSI) measurements.<n>Instead of directly storing raw CSI, we learn a compact low-dimensional radio map embedding and leverage a generative model to reconstruct the high-dimensional CSI.<n> Numerical experiments demonstrate that the proposed model can improve localization accuracy by over 30% and achieve a 20% capacity gain in non-line-of-sight (NLOS) scenarios.
arXiv Detail & Related papers (2025-11-21T07:25:49Z)
Reconstruction-Driven Multimodal Representation Learning for Automated Media Understanding [0.1411701037241356]
We propose a Multimodal Autoencoder that learns unified representations across text, audio, and visual data.<n>We demonstrate significant improvements in clustering and alignment metrics compared to linear baselines.<n>Results highlight the potential of reconstruction-driven multimodal learning to enhance automation, searchability, and content management efficiency in modern broadcast.
arXiv Detail & Related papers (2025-11-17T19:13:51Z)
Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents [85.02904078131682]
We introduce the agent data protocol (ADP), a light-weight representation language that serves as an "interlingua" between agent datasets.<n> ADP is expressive enough to capture a large variety of tasks, including API/tool use, browsing, coding, software engineering, and general agentic.<n>All code and data are released publicly, in the hope that ADP could help lower the barrier to standardized, scalable, and reproducible agent training.
arXiv Detail & Related papers (2025-10-28T17:53:13Z)
DART: Dual Adaptive Refinement Transfer for Open-Vocabulary Multi-Label Recognition [59.203152078315235]
Open-Vocabulary Multi-Label Recognition (OV-MLR) aims to identify multiple seen and unseen object categories within an image.<n> Vision-Language Pre-training models offer a strong open-vocabulary foundation, but struggle with fine-grained localization under weak supervision.<n>We propose the Dual Adaptive Refinement Transfer (DART) framework to overcome these limitations.
arXiv Detail & Related papers (2025-08-07T17:22:33Z)
MAGE: Multimodal Alignment and Generation Enhancement via Bridging Visual and Semantic Spaces [23.447713697204225]
MAGE is a novel framework that bridges the semantic spaces of vision and text through an innovative alignment mechanism.<n>We employ a training strategy that combines cross-entropy and mean squared error, significantly enhancing the alignment effect.<n>Our proposed multimodal large model architecture, MAGE, achieved significantly better performance compared to similar works across various evaluation benchmarks.
arXiv Detail & Related papers (2025-07-29T12:17:46Z)
Multimodal-Aware Fusion Network for Referring Remote Sensing Image Segmentation [7.992331117310217]
Referring remote sensing image segmentation (RRSIS) is a novel visual task in remote sensing images segmentation. We design a multimodal-aware fusion network (MAFN) to achieve fine-grained alignment and fusion between the two modalities.
arXiv Detail & Related papers (2025-03-14T08:31:21Z)
MIETT: Multi-Instance Encrypted Traffic Transformer for Encrypted Traffic Classification [59.96233305733875]
Classifying traffic is essential for detecting security threats and optimizing network management. We propose a Multi-Instance Encrypted Traffic Transformer (MIETT) to capture both token-level and packet-level relationships. MIETT achieves results across five datasets, demonstrating its effectiveness in classifying encrypted traffic and understanding complex network behaviors.
arXiv Detail & Related papers (2024-12-19T12:52:53Z)
Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data. We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation. Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z)
Tackling Distribution Shifts in Task-Oriented Communication with Information Bottleneck [28.661084093544684]
We propose a novel approach based on the information bottleneck (IB) principle and invariant risk minimization (IRM) framework. The proposed method aims to extract compact and informative features that possess high capability for effective domain-shift generalization. We show that the proposed scheme outperforms state-of-the-art approaches and achieves a better rate-distortion tradeoff.
arXiv Detail & Related papers (2024-05-15T17:07:55Z)
Multimodal Informative ViT: Information Aggregation and Distribution for Hyperspectral and LiDAR Classification [25.254816993934746]
Multimodal Informative Vit (MIVit) is a system with an innovative information aggregate-distributing mechanism. MIVit reduces redundancy in the empirical distribution of each modality's separate and fused features. Our results show that MIVit's bidirectional aggregate-distributing mechanism is highly effective.
arXiv Detail & Related papers (2024-01-06T09:53:33Z)
Zero-shot Composed Text-Image Retrieval [72.43790281036584]
We consider the problem of composed image retrieval (CIR) It aims to train a model that can fuse multi-modal information, e.g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
arXiv Detail & Related papers (2023-06-12T17:56:01Z)
Streamlining Multimodal Data Fusion in Wireless Communication and Sensor Networks [4.132799233018846]
This paper presents a novel approach for multimodal data fusion based on the Vector-Quantized Variational Autoencoder (VQVAE) architecture. The proposed method is simple yet effective in achieving excellent reconstruction performance on paired MNIST-SVHN data and WiFi spectrogram data.
arXiv Detail & Related papers (2023-02-24T13:55:33Z)
Correlation Information Bottleneck: Towards Adapting Pretrained Multimodal Models for Robust Visual Question Answering [63.87200781247364]
Correlation Information Bottleneck (CIB) seeks a tradeoff between compression and redundancy in representations. We derive a tight theoretical upper bound for the mutual information between multimodal inputs and representations.
arXiv Detail & Related papers (2022-09-14T22:04:10Z)
SIM-Trans: Structure Information Modeling Transformer for Fine-grained Visual Categorization [59.732036564862796]
We propose the Structure Information Modeling Transformer (SIM-Trans) to incorporate object structure information into transformer for enhancing discriminative representation learning. The proposed two modules are light-weighted and can be plugged into any transformer network and trained end-to-end easily. Experiments and analyses demonstrate that the proposed SIM-Trans achieves state-of-the-art performance on fine-grained visual categorization benchmarks.
arXiv Detail & Related papers (2022-08-31T03:00:07Z)
Multi-agent Communication with Graph Information Bottleneck under Limited Bandwidth (a position paper) [92.11330289225981]
In many real-world scenarios, communication can be expensive and the bandwidth of the multi-agent system is subject to certain constraints. Redundant messages who occupy the communication resources can block the transmission of informative messages and thus jeopardize the performance. We propose a novel multi-agent communication module, CommGIB, which effectively compresses the structure information and node information in the communication graph to deal with bandwidth-constrained settings.
arXiv Detail & Related papers (2021-12-20T07:53:44Z)
MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos. We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z)
Task-Oriented Communication for Multi-Device Cooperative Edge Inference [14.249444124834719]
cooperative edge inference can overcome the limited sensing capability of a single device, but it substantially increases the communication overhead and may incur excessive latency. We propose a learning-based communication scheme that optimize local feature extraction and distributed feature encoding in a task-oriented manner.
arXiv Detail & Related papers (2021-09-01T03:56:20Z)
Learning Task-Oriented Communication for Edge Inference: An Information Bottleneck Approach [3.983055670167878]
A low-end edge device transmits the extracted feature vector of a local data sample to a powerful edge server for processing. It is critical to encode the data into an informative and compact representation for low-latency inference given the limited bandwidth. We propose a learning-based communication scheme that jointly optimize feature extraction, source coding, and channel coding.
arXiv Detail & Related papers (2021-02-08T12:53:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.