SEEC: Semantic Vector Federation across Edge Computing Environments
- URL: http://arxiv.org/abs/2008.13298v1
- Date: Sun, 30 Aug 2020 23:51:41 GMT
- Title: SEEC: Semantic Vector Federation across Edge Computing Environments
- Authors: Shalisha Witherspoon, Dean Steuer, Graham Bent, Nirmit Desai
- Abstract summary: State-of-the-art embedding approaches assume all data is available on a single site.
In many business settings, data is distributed across multiple edge locations and cannot be aggregated.
This paper proposes novel unsupervised algorithms called emphSEEC for learning and applying semantic vector embedding in a variety of distributed settings.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic vector embedding techniques have proven useful in learning semantic
representations of data across multiple domains. A key application enabled by
such techniques is the ability to measure semantic similarity between given
data samples and find data most similar to a given sample. State-of-the-art
embedding approaches assume all data is available on a single site. However, in
many business settings, data is distributed across multiple edge locations and
cannot be aggregated due to a variety of constraints. Hence, the applicability
of state-of-the-art embedding approaches is limited to freely shared datasets,
leaving out applications with sensitive or mission-critical data. This paper
addresses this gap by proposing novel unsupervised algorithms called
\emph{SEEC} for learning and applying semantic vector embedding in a variety of
distributed settings. Specifically, for scenarios where multiple edge locations
can engage in joint learning, we adapt the recently proposed federated learning
techniques for semantic vector embedding. Where joint learning is not possible,
we propose novel semantic vector translation algorithms to enable semantic
query across multiple edge locations, each with its own semantic vector-space.
Experimental results on natural language as well as graph datasets show that
this may be a promising new direction.
Related papers
- A Mathematical Perspective On Contrastive Learning [5.66952471288857]
Multimodal contrastive learning is a methodology for linking different data modalities.<n>We focus on the bimodal setting and interpret contrastive learning as the optimization of encoders that define conditional probability distributions.
arXiv Detail & Related papers (2025-05-30T02:09:37Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - Self-Supervised Representation Learning With MUlti-Segmental
Informational Coding (MUSIC) [6.693379403133435]
Self-supervised representation learning maps high-dimensional data into a meaningful embedding space.
We propose MUlti-Segmental Informational Coding (MUSIC) for self-supervised representation learning.
arXiv Detail & Related papers (2022-06-13T20:37:48Z) - Detection Hub: Unifying Object Detection Datasets via Query Adaptation
on Language Embedding [137.3719377780593]
A new design (named Detection Hub) is dataset-aware and category-aligned.
It mitigates the dataset inconsistency and provides coherent guidance for the detector to learn across multiple datasets.
The categories across datasets are semantically aligned into a unified space by replacing one-hot category representations with word embedding.
arXiv Detail & Related papers (2022-06-07T17:59:44Z) - Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised
Person Re-Identification and Text Authorship Attribution [77.85461690214551]
Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution.
Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences.
We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse.
arXiv Detail & Related papers (2022-02-07T13:08:11Z) - Dominant Set-based Active Learning for Text Classification and its
Application to Online Social Media [0.0]
We present a novel pool-based active learning method for the training of large unlabeled corpus with minimum annotation cost.
Our proposed method does not have any parameters to be tuned, making it dataset-independent.
Our method achieves a higher performance in comparison to the state-of-the-art active learning strategies.
arXiv Detail & Related papers (2022-01-28T19:19:03Z) - Seeking Similarities over Differences: Similarity-based Domain Alignment
for Adaptive Object Detection [86.98573522894961]
We propose a framework that generalizes the components commonly used by Unsupervised Domain Adaptation (UDA) algorithms for detection.
Specifically, we propose a novel UDA algorithm, ViSGA, that leverages the best design choices and introduces a simple but effective method to aggregate features at instance-level.
We show that both similarity-based grouping and adversarial training allows our model to focus on coarsely aligning feature groups, without being forced to match all instances across loosely aligned domains.
arXiv Detail & Related papers (2021-10-04T13:09:56Z) - Improving Deep Metric Learning by Divide and Conquer [11.380358587116683]
Deep metric learning (DML) is a cornerstone of many computer vision applications.
It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another.
We propose to build a more expressive representation by splitting the embedding space and the data hierarchically into smaller sub-parts.
arXiv Detail & Related papers (2021-09-09T02:57:34Z) - Flexible deep transfer learning by separate feature embeddings and
manifold alignment [0.0]
Object recognition is a key enabler across industry and defense.
Unfortunately, algorithms trained on existing labeled datasets do not directly generalize to new data because the data distributions do not match.
We propose a novel deep learning framework that overcomes this limitation by learning separate feature extractions for each domain.
arXiv Detail & Related papers (2020-12-22T19:24:44Z) - Fewer is More: A Deep Graph Metric Learning Perspective Using Fewer
Proxies [65.92826041406802]
We propose a Proxy-based deep Graph Metric Learning approach from the perspective of graph classification.
Multiple global proxies are leveraged to collectively approximate the original data points for each class.
We design a novel reverse label propagation algorithm, by which the neighbor relationships are adjusted according to ground-truth labels.
arXiv Detail & Related papers (2020-10-26T14:52:42Z) - Contextual Diversity for Active Learning [9.546771465714876]
Large datasets restrict the use of deep convolutional neural networks (CNNs) for many practical applications.
We introduce the notion of contextual diversity that captures the confusion associated with spatially co-occurring classes.
Our studies show clear advantages of using contextual diversity for active learning.
arXiv Detail & Related papers (2020-08-13T07:04:15Z) - Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical
Understanding of Outdoor Scene [76.4183572058063]
We present a richly-annotated 3D point cloud dataset for multiple outdoor scene understanding tasks.
The dataset has been point-wisely annotated with both hierarchical and instance-based labels.
We formulate a hierarchical learning problem for 3D point cloud segmentation and propose a measurement evaluating consistency across various hierarchies.
arXiv Detail & Related papers (2020-08-11T19:10:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.