Related papers: From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding

From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding

URL: http://arxiv.org/abs/2304.00553v4
Date: Wed, 3 Apr 2024 10:36:31 GMT
Title: From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding
Authors: Yong-Lu Li, Xiaoqian Wu, Xinpeng Liu, Zehao Wang, Yiming Dou, Yikun Ji, Junyi Zhang, Yixing Li, Jingru Tan, Xudong Lu, Cewu Lu,
Abstract summary: Action understanding can be formed as the mapping from the physical space to the semantic space. We propose a novel model mapping from the physical space to semantic space to fully use Pangea.
Score: 50.412121156940294
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Action understanding has attracted long-term attention. It can be formed as the mapping from the physical space to the semantic space. Typically, researchers built datasets according to idiosyncratic choices to define classes and push the envelope of benchmarks respectively. Datasets are incompatible with each other like "Isolated Islands" due to semantic gaps and various class granularities, e.g., do housework in dataset A and wash plate in dataset B. We argue that we need a more principled semantic space to concentrate the community efforts and use all datasets together to pursue generalizable action learning. To this end, we design a structured action semantic space given verb taxonomy hierarchy and covering massive actions. By aligning the classes of previous datasets to our semantic space, we gather (image/video/skeleton/MoCap) datasets into a unified database in a unified label system, i.e., bridging "isolated islands" into a "Pangea". Accordingly, we propose a novel model mapping from the physical space to semantic space to fully use Pangea. In extensive experiments, our new system shows significant superiority, especially in transfer learning. Our code and data will be made public at https://mvig-rhos.com/pangea.

Related papers

Knowledge Distillation for Semantic Segmentation: A Label Space Unification Approach [0.0]
We propose a knowledge distillation approach that also serves as a label space unification method for semantic segmentation. A teacher model is trained on a source dataset with a given taxonomy, then used to pseudo-label additional data for which ground truth labels of a related label space exist. Our ground truth-corrected pseudo-labels span over 12 and 7 public datasets with 388.230 and 18.558 images for the urban and off-road domains, respectively.
arXiv Detail & Related papers (2025-02-26T14:33:33Z)
Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning [70.64617500380287]
Continual learning allows models to learn from new data while retaining previously learned knowledge. The semantic knowledge available in the label information of the images, offers important semantic information that can be related with previously acquired knowledge of semantic classes. We propose integrating semantic guidance within and across tasks by capturing semantic similarity using text embeddings.
arXiv Detail & Related papers (2024-08-02T07:51:44Z)
SEER-ZSL: Semantic Encoder-Enhanced Representations for Generalized Zero-Shot Learning [0.6792605600335813]
Zero-Shot Learning (ZSL) presents the challenge of identifying categories not seen during training. We introduce a Semantic-Enhanced Representations for Zero-Shot Learning (SEER-ZSL) First, we aim to distill meaningful semantic information using a probabilistic encoder, enhancing the semantic consistency and robustness. Second, we distill the visual space by exploiting the learned data distribution through an adversarially trained generator. Third, we align the distilled information, enabling a mapping of unseen categories onto the true data manifold.
arXiv Detail & Related papers (2023-12-20T15:18:51Z)
Open-Vocabulary Camouflaged Object Segmentation [66.94945066779988]
We introduce a new task, open-vocabulary camouflaged object segmentation (OVCOS) We construct a large-scale complex scene dataset (textbfOVCamo) containing 11,483 hand-selected images with fine annotations and corresponding object classes. By integrating the guidance of class semantic knowledge and the supplement of visual structure cues from the edge and depth information, the proposed method can efficiently capture camouflaged objects.
arXiv Detail & Related papers (2023-11-19T06:00:39Z)
Rethinking Pre-Training in Tabular Data: A Neighborhood Embedding Perspective [71.45945607871715]
We propose Tabular data Pre-Training via Meta-representation (TabPTM) The core idea is to embed data instances into a shared feature space, where each instance is represented by its distance to a fixed number of nearest neighbors and their labels. Extensive experiments on 101 datasets confirm TabPTM's effectiveness in both classification and regression tasks, with and without fine-tuning.
arXiv Detail & Related papers (2023-10-31T18:03:54Z)
Label Name is Mantra: Unifying Point Cloud Segmentation across Heterogeneous Datasets [17.503843467554592]
We propose a principled approach that supports learning from heterogeneous datasets with different label sets. Our idea is to utilize a pre-trained language model to embed discrete labels to a continuous latent space with the help of their label names. Our model outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2023-03-19T06:14:22Z)
Navya3DSeg -- Navya 3D Semantic Segmentation Dataset & split generation for autonomous vehicles [63.20765930558542]
3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization. We propose a new dataset, Navya 3D (Navya3DSeg), with a diverse label space corresponding to a large scale production grade operational domain. It contains 23 labeled sequences and 25 supplementary sequences without labels, designed to explore self-supervised and semi-supervised semantic segmentation benchmarks on point clouds.
arXiv Detail & Related papers (2023-02-16T13:41:19Z)
Regional Semantic Contrast and Aggregation for Weakly Supervised Semantic Segmentation [25.231470587575238]
We propose regional semantic contrast and aggregation (RCA) for learning semantic segmentation. RCA is equipped with a regional memory bank to store massive, diverse object patterns appearing in training data. RCA earns a strong capability of fine-grained semantic understanding, and eventually establishes new state-of-the-art results on two popular benchmarks.
arXiv Detail & Related papers (2022-03-17T23:29:03Z)
AutoGeoLabel: Automated Label Generation for Geospatial Machine Learning [69.47585818994959]
We evaluate a big data processing pipeline to auto-generate labels for remote sensing data. We utilize the big geo-data platform IBM PAIRS to dynamically generate such labels in dense urban areas.
arXiv Detail & Related papers (2022-01-31T20:02:22Z)
Improving Deep Metric Learning by Divide and Conquer [11.380358587116683]
Deep metric learning (DML) is a cornerstone of many computer vision applications. It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another. We propose to build a more expressive representation by splitting the embedding space and the data hierarchically into smaller sub-parts.
arXiv Detail & Related papers (2021-09-09T02:57:34Z)
Joining datasets via data augmentation in the label space for neural networks [6.036150783745836]
We propose a new technique leveraging artificially created knowledge graph, recurrent neural networks and policy gradient that successfully achieve the dataset joining in the label space. Empirical results on both image and text classification justify the validity of our approach.
arXiv Detail & Related papers (2021-06-17T06:08:11Z)
Simple multi-dataset detection [83.9604523643406]
We present a simple method for training a unified detector on multiple large-scale datasets. We show how to automatically integrate dataset-specific outputs into a common semantic taxonomy. Our approach does not require manual taxonomy reconciliation.
arXiv Detail & Related papers (2021-02-25T18:55:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.