Human Correspondence Consensus for 3D Object Semantic Understanding
- URL: http://arxiv.org/abs/1912.12577v2
- Date: Thu, 26 Nov 2020 05:24:05 GMT
- Title: Human Correspondence Consensus for 3D Object Semantic Understanding
- Authors: Yujing Lou, Yang You, Chengkun Li, Zhoujun Cheng, Liangwei Li,
Lizhuang Ma, Weiming Wang, Cewu Lu
- Abstract summary: In this paper, we introduce a new dataset named CorresPondenceNet.
Based on this dataset, we are able to learn dense semantic embeddings with a novel geodesic consistency loss.
We show that CorresPondenceNet could not only boost fine-grained understanding of heterogeneous objects but also cross-object registration and partial object matching.
- Score: 56.34297279246823
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic understanding of 3D objects is crucial in many applications such as
object manipulation. However, it is hard to give a universal definition of
point-level semantics that everyone would agree on. We observe that people have
a consensus on semantic correspondences between two areas from different
objects, but are less certain about the exact semantic meaning of each area.
Therefore, we argue that by providing human labeled correspondences between
different objects from the same category instead of explicit semantic labels,
one can recover rich semantic information of an object. In this paper, we
introduce a new dataset named CorresPondenceNet. Based on this dataset, we are
able to learn dense semantic embeddings with a novel geodesic consistency loss.
Accordingly, several state-of-the-art networks are evaluated on this
correspondence benchmark. We further show that CorresPondenceNet could not only
boost fine-grained understanding of heterogeneous objects but also cross-object
registration and partial object matching.
Related papers
- 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding [58.924180772480504]
3D visual grounding aims to localize the target object in a 3D point cloud by a free-form language description.
We propose a relation-aware one-stage framework, named 3D Relative Position-aware Network (3-Net)
arXiv Detail & Related papers (2023-07-25T09:33:25Z) - Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic
Image Synthesis [139.2216271759332]
We propose a novel ECGAN for the challenging semantic image synthesis task.
The semantic labels do not provide detailed structural information, making it challenging to synthesize local details and structures.
The widely adopted CNN operations such as convolution, down-sampling, and normalization usually cause spatial resolution loss.
We propose a novel contrastive learning method, which aims to enforce pixel embeddings belonging to the same semantic class to generate more similar image content.
arXiv Detail & Related papers (2023-07-22T14:17:19Z) - Disentangling Learnable and Memorizable Data via Contrastive Learning
for Semantic Communications [81.10703519117465]
A novel machine reasoning framework is proposed to disentangle source data so as to make it semantic-ready.
In particular, a novel contrastive learning framework is proposed, whereby instance and cluster discrimination are performed on the data.
Deep semantic clusters of highest confidence are considered learnable, semantic-rich data.
Our simulation results showcase the superiority of our contrastive learning approach in terms of semantic impact and minimalism.
arXiv Detail & Related papers (2022-12-18T12:00:12Z) - EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual
Grounding [4.447173454116189]
3D visual grounding aims to find the object within point clouds mentioned by free-form natural language descriptions with rich semantic cues.
We present EDA that Explicitly Decouples the textual attributes in a sentence.
We further introduce a new visual grounding task, locating objects without object names, which can thoroughly evaluate the model's dense alignment capacity.
arXiv Detail & Related papers (2022-09-29T17:00:22Z) - Object-Compositional Neural Implicit Surfaces [45.274466719163925]
The neural implicit representation has shown its effectiveness in novel view synthesis and high-quality 3D reconstruction from multi-view images.
This paper proposes a novel framework, ObjectSDF, to build an object-compositional neural implicit representation with high fidelity in 3D reconstruction and object representation.
arXiv Detail & Related papers (2022-07-20T06:38:04Z) - Regional Semantic Contrast and Aggregation for Weakly Supervised
Semantic Segmentation [25.231470587575238]
We propose regional semantic contrast and aggregation (RCA) for learning semantic segmentation.
RCA is equipped with a regional memory bank to store massive, diverse object patterns appearing in training data.
RCA earns a strong capability of fine-grained semantic understanding, and eventually establishes new state-of-the-art results on two popular benchmarks.
arXiv Detail & Related papers (2022-03-17T23:29:03Z) - Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds [25.988556827312483]
Similar representation learning techniques have not yet become commonplace in the context of 3D vision.
We learn these vector representations by mining a dataset of scanned 3D spaces using an unsupervised algorithm.
We show that using our method to include context increases the ability of a clustering algorithm to distinguish different semantic classes from each other.
arXiv Detail & Related papers (2021-02-08T11:29:57Z) - Continuous Surface Embeddings [76.86259029442624]
We focus on the task of learning and representing dense correspondences in deformable object categories.
We propose a new, learnable image-based representation of dense correspondences.
We demonstrate that the proposed approach performs on par or better than the state-of-the-art methods for dense pose estimation for humans.
arXiv Detail & Related papers (2020-11-24T22:52:15Z) - Joint Semantic Analysis with Document-Level Cross-Task Coherence Rewards [13.753240692520098]
We present a neural network architecture for joint coreference resolution and semantic role labeling for English.
We use reinforcement learning to encourage global coherence over the document and between semantic annotations.
This leads to improvements on both tasks in multiple datasets from different domains.
arXiv Detail & Related papers (2020-10-12T09:36:24Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.