Blocks World Revisited: The Effect of Self-Occlusion on Classification
by Convolutional Neural Networks
- URL: http://arxiv.org/abs/2102.12911v1
- Date: Thu, 25 Feb 2021 15:02:47 GMT
- Title: Blocks World Revisited: The Effect of Self-Occlusion on Classification
by Convolutional Neural Networks
- Authors: Markus D. Solbach, John K. Tsotsos
- Abstract summary: TEOS (The Effect of Self-Occlusion) is a 3D blocks world dataset that focuses on the geometric shape of 3D objects.
In the real-world, self-occlusion of 3D objects still presents significant challenges for deep learning approaches.
We provide 738 uniformly sampled views of each object, their mask, object and camera position, orientation, amount of self-occlusion, as well as the CAD model of each object.
- Score: 17.58979205709865
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the recent successes in computer vision, there remain new avenues to
explore. In this work, we propose a new dataset to investigate the effect of
self-occlusion on deep neural networks. With TEOS (The Effect of
Self-Occlusion), we propose a 3D blocks world dataset that focuses on the
geometric shape of 3D objects and their omnipresent challenge of
self-occlusion. We designed TEOS to investigate the role of self-occlusion in
the context of object classification. Even though remarkable progress has been
seen in object classification, self-occlusion is a challenge. In the
real-world, self-occlusion of 3D objects still presents significant challenges
for deep learning approaches. However, humans deal with this by deploying
complex strategies, for instance, by changing the viewpoint or manipulating the
scene to gather necessary information. With TEOS, we present a dataset of two
difficulty levels (L1 and L2 ), containing 36 and 12 objects, respectively. We
provide 738 uniformly sampled views of each object, their mask, object and
camera position, orientation, amount of self-occlusion, as well as the CAD
model of each object. We present baseline evaluations with five well-known
classification deep neural networks and show that TEOS poses a significant
challenge for all of them. The dataset, as well as the pre-trained models, are
made publicly available for the scientific community under
https://nvision2.data.eecs.yorku.ca/TEOS.
Related papers
- Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Unsupervised Discovery of Object-Centric Neural Fields [21.223170092979498]
We study inferring 3D object-centric scene representations from a single image.
We propose Unsupervised discovery of Object-Centric neural Fields (uOCF)
arXiv Detail & Related papers (2024-02-12T02:16:59Z) - 4D Unsupervised Object Discovery [53.561750858325915]
We propose 4D unsupervised object discovery, jointly discovering objects from 4D data -- 3D point clouds and 2D RGB images with temporal information.
We present the first practical approach for this task by proposing a ClusterNet on 3D point clouds, which is jointly optimized with a 2D localization network.
arXiv Detail & Related papers (2022-10-10T16:05:53Z) - Object Scene Representation Transformer [56.40544849442227]
We introduce Object Scene Representation Transformer (OSRT), a 3D-centric model in which individual object representations naturally emerge through novel view synthesis.
OSRT scales to significantly more complex scenes with larger diversity of objects and backgrounds than existing methods.
It is multiple orders of magnitude faster at compositional rendering thanks to its light field parametrization and the novel Slot Mixer decoder.
arXiv Detail & Related papers (2022-06-14T15:40:47Z) - Revealing Occlusions with 4D Neural Fields [19.71277637485384]
For computer vision systems to operate in dynamic situations, they need to be able to represent and reason about object permanence.
We introduce a framework for learning to estimate 4D visual representations from monocular-Dtemporal.
arXiv Detail & Related papers (2022-04-22T20:14:42Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z) - Learning to Reconstruct and Segment 3D Objects [4.709764624933227]
We aim to understand scenes and the objects within them by learning general and robust representations using deep neural networks.
This thesis makes three core contributions from object-level 3D shape estimation from single or multiple views to scene-level semantic understanding.
arXiv Detail & Related papers (2020-10-19T15:09:04Z) - SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data.
Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.