ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and
Tactile Representations
- URL: http://arxiv.org/abs/2109.07991v2
- Date: Sat, 18 Sep 2021 17:38:18 GMT
- Title: ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and
Tactile Representations
- Authors: Ruohan Gao, Yen-Yu Chang, Shivani Mall, Li Fei-Fei, Jiajun Wu
- Abstract summary: We present Object, a dataset of 100 objects that addresses both challenges with two key innovations.
First, Object encodes the visual, auditory, and tactile sensory data for all objects, enabling a number of multisensory object recognition tasks.
Second, Object employs a uniform, object-centric simulations, and implicit representation for each object's visual textures, tactile readings, and tactile readings, making the dataset flexible to use and easy to share.
- Score: 52.226947570070784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multisensory object-centric perception, reasoning, and interaction have been
a key research topic in recent years. However, the progress in these directions
is limited by the small set of objects available -- synthetic objects are not
realistic enough and are mostly centered around geometry, while real object
datasets such as YCB are often practically challenging and unstable to acquire
due to international shipping, inventory, and financial cost. We present
ObjectFolder, a dataset of 100 virtualized objects that addresses both
challenges with two key innovations. First, ObjectFolder encodes the visual,
auditory, and tactile sensory data for all objects, enabling a number of
multisensory object recognition tasks, beyond existing datasets that focus
purely on object geometry. Second, ObjectFolder employs a uniform,
object-centric, and implicit representation for each object's visual textures,
acoustic simulations, and tactile readings, making the dataset flexible to use
and easy to share. We demonstrate the usefulness of our dataset as a testbed
for multisensory perception and control by evaluating it on a variety of
benchmark tasks, including instance recognition, cross-sensory retrieval, 3D
reconstruction, and robotic grasping.
Related papers
- Chat-3D v2: Bridging 3D Scene and Large Language Models with Object
Identifiers [62.232809030044116]
We introduce the use of object identifiers to freely reference objects during a conversation.
We propose a two-stage alignment method, which involves learning an attribute-aware token and a relation-aware token for each object.
Experiments conducted on traditional datasets like ScanQA, ScanRefer, and Nr3D/Sr3D showcase the effectiveness of our proposed method.
arXiv Detail & Related papers (2023-12-13T14:27:45Z) - The ObjectFolder Benchmark: Multisensory Learning with Neural and Real
Objects [51.22194706674366]
We introduce the Object Benchmark, a benchmark suite of 10 tasks for multisensory object-centric learning.
We also introduce the Object Real dataset, including the multisensory measurements for 100 real-world household objects.
arXiv Detail & Related papers (2023-06-01T17:51:22Z) - Lifelong Ensemble Learning based on Multiple Representations for
Few-Shot Object Recognition [6.282068591820947]
We present a lifelong ensemble learning approach based on multiple representations to address the few-shot object recognition problem.
To facilitate lifelong learning, each approach is equipped with a memory unit for storing and retrieving object information instantly.
We have performed extensive sets of experiments to assess the performance of the proposed approach in offline, and open-ended scenarios.
arXiv Detail & Related papers (2022-05-04T10:29:10Z) - ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer [46.24535144252644]
We present Object 2.0, a large-scale dataset of common household objects in the form of implicit neural representations.
Our dataset is 10 times larger in the amount of objects and orders of magnitude faster in time.
We show that models learned from virtual objects in our dataset successfully transfer to their real-world counterparts.
arXiv Detail & Related papers (2022-04-05T17:55:01Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - REGRAD: A Large-Scale Relational Grasp Dataset for Safe and
Object-Specific Robotic Grasping in Clutter [52.117388513480435]
We present a new dataset named regrad to sustain the modeling of relationships among objects and grasps.
Our dataset is collected in both forms of 2D images and 3D point clouds.
Users are free to import their own object models for the generation of as many data as they want.
arXiv Detail & Related papers (2021-04-29T05:31:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.