PokeFlex: A Real-World Dataset of Deformable Objects for Robotics
- URL: http://arxiv.org/abs/2410.07688v1
- Date: Thu, 10 Oct 2024 07:54:17 GMT
- Title: PokeFlex: A Real-World Dataset of Deformable Objects for Robotics
- Authors: Jan Obrist, Miguel Zamora, Hehui Zheng, Ronan Hinchet, Firat Ozdemir, Juan Zarate, Robert K. Katzschmann, Stelian Coros,
- Abstract summary: PokeFlex is a dataset featuring real-world paired and annotated multimodal data that includes 3D textured meshes, point clouds, RGB images, and depth maps.
Such data can be leveraged for several downstream tasks such as online 3D mesh reconstruction.
We demonstrate a use case for the PokeFlex dataset in online 3D mesh reconstruction.
- Score: 17.533143584534155
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data-driven methods have shown great potential in solving challenging manipulation tasks, however, their application in the domain of deformable objects has been constrained, in part, by the lack of data. To address this, we propose PokeFlex, a dataset featuring real-world paired and annotated multimodal data that includes 3D textured meshes, point clouds, RGB images, and depth maps. Such data can be leveraged for several downstream tasks such as online 3D mesh reconstruction, and it can potentially enable underexplored applications such as the real-world deployment of traditional control methods based on mesh simulations. To deal with the challenges posed by real-world 3D mesh reconstruction, we leverage a professional volumetric capture system that allows complete 360{\deg} reconstruction. PokeFlex consists of 18 deformable objects with varying stiffness and shapes. Deformations are generated by dropping objects onto a flat surface or by poking the objects with a robot arm. Interaction forces and torques are also reported for the latter case. Using different data modalities, we demonstrated a use case for the PokeFlex dataset in online 3D mesh reconstruction. We refer the reader to our website ( https://pokeflex-dataset.github.io/ ) for demos and examples of our dataset.
Related papers
- Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling [48.78204955169967]
Articulate Anymesh is an automated framework that is able to convert rigid 3D mesh into its articulated counterpart in an open-vocabulary manner.
Our experiments show that Articulate Anymesh can generate large-scale, high-quality 3D articulated objects, including tools, toys, mechanical devices, and vehicles.
arXiv Detail & Related papers (2025-02-04T18:59:55Z) - CULTURE3D: Cultural Landmarks and Terrain Dataset for 3D Applications [11.486451047360248]
We present a large-scale fine-grained dataset using high-resolution images captured from locations worldwide.
Our dataset is built using drone-captured aerial imagery, which provides a more accurate perspective for capturing real-world site layouts and architectural structures.
The dataset enables seamless integration with multi-modal data, supporting a range of 3D applications, from architectural reconstruction to virtual tourism.
arXiv Detail & Related papers (2025-01-12T20:36:39Z) - DOFS: A Real-world 3D Deformable Object Dataset with Full Spatial Information for Dynamics Model Learning [7.513355021861478]
This work proposes DOFS, a pilot dataset of 3D deformable objects (DOs) (e.g., elasto-plastic objects) with full spatial information.
The dataset consists of active manipulation action, multi-view RGB-D images, well-registered point clouds, 3D deformed mesh, and 3D occupancy with semantics.
In addition, we trained a neural network with the down-sampled 3D occupancy and action as input to model the dynamics of an elasto-plastic object.
arXiv Detail & Related papers (2024-10-29T05:46:16Z) - MegaScenes: Scene-Level View Synthesis at Scale [69.21293001231993]
Scene-level novel view synthesis (NVS) is fundamental to many vision and graphics applications.
We create a large-scale scene-level dataset from Internet photo collections, called MegaScenes, which contains over 100K structure from motion (SfM) reconstructions from around the world.
We analyze failure cases of state-of-the-art NVS methods and significantly improve generation consistency.
arXiv Detail & Related papers (2024-06-17T17:55:55Z) - Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction [51.3632308129838]
We present Total-Decom, a novel method for decomposed 3D reconstruction with minimal human interaction.
Our approach seamlessly integrates the Segment Anything Model (SAM) with hybrid implicit-explicit neural surface representations and a mesh-based region-growing technique for accurate 3D object decomposition.
We extensively evaluate our method on benchmark datasets and demonstrate its potential for downstream applications, such as animation and scene editing.
arXiv Detail & Related papers (2024-03-28T11:12:33Z) - UniG3D: A Unified 3D Object Generation Dataset [75.49544172927749]
UniG3D is a unified 3D object generation dataset constructed by employing a universal data transformation pipeline on ShapeNet datasets.
This pipeline converts each raw 3D model into comprehensive multi-modal data representation.
The selection of data sources for our dataset is based on their scale and quality.
arXiv Detail & Related papers (2023-06-19T07:03:45Z) - Parcel3D: Shape Reconstruction from Single RGB Images for Applications
in Transportation Logistics [62.997667081978825]
We focus on enabling damage and tampering detection in logistics and tackle the problem of 3D shape reconstruction of potentially damaged parcels.
We present a novel synthetic dataset, named Parcel3D, that is based on the Google Scanned Objects (GSO) dataset.
We present a novel architecture called CubeRefine R-CNN, which combines estimating a 3D bounding box with an iterative mesh refinement.
arXiv Detail & Related papers (2023-04-18T13:55:51Z) - 3D Human Mesh Estimation from Virtual Markers [34.703241940871635]
We present an intermediate representation, named virtual markers, which learns 64 landmark keypoints on the body surface.
Our approach outperforms the state-of-the-art methods on three datasets.
arXiv Detail & Related papers (2023-03-21T10:30:43Z) - D3D-HOI: Dynamic 3D Human-Object Interactions from Videos [49.38319295373466]
We introduce D3D-HOI: a dataset of monocular videos with ground truth annotations of 3D object pose, shape and part motion during human-object interactions.
Our dataset consists of several common articulated objects captured from diverse real-world scenes and camera viewpoints.
We leverage the estimated 3D human pose for more accurate inference of the object spatial layout and dynamics.
arXiv Detail & Related papers (2021-08-19T00:49:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.