Ultron: Enabling Temporal Geometry Compression of 3D Mesh Sequences using Temporal Correspondence and Mesh Deformation
- URL: http://arxiv.org/abs/2409.05151v1
- Date: Sun, 8 Sep 2024 16:34:19 GMT
- Title: Ultron: Enabling Temporal Geometry Compression of 3D Mesh Sequences using Temporal Correspondence and Mesh Deformation
- Authors: Haichao Zhu,
- Abstract summary: Existing 3D model compression methods primarily focus on static models and do not consider inter-frame information.
This paper proposes a method to compress mesh sequences with arbitrary topology using temporal correspondence and mesh deformation.
- Score: 2.0914328542137346
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With the advancement of computer vision, dynamic 3D reconstruction techniques have seen significant progress and found applications in various fields. However, these techniques generate large amounts of 3D data sequences, necessitating efficient storage and transmission methods. Existing 3D model compression methods primarily focus on static models and do not consider inter-frame information, limiting their ability to reduce data size. Temporal mesh compression, which has received less attention, often requires all input meshes to have the same topology, a condition rarely met in real-world applications. This research proposes a method to compress mesh sequences with arbitrary topology using temporal correspondence and mesh deformation. The method establishes temporal correspondence between consecutive frames, applies a deformation model to transform the mesh from one frame to subsequent frames, and replaces the original meshes with deformed ones if the quality meets a tolerance threshold. Extensive experiments demonstrate that this method can achieve state-of-the-art performance in terms of compression performance. The contributions of this paper include a geometry and motion-based model for establishing temporal correspondence between meshes, a mesh quality assessment for temporal mesh sequences, an entropy-based encoding and corner table-based method for compressing mesh sequences, and extensive experiments showing the effectiveness of the proposed method. All the code will be open-sourced at https://github.com/lszhuhaichao/ultron.
Related papers
- MeshXL: Neural Coordinate Field for Generative 3D Foundation Models [51.1972329762843]
We present a family of generative pre-trained auto-regressive models, which addresses the process of 3D mesh generation with modern large language model approaches.
MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.
arXiv Detail & Related papers (2024-05-31T14:35:35Z) - Deep Spectral Meshes: Multi-Frequency Facial Mesh Processing with Graph
Neural Networks [1.170907599257096]
spectral meshes are introduced as a method to decompose mesh deformations into low- and high-frequency deformations.
A parametric model for 3D facial mesh synthesis is built upon the proposed framework.
Our model takes further advantage of spectral partitioning by representing different frequency levels with disparate, more suitable representations.
arXiv Detail & Related papers (2024-02-15T23:17:08Z) - SpATr: MoCap 3D Human Action Recognition based on Spiral Auto-encoder and Transformer Network [1.4732811715354455]
We introduce a novel approach for 3D human action recognition, denoted as SpATr (Spiral Auto-encoder and Transformer Network)
A lightweight auto-encoder, based on spiral convolutions, is employed to extract spatial geometrical features from each 3D mesh.
The proposed method is evaluated on three prominent 3D human action datasets: Babel, MoVi, and BMLrub.
arXiv Detail & Related papers (2023-06-30T11:49:00Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - Geometry-Contrastive Transformer for Generalized 3D Pose Transfer [95.56457218144983]
The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism.
We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies.
We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
arXiv Detail & Related papers (2021-12-14T13:14:24Z) - Deep Deformation Detail Synthesis for Thin Shell Models [47.442883859643004]
In physics-based cloth animation, rich folds and detailed wrinkles are achieved at the cost of expensive computational resources and huge labor tuning.
We develop a temporally and spatially as-consistent-as-possible deformation representation (named TS-ACAP) and a DeformTransformer network to learn the mapping from low-resolution meshes to detailed ones.
Our method is able to produce reliable and realistic animations in various datasets at high frame rates: 10 35 times faster than physics-based simulation, with superior detail synthesis abilities than existing methods.
arXiv Detail & Related papers (2021-02-23T08:09:11Z) - Primal-Dual Mesh Convolutional Neural Networks [62.165239866312334]
We propose a primal-dual framework drawn from the graph-neural-network literature to triangle meshes.
Our method takes features for both edges and faces of a 3D mesh as input and dynamically aggregates them.
We provide theoretical insights of our approach using tools from the mesh-simplification literature.
arXiv Detail & Related papers (2020-10-23T14:49:02Z) - A Real-time Action Representation with Temporal Encoding and Deep
Compression [115.3739774920845]
We propose a new real-time convolutional architecture, called Temporal Convolutional 3D Network (T-C3D), for action representation.
T-C3D learns video action representations in a hierarchical multi-granularity manner while obtaining a high process speed.
Our method achieves clear improvements on UCF101 action recognition benchmark against state-of-the-art real-time methods by 5.4% in terms of accuracy and 2 times faster in terms of inference speed with a less than 5MB storage model.
arXiv Detail & Related papers (2020-06-17T06:30:43Z) - Dense Non-Rigid Structure from Motion: A Manifold Viewpoint [162.88686222340962]
Non-Rigid Structure-from-Motion (NRSfM) problem aims to recover 3D geometry of a deforming object from its 2D feature correspondences across multiple frames.
We show that our approach significantly improves accuracy, scalability, and robustness against noise.
arXiv Detail & Related papers (2020-06-15T09:15:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.