Related papers: MeshConv3D: Efficient convolution and pooling operators for triangular 3D meshes

MeshConv3D: Efficient convolution and pooling operators for triangular 3D meshes

URL: http://arxiv.org/abs/2501.03830v1
Date: Tue, 07 Jan 2025 14:41:26 GMT
Title: MeshConv3D: Efficient convolution and pooling operators for triangular 3D meshes
Authors: Germain Bregeon, Marius Preda, Radu Ispas, Titus Zaharia,
Abstract summary: MeshConv3D is a 3D mesh-dedicated methodology integrating specialized convolution and face collapse-based pooling operators.<n>The experimental results obtained on three distinct benchmark datasets show that the proposed approach makes it possible to achieve equivalent or superior classification results.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Convolutional neural networks (CNNs) have been pivotal in various 2D image analysis tasks, including computer vision, image indexing and retrieval or semantic classification. Extending CNNs to 3D data such as point clouds and 3D meshes raises significant challenges since the very basic convolution and pooling operators need to be completely re-visited and re-defined in an appropriate manner to tackle irregular connectivity issues. In this paper, we introduce MeshConv3D, a 3D mesh-dedicated methodology integrating specialized convolution and face collapse-based pooling operators. MeshConv3D operates directly on meshes of arbitrary topology, without any need of prior re-meshing/conversion techniques. In order to validate our approach, we have considered a semantic classification task. The experimental results obtained on three distinct benchmark datasets show that the proposed approach makes it possible to achieve equivalent or superior classification results, while minimizing the related memory footprint and computational load.

Related papers

Large Spatial Model: End-to-end Unposed Images to Semantic 3D [79.94479633598102]
Large Spatial Model (LSM) processes unposed RGB images directly into semantic radiance fields. LSM simultaneously estimates geometry, appearance, and semantics in a single feed-forward operation. It can generate versatile label maps by interacting with language at novel viewpoints.
arXiv Detail & Related papers (2024-10-24T17:54:42Z)
Bayesian Self-Training for Semi-Supervised 3D Segmentation [59.544558398992386]
3D segmentation is a core problem in computer vision. densely labeling 3D point clouds to employ fully-supervised training remains too labor intensive and expensive. Semi-supervised training provides a more practical alternative, where only a small set of labeled data is given, accompanied by a larger unlabeled set.
arXiv Detail & Related papers (2024-09-12T14:54:31Z)
Interpretable 2D Vision Models for 3D Medical Images [47.75089895500738]
This study proposes a simple approach of adapting 2D networks with an intermediate feature representation for processing 3D images. We show on all 3D MedMNIST datasets as benchmark and two real-world datasets consisting of several hundred high-resolution CT or MRI scans that our approach performs on par with existing methods.
arXiv Detail & Related papers (2023-07-13T08:27:09Z)
MeT: A Graph Transformer for Semantic Segmentation of 3D Meshes [10.667492516216887]
We propose a transformer-based method for semantic segmentation of 3D mesh. We perform positional encoding by means of the Laplacian eigenvectors of the adjacency matrix. We show how the proposed approach yields state-of-the-art performance on semantic segmentation of 3D meshes.
arXiv Detail & Related papers (2023-07-03T15:45:14Z)
SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences. It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping. Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z)
NOVUM: Neural Object Volumes for Robust Object Classification [22.411611823528272]
We show that explicitly integrating 3D compositional object representations into deep networks for image classification leads to a largely enhanced generalization in out-of-distribution scenarios. In particular, we introduce a novel architecture, referred to as NOVUM, that consists of a feature extractor and a neural object volume for every target object class. Our experiments show that NOVUM offers intriguing advantages over standard architectures due to the 3D compositional structure of the object representation.
arXiv Detail & Related papers (2023-05-24T03:20:09Z)
Moving Frame Net: SE(3)-Equivariant Network for Volumes [0.0]
A rotation and translation equivariant neural network for image data was proposed based on the moving frames approach. We significantly improve that approach by reducing the computation of moving frames to only one, at the input stage. Our trained model overperforms the benchmarks in the medical volume classification of most of the tested datasets from MedMNIST3D.
arXiv Detail & Related papers (2022-11-07T10:25:38Z)
Multi-initialization Optimization Network for Accurate 3D Human Pose and Shape Estimation [75.44912541912252]
We propose a three-stage framework named Multi-Initialization Optimization Network (MION) In the first stage, we strategically select different coarse 3D reconstruction candidates which are compatible with the 2D keypoints of input sample. In the second stage, we design a mesh refinement transformer (MRT) to respectively refine each coarse reconstruction result via a self-attention mechanism. Finally, a Consistency Estimation Network (CEN) is proposed to find the best result from mutiple candidates by evaluating if the visual evidence in RGB image matches a given 3D reconstruction.
arXiv Detail & Related papers (2021-12-24T02:43:58Z)
Geometry-Contrastive Transformer for Generalized 3D Pose Transfer [95.56457218144983]
The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism. We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies. We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
arXiv Detail & Related papers (2021-12-14T13:14:24Z)
Primal-Dual Mesh Convolutional Neural Networks [62.165239866312334]
We propose a primal-dual framework drawn from the graph-neural-network literature to triangle meshes. Our method takes features for both edges and faces of a 3D mesh as input and dynamically aggregates them. We provide theoretical insights of our approach using tools from the mesh-simplification literature.
arXiv Detail & Related papers (2020-10-23T14:49:02Z)
Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training [3.8073142980733]
We propose a novel framework for monocular 3D objects detection using only RGB images, called KM3D-Net. We design a fully convolutional model to predict object keypoints, dimension, and orientation, and then combine these estimations with perspective geometry constraints to compute position attribute.
arXiv Detail & Related papers (2020-09-02T00:51:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.