ShapeEmbed: a self-supervised learning framework for 2D contour quantification
- URL: http://arxiv.org/abs/2507.01009v1
- Date: Tue, 01 Jul 2025 17:55:57 GMT
- Title: ShapeEmbed: a self-supervised learning framework for 2D contour quantification
- Authors: Anna Foix Romero, Craig Russell, Alexander Krull, Virginie Uhlmann,
- Abstract summary: We introduce ShapeEmbed, a self-supervised representation learning framework designed to encode the contour of objects in 2D images.<n>Our approach overcomes the limitations of traditional shape descriptors while improving upon existing state-of-the-art autoencoder-based approaches.<n>We demonstrate that the descriptors learned by our framework outperform their competitors in shape classification tasks on natural and biological images.
- Score: 45.39160205677261
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The shape of objects is an important source of visual information in a wide range of applications. One of the core challenges of shape quantification is to ensure that the extracted measurements remain invariant to transformations that preserve an object's intrinsic geometry, such as changing its size, orientation, and position in the image. In this work, we introduce ShapeEmbed, a self-supervised representation learning framework designed to encode the contour of objects in 2D images, represented as a Euclidean distance matrix, into a shape descriptor that is invariant to translation, scaling, rotation, reflection, and point indexing. Our approach overcomes the limitations of traditional shape descriptors while improving upon existing state-of-the-art autoencoder-based approaches. We demonstrate that the descriptors learned by our framework outperform their competitors in shape classification tasks on natural and biological images. We envision our approach to be of particular relevance to biological imaging applications.
Related papers
- Rethinking Rotation-Invariant Recognition of Fine-grained Shapes from the Perspective of Contour Points [0.0]
We propose an anti-noise rotation-invariant convolution module based on contour geometric aware for fine-grained shape recognition.<n>The results show that our method exhibits excellent performance in rotation-invariant recognition of fine-grained shapes.
arXiv Detail & Related papers (2025-03-14T01:34:20Z) - Interior Object Geometry via Fitted Frames [18.564031163436553]
We describe a representation targeted for anatomic objects which is designed to enable strong locational correspondence within object populations.
The method generates fitted frames on the boundary and in the interior of objects and produces alignment-free geometric features from them.
arXiv Detail & Related papers (2024-07-19T14:38:47Z) - Few-shot Shape Recognition by Learning Deep Shape-aware Features [39.66613852292122]
We propose a fewshot shape descriptor (FSSD) to recognize object shapes given only one or a few samples.
We employ an embedding module for FSSD to extract transformation-invariant shape features.
Thirdly, we propose a decoding module to include the supervision of shape masks and edges and align the original and reconstructed shape features.
arXiv Detail & Related papers (2023-12-03T08:12:23Z) - Geo-SIC: Learning Deformable Geometric Shapes in Deep Image Classifiers [8.781861951759948]
This paper presents Geo-SIC, the first deep learning model to learn deformable shapes in a deformation space for an improved performance of image classification.
We introduce a newly designed framework that (i) simultaneously derives features from both image and latent shape spaces with large intra-class variations.
We develop a boosted classification network, equipped with an unsupervised learning of geometric shape representations.
arXiv Detail & Related papers (2022-10-25T01:55:17Z) - Generative Category-Level Shape and Pose Estimation with Semantic
Primitives [27.692997522812615]
We propose a novel framework for category-level object shape and pose estimation from a single RGB-D image.
To handle the intra-category variation, we adopt a semantic primitive representation that encodes diverse shapes into a unified latent space.
We show that the proposed method achieves SOTA pose estimation performance and better generalization in the real-world dataset.
arXiv Detail & Related papers (2022-10-03T17:51:54Z) - Frame Averaging for Equivariant Shape Space Learning [85.42901997467754]
A natural way to incorporate symmetries in shape space learning is to ask that the mapping to the shape space (encoder) and mapping from the shape space (decoder) are equivariant to the relevant symmetries.
We present a framework for incorporating equivariance in encoders and decoders by introducing two contributions.
arXiv Detail & Related papers (2021-12-03T06:41:19Z) - Learning Canonical 3D Object Representation for Fine-Grained Recognition [77.33501114409036]
We propose a novel framework for fine-grained object recognition that learns to recover object variation in 3D space from a single image.
We represent an object as a composition of 3D shape and its appearance, while eliminating the effect of camera viewpoint.
By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object.
arXiv Detail & Related papers (2021-08-10T12:19:34Z) - Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose
Estimation [44.8872454995923]
We present a novel approach for scalable 6D pose estimation, by self-supervised learning on synthetic data of multiple objects using a single autoencoder.
We test our method on two multi-object benchmarks with real data, T-LESS and NOCS REAL275, and show it outperforms existing RGB-based methods in terms of pose estimation accuracy and generalization.
arXiv Detail & Related papers (2021-07-27T01:55:30Z) - DONet: Learning Category-Level 6D Object Pose and Size Estimation from
Depth Observation [53.55300278592281]
We propose a method of Category-level 6D Object Pose and Size Estimation (COPSE) from a single depth image.
Our framework makes inferences based on the rich geometric information of the object in the depth channel alone.
Our framework competes with state-of-the-art approaches that require labeled real-world images.
arXiv Detail & Related papers (2021-06-27T10:41:50Z) - Learning to Caricature via Semantic Shape Transform [95.25116681761142]
We propose an algorithm based on a semantic shape transform to produce shape exaggerations.
We show that the proposed framework is able to render visually pleasing shape exaggerations while maintaining their facial structures.
arXiv Detail & Related papers (2020-08-12T03:41:49Z) - Fine-grained Image-to-Image Transformation towards Visual Recognition [102.51124181873101]
We aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image.
We adopt a model based on generative adversarial networks to disentangle the identity related and unrelated factors of an image.
Experiments on the CompCars and Multi-PIE datasets demonstrate that our model preserves the identity of the generated images much better than the state-of-the-art image-to-image transformation models.
arXiv Detail & Related papers (2020-01-12T05:26:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.