Self-supervised Secondary Landmark Detection via 3D Representation
Learning
- URL: http://arxiv.org/abs/2110.00543v1
- Date: Fri, 1 Oct 2021 17:15:47 GMT
- Title: Self-supervised Secondary Landmark Detection via 3D Representation
Learning
- Authors: Praneet C. Bala, Jan Zimmermann, Hyun Soo Park, and Benjamin Y. Hayden
- Abstract summary: We present a method to learn the spatial relationship of the primary and secondary landmarks in three dimensional space.
This learning can be applied to various multiview settings across diverse organisms, including macaques, flies, and humans.
- Score: 13.157012771922801
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent technological developments have spurred great advances in the
computerized tracking of joints and other landmarks in moving animals,
including humans. Such tracking promises important advances in biology and
biomedicine. Modern tracking models depend critically on labor-intensive
annotated datasets of primary landmarks by non-expert humans. However, such
annotation approaches can be costly and impractical for secondary landmarks,
that is, ones that reflect fine-grained geometry of animals, and that are often
specific to customized behavioral tasks. Due to visual and geometric ambiguity,
nonexperts are often not qualified for secondary landmark annotation, which can
require anatomical and zoological knowledge. These barriers significantly
impede downstream behavioral studies because the learned tracking models
exhibit limited generalizability. We hypothesize that there exists a shared
representation between the primary and secondary landmarks because the range of
motion of the secondary landmarks can be approximately spanned by that of the
primary landmarks. We present a method to learn this spatial relationship of
the primary and secondary landmarks in three dimensional space, which can, in
turn, self-supervise the secondary landmark detector. This 3D representation
learning is generic, and can therefore be applied to various multiview settings
across diverse organisms, including macaques, flies, and humans.
Related papers
- Video Anomaly Detection with Contours - A Study [24.525564527855092]
We investigate the potential of learning recurrent motion patterns of normal human behavior using 2D contours.
Our results indicate that this novel perspective on Pose-based Video Anomaly Detection marks a promising direction for future research.
arXiv Detail & Related papers (2025-03-25T12:11:50Z) - Does Spatial Cognition Emerge in Frontier Models? [56.47912101304053]
We present SPACE, a benchmark that systematically evaluates spatial cognition in frontier models.
Results suggest that contemporary frontier models fall short of the spatial intelligence of animals.
arXiv Detail & Related papers (2024-10-09T01:41:49Z) - URLOST: Unsupervised Representation Learning without Stationarity or Topology [26.010647961403148]
We introduce a novel framework that learns from high-dimensional data without prior knowledge of stationarity and topology.
Our model, abbreviated as URLOST, combines a learnable self-organizing layer, spectral clustering, and a masked autoencoder.
We evaluate its effectiveness on three diverse data modalities including simulated biological vision data, neural recordings from the primary visual cortex, and gene expressions.
arXiv Detail & Related papers (2023-10-06T18:00:02Z) - Unsupervised 3D Keypoint Discovery with Multi-View Geometry [104.76006413355485]
We propose an algorithm that learns to discover 3D keypoints on human bodies from multiple-view images without supervision or labels.
Our approach discovers more interpretable and accurate 3D keypoints compared to other state-of-the-art unsupervised approaches.
arXiv Detail & Related papers (2022-11-23T10:25:12Z) - Stochastic Coherence Over Attention Trajectory For Continuous Learning
In Video Streams [64.82800502603138]
This paper proposes a novel neural-network-based approach to progressively and autonomously develop pixel-wise representations in a video stream.
The proposed method is based on a human-like attention mechanism that allows the agent to learn by observing what is moving in the attended locations.
Our experiments leverage 3D virtual environments and they show that the proposed agents can learn to distinguish objects just by observing the video stream.
arXiv Detail & Related papers (2022-04-26T09:52:31Z) - Overcoming the Domain Gap in Neural Action Representations [60.47807856873544]
3D pose data can now be reliably extracted from multi-view video sequences without manual intervention.
We propose to use it to guide the encoding of neural action representations together with a set of neural and behavioral augmentations.
To reduce the domain gap, during training, we swap neural and behavioral data across animals that seem to be performing similar actions.
arXiv Detail & Related papers (2021-12-02T12:45:46Z) - Beyond Tracking: Using Deep Learning to Discover Novel Interactions in
Biological Swarms [3.441021278275805]
We propose training deep network models to predict system-level states directly from generic graphical features from the entire view.
Because the resulting predictive models are not based on human-understood predictors, we use explanatory modules.
This represents an example of augmented intelligence in behavioral ecology -- knowledge co-creation in a human-AI team.
arXiv Detail & Related papers (2021-08-20T22:50:41Z) - Pretrained equivariant features improve unsupervised landmark discovery [69.02115180674885]
We formulate a two-step unsupervised approach that overcomes this challenge by first learning powerful pixel-based features.
Our method produces state-of-the-art results in several challenging landmark detection datasets.
arXiv Detail & Related papers (2021-04-07T05:42:11Z) - A primer on model-guided exploration of fitness landscapes for
biological sequence design [0.0]
In this primer we highlight that algorithms for experimental design, what we call "exploration strategies", are a related, yet distinct problem from building good models of sequence-to-function maps.
This primer can serve as a starting point for researchers from different domains that are interested in the problem of searching a sequence space with a model.
arXiv Detail & Related papers (2020-10-04T21:32:07Z) - Structured Landmark Detection via Topology-Adapting Deep Graph Learning [75.20602712947016]
We present a new topology-adapting deep graph learning approach for accurate anatomical facial and medical landmark detection.
The proposed method constructs graph signals leveraging both local image features and global shape features.
Experiments are conducted on three public facial image datasets (WFLW, 300W, and COFW-68) as well as three real-world X-ray medical datasets (Cephalometric (public), Hand and Pelvis)
arXiv Detail & Related papers (2020-04-17T11:55:03Z) - Transferring Dense Pose to Proximal Animal Classes [83.84439508978126]
We show that it is possible to transfer the knowledge existing in dense pose recognition for humans, as well as in more general object detectors and segmenters, to the problem of dense pose recognition in other classes.
We do this by establishing a DensePose model for the new animal which is also geometrically aligned to humans.
We also introduce two benchmark datasets labelled in the manner of DensePose for the class chimpanzee and use them to evaluate our approach.
arXiv Detail & Related papers (2020-02-28T21:43:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.