Skin feature point tracking using deep feature encodings
- URL: http://arxiv.org/abs/2112.14159v1
- Date: Tue, 28 Dec 2021 14:29:08 GMT
- Title: Skin feature point tracking using deep feature encodings
- Authors: Jose Ramon Chang and Torbj\"orn E.M. Nordling
- Abstract summary: We propose a pipeline for feature tracking, that applies a convolutional stacked autoencoder to identify the most similar crop in an image to a reference crop containing the feature of interest.
We train the autoencoder on facial images and validate its ability to track skin features in general using manually labeled face and hand videos.
We conclude that our method creates better feature descriptors for feature tracking, feature matching, and image registration than the traditional algorithms.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Facial feature tracking is a key component of imaging ballistocardiography
(BCG) where accurate quantification of the displacement of facial keypoints is
needed for good heart rate estimation. Skin feature tracking enables
video-based quantification of motor degradation in Parkinson's disease.
Traditional computer vision algorithms include Scale Invariant Feature
Transform (SIFT), Speeded-Up Robust Features (SURF), and Lucas-Kanade method
(LK). These have long represented the state-of-the-art in efficiency and
accuracy but fail when common deformations, like affine local transformations
or illumination changes, are present.
Over the past five years, deep convolutional neural networks have
outperformed traditional methods for most computer vision tasks. We propose a
pipeline for feature tracking, that applies a convolutional stacked autoencoder
to identify the most similar crop in an image to a reference crop containing
the feature of interest. The autoencoder learns to represent image crops into
deep feature encodings specific to the object category it is trained on.
We train the autoencoder on facial images and validate its ability to track
skin features in general using manually labeled face and hand videos. The
tracking errors of distinctive skin features (moles) are so small that we
cannot exclude that they stem from the manual labelling based on a
$\chi^2$-test. With a mean error of 0.6-4.2 pixels, our method outperformed the
other methods in all but one scenario. More importantly, our method was the
only one to not diverge.
We conclude that our method creates better feature descriptors for feature
tracking, feature matching, and image registration than the traditional
algorithms.
Related papers
- Classification and regression of trajectories rendered as images via 2D Convolutional Neural Networks [0.0]
Recent advances in computer vision have facilitated the processing of trajectories rendered as images via artificial neural networks with 2d convolutional layers (CNNs)
In this study, we investigate the effectiveness of CNNs for solving classification and regression problems from synthetic trajectories rendered as images using different modalities.
Results highlight the importance of choosing an appropriate image resolution according to model depth and motion history in applications where movement direction is critical.
arXiv Detail & Related papers (2024-09-27T15:27:04Z) - Unsupervised Skin Feature Tracking with Deep Neural Networks [0.0]
Deep convolutional neural networks have shown remarkable accuracy in tracking tasks.
Our pipeline employs a convolutional stacked autoencoder to match image crops with a reference crop containing the target feature.
Our unsupervised learning approach excels in tracking various skin features under significant motion conditions.
arXiv Detail & Related papers (2024-05-08T10:27:05Z) - Learning Expressive And Generalizable Motion Features For Face Forgery
Detection [52.54404879581527]
We propose an effective sequence-based forgery detection framework based on an existing video classification method.
To make the motion features more expressive for manipulation detection, we propose an alternative motion consistency block.
We make a general video classification network achieve promising results on three popular face forgery datasets.
arXiv Detail & Related papers (2024-03-08T09:25:48Z) - Attentive Symmetric Autoencoder for Brain MRI Segmentation [56.02577247523737]
We propose a novel Attentive Symmetric Auto-encoder based on Vision Transformer (ViT) for 3D brain MRI segmentation tasks.
In the pre-training stage, the proposed auto-encoder pays more attention to reconstruct the informative patches according to the gradient metrics.
Experimental results show that our proposed attentive symmetric auto-encoder outperforms the state-of-the-art self-supervised learning methods and medical image segmentation models.
arXiv Detail & Related papers (2022-09-19T09:43:19Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - Keypoint Message Passing for Video-based Person Re-Identification [106.41022426556776]
Video-based person re-identification (re-ID) is an important technique in visual surveillance systems which aims to match video snippets of people captured by different cameras.
Existing methods are mostly based on convolutional neural networks (CNNs), whose building blocks either process local neighbor pixels at a time, or, when 3D convolutions are used to model temporal information, suffer from the misalignment problem caused by person movement.
In this paper, we propose to overcome the limitations of normal convolutions with a human-oriented graph method. Specifically, features located at person joint keypoints are extracted and connected as a spatial-temporal graph
arXiv Detail & Related papers (2021-11-16T08:01:16Z) - Pixel-Perfect Structure-from-Motion with Featuremetric Refinement [96.73365545609191]
We refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views.
This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors.
Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.
arXiv Detail & Related papers (2021-08-18T17:58:55Z) - Coarse-to-Fine Object Tracking Using Deep Features and Correlation
Filters [2.3526458707956643]
This paper presents a novel deep learning tracking algorithm.
We exploit the generalization ability of deep features to coarsely estimate target translation.
Then, we capitalize on the discriminative power of correlation filters to precisely localize the tracked object.
arXiv Detail & Related papers (2020-12-23T16:43:21Z) - Self-Supervised Linear Motion Deblurring [112.75317069916579]
Deep convolutional neural networks are state-of-the-art for image deblurring.
We present a differentiable reblur model for self-supervised motion deblurring.
Our experiments demonstrate that self-supervised single image deblurring is really feasible.
arXiv Detail & Related papers (2020-02-10T20:15:21Z) - Deep Feature Consistent Variational Autoencoder [46.25741696270528]
We present a novel method for constructing Variational Autoencoder (VAE)
Instead of using pixel-by-pixel loss, we enforce deep feature consistency between the input and the output of a VAE.
We also show that our method can produce latent vectors that can capture the semantic information of face expressions.
arXiv Detail & Related papers (2016-10-02T15:48:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.