Relating Events and Frames Based on Self-Supervised Learning and
Uncorrelated Conditioning for Unsupervised Domain Adaptation
- URL: http://arxiv.org/abs/2401.01042v1
- Date: Tue, 2 Jan 2024 05:10:08 GMT
- Title: Relating Events and Frames Based on Self-Supervised Learning and
Uncorrelated Conditioning for Unsupervised Domain Adaptation
- Authors: Mohammad Rostami and Dayuan Jian
- Abstract summary: Event-based cameras provide accurate and high temporal resolution measurements for performing computer vision tasks.
Despite their advantages, utilizing deep learning for event-based vision encounters a significant obstacle due to the scarcity of annotated data.
We propose a new algorithm tailored for adapting a deep neural network trained on annotated frame-based data to generalize well on event-based unannotated data.
- Score: 23.871860648919593
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Event-based cameras provide accurate and high temporal resolution
measurements for performing computer vision tasks in challenging scenarios,
such as high-dynamic range environments and fast-motion maneuvers. Despite
their advantages, utilizing deep learning for event-based vision encounters a
significant obstacle due to the scarcity of annotated data caused by the
relatively recent emergence of event-based cameras. To overcome this
limitation, leveraging the knowledge available from annotated data obtained
with conventional frame-based cameras presents an effective solution based on
unsupervised domain adaptation. We propose a new algorithm tailored for
adapting a deep neural network trained on annotated frame-based data to
generalize well on event-based unannotated data. Our approach incorporates
uncorrelated conditioning and self-supervised learning in an adversarial
learning scheme to close the gap between the two source and target domains. By
applying self-supervised learning, the algorithm learns to align the
representations of event-based data with those from frame-based camera data,
thereby facilitating knowledge transfer.Furthermore, the inclusion of
uncorrelated conditioning ensures that the adapted model effectively
distinguishes between event-based and conventional data, enhancing its ability
to classify event-based images accurately.Through empirical experimentation and
evaluation, we demonstrate that our algorithm surpasses existing approaches
designed for the same purpose using two benchmarks. The superior performance of
our solution is attributed to its ability to effectively utilize annotated data
from frame-based cameras and transfer the acquired knowledge to the event-based
vision domain.
Related papers
- Evaluating Image-Based Face and Eye Tracking with Event Cameras [9.677797822200965]
Event Cameras, also known as Neuromorphic sensors, capture changes in local light intensity at the pixel level, producing asynchronously generated data termed events''
This data format mitigates common issues observed in conventional cameras, like under-sampling when capturing fast-moving objects.
We evaluate the viability of integrating conventional algorithms with event-based data, transformed into a frame format.
arXiv Detail & Related papers (2024-08-19T20:27:08Z) - VICAN: Very Efficient Calibration Algorithm for Large Camera Networks [49.17165360280794]
We introduce a novel methodology that extends Pose Graph Optimization techniques.
We consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step.
Our framework retains compatibility with traditional PGO solvers, but its efficacy benefits from a custom-tailored optimization scheme.
arXiv Detail & Related papers (2024-03-25T17:47:03Z) - Cross-modal Place Recognition in Image Databases using Event-based
Sensors [28.124708490967713]
We present the first cross-modal visual place recognition framework that is capable of retrieving regular images from a database given an event query.
Our method demonstrates promising results with respect to the state-of-the-art frame-based and event-based methods on the Brisbane-Event-VPR dataset.
arXiv Detail & Related papers (2023-07-03T14:24:04Z) - Unsupervised Domain Adaptation for Training Event-Based Networks Using
Contrastive Learning and Uncorrelated Conditioning [12.013345715187285]
Deep learning in event-based vision faces the challenge of annotated data scarcity due to recency of event cameras.
We develop an unsupervised domain adaptation algorithm for training a deep network for event-based data image classification.
arXiv Detail & Related papers (2023-03-22T09:51:08Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - Deep face recognition with clustering based domain adaptation [57.29464116557734]
We propose a new clustering-based domain adaptation method designed for face recognition task in which the source and target domain do not share any classes.
Our method effectively learns the discriminative target feature by aligning the feature domain globally, and, at the meantime, distinguishing the target clusters locally.
arXiv Detail & Related papers (2022-05-27T12:29:11Z) - Object Tracking by Jointly Exploiting Frame and Event Domain [31.534731963279274]
We propose a multi-modal based approach to fuse visual cues from the frame- and event-domain to enhance single object tracking performance.
The proposed approach can effectively and adaptively combine meaningful information from both domains.
We show that the proposed approach outperforms state-of-the-art frame-based tracking methods by at least 10.4% and 11.9% in terms of representative success rate and precision rate.
arXiv Detail & Related papers (2021-09-19T03:13:25Z) - Self-supervised Segmentation via Background Inpainting [96.10971980098196]
We introduce a self-supervised detection and segmentation approach that can work with single images captured by a potentially moving camera.
We exploit a self-supervised loss function that we exploit to train a proposal-based segmentation network.
We apply our method to human detection and segmentation in images that visually depart from those of standard benchmarks and outperform existing self-supervised methods.
arXiv Detail & Related papers (2020-11-11T08:34:40Z) - Unsupervised Feature Learning for Event Data: Direct vs Inverse Problem
Formulation [53.850686395708905]
Event-based cameras record an asynchronous stream of per-pixel brightness changes.
In this paper, we focus on single-layer architectures for representation learning from event data.
We show improvements of up to 9 % in the recognition accuracy compared to the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-23T10:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.