Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for
Video Correspondence Learning
- URL: http://arxiv.org/abs/2105.05838v1
- Date: Wed, 12 May 2021 17:52:45 GMT
- Title: Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for
Video Correspondence Learning
- Authors: Yansong Tang, Zhenyu Jiang, Zhenda Xie, Yue Cao, Zheng Zhang, Philip
H. S. Torr, Han Hu
- Abstract summary: We present a fully convolutional method, which is simpler and more coherent to the inference process.
We study the underline reason behind this collapse phenomenon, indicating that the absolute positions of pixels provide a shortcut to easily accomplish cycle-consistence.
- Score: 78.43196840793489
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous cycle-consistency correspondence learning methods usually leverage
image patches for training. In this paper, we present a fully convolutional
method, which is simpler and more coherent to the inference process. While
directly applying fully convolutional training results in model collapse, we
study the underline reason behind this collapse phenomenon, indicating that the
absolute positions of pixels provide a shortcut to easily accomplish
cycle-consistence, which hinders the learning of meaningful visual
representations. To break this absolute position shortcut, we propose to apply
different crops for forward and backward frames, and adopt feature warping to
establish correspondence between two crops of a same frame. The former
technique enforces the corresponding pixels at forward and back tracks to have
different absolute positions, and the latter effectively blocks the shortcuts
going between forward and back tracks. In three label propagation benchmarks
for pose tracking, face landmark tracking and video object segmentation, our
method largely improves the results of vanilla fully convolutional
cycle-consistency method, achieving very competitive performance compared with
the self-supervised state-of-the-art approaches.
Related papers
- Self-Supervised Any-Point Tracking by Contrastive Random Walks [17.50529887238381]
We train a global matching transformer to find cycle consistent tracks through video via contrastive random walks.
Our method achieves strong performance on the TapVid benchmarks, outperforming previous self-supervised tracking methods.
arXiv Detail & Related papers (2024-09-24T17:59:56Z) - Refining Pre-Trained Motion Models [56.18044168821188]
We take on the challenge of improving state-of-the-art supervised models with self-supervised training.
We focus on obtaining a "clean" training signal from real-world unlabelled video.
We show that our method yields reliable gains over fully-supervised methods in real videos.
arXiv Detail & Related papers (2024-01-01T18:59:33Z) - Q-REG: End-to-End Trainable Point Cloud Registration with Surface
Curvature [81.25511385257344]
We present a novel solution, Q-REG, which utilizes rich geometric information to estimate the rigid pose from a single correspondence.
Q-REG allows to formalize the robust estimation as an exhaustive search, hence enabling end-to-end training.
We demonstrate in the experiments that Q-REG is agnostic to the correspondence matching method and provides consistent improvement both when used only in inference and in end-to-end training.
arXiv Detail & Related papers (2023-09-27T20:58:53Z) - Meta Transferring for Deblurring [43.86235102507237]
We propose a reblur-de meta-transferring scheme to realize test-time adaptation without using ground truth for dynamic scene deblurring.
We leverage the blurred input video to find and use relatively sharp patches as the pseudo ground truth.
Our reblur-de meta-learning scheme can improve state-of-the-art deblurring models on the DVD, REDS, and RealBlur benchmark datasets.
arXiv Detail & Related papers (2022-10-14T18:06:33Z) - One Sketch for All: One-Shot Personalized Sketch Segmentation [84.45203849671003]
We present the first one-shot personalized sketch segmentation method.
We aim to segment all sketches belonging to the same category with a single sketch with a given part annotation.
We preserve the parts semantics embedded in the exemplar, and we are robust to input style and abstraction.
arXiv Detail & Related papers (2021-12-20T20:10:44Z) - Semi-TCL: Semi-Supervised Track Contrastive Representation Learning [40.31083437957288]
We design a new instance-to-track matching objective to learn appearance embedding.
It compares a candidate detection to the embedding of the tracks persisted in the tracker.
We implement this learning objective in a unified form following the spirit of constrastive loss.
arXiv Detail & Related papers (2021-07-06T05:23:30Z) - Unsupervised Landmark Learning from Unpaired Data [117.81440795184587]
Recent attempts for unsupervised landmark learning leverage synthesized image pairs that are similar in appearance but different in poses.
We propose a cross-image cycle consistency framework which applies the swapping-reconstruction strategy twice to obtain the final supervision.
Our proposed framework is shown to outperform strong baselines by a large margin.
arXiv Detail & Related papers (2020-06-29T13:57:20Z) - Space-Time Correspondence as a Contrastive Random Walk [47.40711876423659]
We cast correspondence as prediction of links in a space-time graph constructed from video.
We learn a representation in which pairwise similarity defines transition probability of a random walk.
We demonstrate that a technique we call edge dropout, as well as self-supervised adaptation at test-time, further improve transfer for object-centric correspondence.
arXiv Detail & Related papers (2020-06-25T17:56:05Z) - LT-Net: Label Transfer by Learning Reversible Voxel-wise Correspondence
for One-shot Medical Image Segmentation [52.2074595581139]
We introduce a one-shot segmentation method to alleviate the burden of manual annotation for medical images.
The main idea is to treat one-shot segmentation as a classical atlas-based segmentation problem, where voxel-wise correspondence from the atlas to the unlabelled data is learned.
We demonstrate the superiority of our method over both deep learning-based one-shot segmentation methods and a classical multi-atlas segmentation method via thorough experiments.
arXiv Detail & Related papers (2020-03-16T08:36:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.