CLIP-FLow: Contrastive Learning by {s}emi-supervised Iterative Pseudo
{l}abeling for Optical Flow Estimation
- URL: http://arxiv.org/abs/2210.14383v1
- Date: Tue, 25 Oct 2022 23:22:25 GMT
- Title: CLIP-FLow: Contrastive Learning by {s}emi-supervised Iterative Pseudo
{l}abeling for Optical Flow Estimation
- Authors: Zhiqi Zhang, Pan Ji, Nitin Bansal, Changjiang Cai, Qingan Yan, Xiangyu
Xu, Yi Xu
- Abstract summary: We propose a semi-supervised iterative pseudo-labeling framework to transfer the pretraining knowledge to the target real domain.
We leverage large-scale, unlabeled real data to facilitate transfer learning with the supervision of iteratively updated pseudo-ground truth labels.
Our framework can also be extended to other models, e.g. CRAFT, reducing the F1-all error from 4.79% to 4.66% on KITTI 2015 benchmark.
- Score: 31.773232370688657
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Synthetic datasets are often used to pretrain end-to-end optical flow
networks, due to the lack of a large amount of labeled, real-scene data. But
major drops in accuracy occur when moving from synthetic to real scenes. How do
we better transfer the knowledge learned from synthetic to real domains? To
this end, we propose CLIP-FLow, a semi-supervised iterative pseudo-labeling
framework to transfer the pretraining knowledge to the target real domain. We
leverage large-scale, unlabeled real data to facilitate transfer learning with
the supervision of iteratively updated pseudo-ground truth labels, bridging the
domain gap between the synthetic and the real. In addition, we propose a
contrastive flow loss on reference features and the warped features by pseudo
ground truth flows, to further boost the accurate matching and dampen the
mismatching due to motion, occlusion, or noisy pseudo labels. We adopt RAFT as
the backbone and obtain an F1-all error of 4.11\%, i.e. a 19\% error reduction
from RAFT (5.10\%) and ranking 2$^{nd}$ place at submission on KITTI 2015
benchmark. Our framework can also be extended to other models, e.g. CRAFT,
reducing the F1-all error from 4.79\% to 4.66\% on KITTI 2015 benchmark.
Related papers
- Enhancing Zero-Shot Vision Models by Label-Free Prompt Distribution Learning and Bias Correcting [55.361337202198925]
Vision-language models, such as CLIP, have shown impressive generalization capacities when using appropriate text descriptions.
We propose a label-Free prompt distribution learning and bias correction framework, dubbed as **Frolic**, which boosts zero-shot performance without the need for labeled data.
arXiv Detail & Related papers (2024-10-25T04:00:45Z) - AdaTriplet-RA: Domain Matching via Adaptive Triplet and Reinforced
Attention for Unsupervised Domain Adaptation [15.905869933337101]
Unsupervised domain adaption (UDA) is a transfer learning task where the data and annotations of the source domain are available but only have access to the unlabeled target data during training.
We propose to improve the unsupervised domain adaptation task with an inter-domain sample matching scheme.
We apply the widely-used and robust Triplet loss to match the inter-domain samples.
To reduce the catastrophic effect of the inaccurate pseudo-labels generated during training, we propose a novel uncertainty measurement method to select reliable pseudo-labels automatically and progressively refine them.
arXiv Detail & Related papers (2022-11-16T13:04:24Z) - Federated Learning with Label Distribution Skew via Logits Calibration [26.98248192651355]
In this paper, we investigate the label distribution skew in FL, where the distribution of labels varies across clients.
We propose FedLC, which calibrates the logits before softmax cross-entropy according to the probability of occurrence of each class.
Experiments on federated datasets and real-world datasets demonstrate that FedLC leads to a more accurate global model.
arXiv Detail & Related papers (2022-09-01T02:56:39Z) - Cycle Label-Consistent Networks for Unsupervised Domain Adaptation [57.29464116557734]
Domain adaptation aims to leverage a labeled source domain to learn a classifier for the unlabeled target domain with a different distribution.
We propose a simple yet efficient domain adaptation method, i.e. Cycle Label-Consistent Network (CLCN), by exploiting the cycle consistency of classification label.
We demonstrate the effectiveness of our approach on MNIST-USPS-SVHN, Office-31, Office-Home and Image CLEF-DA benchmarks.
arXiv Detail & Related papers (2022-05-27T13:09:08Z) - Imposing Consistency for Optical Flow Estimation [73.53204596544472]
Imposing consistency through proxy tasks has been shown to enhance data-driven learning.
This paper introduces novel and effective consistency strategies for optical flow estimation.
arXiv Detail & Related papers (2022-04-14T22:58:30Z) - Deformation and Correspondence Aware Unsupervised Synthetic-to-Real
Scene Flow Estimation for Point Clouds [43.792032657561236]
We develop a point cloud collector and scene flow annotator for GTA-V engine to automatically obtain diverse training samples without human intervention.
We propose a mean-teacher-based domain adaptation framework that leverages self-generated pseudo-labels of the target domain.
Our framework achieves superior adaptation performance on six source-target dataset pairs, remarkably closing the average domain gap by 60%.
arXiv Detail & Related papers (2022-03-31T09:03:23Z) - Synergistic Network Learning and Label Correction for Noise-robust Image
Classification [28.27739181560233]
Deep Neural Networks (DNNs) tend to overfit training label noise, resulting in poorer model performance in practice.
We propose a robust label correction framework combining the ideas of small loss selection and noise correction.
We demonstrate our method on both synthetic and real-world datasets with different noise types and rates.
arXiv Detail & Related papers (2022-02-27T23:06:31Z) - SCARF: Self-Supervised Contrastive Learning using Random Feature
Corruption [72.35532598131176]
We propose SCARF, a technique for contrastive learning, where views are formed by corrupting a random subset of features.
We show that SCARF complements existing strategies and outperforms alternatives like autoencoders.
arXiv Detail & Related papers (2021-06-29T08:08:33Z) - Optical Flow Dataset Synthesis from Unpaired Images [36.158607790844705]
We introduce a novel method to build a training set of pseudo-real images that can be used to train optical flow in a supervised manner.
Our dataset uses two unpaired frames from real data and creates pairs of frames by simulating random warps.
We thus obtain the benefit of directly training on real data while having access to an exact ground truth.
arXiv Detail & Related papers (2021-04-02T22:19:47Z) - A Free Lunch for Unsupervised Domain Adaptive Object Detection without
Source Data [69.091485888121]
Unsupervised domain adaptation assumes that source and target domain data are freely available and usually trained together to reduce the domain gap.
We propose a source data-free domain adaptive object detection (SFOD) framework via modeling it into a problem of learning with noisy labels.
arXiv Detail & Related papers (2020-12-10T01:42:35Z) - RAFT: Recurrent All-Pairs Field Transforms for Optical Flow [78.92562539905951]
We introduce Recurrent All-Pairs Field Transforms (RAFT), a new deep network architecture for optical flow.
RAFT extracts per-pixel features, builds multi-scale 4D correlation volumes for all pairs of pixels, and iteratively updates a flow field.
RAFT achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-26T17:12:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.