Meta-Optimization for Higher Model Generalizability in Single-Image
Depth Prediction
- URL: http://arxiv.org/abs/2305.07269v2
- Date: Tue, 30 Jan 2024 08:49:14 GMT
- Title: Meta-Optimization for Higher Model Generalizability in Single-Image
Depth Prediction
- Authors: Cho-Ying Wu, Yiqi Zhong, Junying Wang, Ulrich Neumann
- Abstract summary: We leverage gradient-based meta-learning for higher generalizability on zero-shot cross-dataset inference.
Unlike the most-studied image classification in meta-learning, depth is pixel-level continuous range values.
We propose fine-grained task that treats each RGB-D pair as a task in our meta-optimization.
- Score: 19.469860191876876
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Model generalizability to unseen datasets, concerned with in-the-wild
robustness, is less studied for indoor single-image depth prediction. We
leverage gradient-based meta-learning for higher generalizability on zero-shot
cross-dataset inference. Unlike the most-studied image classification in
meta-learning, depth is pixel-level continuous range values, and mappings from
each image to depth vary widely across environments. Thus no explicit task
boundaries exist. We instead propose fine-grained task that treats each RGB-D
pair as a task in our meta-optimization. We first show meta-learning on limited
data induces much better prior (max +29.4\%). Using meta-learned weights as
initialization for following supervised learning, without involving extra data
or information, it consistently outperforms baselines without the method.
Compared to most indoor-depth methods that only train/ test on a single
dataset, we propose zero-shot cross-dataset protocols, closely evaluate
robustness, and show consistently higher generalizability and accuracy by our
meta-initialization. The work at the intersection of depth and meta-learning
potentially drives both research streams to step closer to practical use.
Related papers
- Boosting Generalizability towards Zero-Shot Cross-Dataset Single-Image Indoor Depth by Meta-Initialization [17.822554284161868]
We use gradient-based meta-learning to gain higher generalizability on zero-shot cross-dataset inference.
We propose zero-shot cross-dataset protocols and validate higher generalizability induced by our meta-initialization.
arXiv Detail & Related papers (2024-09-04T07:25:50Z) - Msmsfnet: a multi-stream and multi-scale fusion net for edge detection [6.1932429715357165]
Edge detection is a long-standing problem in computer vision.
Recent deep learning based algorithms achieve state-of-the-art performance in publicly available datasets.
However, their performance relies heavily on the pre-trained weights of the backbone network on the ImageNet dataset.
arXiv Detail & Related papers (2024-04-07T08:03:42Z) - Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over Quantity [11.414069074535007]
Contrastive Language-Image Pre-training on large-scale image-caption datasets learns representations that can achieve remarkable zero-shot generalization.
Small subsets of training data that provably generalize the best has remained an open question.
We show that subsets that closely preserve the cross-covariance of the images and captions of the full data provably achieve a superior generalization performance.
arXiv Detail & Related papers (2024-03-18T21:32:58Z) - Rethinking Transformers Pre-training for Multi-Spectral Satellite
Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks.
Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data.
In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z) - Beyond Simple Meta-Learning: Multi-Purpose Models for Multi-Domain,
Active and Continual Few-Shot Learning [41.07029317930986]
We propose a variance-sensitive class of models that operates in a low-label regime.
The first method, Simple CNAPS, employs a hierarchically regularized Mahalanobis-distance based classifier.
We further extend this approach to a transductive learning setting, proposing Transductive CNAPS.
arXiv Detail & Related papers (2022-01-13T18:59:02Z) - Memory Efficient Meta-Learning with Large Images [62.70515410249566]
Meta learning approaches to few-shot classification are computationally efficient at test time requiring just a few optimization steps or single forward pass to learn a new task.
This limitation arises because a task's entire support set, which can contain up to 1000 images, must be processed before an optimization step can be taken.
We propose LITE, a general and memory efficient episodic training scheme that enables meta-training on large tasks composed of large images on a single GPU.
arXiv Detail & Related papers (2021-07-02T14:37:13Z) - Revisiting Contrastive Methods for Unsupervised Learning of Visual
Representations [78.12377360145078]
Contrastive self-supervised learning has outperformed supervised pretraining on many downstream tasks like segmentation and object detection.
In this paper, we first study how biases in the dataset affect existing methods.
We show that current contrastive approaches work surprisingly well across: (i) object- versus scene-centric, (ii) uniform versus long-tailed and (iii) general versus domain-specific datasets.
arXiv Detail & Related papers (2021-06-10T17:59:13Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - Learning to Generalize Unseen Domains via Memory-based Multi-Source
Meta-Learning for Person Re-Identification [59.326456778057384]
We propose the Memory-based Multi-Source Meta-Learning framework to train a generalizable model for unseen domains.
We also present a meta batch normalization layer (MetaBN) to diversify meta-test features.
Experiments demonstrate that our M$3$L can effectively enhance the generalization ability of the model for unseen domains.
arXiv Detail & Related papers (2020-12-01T11:38:16Z) - Incremental Meta-Learning via Indirect Discriminant Alignment [118.61152684795178]
We develop a notion of incremental learning during the meta-training phase of meta-learning.
Our approach performs favorably at test time as compared to training a model with the full meta-training set.
arXiv Detail & Related papers (2020-02-11T01:39:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.