Generalizing Interactive Backpropagating Refinement for Dense Prediction
- URL: http://arxiv.org/abs/2112.10969v2
- Date: Wed, 22 Dec 2021 11:07:46 GMT
- Title: Generalizing Interactive Backpropagating Refinement for Dense Prediction
- Authors: Fanqing Lin, Brian Price, Tony Martinez
- Abstract summary: We introduce a set of G-BRS layers that enable both global and localized refinement for a range of dense prediction tasks.
Our method can successfully generalize and significantly improve performance of existing pretrained state-of-the-art models with only a few clicks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As deep neural networks become the state-of-the-art approach in the field of
computer vision for dense prediction tasks, many methods have been developed
for automatic estimation of the target outputs given the visual inputs.
Although the estimation accuracy of the proposed automatic methods continues to
improve, interactive refinement is oftentimes necessary for further correction.
Recently, feature backpropagating refinement scheme (f-BRS) has been proposed
for the task of interactive segmentation, which enables efficient optimization
of a small set of auxiliary variables inserted into the pretrained network to
produce object segmentation that better aligns with user inputs. However, the
proposed auxiliary variables only contain channel-wise scale and bias, limiting
the optimization to global refinement only. In this work, in order to
generalize backpropagating refinement for a wide range of dense prediction
tasks, we introduce a set of G-BRS (Generalized Backpropagating Refinement
Scheme) layers that enable both global and localized refinement for the
following tasks: interactive segmentation, semantic segmentation, image matting
and monocular depth estimation. Experiments on SBD, Cityscapes, Mapillary
Vista, Composition-1k and NYU-Depth-V2 show that our method can successfully
generalize and significantly improve performance of existing pretrained
state-of-the-art models with only a few clicks.
Related papers
- LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging [80.17238673443127]
LiNeS is a post-training editing technique designed to preserve pre-trained generalization while enhancing fine-tuned task performance.
LiNeS demonstrates significant improvements in both single-task and multi-task settings across various benchmarks in vision and natural language processing.
arXiv Detail & Related papers (2024-10-22T16:26:05Z) - Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation.
In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales.
Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z) - Beyond Single-Model Views for Deep Learning: Optimization versus
Generalizability of Stochastic Optimization Algorithms [13.134564730161983]
This paper adopts a novel approach to deep learning optimization, focusing on gradient descent (SGD) and its variants.
We show that SGD and its variants demonstrate performance on par with flat-minimas like SAM, albeit with half the gradient evaluations.
Our study uncovers several key findings regarding the relationship between training loss and hold-out accuracy, as well as the comparable performance of SGD and noise-enabled variants.
arXiv Detail & Related papers (2024-03-01T14:55:22Z) - ConvPoseCNN2: Prediction and Refinement of Dense 6D Object Poses [23.348510362258402]
We propose a fully-convolutional extension of the PoseCNN method, which densely predicts object translations and orientations.
This has several advantages such as improving the spatial resolution of the orientation predictions.
We demonstrate that our method achieves the same accuracy as PoseCNN on the challenging YCB-Video dataset.
arXiv Detail & Related papers (2022-05-23T08:32:09Z) - Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation [87.54604263202941]
We propose a tiny deep neural network of which partial layers are iteratively exploited for refining its previous estimations.
We employ learned gating criteria to decide whether to exit from the weight-sharing loop, allowing per-sample adaptation in our model.
Our method consistently outperforms state-of-the-art 2D/3D hand pose estimation approaches in terms of both accuracy and efficiency for widely used benchmarks.
arXiv Detail & Related papers (2021-11-11T23:31:34Z) - Predicting Deep Neural Network Generalization with Perturbation Response
Curves [58.8755389068888]
We propose a new framework for evaluating the generalization capabilities of trained networks.
Specifically, we introduce two new measures for accurately predicting generalization gaps.
We attain better predictive scores than the current state-of-the-art measures on a majority of tasks in the Predicting Generalization in Deep Learning (PGDL) NeurIPS 2020 competition.
arXiv Detail & Related papers (2021-06-09T01:37:36Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Unsupervised learning of disentangled representations in deep restricted
kernel machines with orthogonality constraints [15.296955630621566]
Constr-DRKM is a deep kernel method for the unsupervised learning of disentangled data representations.
We quantitatively evaluate the proposed method's effectiveness in disentangled feature learning.
arXiv Detail & Related papers (2020-11-25T11:40:10Z) - Regularizing Deep Networks with Semantic Data Augmentation [44.53483945155832]
We propose a novel semantic data augmentation algorithm to complement traditional approaches.
The proposed method is inspired by the intriguing property that deep networks are effective in learning linearized features.
We show that the proposed implicit semantic data augmentation (ISDA) algorithm amounts to minimizing a novel robust CE loss.
arXiv Detail & Related papers (2020-07-21T00:32:44Z) - Target-Embedding Autoencoders for Supervised Representation Learning [111.07204912245841]
This paper analyzes a framework for improving generalization in a purely supervised setting, where the target space is high-dimensional.
We motivate and formalize the general framework of target-embedding autoencoders (TEA) for supervised prediction, learning intermediate latent representations jointly optimized to be both predictable from features as well as predictive of targets.
arXiv Detail & Related papers (2020-01-23T02:37:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.