Doubly Deformable Aggregation of Covariance Matrices for Few-shot
Segmentation
- URL: http://arxiv.org/abs/2208.00306v1
- Date: Sat, 30 Jul 2022 20:41:38 GMT
- Title: Doubly Deformable Aggregation of Covariance Matrices for Few-shot
Segmentation
- Authors: Zhitong Xiong, Haopeng Li, and Xiao Xiang Zhu
- Abstract summary: Training semantic segmentation models with few annotated samples has great potential in various real-world applications.
For the few-shot segmentation task, the main challenge is how to accurately measure the semantic correspondence between the support and query samples.
We propose to aggregate the learnable covariance matrices with a deformable 4D Transformer to effectively predict the segmentation map.
- Score: 25.387090319723715
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training semantic segmentation models with few annotated samples has great
potential in various real-world applications. For the few-shot segmentation
task, the main challenge is how to accurately measure the semantic
correspondence between the support and query samples with limited training
data. To address this problem, we propose to aggregate the learnable covariance
matrices with a deformable 4D Transformer to effectively predict the
segmentation map. Specifically, in this work, we first devise a novel hard
example mining mechanism to learn covariance kernels for the Gaussian process.
The learned covariance kernel functions have great advantages over existing
cosine similarity-based methods in correspondence measurement. Based on the
learned covariance kernels, an efficient doubly deformable 4D Transformer
module is designed to adaptively aggregate feature similarity maps into
segmentation results. By combining these two designs, the proposed method can
not only set new state-of-the-art performance on public benchmarks, but also
converge extremely faster than existing methods. Experiments on three public
datasets have demonstrated the effectiveness of our method.
Related papers
- Adaptive manifold for imbalanced transductive few-shot learning [16.627512688664513]
We propose a novel algorithm to address imbalanced transductive few-shot learning, named Adaptive Manifold.
Our method exploits the underlying manifold of the labeled support examples and unlabeled queries by using manifold similarity to predict the class probability distribution per query.
arXiv Detail & Related papers (2023-04-27T15:42:49Z) - Scalable Randomized Kernel Methods for Multiview Data Integration and
Prediction [4.801208484529834]
We develop scalable randomized kernel methods for jointly associating data from multiple sources and simultaneously predicting an outcome or classifying a unit into one of two or more classes.
The proposed methods model nonlinear relationships in multiview data together with predicting a clinical outcome and are capable of identifying variables or groups of variables that best contribute to the relationships among the views.
arXiv Detail & Related papers (2023-04-10T16:14:42Z) - Learning Context-aware Classifier for Semantic Segmentation [88.88198210948426]
In this paper, contextual hints are exploited via learning a context-aware classifier.
Our method is model-agnostic and can be easily applied to generic segmentation models.
With only negligible additional parameters and +2% inference time, decent performance gain has been achieved on both small and large models.
arXiv Detail & Related papers (2023-03-21T07:00:35Z) - Asymmetric Scalable Cross-modal Hashing [51.309905690367835]
Cross-modal hashing is a successful method to solve large-scale multimedia retrieval issue.
We propose a novel Asymmetric Scalable Cross-Modal Hashing (ASCMH) to address these issues.
Our ASCMH outperforms the state-of-the-art cross-modal hashing methods in terms of accuracy and efficiency.
arXiv Detail & Related papers (2022-07-26T04:38:47Z) - PointInst3D: Segmenting 3D Instances by Points [136.7261709896713]
We propose a fully-convolutional 3D point cloud instance segmentation method that works in a per-point prediction fashion.
We find the key to its success is assigning a suitable target to each sampled point.
Our approach achieves promising results on both ScanNet and S3DIS benchmarks.
arXiv Detail & Related papers (2022-04-25T02:41:46Z) - Cost Aggregation Is All You Need for Few-Shot Segmentation [28.23753949369226]
We introduce Volumetric Aggregation with Transformers (VAT) to tackle the few-shot segmentation task.
VAT uses both convolutions and transformers to efficiently handle high dimensional correlation maps between query and support.
We find that the proposed method attains state-of-the-art performance even for the standard benchmarks in semantic correspondence task.
arXiv Detail & Related papers (2021-12-22T06:18:51Z) - Dense Unsupervised Learning for Video Segmentation [49.46930315961636]
We present a novel approach to unsupervised learning for video object segmentation (VOS)
Unlike previous work, our formulation allows to learn dense feature representations directly in a fully convolutional regime.
Our approach exceeds the segmentation accuracy of previous work despite using significantly less training data and compute power.
arXiv Detail & Related papers (2021-11-11T15:15:11Z) - Parameter Decoupling Strategy for Semi-supervised 3D Left Atrium
Segmentation [0.0]
We present a novel semi-supervised segmentation model based on parameter decoupling strategy to encourage consistent predictions from diverse views.
Our method has achieved a competitive result over the state-of-the-art semisupervised methods on the Atrial Challenge dataset.
arXiv Detail & Related papers (2021-09-20T14:51:42Z) - EiGLasso for Scalable Sparse Kronecker-Sum Inverse Covariance Estimation [1.370633147306388]
We introduce EiGLasso, a highly scalable method for sparse Kronecker-sum inverse covariance estimation.
We show that EiGLasso achieves two to three orders-of-magnitude speed-up compared to the existing methods.
arXiv Detail & Related papers (2021-05-20T16:22:50Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z) - FDA: Fourier Domain Adaptation for Semantic Segmentation [82.4963423086097]
We describe a simple method for unsupervised domain adaptation, whereby the discrepancy between the source and target distributions is reduced by swapping the low-frequency spectrum of one with the other.
We illustrate the method in semantic segmentation, where densely annotated images are aplenty in one domain, but difficult to obtain in another.
Our results indicate that even simple procedures can discount nuisance variability in the data that more sophisticated methods struggle to learn away.
arXiv Detail & Related papers (2020-04-11T22:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.