Domain Siamese CNNs for Sparse Multispectral Disparity Estimation
- URL: http://arxiv.org/abs/2005.00088v1
- Date: Thu, 30 Apr 2020 20:29:59 GMT
- Title: Domain Siamese CNNs for Sparse Multispectral Disparity Estimation
- Authors: David-Alexandre Beaupre and Guillaume-Alexandre Bilodeau
- Abstract summary: We propose a new CNN architecture able to do disparity estimation between images from different spectrum.
Our method was tested using the publicly available LITIV 2014 and LITIV 2018 datasets.
- Score: 15.065764374430783
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multispectral disparity estimation is a difficult task for many reasons: it
has all the same challenges as traditional visible-visible disparity estimation
(occlusions, repetitive patterns, textureless surfaces), in addition of having
very few common visual information between images (e.g. color information vs.
thermal information). In this paper, we propose a new CNN architecture able to
do disparity estimation between images from different spectrum, namely thermal
and visible in our case. Our proposed model takes two patches as input and
proceeds to do domain feature extraction for each of them. Features from both
domains are then merged with two fusion operations, namely correlation and
concatenation. These merged vectors are then forwarded to their respective
classification heads, which are responsible for classifying the inputs as being
same or not. Using two merging operations gives more robustness to our feature
extraction process, which leads to more precise disparity estimation. Our
method was tested using the publicly available LITIV 2014 and LITIV 2018
datasets, and showed best results when compared to other state of the art
methods.
Related papers
- Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning [71.14084801851381]
Change captioning aims to succinctly describe the semantic change between a pair of similar images.
Most existing methods directly capture the difference between them, which risk obtaining error-prone difference features.
We propose a distractors-immune representation learning network that correlates the corresponding channels of two image representations.
arXiv Detail & Related papers (2024-07-16T13:00:33Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - DARC: Distribution-Aware Re-Coloring Model for Generalizable Nucleus
Segmentation [68.43628183890007]
We argue that domain gaps can also be caused by different foreground (nucleus)-background ratios.
First, we introduce a re-coloring method that relieves dramatic image color variations between different domains.
Second, we propose a new instance normalization method that is robust to the variation in the foreground-background ratios.
arXiv Detail & Related papers (2023-09-01T01:01:13Z) - Learning Partial Correlation based Deep Visual Representation for Image
Classification [61.0532370259644]
We formulate sparse inverse covariance estimation (SICE) as a novel structured layer of CNN.
Our work obtains a partial correlation based deep visual representation and mitigates the small sample problem.
Experiments show the efficacy and superior classification performance of our model.
arXiv Detail & Related papers (2023-04-23T10:09:01Z) - LILE: Look In-Depth before Looking Elsewhere -- A Dual Attention Network
using Transformers for Cross-Modal Information Retrieval in Histopathology
Archives [0.7614628596146599]
Cross-modality data retrieval has become a requirement for many domains and disciplines of research.
This study proposes a novel architecture with a new loss term to help represent images and texts in the joint latent space.
arXiv Detail & Related papers (2022-03-02T22:42:20Z) - IMACS: Image Model Attribution Comparison Summaries [16.80986701058596]
We introduce IMACS, a method that combines gradient-based model attributions with aggregation and visualization techniques.
IMACS extracts salient input features from an evaluation dataset, clusters them based on similarity, then visualizes differences in model attributions for similar input features.
We show how our technique can uncover behavioral differences caused by domain shift between two models trained on satellite images.
arXiv Detail & Related papers (2022-01-26T21:35:14Z) - MGA-VQA: Multi-Granularity Alignment for Visual Question Answering [75.55108621064726]
Learning to answer visual questions is a challenging task since the multi-modal inputs are within two feature spaces.
We propose Multi-Granularity Alignment architecture for Visual Question Answering task (MGA-VQA)
Our model splits alignment into different levels to achieve learning better correlations without needing additional data and annotations.
arXiv Detail & Related papers (2022-01-25T22:30:54Z) - Multi-Scale Feature Fusion: Learning Better Semantic Segmentation for
Road Pothole Detection [9.356003255288417]
This paper presents a novel pothole detection approach based on single-modal semantic segmentation.
It first extracts visual features from input images using a convolutional neural network.
A channel attention module then reweighs the channel features to enhance the consistency of different feature maps.
arXiv Detail & Related papers (2021-12-24T15:07:47Z) - Revisiting Contrastive Methods for Unsupervised Learning of Visual
Representations [78.12377360145078]
Contrastive self-supervised learning has outperformed supervised pretraining on many downstream tasks like segmentation and object detection.
In this paper, we first study how biases in the dataset affect existing methods.
We show that current contrastive approaches work surprisingly well across: (i) object- versus scene-centric, (ii) uniform versus long-tailed and (iii) general versus domain-specific datasets.
arXiv Detail & Related papers (2021-06-10T17:59:13Z) - Multi-Perspective Anomaly Detection [3.3511723893430476]
We build upon the deep support vector data description algorithm and address multi-perspective anomaly detection.
We employ different augmentation techniques with a denoising process to deal with scarce one-class data.
We evaluate our approach on the new dices dataset using images from two different perspectives and also benchmark on the standard MNIST dataset.
arXiv Detail & Related papers (2021-05-20T17:07:36Z) - On the Texture Bias for Few-Shot CNN Segmentation [21.349705243254423]
Convolutional Neural Networks (CNNs) are driven by shapes to perform visual recognition tasks.
Recent evidence suggests texture bias in CNNs provides higher performing models when learning on large labeled training datasets.
We propose a novel architecture that integrates a set of Difference of Gaussians (DoG) to attenuate high-frequency local components in the feature space.
arXiv Detail & Related papers (2020-03-09T11:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.