M2IOSR: Maximal Mutual Information Open Set Recognition
- URL: http://arxiv.org/abs/2108.02373v2
- Date: Fri, 6 Aug 2021 00:37:12 GMT
- Title: M2IOSR: Maximal Mutual Information Open Set Recognition
- Authors: Xin Sun, Henghui Ding, Chi Zhang, Guosheng Lin, Keck-Voon Ling
- Abstract summary: We propose a mutual information-based method with a streamlined architecture for open set recognition.
The proposed method significantly improves the performance of baselines and achieves new state-of-the-art results on several benchmarks consistently.
- Score: 47.1393314282815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we aim to address the challenging task of open set recognition
(OSR). Many recent OSR methods rely on auto-encoders to extract class-specific
features by a reconstruction strategy, requiring the network to restore the
input image on pixel-level. This strategy is commonly over-demanding for OSR
since class-specific features are generally contained in target objects, not in
all pixels. To address this shortcoming, here we discard the pixel-level
reconstruction strategy and pay more attention to improving the effectiveness
of class-specific feature extraction. We propose a mutual information-based
method with a streamlined architecture, Maximal Mutual Information Open Set
Recognition (M2IOSR). The proposed M2IOSR only uses an encoder to extract
class-specific features by maximizing the mutual information between the given
input and its latent features across multiple scales. Meanwhile, to further
reduce the open space risk, latent features are constrained to class
conditional Gaussian distributions by a KL-divergence loss function. In this
way, a strong function is learned to prevent the network from mapping different
observations to similar latent features and help the network extract
class-specific features with desired statistical characteristics. The proposed
method significantly improves the performance of baselines and achieves new
state-of-the-art results on several benchmarks consistently.
Related papers
- Disentangling CLIP Features for Enhanced Localized Understanding [58.73850193789384]
We propose Unmix-CLIP, a novel framework designed to reduce mutual feature information (MFI) and improve feature disentanglement.
For the COCO- 14 dataset, Unmix-CLIP reduces feature similarity by 24.9%.
arXiv Detail & Related papers (2025-02-05T08:20:31Z) - Pruning Deep Convolutional Neural Network Using Conditional Mutual Information [10.302118493842647]
Convolutional Neural Networks (CNNs) achieve high performance in image classification tasks but are challenging to deploy on resource-limited hardware.
We propose a structured filter-pruning approach for CNNs that identifies and selectively retains the most informative features in each layer.
arXiv Detail & Related papers (2024-11-27T18:23:59Z) - Electromagnetic Scattering Kernel Guided Reciprocal Point Learning for SAR Open-Set Recognition [6.226365654670747]
Open Set Recognition (OSR) aims to categorize known classes while denoting unknown ones as "unknown"
To enhance open-set SAR classification, a method called scattering kernel with reciprocal learning network is proposed.
Proposal is made to design convolutional kernels based on large-sized attribute scattering center models.
arXiv Detail & Related papers (2024-11-07T13:26:20Z) - A Refreshed Similarity-based Upsampler for Direct High-Ratio Feature Upsampling [54.05517338122698]
A popular similarity-based feature upsampling pipeline has been proposed, which utilizes a high-resolution feature as guidance.
We propose an explicitly controllable query-key feature alignment from both semantic-aware and detail-aware perspectives.
We develop a fine-grained neighbor selection strategy on HR features, which is simple yet effective for alleviating mosaic artifacts.
arXiv Detail & Related papers (2024-07-02T14:12:21Z) - PARFormer: Transformer-based Multi-Task Network for Pedestrian Attribute
Recognition [23.814762073093153]
We propose a pure transformer-based multi-task PAR network named PARFormer, which includes four modules.
In the feature extraction module, we build a strong baseline for feature extraction, which achieves competitive results on several PAR benchmarks.
In the viewpoint perception module, we explore the impact of viewpoints on pedestrian attributes, and propose a multi-view contrastive loss.
In the attribute recognition module, we alleviate the negative-positive imbalance problem to generate the attribute predictions.
arXiv Detail & Related papers (2023-04-14T16:27:56Z) - Specificity-preserving RGB-D Saliency Detection [103.3722116992476]
We propose a specificity-preserving network (SP-Net) for RGB-D saliency detection.
Two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps.
Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-08-18T14:14:22Z) - Hierarchical Deep CNN Feature Set-Based Representation Learning for
Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics.
Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space.
In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z) - Sequential Hierarchical Learning with Distribution Transformation for
Image Super-Resolution [83.70890515772456]
We build a sequential hierarchical learning super-resolution network (SHSR) for effective image SR.
We consider the inter-scale correlations of features, and devise a sequential multi-scale block (SMB) to progressively explore the hierarchical information.
Experiment results show SHSR achieves superior quantitative performance and visual quality to state-of-the-art methods.
arXiv Detail & Related papers (2020-07-19T01:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.