Related papers: M2IOSR: Maximal Mutual Information Open Set Recognition

M2IOSR: Maximal Mutual Information Open Set Recognition

URL: http://arxiv.org/abs/2108.02373v2
Date: Fri, 6 Aug 2021 00:37:12 GMT
Title: M2IOSR: Maximal Mutual Information Open Set Recognition
Authors: Xin Sun, Henghui Ding, Chi Zhang, Guosheng Lin, Keck-Voon Ling
Abstract summary: We propose a mutual information-based method with a streamlined architecture for open set recognition. The proposed method significantly improves the performance of baselines and achieves new state-of-the-art results on several benchmarks consistently.
Score: 47.1393314282815
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we aim to address the challenging task of open set recognition (OSR). Many recent OSR methods rely on auto-encoders to extract class-specific features by a reconstruction strategy, requiring the network to restore the input image on pixel-level. This strategy is commonly over-demanding for OSR since class-specific features are generally contained in target objects, not in all pixels. To address this shortcoming, here we discard the pixel-level reconstruction strategy and pay more attention to improving the effectiveness of class-specific feature extraction. We propose a mutual information-based method with a streamlined architecture, Maximal Mutual Information Open Set Recognition (M2IOSR). The proposed M2IOSR only uses an encoder to extract class-specific features by maximizing the mutual information between the given input and its latent features across multiple scales. Meanwhile, to further reduce the open space risk, latent features are constrained to class conditional Gaussian distributions by a KL-divergence loss function. In this way, a strong function is learned to prevent the network from mapping different observations to similar latent features and help the network extract class-specific features with desired statistical characteristics. The proposed method significantly improves the performance of baselines and achieves new state-of-the-art results on several benchmarks consistently.

Related papers

Disentangling CLIP Features for Enhanced Localized Understanding [58.73850193789384]
We propose Unmix-CLIP, a novel framework designed to reduce mutual feature information (MFI) and improve feature disentanglement. For the COCO- 14 dataset, Unmix-CLIP reduces feature similarity by 24.9%.
arXiv Detail & Related papers (2025-02-05T08:20:31Z)
Pruning Deep Convolutional Neural Network Using Conditional Mutual Information [10.302118493842647]
Convolutional Neural Networks (CNNs) achieve high performance in image classification tasks but are challenging to deploy on resource-limited hardware. We propose a structured filter-pruning approach for CNNs that identifies and selectively retains the most informative features in each layer.
arXiv Detail & Related papers (2024-11-27T18:23:59Z)
Reciprocal Point Learning Network with Large Electromagnetic Kernel for SAR Open-Set Recognition [6.226365654670747]
Open Set Recognition (OSR) aims to categorize known classes while denoting unknown ones as "unknown" To enhance open-set SAR classification, a method called scattering kernel with reciprocal learning network is proposed. Proposal is made to design convolutional kernels based on large-sized attribute scattering center models.
arXiv Detail & Related papers (2024-11-07T13:26:20Z)
A Refreshed Similarity-based Upsampler for Direct High-Ratio Feature Upsampling [54.05517338122698]
We propose an explicitly controllable query-key feature alignment from both semantic-aware and detail-aware perspectives. We also develop a fine-grained neighbor selection strategy on HR features, which is simple yet effective for alleviating mosaic artifacts. Our proposed ReSFU framework consistently achieves satisfactory performance on different segmentation applications.
arXiv Detail & Related papers (2024-07-02T14:12:21Z)
Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs. Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z)
PARFormer: Transformer-based Multi-Task Network for Pedestrian Attribute Recognition [23.814762073093153]
We propose a pure transformer-based multi-task PAR network named PARFormer, which includes four modules. In the feature extraction module, we build a strong baseline for feature extraction, which achieves competitive results on several PAR benchmarks. In the viewpoint perception module, we explore the impact of viewpoints on pedestrian attributes, and propose a multi-view contrastive loss. In the attribute recognition module, we alleviate the negative-positive imbalance problem to generate the attribute predictions.
arXiv Detail & Related papers (2023-04-14T16:27:56Z)
Specificity-preserving RGB-D Saliency Detection [103.3722116992476]
We propose a specificity-preserving network (SP-Net) for RGB-D saliency detection. Two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps. Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-08-18T14:14:22Z)
Hierarchical Deep CNN Feature Set-Based Representation Learning for Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics. Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space. In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z)
Sequential Hierarchical Learning with Distribution Transformation for Image Super-Resolution [83.70890515772456]
We build a sequential hierarchical learning super-resolution network (SHSR) for effective image SR. We consider the inter-scale correlations of features, and devise a sequential multi-scale block (SMB) to progressively explore the hierarchical information. Experiment results show SHSR achieves superior quantitative performance and visual quality to state-of-the-art methods.
arXiv Detail & Related papers (2020-07-19T01:35:53Z)
Hybrid Embedded Deep Stacked Sparse Autoencoder with w_LPPD SVM Ensemble [13.981652331491558]
This paper presents a novel deep autoencoder - hybrid feature embedded stacked sparse autoencoder(HESSAE) It is capable to learn discriminant deep features with the help of embedding original features to filter weak hidden-layer outputs during training. The experimental results demonstrated that, the proposed feature learning method yields superior performance compared to other existing and state of art feature learning algorithms.
arXiv Detail & Related papers (2020-02-17T04:06:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.