Self Identity Mapping
- URL: http://arxiv.org/abs/2509.18165v1
- Date: Wed, 17 Sep 2025 07:52:20 GMT
- Title: Self Identity Mapping
- Authors: Xiuding Cai, Yaoyao Zhu, Linjie Fu, Dong Miao, Yu Yao,
- Abstract summary: Self Identity Mapping (SIM) is a data-intrinsic regularization framework that leverages an inverse mapping mechanism to enhance representation learning.<n>As a model-agnostic, task-agnostic regularizer, SIM can be seamlessly integrated as a plug-and-play module.<n>We extensively evaluate $rhotextSIM$ across three tasks: image classification, few-shot prompt learning, and domain generalization.
- Score: 5.887119990422003
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Regularization is essential in deep learning to enhance generalization and mitigate overfitting. However, conventional techniques often rely on heuristics, making them less reliable or effective across diverse settings. We propose Self Identity Mapping (SIM), a simple yet effective, data-intrinsic regularization framework that leverages an inverse mapping mechanism to enhance representation learning. By reconstructing the input from its transformed output, SIM reduces information loss during forward propagation and facilitates smoother gradient flow. To address computational inefficiencies, We instantiate SIM as $ \rho\text{SIM} $ by incorporating patch-level feature sampling and projection-based method to reconstruct latent features, effectively lowering complexity. As a model-agnostic, task-agnostic regularizer, SIM can be seamlessly integrated as a plug-and-play module, making it applicable to different network architectures and tasks. We extensively evaluate $\rho\text{SIM}$ across three tasks: image classification, few-shot prompt learning, and domain generalization. Experimental results show consistent improvements over baseline methods, highlighting $\rho\text{SIM}$'s ability to enhance representation learning across various tasks. We also demonstrate that $\rho\text{SIM}$ is orthogonal to existing regularization methods, boosting their effectiveness. Moreover, our results confirm that $\rho\text{SIM}$ effectively preserves semantic information and enhances performance in dense-to-dense tasks, such as semantic segmentation and image translation, as well as in non-visual domains including audio classification and time series anomaly detection. The code is publicly available at https://github.com/XiudingCai/SIM-pytorch.
Related papers
- Beyond Softmax: A Natural Parameterization for Categorical Random Variables [61.709831225296305]
We introduce the $textitcatnat$ function, a function composed of a sequence of hierarchical binary splits.<n>A rich set of experiments show that the proposed function improves the learning efficiency and yields models characterized by consistently higher test performance.
arXiv Detail & Related papers (2025-09-29T12:55:50Z) - Toward Better SSIM Loss for Unsupervised Monocular Depth Estimation [14.89929051723735]
This work proposes a new form of structure similarity index measure (SSIM)<n>Compared with original SSIM function, the proposed new form uses addition rather than multiplication to combine the luminance, contrast, and structural similarity related components in SSIM.<n>The loss function constructed with this scheme helps result in smoother gradients and achieve higher performance on unsupervised depth estimation.
arXiv Detail & Related papers (2025-06-05T08:43:24Z) - Semantic-aware Representation Learning for Homography Estimation [28.70450397793246]
We propose SRMatcher, a detector-free feature matching method, which encourages the network to learn integrated semantic feature representation.
By reducing errors stemming from semantic inconsistencies in matching pairs, our proposed SRMatcher is able to deliver more accurate and realistic outcomes.
arXiv Detail & Related papers (2024-07-18T08:36:28Z) - Enhancing Vision-Language Few-Shot Adaptation with Negative Learning [11.545127156146368]
We propose a Simple yet effective Negative Learning approach, SimNL, to more efficiently exploit task-specific knowledge.
To this issue, we introduce a plug-and-play few-shot instance reweighting technique to mitigate noisy outliers.
Our extensive experimental results validate that the proposed SimNL outperforms existing state-of-the-art methods on both few-shot learning and domain generalization tasks.
arXiv Detail & Related papers (2024-03-19T17:59:39Z) - Semantics-Aware Dynamic Localization and Refinement for Referring Image
Segmentation [102.25240608024063]
Referring image segments an image from a language expression.
We develop an algorithm that shifts from being localization-centric to segmentation-language.
Compared to its counterparts, our method is more versatile yet effective.
arXiv Detail & Related papers (2023-03-11T08:42:40Z) - How Fine-Tuning Allows for Effective Meta-Learning [50.17896588738377]
We present a theoretical framework for analyzing representations derived from a MAML-like algorithm.
We provide risk bounds on the best predictor found by fine-tuning via gradient descent, demonstrating that the algorithm can provably leverage the shared structure.
This separation result underscores the benefit of fine-tuning-based methods, such as MAML, over methods with "frozen representation" objectives in few-shot learning.
arXiv Detail & Related papers (2021-05-05T17:56:00Z) - Graph Sampling Based Deep Metric Learning for Generalizable Person
Re-Identification [114.56752624945142]
We argue that the most popular random sampling method, the well-known PK sampler, is not informative and efficient for deep metric learning.
We propose an efficient mini batch sampling method called Graph Sampling (GS) for large-scale metric learning.
arXiv Detail & Related papers (2021-04-04T06:44:15Z) - Region Similarity Representation Learning [94.88055458257081]
Region Similarity Representation Learning (ReSim) is a new approach to self-supervised representation learning for localization-based tasks.
ReSim learns both regional representations for localization as well as semantic image-level representations.
We show how ReSim learns representations which significantly improve the localization and classification performance compared to a competitive MoCo-v2 baseline.
arXiv Detail & Related papers (2021-03-24T00:42:37Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - Predicting What You Already Know Helps: Provable Self-Supervised
Learning [60.27658820909876]
Self-supervised representation learning solves auxiliary prediction tasks (known as pretext tasks) without requiring labeled data.
We show a mechanism exploiting the statistical connections between certain em reconstruction-based pretext tasks that guarantee to learn a good representation.
We prove the linear layer yields small approximation error even for complex ground truth function class.
arXiv Detail & Related papers (2020-08-03T17:56:13Z) - Image Matching across Wide Baselines: From Paper to Practice [80.9424750998559]
We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
arXiv Detail & Related papers (2020-03-03T15:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.