Related papers: CRC-RL: A Novel Visual Feature Representation Architecture for Unsupervised Reinforcement Learning

CRC-RL: A Novel Visual Feature Representation Architecture for Unsupervised Reinforcement Learning

URL: http://arxiv.org/abs/2301.13473v1
Date: Tue, 31 Jan 2023 08:41:18 GMT
Title: CRC-RL: A Novel Visual Feature Representation Architecture for Unsupervised Reinforcement Learning
Authors: Darshita Jain, Anima Majumder, Samrat Dutta and Swagat Kumar
Abstract summary: A novel architecture is proposed that uses a heterogeneous loss function, called CRC loss, to learn improved visual features. The proposed architecture, called CRC-RL, is shown to outperform the existing state-of-the-art methods on the challenging Deep mind control suite environments.
Score: 7.4010632660248765
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper addresses the problem of visual feature representation learning with an aim to improve the performance of end-to-end reinforcement learning (RL) models. Specifically, a novel architecture is proposed that uses a heterogeneous loss function, called CRC loss, to learn improved visual features which can then be used for policy learning in RL. The CRC-loss function is a combination of three individual loss functions, namely, contrastive, reconstruction and consistency loss. The feature representation is learned in parallel to the policy learning while sharing the weight updates through a Siamese Twin encoder model. This encoder model is augmented with a decoder network and a feature projection network to facilitate computation of the above loss components. Through empirical analysis involving latent feature visualization, an attempt is made to provide an insight into the role played by this loss function in learning new action-dependent features and how they are linked to the complexity of the problems being solved. The proposed architecture, called CRC-RL, is shown to outperform the existing state-of-the-art methods on the challenging Deep mind control suite environments by a significant margin thereby creating a new benchmark in this field.

Related papers

Implicit Neural Representation-Based Continuous Single Image Super Resolution: An Empirical Study [50.15623093332659]
Implicit neural representation (INR) has become the standard approach for arbitrary-scale image super-resolution (ASSR)<n>We compare existing techniques across diverse settings and present aggregated performance results on multiple image quality metrics.<n>We examine a new loss function that penalizes intensity variations while preserving edges, textures, and finer details during training.
arXiv Detail & Related papers (2026-01-25T07:09:20Z)
Rethinking Hebbian Principle: Low-Dimensional Structural Projection for Unsupervised Learning [17.299267108673277]
Hebbian learning is a biological principle that intuitively describes how neurons adapt their connections through repeated stimuli.<n>We introduce the Structural Projection Hebbian Representation (SPHeRe), a novel unsupervised learning method.<n> Experimental results show that SPHeRe achieves SOTA performance among unsupervised synaptic plasticity approaches.
arXiv Detail & Related papers (2025-10-16T15:47:29Z)
Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning [58.533203990515034]
Scaling neural networks has driven breakthrough advances in machine learning, yet this paradigm fails in deep reinforcement learning (DRL)<n>We show that dynamic sparse training strategies provide module-specific benefits that complement the primary scalability foundation established by architectural improvements.<n>We finally distill these insights into Module-Specific Training (MST), a practical framework that exploits the benefits of architectural improvements and demonstrates substantial scalability gains across diverse RL algorithms without algorithmic modifications.
arXiv Detail & Related papers (2025-10-14T03:03:08Z)
Learning Distinguishable Representations in Deep Q-Networks for Linear Transfer [0.9558392439655014]
We propose a novel deep Q-learning approach that introduces a regularization term to reduce positive correlations between feature representation of states.<n>We demonstrate the efficacy of our approach in improving transfer learning performance and thereby reducing computational overhead.
arXiv Detail & Related papers (2025-09-29T15:44:35Z)
Foundations and Models in Modern Computer Vision: Key Building Blocks in Landmark Architectures [34.542592986038265]
This report analyzes the evolution of key design patterns in computer vision by examining six influential papers.<n>We review ResNet, which introduced residual connections to overcome the vanishing gradient problem.<n>We examine the Vision Transformer (ViT), which established a new paradigm by applying the Transformer architecture to sequences of image patches.
arXiv Detail & Related papers (2025-07-31T09:08:11Z)
C2D-ISR: Optimizing Attention-based Image Super-resolution from Continuous to Discrete Scales [6.700548615812325]
We propose a novel framework, textbfC2D-ISR, for optimizing attention-based image super-resolution models. Our approach is based on a two-stage training methodology and a hierarchical encoding mechanism. In addition, we generalize the hierarchical encoding mechanism with existing attention-based network structures.
arXiv Detail & Related papers (2025-03-17T21:52:18Z)
Soft Knowledge Distillation with Multi-Dimensional Cross-Net Attention for Image Restoration Models Compression [0.0]
Transformer-based encoder-decoder models have achieved remarkable success in image-to-image transfer tasks. However, their high computational complexity-manifested in elevated FLOPs and parameter counts-limits their application in real-world scenarios. We propose a Soft Knowledge Distillation (SKD) strategy that incorporates a Multi-dimensional Cross-net Attention (MCA) mechanism for compressing image restoration models.
arXiv Detail & Related papers (2025-01-16T06:25:56Z)
Fast and Efficient Local Search for Genetic Programming Based Loss Function Learning [12.581217671500887]
We propose a new meta-learning framework for task and model-agnostic loss function learning via a hybrid search approach. Results show that the learned loss functions bring improved convergence, sample efficiency, and inference performance on tabulated, computer vision, and natural language processing problems.
arXiv Detail & Related papers (2024-03-01T02:20:04Z)
Low-Resolution Self-Attention for Semantic Segmentation [93.30597515880079]
We introduce the Low-Resolution Self-Attention (LRSA) mechanism to capture global context at a significantly reduced computational cost.<n>Our approach involves computing self-attention in a fixed low-resolution space regardless of the input image's resolution.<n>We demonstrate the effectiveness of our LRSA approach by building the LRFormer, a vision transformer with an encoder-decoder structure.
arXiv Detail & Related papers (2023-10-08T06:10:09Z)
A Neuromorphic Architecture for Reinforcement Learning from Real-Valued Observations [0.34410212782758043]
Reinforcement Learning (RL) provides a powerful framework for decision-making in complex environments. This paper presents a novel Spiking Neural Network (SNN) architecture for solving RL problems with real-valued observations.
arXiv Detail & Related papers (2023-07-06T12:33:34Z)
Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs. We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z)
Bridging Component Learning with Degradation Modelling for Blind Image Super-Resolution [69.11604249813304]
We propose a components decomposition and co-optimization network (CDCN) for blind SR. CDCN decomposes the input LR image into structure and detail components in feature space. We present a degradation-driven learning strategy to jointly supervise the HR image detail and structure restoration process.
arXiv Detail & Related papers (2022-12-03T14:53:56Z)
Learning Detail-Structure Alternative Optimization for Blind Super-Resolution [69.11604249813304]
We propose an effective and kernel-free network, namely DSSR, which enables recurrent detail-structure alternative optimization without blur kernel prior incorporation for blind SR. In our DSSR, a detail-structure modulation module (DSMM) is built to exploit the interaction and collaboration of image details and structures. Our method achieves the state-of-the-art against existing methods.
arXiv Detail & Related papers (2022-12-03T14:44:17Z)
Learning Deep Representations via Contrastive Learning for Instance Retrieval [11.736450745549792]
This paper makes the first attempt that tackles the problem using instance-discrimination based contrastive learning (CL) In this work, we approach this problem by exploring the capability of deriving discriminative representations from pre-trained and fine-tuned CL models.
arXiv Detail & Related papers (2022-09-28T04:36:34Z)
SIRe-Networks: Skip Connections over Interlaced Multi-Task Learning and Residual Connections for Structure Preserving Object Classification [28.02302915971059]
In this paper, we introduce an interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing gradient in relation to the object classification task. The presented methodology directly improves a convolutional neural network (CNN) by enforcing the input image structure preservation through auto-encoders. To validate the presented methodology, a simple CNN and various implementations of famous networks are extended via the SIRe strategy and extensively tested on the CIFAR100 dataset.
arXiv Detail & Related papers (2021-10-06T13:54:49Z)
Visual Alignment Constraint for Continuous Sign Language Recognition [74.26707067455837]
Vision-based Continuous Sign Language Recognition aims to recognize unsegmented gestures from image sequences. In this work, we revisit the overfitting problem in recent CTC-based CSLR works and attribute it to the insufficient training of the feature extractor. We propose a Visual Alignment Constraint (VAC) to enhance the feature extractor with more alignment supervision.
arXiv Detail & Related papers (2021-04-06T07:24:58Z)
Hierarchical Deep CNN Feature Set-Based Representation Learning for Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics. Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space. In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z)
Progressive Self-Guided Loss for Salient Object Detection [102.35488902433896]
We present a progressive self-guided loss function to facilitate deep learning-based salient object detection in images. Our framework takes advantage of adaptively aggregated multi-scale features to locate and detect salient objects effectively.
arXiv Detail & Related papers (2021-01-07T07:33:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.