TransCL: Transformer Makes Strong and Flexible Compressive Learning
- URL: http://arxiv.org/abs/2207.11972v1
- Date: Mon, 25 Jul 2022 08:21:48 GMT
- Title: TransCL: Transformer Makes Strong and Flexible Compressive Learning
- Authors: Chong Mou, Jian Zhang
- Abstract summary: Compressive learning (CL) is an emerging framework that integrates signal acquisition via compressed sensing (CS) and machine learning for inference tasks directly on a small number of measurements.
Previous attempts on CL are not only limited to a fixed CS ratio, but also limited to MNIST/CIFAR-like datasets and do not scale to complex real-world high-resolution (HR) data or vision tasks.
In this paper, a novel transformer-based compressive learning framework on large-scale images with arbitrary CS ratios, dubbed TransCL, is proposed.
- Score: 11.613886854794133
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compressive learning (CL) is an emerging framework that integrates signal
acquisition via compressed sensing (CS) and machine learning for inference
tasks directly on a small number of measurements. It can be a promising
alternative to classical image-domain methods and enjoys great advantages in
memory saving and computational efficiency. However, previous attempts on CL
are not only limited to a fixed CS ratio, which lacks flexibility, but also
limited to MNIST/CIFAR-like datasets and do not scale to complex real-world
high-resolution (HR) data or vision tasks. In this paper, a novel
transformer-based compressive learning framework on large-scale images with
arbitrary CS ratios, dubbed TransCL, is proposed. Specifically, TransCL first
utilizes the strategy of learnable block-based compressed sensing and proposes
a flexible linear projection strategy to enable CL to be performed on
large-scale images in an efficient block-by-block manner with arbitrary CS
ratios. Then, regarding CS measurements from all blocks as a sequence, a pure
transformer-based backbone is deployed to perform vision tasks with various
task-oriented heads. Our sufficient analysis presents that TransCL exhibits
strong resistance to interference and robust adaptability to arbitrary CS
ratios. Extensive experiments for complex HR data demonstrate that the proposed
TransCL can achieve state-of-the-art performance in image classification and
semantic segmentation tasks. In particular, TransCL with a CS ratio of $10\%$
can obtain almost the same performance as when operating directly on the
original data and can still obtain satisfying performance even with an
extremely low CS ratio of $1\%$. The source codes of our proposed TransCL is
available at \url{https://github.com/MC-E/TransCL/}.
Related papers
- Variable-size Symmetry-based Graph Fourier Transforms for image compression [65.7352685872625]
We propose a new family of Symmetry-based Graph Fourier Transforms of variable sizes into a coding framework.
Our proposed algorithm generates symmetric graphs on the grid by adding specific symmetrical connections between nodes.
Experiments show that SBGFTs outperform the primary transforms integrated in the explicit Multiple Transform Selection.
arXiv Detail & Related papers (2024-11-24T13:00:44Z) - Simplifying CLIP: Unleashing the Power of Large-Scale Models on Consumer-level Computers [3.2492319522383717]
Contrastive Language-Image Pre-training (CLIP) has attracted a surge of attention for its superior zero-shot performance and excellent transferability to downstream tasks.
However, training such large-scale models usually requires substantial computation and storage, which poses barriers for general users with consumer-level computers.
arXiv Detail & Related papers (2024-11-22T08:17:46Z) - Task-Aware Dynamic Transformer for Efficient Arbitrary-Scale Image Super-Resolution [8.78015409192613]
Arbitrary-scale super-resolution (ASSR) aims to learn a single model for image super-resolution at arbitrary magnifying scales.
Existing ASSR networks typically comprise an off-the-shelf scale-agnostic feature extractor and an arbitrary scale upsampler.
We propose a Task-Aware Dynamic Transformer (TADT) as an input-adaptive feature extractor for efficient image ASSR.
arXiv Detail & Related papers (2024-08-16T13:35:52Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Binarized Spectral Compressive Imaging [59.18636040850608]
Existing deep learning models for hyperspectral image (HSI) reconstruction achieve good performance but require powerful hardwares with enormous memory and computational resources.
We propose a novel method, Binarized Spectral-Redistribution Network (BiSRNet)
BiSRNet is derived by using the proposed techniques to binarize the base model.
arXiv Detail & Related papers (2023-05-17T15:36:08Z) - Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset
Selection [59.77647907277523]
Adversarial contrast learning (ACL) does not require expensive data annotations but outputs a robust representation that withstands adversarial attacks.
ACL needs tremendous running time to generate the adversarial variants of all training data.
This paper proposes a robustness-aware coreset selection (RCS) method to speed up ACL.
arXiv Detail & Related papers (2023-02-08T03:20:14Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.