Related papers: Rethinking Spatially-Adaptive Normalization

Rethinking Spatially-Adaptive Normalization

URL: http://arxiv.org/abs/2004.02867v1
Date: Mon, 6 Apr 2020 17:58:25 GMT
Title: Rethinking Spatially-Adaptive Normalization
Authors: Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, Jing Liao, Mingming He, Lu Yuan, Nenghai Yu
Abstract summary: Class-adaptive normalization (CLADE) is a lightweight variant that is not adaptive to spatial positions or layouts. CLADE greatly reduces the computation cost while still being able to preserve the semantic information during the generation.
Score: 111.13203525538496
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Spatially-adaptive normalization is remarkably successful recently in conditional semantic image synthesis, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to preserve the semantic information from being washed away. Despite its impressive performance, a more thorough understanding of the true advantages inside the box is still highly demanded, to help reduce the significant computation and parameter overheads introduced by these new structures. In this paper, from a return-on-investment point of view, we present a deep analysis of the effectiveness of SPADE and observe that its advantages actually come mainly from its semantic-awareness rather than the spatial-adaptiveness. Inspired by this point, we propose class-adaptive normalization (CLADE), a lightweight variant that is not adaptive to spatial positions or layouts. Benefited from this design, CLADE greatly reduces the computation cost while still being able to preserve the semantic information during the generation. Extensive experiments on multiple challenging datasets demonstrate that while the resulting fidelity is on par with SPADE, its overhead is much cheaper than SPADE. Take the generator for ADE20k dataset as an example, the extra parameter and computation cost introduced by CLADE are only 4.57% and 0.07% while that of SPADE are 39.21% and 234.73% respectively.

Related papers

ESSA: Evolutionary Strategies for Scalable Alignment [2.589791058467358]
This paper introduces ESSA, a new framework that uses Evolutionary Strategies (ES) to efficiently align Large Language Models (LLMs)<n>ES is well-suited for LLM alignment due to its favorable properties, such as high parallelizability, memory efficiency, robustness to sparse rewards, and fewer data samples required for convergence.<n>Our findings establish ES as a promising and scalable alternative to gradient-based alignment, paving the way for efficient post-training of large language models.
arXiv Detail & Related papers (2025-07-06T16:23:07Z)
EKPC: Elastic Knowledge Preservation and Compensation for Class-Incremental Learning [53.88000987041739]
Class-Incremental Learning (CIL) aims to enable AI models to continuously learn from sequentially arriving data of different classes over time.<n>We propose the Elastic Knowledge Preservation and Compensation (EKPC) method, integrating Importance-aware importance Regularization (IPR) and Trainable Semantic Drift Compensation (TSDC) for CIL.
arXiv Detail & Related papers (2025-06-14T05:19:58Z)
Deep Learning Optimization Using Self-Adaptive Weighted Auxiliary Variables [20.09691024284159]
In this paper, we develop a new framework for learning via neural networks or physics-informed networks. The robustness of our framework guarantees that the new loss helps optimize the original problem.
arXiv Detail & Related papers (2025-04-30T10:43:13Z)
PCA-RAG: Principal Component Analysis for Efficient Retrieval-Augmented Generation [0.0]
High-dimensional language model embeddings can present scalability challenges in terms of storage and latency. This paper investigates the use of Principal Component Analysis (PCA) to reduce embedding dimensionality. We show that PCA-based compression offers a viable balance between retrieval fidelity and resource efficiency.
arXiv Detail & Related papers (2025-04-11T09:38:12Z)
Gradient Multi-Normalization for Stateless and Scalable LLM Training [16.037614012166063]
Training large language models (LLMs) typically relies on adaptives like Adam which store additional state information to accelerate convergence but incur significant memory overhead. Recent efforts, such as SWAN (Ma et al., 2024) address this by eliminating the need for states while achieving performance comparable to Adam via a multi-step preprocessing procedure applied to instantaneous gradients. We introduce a novel framework for designing stateless gradients that normalizes gradients according to multiple norms. Experiments on pre-training LLaMA models with up to 1 billion parameters demonstrate a 3X speedup over Adam with significantly reduced memory requirements, outperforming other memory-efficient baseline
arXiv Detail & Related papers (2025-02-10T18:09:53Z)
ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts. Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z)
Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform [10.565509997395504]
We propose a novel Selective Discrete Cosine Transformation (sDCTFT) fine-tuning scheme to push this frontier. Its general idea is to exploit the superior energy compaction and decorrelation properties of DCT. Experiments on four benchmark datasets demonstrate the superior accuracy, reduced computational cost, and lower storage requirements.
arXiv Detail & Related papers (2024-10-09T16:07:42Z)
PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners [65.93130697098658]
This paper proposes PredFormer, a pure transformer-based framework for predictive learning. With its recurrent-free, transformer-based design, PredFormer is both simple and efficient. experiments on synthetic and real-world datasets demonstrate that PredFormer achieves state-the-art performance.
arXiv Detail & Related papers (2024-10-07T03:52:06Z)
Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT) We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z)
Adaptive and Cascaded Compressive Sensing [10.162966219929887]
Scene-dependent adaptive compressive sensing (CS) has been a long pursuing goal which has huge potential in significantly improving the performance of CS. We propose a restricted isometry property (RIP) condition based error clamping, which could directly predict the reconstruction error. We also propose a cascaded feature fusion reconstruction network that could efficiently utilize the information derived from different adaptive sampling stages.
arXiv Detail & Related papers (2022-03-21T07:50:24Z)
Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference. We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z)
Disentangling Generative Factors of Physical Fields Using Variational Autoencoders [0.0]
This work explores the use of variational autoencoders (VAEs) for non-linear dimension reduction. A disentangled decomposition is interpretable and can be transferred to a variety of tasks including generative modeling.
arXiv Detail & Related papers (2021-09-15T16:02:43Z)
Inception Convolution with Efficient Dilation Search [121.41030859447487]
Dilation convolution is a critical mutant of standard convolution neural network to control effective receptive fields and handle large scale variance of objects. We propose a new mutant of dilated convolution, namely inception (dilated) convolution where the convolutions have independent dilation among different axes, channels and layers. We explore a practical method for fitting the complex inception convolution to the data, a simple while effective dilation search algorithm(EDO) based on statistical optimization is developed.
arXiv Detail & Related papers (2020-12-25T14:58:35Z)
Efficient Semantic Image Synthesis via Class-Adaptive Normalization [116.63715955932174]
Class-adaptive normalization (CLADE) is a lightweight but equally-effective variant that is only adaptive to semantic class. We introduce intra-class positional map encoding calculated from semantic layouts to modulate the normalization parameters of CLADE. The proposed CLADE can be generalized to different SPADE-based methods while achieving comparable generation quality compared to SPADE.
arXiv Detail & Related papers (2020-12-08T18:59:32Z)
SASL: Saliency-Adaptive Sparsity Learning for Neural Network Acceleration [20.92912642901645]
We propose a Saliency-Adaptive Sparsity Learning (SASL) approach for further optimization. Our method can reduce 49.7% FLOPs of ResNet-50 with very negligible 0.39% top-1 and 0.05% top-5 accuracy degradation.
arXiv Detail & Related papers (2020-03-12T16:49:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.