Efficient Semantic Image Synthesis via Class-Adaptive Normalization
- URL: http://arxiv.org/abs/2012.04644v2
- Date: Tue, 4 May 2021 23:20:35 GMT
- Title: Efficient Semantic Image Synthesis via Class-Adaptive Normalization
- Authors: Zhentao Tan and Dongdong Chen and Qi Chu and Menglei Chai and Jing
Liao and Mingming He and Lu Yuan and Gang Hua and Nenghai Yu
- Abstract summary: Class-adaptive normalization (CLADE) is a lightweight but equally-effective variant that is only adaptive to semantic class.
We introduce intra-class positional map encoding calculated from semantic layouts to modulate the normalization parameters of CLADE.
The proposed CLADE can be generalized to different SPADE-based methods while achieving comparable generation quality compared to SPADE.
- Score: 116.63715955932174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spatially-adaptive normalization (SPADE) is remarkably successful recently in
conditional semantic image synthesis \cite{park2019semantic}, which modulates
the normalized activation with spatially-varying transformations learned from
semantic layouts, to prevent the semantic information from being washed away.
Despite its impressive performance, a more thorough understanding of the
advantages inside the box is still highly demanded to help reduce the
significant computation and parameter overhead introduced by this novel
structure. In this paper, from a return-on-investment point of view, we conduct
an in-depth analysis of the effectiveness of this spatially-adaptive
normalization and observe that its modulation parameters benefit more from
semantic-awareness rather than spatial-adaptiveness, especially for
high-resolution input masks. Inspired by this observation, we propose
class-adaptive normalization (CLADE), a lightweight but equally-effective
variant that is only adaptive to semantic class. In order to further improve
spatial-adaptiveness, we introduce intra-class positional map encoding
calculated from semantic layouts to modulate the normalization parameters of
CLADE and propose a truly spatially-adaptive variant of CLADE, namely
CLADE-ICPE.Through extensive experiments on multiple challenging datasets, we
demonstrate that the proposed CLADE can be generalized to different SPADE-based
methods while achieving comparable generation quality compared to SPADE, but it
is much more efficient with fewer extra parameters and lower computational
cost. The code and pretrained models are available at
\url{https://github.com/tzt101/CLADE.git}.
Related papers
- Regularizing Subspace Redundancy of Low-Rank Adaptation [54.473090597164834]
We propose ReSoRA, a method that explicitly models redundancy between mapping subspaces and adaptively Regularizes Subspace redundancy of Low-Rank Adaptation.<n>Our proposed method consistently facilitates existing state-of-the-art PETL methods across various backbones and datasets in vision-language retrieval and standard visual classification benchmarks.<n>As a training supervision, ReSoRA can be seamlessly integrated into existing approaches in a plug-and-play manner, with no additional inference costs.
arXiv Detail & Related papers (2025-07-28T11:52:56Z) - ESSA: Evolutionary Strategies for Scalable Alignment [2.589791058467358]
This paper introduces ESSA, a new framework that uses Evolutionary Strategies (ES) to efficiently align Large Language Models (LLMs)<n>ES is well-suited for LLM alignment due to its favorable properties, such as high parallelizability, memory efficiency, robustness to sparse rewards, and fewer data samples required for convergence.<n>Our findings establish ES as a promising and scalable alternative to gradient-based alignment, paving the way for efficient post-training of large language models.
arXiv Detail & Related papers (2025-07-06T16:23:07Z) - Perception-Oriented Latent Coding for High-Performance Compressed Domain Semantic Inference [30.78149130760627]
Perception-Oriented Latent Coding (POLC) is an approach that enriches the semantic content of latent features for high-performance semantic inference.<n>POLC requires only a plug-and-play adapter for fine-tuning, significantly reducing the parameter count compared to previous MSE-oriented methods.
arXiv Detail & Related papers (2025-07-02T11:21:38Z) - HAFLQ: Heterogeneous Adaptive Federated LoRA Fine-tuned LLM with Quantization [55.972018549438964]
Federated fine-tuning of pre-trained Large Language Models (LLMs) enables task-specific adaptation across diverse datasets while preserving privacy.<n>We propose HAFLQ (Heterogeneous Adaptive Federated Low-Rank Adaptation Fine-tuned LLM with Quantization), a novel framework for efficient and scalable fine-tuning of LLMs in heterogeneous environments.<n> Experimental results on the text classification task demonstrate that HAFLQ reduces memory usage by 31%, lowers communication cost by 49%, improves accuracy by 50%, and achieves faster convergence compared to the baseline method.
arXiv Detail & Related papers (2024-11-10T19:59:54Z) - Transducer Consistency Regularization for Speech to Text Applications [4.510630624936377]
We present Transducer Consistency Regularization (TCR), a consistency regularization method for transducer models.
We utilize occupational probabilities to give different weights on transducer output distributions, thus only alignments close to oracle alignments would contribute to the model learning.
Our experiments show the proposed method is superior to other consistency regularization implementations and could effectively reduce word error rate (WER) by 4.3% relatively comparing with a strong baseline on the textscLibrispeech dataset.
arXiv Detail & Related papers (2024-10-09T23:53:13Z) - Conditional Deformable Image Registration with Spatially-Variant and
Adaptive Regularization [2.3419031955865517]
We propose a learning-based registration approach based on a novel conditional spatially adaptive instance normalization (CSAIN)
Experiments show that our proposed method outperforms the baseline approaches while achieving spatially-variant and adaptive regularization.
arXiv Detail & Related papers (2023-03-19T16:12:06Z) - Retrieval-based Spatially Adaptive Normalization for Semantic Image
Synthesis [68.1281982092765]
We propose a novel normalization module, termed as REtrieval-based Spatially AdaptIve normaLization (RESAIL)
RESAIL provides pixel level fine-grained guidance to the normalization architecture.
Experiments on several challenging datasets show that our RESAIL performs favorably against state-of-the-arts in terms of quantitative metrics, visual quality, and subjective evaluation.
arXiv Detail & Related papers (2022-04-06T14:21:39Z) - Adaptive and Cascaded Compressive Sensing [10.162966219929887]
Scene-dependent adaptive compressive sensing (CS) has been a long pursuing goal which has huge potential in significantly improving the performance of CS.
We propose a restricted isometry property (RIP) condition based error clamping, which could directly predict the reconstruction error.
We also propose a cascaded feature fusion reconstruction network that could efficiently utilize the information derived from different adaptive sampling stages.
arXiv Detail & Related papers (2022-03-21T07:50:24Z) - HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning [74.76431541169342]
Zero-shot learning (ZSL) tackles the unseen class recognition problem, transferring semantic knowledge from seen classes to unseen ones.
We propose a novel hierarchical semantic-visual adaptation (HSVA) framework to align semantic and visual domains.
Experiments on four benchmark datasets demonstrate HSVA achieves superior performance on both conventional and generalized ZSL.
arXiv Detail & Related papers (2021-09-30T14:27:50Z) - Deep Contrastive Graph Representation via Adaptive Homotopy Learning [76.22904270821778]
Homotopy model is an excellent tool exploited by diverse research works in the field of machine learning.
We propose a novel adaptive homotopy framework (AH) in which the Maclaurin duality is employed.
AH can be widely utilized to enhance the homotopy-based algorithm.
arXiv Detail & Related papers (2021-06-17T04:46:04Z) - SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients [99.13839450032408]
It is desired to design a universal framework for adaptive algorithms to solve general problems.
In particular, our novel framework provides adaptive methods under non convergence support for setting.
arXiv Detail & Related papers (2021-06-15T15:16:28Z) - Bayesian Sparse learning with preconditioned stochastic gradient MCMC
and its applications [5.660384137948734]
The proposed algorithm converges to the correct distribution with a controllable bias under mild conditions.
We show that the proposed algorithm canally converge to the correct distribution with a controllable bias under mild conditions.
arXiv Detail & Related papers (2020-06-29T20:57:20Z) - Rethinking Spatially-Adaptive Normalization [111.13203525538496]
Class-adaptive normalization (CLADE) is a lightweight variant that is not adaptive to spatial positions or layouts.
CLADE greatly reduces the computation cost while still being able to preserve the semantic information during the generation.
arXiv Detail & Related papers (2020-04-06T17:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.