Eliminating Gradient Conflict in Reference-based Line-art Colorization
- URL: http://arxiv.org/abs/2207.06095v1
- Date: Wed, 13 Jul 2022 10:08:37 GMT
- Title: Eliminating Gradient Conflict in Reference-based Line-art Colorization
- Authors: Zekun Li, Zhengyang Geng, Zhao Kang, Wenyu Chen, Yibo Yang
- Abstract summary: Reference-based line-art colorization is a challenging task in computer vision.
We propose a novel attention mechanism using Stop-Gradient Attention (SGA)
Compared with state-of-the-art modules in line-art colorization, our approach demonstrates significant improvements.
- Score: 26.46476996150605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reference-based line-art colorization is a challenging task in computer
vision. The color, texture, and shading are rendered based on an abstract
sketch, which heavily relies on the precise long-range dependency modeling
between the sketch and reference. Popular techniques to bridge the cross-modal
information and model the long-range dependency employ the attention mechanism.
However, in the context of reference-based line-art colorization, several
techniques would intensify the existing training difficulty of attention, for
instance, self-supervised training protocol and GAN-based losses. To understand
the instability in training, we detect the gradient flow of attention and
observe gradient conflict among attention branches. This phenomenon motivates
us to alleviate the gradient issue by preserving the dominant gradient branch
while removing the conflict ones. We propose a novel attention mechanism using
this training strategy, Stop-Gradient Attention (SGA), outperforming the
attention baseline by a large margin with better training stability. Compared
with state-of-the-art modules in line-art colorization, our approach
demonstrates significant improvements in Fr\'echet Inception Distance (FID, up
to 27.21%) and structural similarity index measure (SSIM, up to 25.67%) on
several benchmarks. The code of SGA is available at
https://github.com/kunkun0w0/SGA .
Related papers
- Rethinking Dimensional Rationale in Graph Contrastive Learning from Causal Perspective [15.162584339143239]
Graph contrastive learning is a general learning paradigm excelling at capturing invariant information from diverse perturbations in graphs.
Recent works focus on exploring the structural rationale from graphs, thereby increasing the discriminability of the invariant information.
We propose the dimensional rationale-aware graph contrastive learning approach, which introduces a learnable dimensional rationale acquiring network and a redundancy reduction constraint.
arXiv Detail & Related papers (2023-12-16T10:05:18Z) - 2D Feature Distillation for Weakly- and Semi-Supervised 3D Semantic
Segmentation [92.17700318483745]
We propose an image-guidance network (IGNet) which builds upon the idea of distilling high level feature information from a domain adapted synthetically trained 2D semantic segmentation network.
IGNet achieves state-of-the-art results for weakly-supervised LiDAR semantic segmentation on ScribbleKITTI, boasting up to 98% relative performance to fully supervised training with only 8% labeled points.
arXiv Detail & Related papers (2023-11-27T07:57:29Z) - Attention-Aware Anime Line Drawing Colorization [10.924683447616273]
We introduce an attention-based model for anime line drawing colorization, in which a channel-wise and spatial-wise Convolutional Attention module is used.
Our method outperforms other SOTA methods, with more accurate line structure and semantic color information.
arXiv Detail & Related papers (2022-12-21T12:50:31Z) - Scaling Multimodal Pre-Training via Cross-Modality Gradient
Harmonization [68.49738668084693]
Self-supervised pre-training recently demonstrates success on large-scale multimodal data.
Cross-modality alignment (CMA) is only a weak and noisy supervision.
CMA might cause conflicts and biases among modalities.
arXiv Detail & Related papers (2022-11-03T18:12:32Z) - Adversarial Cross-View Disentangled Graph Contrastive Learning [30.97720522293301]
We introduce ACDGCL, which follows the information bottleneck principle to learn minimal yet sufficient representations from graph data.
We empirically demonstrate that our proposed model outperforms the state-of-the-arts on graph classification task over multiple benchmark datasets.
arXiv Detail & Related papers (2022-09-16T03:48:39Z) - Are Gradients on Graph Structure Reliable in Gray-box Attacks? [56.346504691615934]
Previous gray-box attackers employ gradients from the surrogate model to locate the vulnerable edges to perturb the graph structure.
In this paper, we discuss and analyze the errors caused by the unreliability of the structural gradients.
We propose a novel attack model with methods to reduce the errors inside the structural gradients.
arXiv Detail & Related papers (2022-08-07T06:43:32Z) - GraphCoCo: Graph Complementary Contrastive Learning [65.89743197355722]
Graph Contrastive Learning (GCL) has shown promising performance in graph representation learning (GRL) without the supervision of manual annotations.
This paper proposes an effective graph complementary contrastive learning approach named GraphCoCo to tackle the above issue.
arXiv Detail & Related papers (2022-03-24T02:58:36Z) - Attention-based Stylisation for Exemplar Image Colourisation [3.491870689686827]
This work reformulates the existing methodology introducing a novel end-to-end colourisation network.
The proposed architecture integrates attention modules at different resolutions that learn how to perform the style transfer task.
Experimental validations demonstrate efficiency of the proposed methodology which generates high quality and visual appealing colourisation.
arXiv Detail & Related papers (2021-05-04T18:56:26Z) - Diversified Multiscale Graph Learning with Graph Self-Correction [55.43696999424127]
We propose a diversified multiscale graph learning model equipped with two core ingredients.
A graph self-correction (GSC) mechanism to generate informative embedded graphs, and a diversity boosting regularizer (DBR) to achieve a comprehensive characterization of the input graph.
Experiments on popular graph classification benchmarks show that the proposed GSC mechanism leads to significant improvements over state-of-the-art graph pooling methods.
arXiv Detail & Related papers (2021-03-17T16:22:24Z) - SFANet: A Spectrum-aware Feature Augmentation Network for
Visible-Infrared Person Re-Identification [12.566284647658053]
We propose a novel spectrum-aware feature augementation network named SFANet for cross-modality matching problem.
Learning with grayscale-spectrum images, our model can apparently reduce modality discrepancy and detect inner structure relations.
In feature-level, we improve the conventional two-stream network through balancing the number of specific and sharable convolutional blocks.
arXiv Detail & Related papers (2021-02-24T08:57:32Z) - Tensor Graph Convolutional Networks for Multi-relational and Robust
Learning [74.05478502080658]
This paper introduces a tensor-graph convolutional network (TGCN) for scalable semi-supervised learning (SSL) from data associated with a collection of graphs, that are represented by a tensor.
The proposed architecture achieves markedly improved performance relative to standard GCNs, copes with state-of-the-art adversarial attacks, and leads to remarkable SSL performance over protein-to-protein interaction networks.
arXiv Detail & Related papers (2020-03-15T02:33:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.