Related papers: ST-SACLF: Style Transfer Informed Self-Attention Classifier for Bias-Aware Painting Classification

ST-SACLF: Style Transfer Informed Self-Attention Classifier for Bias-Aware Painting Classification

URL: http://arxiv.org/abs/2408.01827v1
Date: Sat, 3 Aug 2024 17:31:58 GMT
Title: ST-SACLF: Style Transfer Informed Self-Attention Classifier for Bias-Aware Painting Classification
Authors: Mridula Vijendran, Frederick W. B. Li, Jingjing Deng, Hubert P. H. Shum,
Abstract summary: Painting classification plays a vital role in organizing, finding, and suggesting artwork for digital and classic art galleries. Existing methods struggle with adapting knowledge from the real world to artistic images during training, leading to poor performance when dealing with different datasets. We generate more data using Style Transfer with Adaptive Instance Normalization (AdaIN), bridging the gap between diverse styles. We achieve an impressive 87.24% accuracy using the ResNet-50 backbone over 40 training epochs.
Score: 9.534646914709018
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Painting classification plays a vital role in organizing, finding, and suggesting artwork for digital and classic art galleries. Existing methods struggle with adapting knowledge from the real world to artistic images during training, leading to poor performance when dealing with different datasets. Our innovation lies in addressing these challenges through a two-step process. First, we generate more data using Style Transfer with Adaptive Instance Normalization (AdaIN), bridging the gap between diverse styles. Then, our classifier gains a boost with feature-map adaptive spatial attention modules, improving its understanding of artistic details. Moreover, we tackle the problem of imbalanced class representation by dynamically adjusting augmented samples. Through a dual-stage process involving careful hyperparameter search and model fine-tuning, we achieve an impressive 87.24\% accuracy using the ResNet-50 backbone over 40 training epochs. Our study explores quantitative analyses that compare different pretrained backbones, investigates model optimization through ablation studies, and examines how varying augmentation levels affect model performance. Complementing this, our qualitative experiments offer valuable insights into the model's decision-making process using spatial attention and its ability to differentiate between easy and challenging samples based on confidence ranking.

Related papers

Transfer Learning and Mixup for Fine-Grained Few-Shot Fungi Classification [0.0]
This paper presents our approach for the FungiCLEF 2025 competition.<n>It focuses on few-shot fine-grained visual categorization using the FungiTastic Few-Shot dataset.
arXiv Detail & Related papers (2025-07-11T01:21:21Z)
CSTA: Spatial-Temporal Causal Adaptive Learning for Exemplar-Free Video Class-Incremental Learning [62.69917996026769]
A class-incremental learning task requires learning and preserving both spatial appearance and temporal action involvement. We propose a framework that equips separate adapters to learn new class patterns, accommodating the incremental information requirements unique to each class. A causal compensation mechanism is proposed to reduce the conflicts during increment and memorization for between different types of information.
arXiv Detail & Related papers (2025-01-13T11:34:55Z)
Preview-based Category Contrastive Learning for Knowledge Distillation [53.551002781828146]
We propose a novel preview-based category contrastive learning method for knowledge distillation (PCKD) It first distills the structural knowledge of both instance-level feature correspondence and the relation between instance features and category centers. It can explicitly optimize the category representation and explore the distinct correlation between representations of instances and categories.
arXiv Detail & Related papers (2024-10-18T03:31:00Z)
Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images. We identify model weaknesses by testing the model using the counterfactual image dataset. We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z)
Contrastive-Adversarial and Diffusion: Exploring pre-training and fine-tuning strategies for sulcal identification [3.0398616939692777]
Techniques like adversarial learning, contrastive learning, diffusion denoising learning, and ordinary reconstruction learning have become standard. The study aims to elucidate the advantages of pre-training techniques and fine-tuning strategies to enhance the learning process of neural networks.
arXiv Detail & Related papers (2024-05-29T15:44:51Z)
Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z)
Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks. We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception. Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z)
Bilevel Fast Scene Adaptation for Low-Light Image Enhancement [50.639332885989255]
Enhancing images in low-light scenes is a challenging but widely concerned task in the computer vision. Main obstacle lies in the modeling conundrum from distribution discrepancy across different scenes. We introduce the bilevel paradigm to model the above latent correspondence. A bilevel learning framework is constructed to endow the scene-irrelevant generality of the encoder towards diverse scenes.
arXiv Detail & Related papers (2023-06-02T08:16:21Z)
Tackling Data Bias in Painting Classification with Style Transfer [12.88476464580968]
We propose a system to handle data bias in small paintings datasets like the Kaokore dataset. Our system consists of two stages which are style transfer and classification.
arXiv Detail & Related papers (2023-01-06T14:33:53Z)
Task Formulation Matters When Learning Continually: A Case Study in Visual Question Answering [58.82325933356066]
Continual learning aims to train a model incrementally on a sequence of tasks without forgetting previous knowledge. We present a detailed study of how different settings affect performance for Visual Question Answering.
arXiv Detail & Related papers (2022-09-30T19:12:58Z)
Playing to distraction: towards a robust training of CNN classifiers through visual explanation techniques [1.2321022105220707]
We present a novel and robust training scheme that integrates visual explanation techniques in the learning process. In particular, we work on the challenging EgoFoodPlaces dataset, achieving state-of-the-art results with a lower level of complexity.
arXiv Detail & Related papers (2020-12-28T10:24:32Z)
Two-Level Adversarial Visual-Semantic Coupling for Generalized Zero-shot Learning [21.89909688056478]
We propose a new two-level joint idea to augment the generative network with an inference network during training. This provides strong cross-modal interaction for effective transfer of knowledge between visual and semantic domains. We evaluate our approach on four benchmark datasets against several state-of-the-art methods, and show its performance.
arXiv Detail & Related papers (2020-07-15T15:34:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.