Related papers: Using Multimodal Foundation Models and Clustering for Improved Style Ambiguity Loss

Related papers

Integrating Domain Knowledge into Large Language Models for Enhanced Fashion Recommendations [5.251304651964696]
We introduce the Fashion Large Language Model (FLLM), which employs auto-prompt generation training strategies to enhance its capacity for delivering personalized fashion advice. Our results show that this approach surpasses existing models in accuracy, interpretability, and few-shot learning capabilities.
arXiv Detail & Related papers (2025-01-03T21:49:44Z)
Leveraging Hierarchical Taxonomies in Prompt-based Continual Learning [41.13568563835089]
We find that applying human habits of organizing and connecting information can serve as an efficient strategy when training deep learning models. We propose a novel regularization loss function that encourages models to focus more on challenging knowledge areas.
arXiv Detail & Related papers (2024-10-06T01:30:40Z)
Using Style Ambiguity Loss to Improve Aesthetics of Diffusion Models [0.0]
Teaching text-to-image models to be creative involves using style ambiguity loss. In this work, we explore using the style ambiguity training objective, used to approximate creativity, on a diffusion model. We find that the models trained with style ambiguity loss can generate better images than the baseline diffusion models and GANs.
arXiv Detail & Related papers (2024-10-02T22:05:30Z)
Automatic Generation of Fashion Images using Prompting in Generative Machine Learning Models [1.8817715864806608]
This work investigates methodologies for generating tailored fashion descriptions using two distinct Large Language Models and a Stable Diffusion model for fashion image creation. Emphasizing adaptability in AI-driven fashion creativity, we focus on prompting techniques, such as zero-shot and few-shot learning. Evaluation combines quantitative metrics such as CLIPscore with qualitative human judgment, highlighting strengths in creativity, coherence, and aesthetic appeal across diverse styles.
arXiv Detail & Related papers (2024-07-20T17:37:51Z)
An Improved Method for Personalizing Diffusion Models [23.20529652769131]
Diffusion models have demonstrated impressive image generation capabilities. Personalized approaches, such as textual inversion and Dreambooth, enhance model individualization using specific images. Our proposed approach aims to retain the model's original knowledge during new information integration.
arXiv Detail & Related papers (2024-07-07T09:52:04Z)
Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images. We identify model weaknesses by testing the model using the counterfactual image dataset. We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z)
Few-shot Calligraphy Style Learning [0.0]
"Presidifussion" is a novel approach to learning and replicating the unique style of calligraphy of President Xu. We introduce innovative techniques of font image conditioning and stroke information conditioning, enabling the model to capture the intricate structural elements of Chinese characters. This work not only presents a breakthrough in the digital preservation of calligraphic art but also sets a new standard for data-efficient generative modeling in the domain of cultural heritage digitization.
arXiv Detail & Related papers (2024-04-26T07:17:09Z)
HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced Diffusion Models [84.12784265734238]
The goal of Arbitrary Style Transfer (AST) is injecting the artistic features of a style reference into a given image/video. We propose HiCAST, which is capable of explicitly customizing the stylization results according to various source of semantic clues. A novel learning objective is leveraged for video diffusion model training, which significantly improve cross-frame temporal consistency.
arXiv Detail & Related papers (2024-01-11T12:26:23Z)
Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption [73.98706049140098]
We propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss. Specifically, we design a phasic training strategy with phasic content fusion to help our model learn content and style information when t is large. Finally, we propose a cross-domain structure guidance strategy that enhances structure consistency during domain adaptation.
arXiv Detail & Related papers (2023-09-07T14:14:11Z)
Training Diffusion Models with Reinforcement Learning [82.29328477109826]
Diffusion models are trained with an approximation to the log-likelihood objective. In this paper, we investigate reinforcement learning methods for directly optimizing diffusion models for downstream objectives. We describe how posing denoising as a multi-step decision-making problem enables a class of policy gradient algorithms.
arXiv Detail & Related papers (2023-05-22T17:57:41Z)
Creative divergent synthesis with generative models [3.655021726150369]
Machine learning approaches now achieve impressive generation capabilities in numerous domains such as image, audio or video. We propose various perspectives on how this complicated goal could ever be achieved, and provide preliminary results on our novel training objective called textitBounded Adversarial Divergence (BAD)
arXiv Detail & Related papers (2022-11-16T12:12:31Z)
DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model. We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z)
Stylized Adversarial Defense [105.88250594033053]
adversarial training creates perturbation patterns and includes them in the training set to robustify the model. We propose to exploit additional information from the feature space to craft stronger adversaries. Our adversarial training approach demonstrates strong robustness compared to state-of-the-art defenses.
arXiv Detail & Related papers (2020-07-29T08:38:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.