Related papers: ISAC: Training-Free Instance-to-Semantic Attention Control for Improving Multi-Instance Generation

ISAC: Training-Free Instance-to-Semantic Attention Control for Improving Multi-Instance Generation

URL: http://arxiv.org/abs/2505.20935v1
Date: Tue, 27 May 2025 09:23:10 GMT
Title: ISAC: Training-Free Instance-to-Semantic Attention Control for Improving Multi-Instance Generation
Authors: Sanghyun Jo, Wooyeol Lee, Ziseok Lee, Kyungsu Kim,
Abstract summary: Instance-to-Semantic Attention Control (ISAC) explicitly resolves incomplete instance formation and semantic entanglement.<n>ISAC achieves up to 52% average multi-class accuracy and 83% average multi-instance accuracy.
Score: 1.3624495460189863
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-to-image diffusion models excel at generating single-instance scenes but struggle with multi-instance scenarios, often merging or omitting objects. Unlike previous training-free approaches that rely solely on semantic-level guidance without addressing instance individuation, our training-free method, Instance-to-Semantic Attention Control (ISAC), explicitly resolves incomplete instance formation and semantic entanglement through an instance-first modeling approach. This enables ISAC to effectively leverage a hierarchical, tree-structured prompt mechanism, disentangling multiple object instances and individually aligning them with their corresponding semantic labels. Without employing any external models, ISAC achieves up to 52% average multi-class accuracy and 83% average multi-instance accuracy by effectively forming disentangled instances. The code will be made available upon publication.

Related papers

CountZES: Counting via Zero-Shot Exemplar Selection [22.69910219820086]
We propose CountZES, a training-free framework for object counting via zero-shot exemplar selection.<n>CountZES discovers diverse exemplars through three synergistic stages: Detection-Anchored Exemplar (DAE), Density-Guided Exemplar (DGE), and Feature-Consensus Exemplar (FCE)
arXiv Detail & Related papers (2025-12-18T11:12:50Z)
Segment Any Class (SAC): Multi-Class Few-Shot Semantic Segmentation via Class Region Proposals [0.0]
We present Segment Any Class (SAC), a training-free approach that task-adapts SAM for Multi-class segmentation. SAC generates Class-Region Proposals (CRP) on query images which allows us to automatically generate class-aware prompts. SAC solely utilizes automated prompting and achieves superior results over state-of-the-art methods on the COCO-20i benchmark.
arXiv Detail & Related papers (2024-11-21T01:04:53Z)
Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation [50.51125319374404]
We propose a novel self-training network InsTeacher3D to explore and exploit pure instance knowledge from unlabeled data. Experimental results on multiple large-scale datasets show that the InsTeacher3D significantly outperforms prior state-of-the-art semi-supervised approaches.
arXiv Detail & Related papers (2024-06-24T16:35:58Z)
Weakly Supervised 3D Instance Segmentation without Instance-level Annotations [57.615325809883636]
3D semantic scene understanding tasks have achieved great success with the emergence of deep learning, but often require a huge amount of manually annotated training data. We propose the first weakly-supervised 3D instance segmentation method that only requires categorical semantic labels as supervision. By generating pseudo instance labels from categorical semantic labels, our designed approach can also assist existing methods for learning 3D instance segmentation at reduced annotation cost.
arXiv Detail & Related papers (2023-08-03T12:30:52Z)
Rethinking Multiple Instance Learning for Whole Slide Image Classification: A Good Instance Classifier is All You Need [18.832471712088353]
We propose an instance-level weakly supervised contrastive learning algorithm for the first time under the MIL setting. We also propose an accurate pseudo label generation method through prototype learning.
arXiv Detail & Related papers (2023-07-05T12:44:52Z)
SIM: Semantic-aware Instance Mask Generation for Box-Supervised Instance Segmentation [22.930296667684125]
We propose a new box-supervised instance segmentation approach by developing a Semantic-aware Instance Mask (SIM) generation paradigm. Considering that the semantic-aware prototypes cannot distinguish different instances of the same semantics, we propose a self-correction mechanism. Extensive experimental results demonstrate the superiority of our proposed SIM approach over other state-of-the-art methods.
arXiv Detail & Related papers (2023-03-14T05:59:25Z)
Instance-specific and Model-adaptive Supervision for Semi-supervised Semantic Segmentation [49.82432158155329]
We propose an instance-specific and model-adaptive supervision for semi-supervised semantic segmentation, named iMAS. iMAS learns from unlabeled instances progressively by weighing their corresponding consistency losses based on the evaluated hardness.
arXiv Detail & Related papers (2022-11-21T10:37:28Z)
Object-Aware Self-supervised Multi-Label Learning [9.496981642855769]
We propose an Object-Aware Self-Supervision (OASS) method to obtain more fine-grained representations for multi-label learning. The proposed method can be leveraged to efficiently generate Class-Specific Instances (CSI) in a proposal-free fashion. Experiments on the VOC2012 dataset for multi-label classification demonstrate the effectiveness of the proposed method against the state-of-the-art counterparts.
arXiv Detail & Related papers (2022-05-14T10:14:08Z)
Learning to Imagine: Diversify Memory for Incremental Learning using Unlabeled Data [69.30452751012568]
We develop a learnable feature generator to diversify exemplars by adaptively generating diverse counterparts of exemplars. We introduce semantic contrastive learning to enforce the generated samples to be semantic consistent with exemplars. Our method does not bring any extra inference cost and outperforms state-of-the-art methods on two benchmarks.
arXiv Detail & Related papers (2022-04-19T15:15:18Z)
Learning to Detect Instance-level Salient Objects Using Complementary Image Labels [55.049347205603304]
We present the first weakly-supervised approach to the salient instance detection problem. We propose a novel weakly-supervised network with three branches: a Saliency Detection Branch leveraging class consistency information to locate candidate objects; a Boundary Detection Branch exploiting class discrepancy information to delineate object boundaries; and a Centroid Detection Branch using subitizing information to detect salient instance centroids.
arXiv Detail & Related papers (2021-11-19T10:15:22Z)
Approximating Instance-Dependent Noise via Instance-Confidence Embedding [87.65718705642819]
Label noise in multiclass classification is a major obstacle to the deployment of learning systems. We investigate the instance-dependent noise (IDN) model and propose an efficient approximation of IDN to capture the instance-specific label corruption.
arXiv Detail & Related papers (2021-03-25T02:33:30Z)
Train a One-Million-Way Instance Classifier for Unsupervised Visual Representation Learning [45.510042484456854]
This paper presents a simple unsupervised visual representation learning method with a pretext task of discriminating all images in a dataset using a parametric, instance-level computation. The overall framework is a replica of a supervised classification model, where semantic classes (e.g., dog, bird, and ship) are replaced by instance IDs. scaling up the classification task from thousands of semantic labels to millions of instance labels brings specific challenges including 1) the large-scale softmax classifier; 2) the slow convergence due to the infrequent visiting of instance samples; and 3) the massive number of negative classes that can be noisy.
arXiv Detail & Related papers (2021-02-09T14:44:18Z)
How to trust unlabeled data? Instance Credibility Inference for Few-Shot Learning [47.21354101796544]
This paper presents a statistical approach, dubbed Instance Credibility Inference (ICI) to exploit the support of unlabeled instances for few-shot visual recognition. We rank the credibility of pseudo-labeled instances along the regularization path of their corresponding incidental parameters, and the most trustworthy pseudo-labeled examples are preserved as the augmented labeled instances.
arXiv Detail & Related papers (2020-07-15T03:38:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.