Related papers: Instance-Level Generation for Representation Learning

Instance-Level Generation for Representation Learning

URL: http://arxiv.org/abs/2510.09171v1
Date: Fri, 10 Oct 2025 09:14:33 GMT
Title: Instance-Level Generation for Representation Learning
Authors: Yankun Wu, Zakaria Laskar, Giorgos Kordopatis-Zilos, Noa Garcia, Giorgos Tolias,
Abstract summary: Instance-level recognition (ILR) focuses on identifying individual objects rather than broad categories.<n>We introduce a novel approach that synthetically generates diverse object instances from multiple domains.<n>Our method is the first to address ILR-specific challenges without relying on any real images.
Score: 20.97048848139392
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Instance-level recognition (ILR) focuses on identifying individual objects rather than broad categories, offering the highest granularity in image classification. However, this fine-grained nature makes creating large-scale annotated datasets challenging, limiting ILR's real-world applicability across domains. To overcome this, we introduce a novel approach that synthetically generates diverse object instances from multiple domains under varied conditions and backgrounds, forming a large-scale training set. Unlike prior work on automatic data synthesis, our method is the first to address ILR-specific challenges without relying on any real images. Fine-tuning foundation vision models on the generated data significantly improves retrieval performance across seven ILR benchmarks spanning multiple domains. Our approach offers a new, efficient, and effective alternative to extensive data collection and curation, introducing a new ILR paradigm where the only input is the names of the target domains, unlocking a wide range of real-world applications.

Related papers

FeDecider: An LLM-Based Framework for Federated Cross-Domain Recommendation [75.50721642765994]
Large language model (LLM)-based recommendation models have demonstrated impressive performance.<n>We propose an LLM-based framework for Federated cross-domain recommendation, FeDecider.<n>Extensive experiments across diverse datasets validate the effectiveness of our proposed FeDecider.
arXiv Detail & Related papers (2026-02-17T21:42:28Z)
Cross-Domain Few-Shot Segmentation via Multi-view Progressive Adaptation [84.97054460338109]
Cross-Domain Few-Shot aims to segment in data-scarce domains conditioned on a few exemplars.<n>We propose Multi-view Progressive Adaptation, which progressively adapts few-shot capability to target domains from both data and strategy perspectives.<n> MPA effectively adapts few-shot capability to target domains, outperforming state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2026-02-05T02:16:44Z)
Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation [66.72195610471624]
Cross-Domain Sequential Recommendation aims to mine and transfer users' sequential preferences across different domains. We propose a novel framework named URLLM, which aims to improve the CDSR performance by exploring the User Retrieval approach.
arXiv Detail & Related papers (2024-06-05T09:19:54Z)
UniGen: Universal Domain Generalization for Sentiment Classification via Zero-shot Dataset Generation [6.3823202275924125]
We propose a novel approach to universal domain generalization that generates a dataset regardless of the target domain. Our experiments indicate that the proposed method accomplishes generalizability across various domains while using a parameter set that is orders of magnitude smaller than PLMs.
arXiv Detail & Related papers (2024-05-02T05:46:13Z)
Few-shot Object Localization [37.347898735345574]
This paper defines a novel task named Few-Shot Object localization (FSOL) It aims to achieve precise localization with limited samples. This task achieves generalized object localization by leveraging a small number of labeled support samples to query the positional information of objects within corresponding images. Experimental results demonstrate a significant performance improvement of our approach in the FSOL task, establishing an efficient benchmark for further research.
arXiv Detail & Related papers (2024-03-19T05:50:48Z)
Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization [61.64304227831361]
Single-domain generalization aims to learn a model from single source domain data to achieve generalized performance on other unseen target domains. We propose a dynamic object-centric perception network based on prompt learning, aiming to adapt to the variations in image complexity.
arXiv Detail & Related papers (2024-02-28T16:16:51Z)
Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets. This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets. We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z)
Multi-Domain Adversarial Feature Generalization for Person Re-Identification [52.835955258959785]
We propose a multi-dataset feature generalization network (MMFA-AAE) It is capable of learning a universal domain-invariant feature representation from multiple labeled datasets and generalizing it to unseen' camera systems. It also surpasses many state-of-the-art supervised methods and unsupervised domain adaptation methods by a large margin.
arXiv Detail & Related papers (2020-11-25T08:03:15Z)
A Universal Representation Transformer Layer for Few-Shot Image Classification [43.31379752656756]
Few-shot classification aims to recognize unseen classes when presented with only a small number of samples. We consider the problem of multi-domain few-shot image classification, where unseen classes and examples come from diverse data sources. Here, we propose a Universal Representation Transformer layer, that meta-learns to leverage universal features for few-shot classification.
arXiv Detail & Related papers (2020-06-21T03:08:00Z)
Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN [117.80737222754306]
We present a novel universal object detector called Universal-RCNN. We first generate a global semantic pool by integrating all high-level semantic representation of all the categories. An Intra-Domain Reasoning Module learns and propagates the sparse graph representation within one dataset guided by a spatial-aware GCN.
arXiv Detail & Related papers (2020-02-18T07:57:45Z)
Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification. Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.