COLLAGE: Adaptive Fusion-based Retrieval for Augmented Policy Learning
- URL: http://arxiv.org/abs/2508.01131v1
- Date: Sat, 02 Aug 2025 01:23:09 GMT
- Title: COLLAGE: Adaptive Fusion-based Retrieval for Augmented Policy Learning
- Authors: Sateesh Kumar, Shivin Dass, Georgios Pavlakos, Roberto Martín-Martín,
- Abstract summary: We present COLLAGE, a method for COLLective data AGgrEgation in few-shot imitation learning.<n>Collage uses an adaptive late fusion mechanism to guide the selection of relevant demonstrations based on a task-specific combination of multiple cues.<n>Collage outperforms state-of-the-art retrieval and multi-task learning approaches by 5.1% in simulation across 10 tasks, and by 16.6% in the real world across 6 tasks.
- Score: 19.173177969412656
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we study the problem of data retrieval for few-shot imitation learning: selecting data from a large dataset to train a performant policy for a specific task, given only a few target demonstrations. Prior methods retrieve data using a single-feature distance heuristic, assuming that the best demonstrations are those that most closely resemble the target examples in visual, semantic, or motion space. However, this approach captures only a subset of the relevant information and can introduce detrimental demonstrations, e.g., retrieving data from unrelated tasks due to similar scene layouts, or selecting similar motions from tasks with divergent goals. We present COLLAGE, a method for COLLective data AGgrEgation in few-shot imitation learning that uses an adaptive late fusion mechanism to guide the selection of relevant demonstrations based on a task-specific combination of multiple cues. COLLAGE follows a simple, flexible, and efficient recipe: it assigns weights to subsets of the dataset that are pre-selected using a single feature (e.g., appearance, shape, or language similarity), based on how well a policy trained on each subset predicts actions in the target demonstrations. These weights are then used to perform importance sampling during policy training, sampling data more densely or sparsely according to estimated relevance. COLLAGE is general and feature-agnostic, allowing it to combine any number of subsets selected by any retrieval heuristic, and to identify which subsets provide the greatest benefit for the target task. In extensive experiments, COLLAGE outperforms state-of-the-art retrieval and multi-task learning approaches by 5.1% in simulation across 10 tasks, and by 16.6% in the real world across 6 tasks, where we perform retrieval from the large-scale DROID dataset. More information at https://robin-lab.cs.utexas.edu/COLLAGE .
Related papers
- Large Language Models are Demonstration Pre-Selectors for Themselves [57.101804269100185]
In-context learning (ICL) with large language models (LLMs) delivers strong few-shot performance by choosing few-shot demonstrations from the entire training data.<n>FEw yet Essential Demonstration prE-selectoR is a novel pre-selection framework that identifies a representative subset of demonstrations.<n>FEw yet Essential Demonstration prE-selectoR can reduce training data size by over 20% while maintaining performance.
arXiv Detail & Related papers (2025-06-06T12:29:03Z) - Task-Specific Data Selection for Instruction Tuning via Monosemantic Neuronal Activations [19.25205110583291]
A critical bottleneck is selecting the most relevant data to maximize task-specific performance.<n>Existing data selection approaches include unstable influence-based methods and more stable distribution alignment methods.<n>We introduce a dedicated similarity metric for this space to better identify task-relevant data.
arXiv Detail & Related papers (2025-03-19T11:35:57Z) - COBRA: COmBinatorial Retrieval Augmentation for Few-Shot Adaptation [5.840239260337972]
We propose COBRA (COmBinatorial Retrieval Augmentation), which employs an alternative CMI measure that considers both diversity and similarity to a target dataset.<n> COBRA introduces negligible computational overhead to the cost of retrieval while providing significant gains in downstream model performance.
arXiv Detail & Related papers (2024-12-23T16:10:07Z) - Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection [89.42023974249122]
Adapt-$infty$ is a new multi-way and adaptive data selection approach for lifelong instruction tuning.<n>We construct pseudo-skill clusters by grouping gradient-based sample vectors.<n>We select the best-performing data selector for each skill cluster from a pool of selector experts.<n>This data selector samples a subset of the most important samples from each skill cluster for training.
arXiv Detail & Related papers (2024-10-14T15:48:09Z) - Target-Aware Language Modeling via Granular Data Sampling [25.957424920194914]
Language model pretraining generally targets a broad range of use cases and incorporates data from diverse sources.
A cost-effective and straightforward approach is sampling with low-dimensional data features.
We show that pretrained models perform on par with the full RefinedWeb data and outperform randomly selected samples for model sizes ranging from 125M to 1.5B.
arXiv Detail & Related papers (2024-09-23T04:52:17Z) - LLM-Select: Feature Selection with Large Language Models [64.5099482021597]
Large language models (LLMs) are capable of selecting the most predictive features, with performance rivaling the standard tools of data science.<n>Our findings suggest that LLMs may be useful not only for selecting the best features for training but also for deciding which features to collect in the first place.
arXiv Detail & Related papers (2024-07-02T22:23:40Z) - LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection.
We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks.
Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z) - One-Shot Learning as Instruction Data Prospector for Large Language Models [108.81681547472138]
textscNuggets uses one-shot learning to select high-quality instruction data from extensive datasets.
We show that instruction tuning with the top 1% of examples curated by textscNuggets substantially outperforms conventional methods employing the entire dataset.
arXiv Detail & Related papers (2023-12-16T03:33:12Z) - GistScore: Learning Better Representations for In-Context Example
Selection with Gist Bottlenecks [3.9638110494107095]
In-context Learning (ICL) is the ability of Large Language Models (LLMs) to perform new tasks when conditioned on prompts.
We propose Example Gisting, a novel approach for training example encoders through supervised fine-tuning.
We show that our fine-tuned models get state-of-the-art ICL performance with over 20% absolute gain over off-the-shelf retrievers.
arXiv Detail & Related papers (2023-11-16T06:28:05Z) - Unified Demonstration Retriever for In-Context Learning [56.06473069923567]
Unified Demonstration Retriever (textbfUDR) is a single model to retrieve demonstrations for a wide range of tasks.
We propose a multi-task list-wise ranking training framework, with an iterative mining strategy to find high-quality candidates.
Experiments on 30+ tasks across 13 task families and multiple data domains show that UDR significantly outperforms baselines.
arXiv Detail & Related papers (2023-05-07T16:07:11Z) - Towards General and Efficient Active Learning [20.888364610175987]
Active learning aims to select the most informative samples to exploit limited annotation budgets.
We propose a novel general and efficient active learning (GEAL) method in this paper.
Our method can conduct data selection processes on different datasets with a single-pass inference of the same model.
arXiv Detail & Related papers (2021-12-15T08:35:28Z) - Improving Multi-Turn Response Selection Models with Complementary
Last-Utterance Selection by Instance Weighting [84.9716460244444]
We consider utilizing the underlying correlation in the data resource itself to derive different kinds of supervision signals.
We conduct extensive experiments in two public datasets and obtain significant improvement in both datasets.
arXiv Detail & Related papers (2020-02-18T06:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.