Online Foundation Model Selection in Robotics
- URL: http://arxiv.org/abs/2402.08570v1
- Date: Tue, 13 Feb 2024 16:14:32 GMT
- Title: Online Foundation Model Selection in Robotics
- Authors: Po-han Li, Oyku Selin Toprak, Aditya Narayanan, Ufuk Topcu, Sandeep
Chinchali
- Abstract summary: Foundation models have recently expanded into robotics after excelling in computer vision and natural language processing.
Users with access to both face a problem when deciding between effective yet costly closed-source models and free but less powerful open-source alternatives.
We propose a novel solution that combines an open-source encoder to output context and an online learning algorithm that processes this context.
- Score: 18.65707136264266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Foundation models have recently expanded into robotics after excelling in
computer vision and natural language processing. The models are accessible in
two ways: open-source or paid, closed-source options. Users with access to both
face a problem when deciding between effective yet costly closed-source models
and free but less powerful open-source alternatives. We call it the model
selection problem. Existing supervised-learning methods are impractical due to
the high cost of collecting extensive training data from closed-source models.
Hence, we focus on the online learning setting where algorithms learn while
collecting data, eliminating the need for large pre-collected datasets. We thus
formulate a user-centric online model selection problem and propose a novel
solution that combines an open-source encoder to output context and an online
learning algorithm that processes this context. The encoder distills vast data
distributions into low-dimensional features, i.e., the context, without
additional training. The online learning algorithm aims to maximize a composite
reward that includes model performance, execution time, and costs based on the
context extracted from the data. It results in an improved trade-off between
selecting open-source and closed-source models compared to non-contextual
methods, as validated by our theoretical analysis. Experiments across
language-based robotic tasks such as Waymo Open Dataset, ALFRED, and Open
X-Embodiment demonstrate real-world applications of the solution. The results
show that the solution significantly improves the task success rate by up to
14%.
Related papers
- The Challenger: When Do New Data Sources Justify Switching Machine Learning Models? [2.7998963147546143]
We study the problem of deciding whether an organization should replace a trained incumbent model with a challenger relying on newly available features.<n>We develop a unified economic and statistical framework that links learning-curve dynamics, data-acquisition and retraining costs, and discounting of future gains.
arXiv Detail & Related papers (2025-12-20T15:03:40Z) - Lift What You Can: Green Online Learning with Heterogeneous Ensembles [3.5523355921740163]
We present a policy for choosing which models to train on incoming data.<n>Most notably, we propose the novel $zeta$-policy, which focuses on training near optimal models at reduced costs.<n>In our experiments across 11 benchmark datasets, we find empiric evidence that our $zeta$-policy is a strong contribution to the state-of-the-art.
arXiv Detail & Related papers (2025-09-23T13:14:37Z) - SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models [51.74498855100541]
Large language models (LLMs) have shown strong reasoning capabilities when fine-tuned with reinforcement learning (RL)<n>We propose textbfSPaRFT, a self-paced learning framework that enables efficient learning based on the capability of the model being trained.
arXiv Detail & Related papers (2025-08-07T03:50:48Z) - Intention-Conditioned Flow Occupancy Models [69.79049994662591]
Large-scale pre-training has fundamentally changed how machine learning research is done today.<n>Applying this same framework to reinforcement learning is appealing because it offers compelling avenues for addressing core challenges in RL.<n>Recent advances in generative AI have provided new tools for modeling highly complex distributions.
arXiv Detail & Related papers (2025-06-10T15:27:46Z) - Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Unlearnable Algorithms for In-context Learning [36.895152458323764]
In this paper, we focus on efficient unlearning methods for the task adaptation phase of a pretrained large language model.
We observe that an LLM's ability to do in-context learning for task adaptation allows for efficient exact unlearning of task adaptation training data.
We propose a new holistic measure of unlearning cost which accounts for varying inference costs.
arXiv Detail & Related papers (2024-02-01T16:43:04Z) - Equitable-FL: Federated Learning with Sparsity for Resource-Constrained
Environment [10.980548731600116]
We propose a sparse form of federated learning that performs well in a Resource Constrained Environment.
Our goal is to make learning possible, regardless of a node's space, computing, or bandwidth scarcity.
Results obtained from experiments performed for training convolutional neural networks validate the efficacy of Equitable-FL.
arXiv Detail & Related papers (2023-09-02T08:40:17Z) - Anytime Model Selection in Linear Bandits [61.97047189786905]
We develop ALEXP, which has an exponentially improved dependence on $M$ for its regret.
Our approach utilizes a novel time-uniform analysis of the Lasso, establishing a new connection between online learning and high-dimensional statistics.
arXiv Detail & Related papers (2023-07-24T15:44:30Z) - Distilling from Similar Tasks for Transfer Learning on a Budget [38.998980344852846]
Transfer learning is an effective solution for training with few labels, however often at the expense of a computationally costly fine-tuning of large base models.
We propose to mitigate this unpleasant trade-off between compute and accuracy via semi-supervised cross-domain distillation.
Our methods need no access to source data, and merely need features and pseudo-labels of the source models.
arXiv Detail & Related papers (2023-04-24T17:59:01Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Learning to Optimize Permutation Flow Shop Scheduling via Graph-based
Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems.
We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately.
Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z) - Implicit Offline Reinforcement Learning via Supervised Learning [83.8241505499762]
Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels.
We show how implicit models can leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets.
arXiv Detail & Related papers (2022-10-21T21:59:42Z) - Deep Learning with Multiple Data Set: A Weighted Goal Programming
Approach [2.7393821783237184]
Large-scale data analysis is growing at an exponential rate as data proliferates in our societies.
Deep Learning models require plenty of resources, and distributed training is needed.
This paper presents a Multicriteria approach for distributed learning.
arXiv Detail & Related papers (2021-11-27T07:10:25Z) - You Only Compress Once: Optimal Data Compression for Estimating Linear
Models [1.2845031126178592]
Many engineering systems that use linear models achieve computational efficiency through distributed systems and expert configuration.
Conditionally sufficient statistics is a unified data compression and estimation strategy.
arXiv Detail & Related papers (2021-02-22T19:00:18Z) - Decentralized Federated Learning Preserves Model and Data Privacy [77.454688257702]
We propose a fully decentralized approach, which allows to share knowledge between trained models.
Students are trained on the output of their teachers via synthetically generated input data.
The results show that an untrained student model, trained on the teachers output reaches comparable F1-scores as the teacher.
arXiv Detail & Related papers (2021-02-01T14:38:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.