Coresets via Bilevel Optimization for Continual Learning and Streaming
- URL: http://arxiv.org/abs/2006.03875v2
- Date: Thu, 22 Oct 2020 17:53:39 GMT
- Title: Coresets via Bilevel Optimization for Continual Learning and Streaming
- Authors: Zal\'an Borsos, Mojm\'ir Mutn\'y, Andreas Krause
- Abstract summary: We propose a novel coreset construction via cardinality-constrained bilevel optimization.
We show how our framework can efficiently generate coresets for deep neural networks, and demonstrate its empirical benefits in continual learning and in streaming settings.
- Score: 86.67190358712064
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Coresets are small data summaries that are sufficient for model training.
They can be maintained online, enabling efficient handling of large data
streams under resource constraints. However, existing constructions are limited
to simple models such as k-means and logistic regression. In this work, we
propose a novel coreset construction via cardinality-constrained bilevel
optimization. We show how our framework can efficiently generate coresets for
deep neural networks, and demonstrate its empirical benefits in continual
learning and in streaming settings.
Related papers
- Refined Coreset Selection: Towards Minimal Coreset Size under Model
Performance Constraints [69.27190330994635]
Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms.
We propose an innovative method, which maintains optimization priority order over the model performance and coreset size.
Empirically, extensive experiments confirm its superiority, often yielding better model performance with smaller coreset sizes.
arXiv Detail & Related papers (2023-11-15T03:43:04Z) - Composable Core-sets for Diversity Approximation on Multi-Dataset
Streams [4.765131728094872]
Composable core-sets are core-sets with the property that subsets of the core set can be unioned together to obtain an approximation for the original data.
We introduce a core-set construction algorithm for constructing composable core-sets to summarize streamed data for use in active learning environments.
arXiv Detail & Related papers (2023-08-10T23:24:51Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Adaptive Second Order Coresets for Data-efficient Machine Learning [5.362258158646462]
Training machine learning models on datasets incurs substantial computational costs.
We propose AdaCore to extract subsets of the training examples for efficient machine learning.
arXiv Detail & Related papers (2022-07-28T05:43:09Z) - Data Summarization via Bilevel Optimization [48.89977988203108]
A simple yet powerful approach is to operate on small subsets of data.
In this work, we propose a generic coreset framework that formulates the coreset selection as a cardinality-constrained bilevel optimization problem.
arXiv Detail & Related papers (2021-09-26T09:08:38Z) - Robust Coreset for Continuous-and-Bounded Learning (with Outliers) [30.91741925182613]
We propose a novel robust coreset method for the em continuous-and-bounded learning problem (with outliers)
Our robust coreset can be efficiently maintained in fully-dynamic environment.
arXiv Detail & Related papers (2021-06-30T19:24:20Z) - Top-KAST: Top-K Always Sparse Training [50.05611544535801]
We propose Top-KAST, a method that preserves constant sparsity throughout training.
We show that it performs comparably to or better than previous works when training models on the established ImageNet benchmark.
In addition to our ImageNet results, we also demonstrate our approach in the domain of language modeling.
arXiv Detail & Related papers (2021-06-07T11:13:05Z) - On Coresets for Support Vector Machines [61.928187390362176]
A coreset is a small, representative subset of the original data points.
We show that our algorithm can be used to extend the applicability of any off-the-shelf SVM solver to streaming, distributed, and dynamic data settings.
arXiv Detail & Related papers (2020-02-15T23:25:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.