Continual Learning From a Stream of APIs
- URL: http://arxiv.org/abs/2309.00023v2
- Date: Thu, 12 Sep 2024 08:34:10 GMT
- Title: Continual Learning From a Stream of APIs
- Authors: Enneng Yang, Zhenyi Wang, Li Shen, Nan Yin, Tongliang Liu, Guibing Guo, Xingwei Wang, Dacheng Tao,
- Abstract summary: Continual learning (CL) aims to learn new tasks without forgetting previous tasks.
Existing CL methods require a large amount of raw data, which is often unavailable due to copyright considerations and privacy risks.
This paper considers two practical-yet-novel CL settings: data-efficient CL (DECL-APIs) and data-free CL (DFCL-APIs)
- Score: 90.41825351073908
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual learning (CL) aims to learn new tasks without forgetting previous tasks. However, existing CL methods require a large amount of raw data, which is often unavailable due to copyright considerations and privacy risks. Instead, stakeholders usually release pre-trained machine learning models as a service (MLaaS), which users can access via APIs. This paper considers two practical-yet-novel CL settings: data-efficient CL (DECL-APIs) and data-free CL (DFCL-APIs), which achieve CL from a stream of APIs with partial or no raw data. Performing CL under these two new settings faces several challenges: unavailable full raw data, unknown model parameters, heterogeneous models of arbitrary architecture and scale, and catastrophic forgetting of previous APIs. To overcome these issues, we propose a novel data-free cooperative continual distillation learning framework that distills knowledge from a stream of APIs into a CL model by generating pseudo data, just by querying APIs. Specifically, our framework includes two cooperative generators and one CL model, forming their training as an adversarial game. We first use the CL model and the current API as fixed discriminators to train generators via a derivative-free method. Generators adversarially generate hard and diverse synthetic data to maximize the response gap between the CL model and the API. Next, we train the CL model by minimizing the gap between the responses of the CL model and the black-box API on synthetic data, to transfer the API's knowledge to the CL model. Furthermore, we propose a new regularization term based on network similarity to prevent catastrophic forgetting of previous APIs.Our method performs comparably to classic CL with full raw data on the MNIST and SVHN in the DFCL-APIs setting. In the DECL-APIs setting, our method achieves 0.97x, 0.75x and 0.69x performance of classic CL on CIFAR10, CIFAR100, and MiniImageNet.
Related papers
- SEAL: Suite for Evaluating API-use of LLMs [1.2528321519119252]
SEAL is an end-to-end testbed designed to evaluate large language models in real-world API usage.
It standardizes existing benchmarks, integrates an agent system for testing API retrieval and planning, and addresses the instability of real-time APIs.
arXiv Detail & Related papers (2024-09-23T20:16:49Z) - FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability.
Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request.
We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z) - Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries.
We propose a novel framework that emulates the process of programmers writing private code.
We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z) - Don't Memorize; Mimic The Past: Federated Class Incremental Learning
Without Episodic Memory [36.4406505365313]
This paper presents a framework for federated class incremental learning that utilizes a generative model to synthesize samples from past distributions instead of storing part of past data.
The generative model is trained on the server using data-free methods at the end of each task without requesting data from clients.
arXiv Detail & Related papers (2023-07-02T07:06:45Z) - Learning to Learn from APIs: Black-Box Data-Free Meta-Learning [95.41441357931397]
Data-free meta-learning (DFML) aims to enable efficient learning of new tasks by meta-learning from a collection of pre-trained models without access to the training data.
Existing DFML work can only meta-learn from (i) white-box and (ii) small-scale pre-trained models.
We propose a Bi-level Data-free Meta Knowledge Distillation (BiDf-MKD) framework to transfer more general meta knowledge from a collection of black-box APIs to one single model.
arXiv Detail & Related papers (2023-05-28T18:00:12Z) - Computationally Budgeted Continual Learning: What Does Matter? [128.0827987414154]
Continual Learning (CL) aims to sequentially train models on streams of incoming data that vary in distribution by preserving previous knowledge while adapting to new data.
Current CL literature focuses on restricted access to previously seen data, while imposing no constraints on the computational budget for training.
We revisit this problem with a large-scale benchmark and analyze the performance of traditional CL approaches in a compute-constrained setting.
arXiv Detail & Related papers (2023-03-20T14:50:27Z) - TARGET: Federated Class-Continual Learning via Exemplar-Free
Distillation [9.556059871106351]
This paper focuses on an under-explored yet important problem: Federated Class-Continual Learning (FCCL)
Existing FCCL works suffer from various limitations, such as requiring additional datasets or storing the private data from previous tasks.
We propose a novel method called TARGET, which alleviates catastrophic forgetting in FCCL while preserving client data privacy.
arXiv Detail & Related papers (2023-03-13T09:11:54Z) - The CLEAR Benchmark: Continual LEArning on Real-World Imagery [77.98377088698984]
Continual learning (CL) is widely regarded as crucial challenge for lifelong AI.
We introduce CLEAR, the first continual image classification benchmark dataset with a natural temporal evolution of visual concepts.
We find that a simple unsupervised pre-training step can already boost state-of-the-art CL algorithms.
arXiv Detail & Related papers (2022-01-17T09:09:09Z) - Prototypes-Guided Memory Replay for Continual Learning [13.459792148030717]
Continual learning (CL) refers to a machine learning paradigm that using only a small account of training samples and previously learned knowledge to enhance learning performance.
The major difficulty in CL is catastrophic forgetting of previously learned tasks, caused by shifts in data distributions.
We propose a memory-efficient CL method, incorporating it into an online meta-learning model.
arXiv Detail & Related papers (2021-08-28T13:00:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.