Related papers: MelissaDL x Breed: Towards Data-Efficient On-line Supervised Training of Multi-parametric Surrogates with Active Learning

MelissaDL x Breed: Towards Data-Efficient On-line Supervised Training of Multi-parametric Surrogates with Active Learning

URL: http://arxiv.org/abs/2410.05860v1
Date: Tue, 8 Oct 2024 09:52:15 GMT
Title: MelissaDL x Breed: Towards Data-Efficient On-line Supervised Training of Multi-parametric Surrogates with Active Learning
Authors: Sofya Dymchenko, Abhishek Purandare, Bruno Raffin,
Abstract summary: We introduce a new active learning method to enhance data-efficiency for on-line surrogate training. The surrogate is trained to predict a given timestep directly with different initial and boundary conditions parameters. Preliminary results for 2D heat PDE demonstrate the potential of this method, called Breed, to improve the generalization capabilities of surrogates.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Artificial intelligence is transforming scientific computing with deep neural network surrogates that approximate solutions to partial differential equations (PDEs). Traditional off-line training methods face issues with storage and I/O efficiency, as the training dataset has to be computed with numerical solvers up-front. Our previous work, the Melissa framework, addresses these problems by enabling data to be created "on-the-fly" and streamed directly into the training process. In this paper we introduce a new active learning method to enhance data-efficiency for on-line surrogate training. The surrogate is direct and multi-parametric, i.e., it is trained to predict a given timestep directly with different initial and boundary conditions parameters. Our approach uses Adaptive Multiple Importance Sampling guided by training loss statistics, in order to focus NN training on the difficult areas of the parameter space. Preliminary results for 2D heat PDE demonstrate the potential of this method, called Breed, to improve the generalization capabilities of surrogates while reducing computational overhead.

Related papers

Reparameterized LLM Training via Orthogonal Equivalence Transformation [54.80172809738605]
We present POET, a novel training algorithm that uses Orthogonal Equivalence Transformation to optimize neurons.<n>POET can stably optimize the objective function with improved generalization.<n>We develop efficient approximations that make POET flexible and scalable for training large-scale neural networks.
arXiv Detail & Related papers (2025-06-09T17:59:34Z)
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning [62.984693936073974]
Value-based reinforcement learning can learn effective policies for a wide range of multi-turn problems. Current value-based RL methods have proven particularly challenging to scale to the setting of large language models. We propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning problem.
arXiv Detail & Related papers (2024-11-07T21:36:52Z)
Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning [45.78096783448304]
In this work, seeking data efficiency, we design unsupervised pretraining for PDE operator learning. We mine unlabeled PDE data without simulated solutions, and we pretrain neural operators with physics-inspired reconstruction-based proxy tasks. Our method is highly data-efficient, more generalizable, and even outperforms conventional vision-pretrained models.
arXiv Detail & Related papers (2024-02-24T06:27:33Z)
Machine Unlearning of Pre-trained Large Language Models [17.40601262379265]
This study investigates the concept of the right to be forgotten' within the context of large language models (LLMs) We explore machine unlearning as a pivotal solution, with a focus on pre-trained models.
arXiv Detail & Related papers (2024-02-23T07:43:26Z)
Fast Machine Unlearning Without Retraining Through Selective Synaptic Dampening [51.34904967046097]
Selective Synaptic Dampening (SSD) is a fast, performant, and does not require long-term storage of the training data. We present a novel two-step, post hoc, retrain-free approach to machine unlearning which is fast, performant, and does not require long-term storage of the training data.
arXiv Detail & Related papers (2023-08-15T11:30:45Z)
Training Deep Surrogate Models with Large Scale Online Learning [48.7576911714538]
Deep learning algorithms have emerged as a viable alternative for obtaining fast solutions for PDEs. Models are usually trained on synthetic data generated by solvers, stored on disk and read back for training. It proposes an open source online training framework for deep surrogate models.
arXiv Detail & Related papers (2023-06-28T12:02:27Z)
Transferring Learning Trajectories of Neural Networks [2.2299983745857896]
Training deep neural networks (DNNs) is computationally expensive. We formulate the problem of "transferring" a given learning trajectory from one initial parameter to another one. We empirically show that the transferred parameters achieve non-trivial accuracy before any direct training, and can be trained significantly faster than training from scratch.
arXiv Detail & Related papers (2023-05-23T14:46:32Z)
Variational operator learning: A unified paradigm marrying training neural operators and solving partial differential equations [9.148052787201797]
We propose a novel paradigm that provides a unified framework of training neural operators and solving PDEs with the variational form. With a label-free training set and a 5-label-only shift set, VOL learns solution operators with its test errors decreasing in a power law with respect to the amount of unlabeled data.
arXiv Detail & Related papers (2023-04-09T13:20:19Z)
Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability [53.27240222619834]
Knowledge Distillation as Efficient Pre-training aims to efficiently transfer the learned feature representation from pre-trained models to new student models for future downstream tasks. Our method performs comparably with supervised pre-training counterparts in 3 downstream tasks and 9 downstream datasets requiring 10x less data and 5x less pre-training time.
arXiv Detail & Related papers (2022-03-10T06:23:41Z)
Adaptive Serverless Learning [114.36410688552579]
We propose a novel adaptive decentralized training approach, which can compute the learning rate from data dynamically. Our theoretical results reveal that the proposed algorithm can achieve linear speedup with respect to the number of workers. To reduce the communication-efficient overhead, we further propose a communication-efficient adaptive decentralized training approach.
arXiv Detail & Related papers (2020-08-24T13:23:02Z)
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience. Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.