Improving Sample Efficiency of Deep Learning Models in Electricity
Market
- URL: http://arxiv.org/abs/2210.05599v1
- Date: Tue, 11 Oct 2022 16:35:13 GMT
- Title: Improving Sample Efficiency of Deep Learning Models in Electricity
Market
- Authors: Guangchun Ruan, Jianxiao Wang, Haiwang Zhong, Qing Xia, Chongqing Kang
- Abstract summary: We propose a general framework, namely Knowledge-Augmented Training (KAT), to improve the sample efficiency.
We propose a novel data augmentation technique to generate some synthetic data, which are later processed by an improved training strategy.
Modern learning theories demonstrate the effectiveness of our method in terms of effective prediction error feedbacks, a reliable loss function, and rich gradient noises.
- Score: 0.41998444721319217
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The superior performance of deep learning relies heavily on a large
collection of sample data, but the data insufficiency problem turns out to be
relatively common in global electricity markets. How to prevent overfitting in
this case becomes a fundamental challenge when training deep learning models in
different market applications. With this in mind, we propose a general
framework, namely Knowledge-Augmented Training (KAT), to improve the sample
efficiency, and the main idea is to incorporate domain knowledge into the
training procedures of deep learning models. Specifically, we propose a novel
data augmentation technique to generate some synthetic data, which are later
processed by an improved training strategy. This KAT methodology follows and
realizes the idea of combining analytical and deep learning models together.
Modern learning theories demonstrate the effectiveness of our method in terms
of effective prediction error feedbacks, a reliable loss function, and rich
gradient noises. At last, we study two popular applications in detail: user
modeling and probabilistic price forecasting. The proposed method outperforms
other competitors in all numerical tests, and the underlying reasons are
explained by further statistical and visualization results.
Related papers
- Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining [55.262510814326035]
Existing reweighting strategies primarily focus on group-level data importance.
We introduce novel algorithms for dynamic, instance-level data reweighting.
Our framework allows us to devise reweighting strategies deprioritizing redundant or uninformative data.
arXiv Detail & Related papers (2025-02-10T17:57:15Z) - Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA)
Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%.
DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z) - Applied Federated Model Personalisation in the Industrial Domain: A Comparative Study [5.999474111757664]
Three suggested strategies to tackle this challenge include Active Learning, Knowledge Distillation, and Local Memorization.
The present study delves into the fundamental principles of these three approaches and proposes an advanced Federated Learning System.
The results of the original and optimised models are then compared in both local and federated contexts using a comparison analysis.
arXiv Detail & Related papers (2024-09-10T23:00:19Z) - Unlearning with Control: Assessing Real-world Utility for Large Language Model Unlearning [97.2995389188179]
Recent research has begun to approach large language models (LLMs) unlearning via gradient ascent (GA)
Despite their simplicity and efficiency, we suggest that GA-based methods face the propensity towards excessive unlearning.
We propose several controlling methods that can regulate the extent of excessive unlearning.
arXiv Detail & Related papers (2024-06-13T14:41:00Z) - A Comprehensive Study on Model Initialization Techniques Ensuring
Efficient Federated Learning [0.0]
Federated learning(FL) has emerged as a promising paradigm for training machine learning models in a distributed and privacy-preserving manner.
The choice of methods used for models plays a crucial role in the performance, convergence speed, communication efficiency, privacy guarantees of federated learning systems.
Our research meticulously compares, categorizes, and delineates the merits and demerits of each technique, examining their applicability across diverse FL scenarios.
arXiv Detail & Related papers (2023-10-31T23:26:58Z) - A Note on Generalization in Variational Autoencoders: How Effective Is Synthetic Data & Overparameterization? [11.15942317329723]
Variational autoencoders (VAEs) are deep probabilistic models that are used in scientific applications.
Our motivation comes from the recent discussion on whether the increasing amount of publicly accessible synthetic data will improve or hurt currently trained generative models.
Our investigation shows how both training on samples from a pre-trained diffusion model, and using more parameters at certain layers are able to effectively mitigate overfitting in VAEs.
arXiv Detail & Related papers (2023-10-30T15:38:39Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Learnability of Competitive Threshold Models [11.005966612053262]
We study the learnability of the competitive threshold model from a theoretical perspective.
We demonstrate how competitive threshold models can be seamlessly simulated by artificial neural networks.
arXiv Detail & Related papers (2022-05-08T01:11:51Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z) - MixKD: Towards Efficient Distillation of Large-scale Language Models [129.73786264834894]
We propose MixKD, a data-agnostic distillation framework, to endow the resulting model with stronger generalization ability.
We prove from a theoretical perspective that under reasonable conditions MixKD gives rise to a smaller gap between the error and the empirical error.
Experiments under a limited-data setting and ablation studies further demonstrate the advantages of the proposed approach.
arXiv Detail & Related papers (2020-11-01T18:47:51Z) - Domain Knowledge Integration By Gradient Matching For Sample-Efficient
Reinforcement Learning [0.0]
We propose a gradient matching algorithm to improve sample efficiency by utilizing target slope information from the dynamics to aid the model-free learner.
We demonstrate this by presenting a technique for matching the gradient information from the model-based learner with the model-free component in an abstract low-dimensional space.
arXiv Detail & Related papers (2020-05-28T05:02:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.