Improving Sample Efficiency of Deep Learning Models in Electricity
Market
- URL: http://arxiv.org/abs/2210.05599v1
- Date: Tue, 11 Oct 2022 16:35:13 GMT
- Title: Improving Sample Efficiency of Deep Learning Models in Electricity
Market
- Authors: Guangchun Ruan, Jianxiao Wang, Haiwang Zhong, Qing Xia, Chongqing Kang
- Abstract summary: We propose a general framework, namely Knowledge-Augmented Training (KAT), to improve the sample efficiency.
We propose a novel data augmentation technique to generate some synthetic data, which are later processed by an improved training strategy.
Modern learning theories demonstrate the effectiveness of our method in terms of effective prediction error feedbacks, a reliable loss function, and rich gradient noises.
- Score: 0.41998444721319217
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The superior performance of deep learning relies heavily on a large
collection of sample data, but the data insufficiency problem turns out to be
relatively common in global electricity markets. How to prevent overfitting in
this case becomes a fundamental challenge when training deep learning models in
different market applications. With this in mind, we propose a general
framework, namely Knowledge-Augmented Training (KAT), to improve the sample
efficiency, and the main idea is to incorporate domain knowledge into the
training procedures of deep learning models. Specifically, we propose a novel
data augmentation technique to generate some synthetic data, which are later
processed by an improved training strategy. This KAT methodology follows and
realizes the idea of combining analytical and deep learning models together.
Modern learning theories demonstrate the effectiveness of our method in terms
of effective prediction error feedbacks, a reliable loss function, and rich
gradient noises. At last, we study two popular applications in detail: user
modeling and probabilistic price forecasting. The proposed method outperforms
other competitors in all numerical tests, and the underlying reasons are
explained by further statistical and visualization results.
Related papers
- Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA)
Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%.
DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z) - Applied Federated Model Personalisation in the Industrial Domain: A Comparative Study [5.999474111757664]
Three suggested strategies to tackle this challenge include Active Learning, Knowledge Distillation, and Local Memorization.
The present study delves into the fundamental principles of these three approaches and proposes an advanced Federated Learning System.
The results of the original and optimised models are then compared in both local and federated contexts using a comparison analysis.
arXiv Detail & Related papers (2024-09-10T23:00:19Z) - Unlearning with Control: Assessing Real-world Utility for Large Language Model Unlearning [97.2995389188179]
Recent research has begun to approach large language models (LLMs) unlearning via gradient ascent (GA)
Despite their simplicity and efficiency, we suggest that GA-based methods face the propensity towards excessive unlearning.
We propose several controlling methods that can regulate the extent of excessive unlearning.
arXiv Detail & Related papers (2024-06-13T14:41:00Z) - A Comprehensive Study on Model Initialization Techniques Ensuring
Efficient Federated Learning [0.0]
Federated learning(FL) has emerged as a promising paradigm for training machine learning models in a distributed and privacy-preserving manner.
The choice of methods used for models plays a crucial role in the performance, convergence speed, communication efficiency, privacy guarantees of federated learning systems.
Our research meticulously compares, categorizes, and delineates the merits and demerits of each technique, examining their applicability across diverse FL scenarios.
arXiv Detail & Related papers (2023-10-31T23:26:58Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Federated Pruning: Improving Neural Network Efficiency with Federated
Learning [24.36174705715827]
We propose Federated Pruning to train a reduced model under the federated setting.
We explore different pruning schemes and provide empirical evidence of the effectiveness of our methods.
arXiv Detail & Related papers (2022-09-14T00:48:37Z) - Learnability of Competitive Threshold Models [11.005966612053262]
We study the learnability of the competitive threshold model from a theoretical perspective.
We demonstrate how competitive threshold models can be seamlessly simulated by artificial neural networks.
arXiv Detail & Related papers (2022-05-08T01:11:51Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z) - MixKD: Towards Efficient Distillation of Large-scale Language Models [129.73786264834894]
We propose MixKD, a data-agnostic distillation framework, to endow the resulting model with stronger generalization ability.
We prove from a theoretical perspective that under reasonable conditions MixKD gives rise to a smaller gap between the error and the empirical error.
Experiments under a limited-data setting and ablation studies further demonstrate the advantages of the proposed approach.
arXiv Detail & Related papers (2020-11-01T18:47:51Z) - Domain Knowledge Integration By Gradient Matching For Sample-Efficient
Reinforcement Learning [0.0]
We propose a gradient matching algorithm to improve sample efficiency by utilizing target slope information from the dynamics to aid the model-free learner.
We demonstrate this by presenting a technique for matching the gradient information from the model-based learner with the model-free component in an abstract low-dimensional space.
arXiv Detail & Related papers (2020-05-28T05:02:47Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.