Related papers: Improving Sample Efficiency of Deep Learning Models in Electricity Market

Improving Sample Efficiency of Deep Learning Models in Electricity Market

URL: http://arxiv.org/abs/2210.05599v1
Date: Tue, 11 Oct 2022 16:35:13 GMT
Title: Improving Sample Efficiency of Deep Learning Models in Electricity Market
Authors: Guangchun Ruan, Jianxiao Wang, Haiwang Zhong, Qing Xia, Chongqing Kang
Abstract summary: We propose a general framework, namely Knowledge-Augmented Training (KAT), to improve the sample efficiency. We propose a novel data augmentation technique to generate some synthetic data, which are later processed by an improved training strategy. Modern learning theories demonstrate the effectiveness of our method in terms of effective prediction error feedbacks, a reliable loss function, and rich gradient noises.
Score: 0.41998444721319217
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The superior performance of deep learning relies heavily on a large collection of sample data, but the data insufficiency problem turns out to be relatively common in global electricity markets. How to prevent overfitting in this case becomes a fundamental challenge when training deep learning models in different market applications. With this in mind, we propose a general framework, namely Knowledge-Augmented Training (KAT), to improve the sample efficiency, and the main idea is to incorporate domain knowledge into the training procedures of deep learning models. Specifically, we propose a novel data augmentation technique to generate some synthetic data, which are later processed by an improved training strategy. This KAT methodology follows and realizes the idea of combining analytical and deep learning models together. Modern learning theories demonstrate the effectiveness of our method in terms of effective prediction error feedbacks, a reliable loss function, and rich gradient noises. At last, we study two popular applications in detail: user modeling and probabilistic price forecasting. The proposed method outperforms other competitors in all numerical tests, and the underlying reasons are explained by further statistical and visualization results.

Related papers

Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
Cost-effective Reduced-Order Modeling via Bayesian Active Learning [12.256032958843065]
We propose BayPOD-AL, an active learning framework based on an uncertainty-aware Bayesian proper decomposition (POD) approach.<n> Experimental results on predicting the temperature evolution over a rod demonstrate BayPOD-AL's effectiveness in suggesting the informative data.<n>We demonstrate BayPOD-AL's generalizability and efficiency by evaluating its performance on a dataset of higher temporal resolution than the training dataset.
arXiv Detail & Related papers (2025-06-27T21:23:37Z)
Model-agnostic Mitigation Strategies of Data Imbalance for Regression [0.0]
Data imbalance persists as a pervasive challenge in regression tasks, introducing bias in model performance and undermining predictive reliability.<n>We present advanced mitigation techniques, which build upon and improve existing sampling methods.<n>We demonstrate that constructing an ensemble of models -- one trained with imbalance mitigation and another without -- can significantly reduce these negative effects.
arXiv Detail & Related papers (2025-06-02T09:46:08Z)
Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining [55.262510814326035]
Existing reweighting strategies primarily focus on group-level data importance. We introduce novel algorithms for dynamic, instance-level data reweighting. Our framework allows us to devise reweighting strategies deprioritizing redundant or uninformative data.
arXiv Detail & Related papers (2025-02-10T17:57:15Z)
Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA) Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%. DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z)
Applied Federated Model Personalisation in the Industrial Domain: A Comparative Study [5.999474111757664]
Three suggested strategies to tackle this challenge include Active Learning, Knowledge Distillation, and Local Memorization. The present study delves into the fundamental principles of these three approaches and proposes an advanced Federated Learning System. The results of the original and optimised models are then compared in both local and federated contexts using a comparison analysis.
arXiv Detail & Related papers (2024-09-10T23:00:19Z)
Unlearning with Control: Assessing Real-world Utility for Large Language Model Unlearning [97.2995389188179]
Recent research has begun to approach large language models (LLMs) unlearning via gradient ascent (GA) Despite their simplicity and efficiency, we suggest that GA-based methods face the propensity towards excessive unlearning. We propose several controlling methods that can regulate the extent of excessive unlearning.
arXiv Detail & Related papers (2024-06-13T14:41:00Z)
A Comprehensive Study on Model Initialization Techniques Ensuring Efficient Federated Learning [0.0]
Federated learning(FL) has emerged as a promising paradigm for training machine learning models in a distributed and privacy-preserving manner. The choice of methods used for models plays a crucial role in the performance, convergence speed, communication efficiency, privacy guarantees of federated learning systems. Our research meticulously compares, categorizes, and delineates the merits and demerits of each technique, examining their applicability across diverse FL scenarios.
arXiv Detail & Related papers (2023-10-31T23:26:58Z)
A Note on Generalization in Variational Autoencoders: How Effective Is Synthetic Data & Overparameterization? [11.15942317329723]
Variational autoencoders (VAEs) are deep probabilistic models that are used in scientific applications. Our motivation comes from the recent discussion on whether the increasing amount of publicly accessible synthetic data will improve or hurt currently trained generative models. Our investigation shows how both training on samples from a pre-trained diffusion model, and using more parameters at certain layers are able to effectively mitigate overfitting in VAEs.
arXiv Detail & Related papers (2023-10-30T15:38:39Z)
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other. We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z)
Federated Pruning: Improving Neural Network Efficiency with Federated Learning [24.36174705715827]
We propose Federated Pruning to train a reduced model under the federated setting. We explore different pruning schemes and provide empirical evidence of the effectiveness of our methods.
arXiv Detail & Related papers (2022-09-14T00:48:37Z)
Learnability of Competitive Threshold Models [11.005966612053262]
We study the learnability of the competitive threshold model from a theoretical perspective. We demonstrate how competitive threshold models can be seamlessly simulated by artificial neural networks.
arXiv Detail & Related papers (2022-05-08T01:11:51Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
MixKD: Towards Efficient Distillation of Large-scale Language Models [129.73786264834894]
We propose MixKD, a data-agnostic distillation framework, to endow the resulting model with stronger generalization ability. We prove from a theoretical perspective that under reasonable conditions MixKD gives rise to a smaller gap between the error and the empirical error. Experiments under a limited-data setting and ablation studies further demonstrate the advantages of the proposed approach.
arXiv Detail & Related papers (2020-11-01T18:47:51Z)
Domain Knowledge Integration By Gradient Matching For Sample-Efficient Reinforcement Learning [0.0]
We propose a gradient matching algorithm to improve sample efficiency by utilizing target slope information from the dynamics to aid the model-free learner. We demonstrate this by presenting a technique for matching the gradient information from the model-based learner with the model-free component in an abstract low-dimensional space.
arXiv Detail & Related papers (2020-05-28T05:02:47Z)
On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation. We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.