Related papers: Using Gradient to Boost the Generalization Performance of Deep Learning Models for Fluid Dynamics

Using Gradient to Boost the Generalization Performance of Deep Learning Models for Fluid Dynamics

URL: http://arxiv.org/abs/2212.00716v1
Date: Sun, 9 Oct 2022 10:20:09 GMT
Title: Using Gradient to Boost the Generalization Performance of Deep Learning Models for Fluid Dynamics
Authors: Eduardo Vital Brasil
Abstract summary: We present a novel work to increase the generalization capabilities of Deep Learning. Our strategy has shown good results towards a better generalization of DL networks.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Nowadays, Computational Fluid Dynamics (CFD) is a fundamental tool for industrial design. However, the computational cost of doing such simulations is expensive and can be detrimental for real-world use cases where many simulations are necessary, such as the task of shape optimization. Recently, Deep Learning (DL) has achieved a significant leap in a wide spectrum of applications and became a good candidate for physical systems, opening perspectives to CFD. To circumvent the computational bottleneck of CFD, DL models have been used to learn on Euclidean data, and more recently, on non-Euclidean data such as unstuctured grids and manifolds, allowing much faster and more efficient (memory, hardware) surrogate models. Nevertheless, DL presents the intrinsic limitation of extrapolating (generalizing) out of training data distribution (design space). In this study, we present a novel work to increase the generalization capabilities of Deep Learning. To do so, we incorporate the physical gradients (derivatives of the outputs w.r.t. the inputs) to the DL models. Our strategy has shown good results towards a better generalization of DL networks and our methodological/ theoretical study is corroborated with empirical validation, including an ablation study.

Related papers

DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [70.91804882618243]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks. We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge. Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z)
Multiscale Stochastic Gradient Descent: Efficiently Training Convolutional Neural Networks [6.805997961535213]
Multiscale Gradient Descent (Multiscale-SGD) is a novel optimization approach that exploits coarse-to-fine training strategies to estimate the gradient at a fraction of the cost. We introduce a new class of learnable, scale-independent Mesh-Free Convolutions (MFCs) that ensure consistent gradient behavior across resolutions. Our results establish a new paradigm for the efficient training of deep networks, enabling practical scalability in high-resolution and multiscale learning tasks.
arXiv Detail & Related papers (2025-01-22T09:13:47Z)
Graph Neural Networks and Differential Equations: A hybrid approach for data assimilation of fluid flows [0.0]
This study presents a novel hybrid approach that combines Graph Neural Networks (GNNs) with Reynolds-Averaged Navier Stokes (RANS) equations. The results demonstrate significant improvements in the accuracy of the reconstructed mean flow compared to purely data-driven models.
arXiv Detail & Related papers (2024-11-14T14:31:52Z)
DCP: Learning Accelerator Dataflow for Neural Network via Propagation [52.06154296196845]
This work proposes an efficient data-centric approach, named Dataflow Code Propagation (DCP), to automatically find the optimal dataflow for DNN layers in seconds without human effort. DCP learns a neural predictor to efficiently update the dataflow codes towards the desired gradient directions to minimize various optimization objectives. For example, without using additional training data, DCP surpasses the GAMMA method that performs a full search using thousands of samples.
arXiv Detail & Related papers (2024-10-09T05:16:44Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning [45.78096783448304]
In this work, seeking data efficiency, we design unsupervised pretraining for PDE operator learning. We mine unlabeled PDE data without simulated solutions, and we pretrain neural operators with physics-inspired reconstruction-based proxy tasks. Our method is highly data-efficient, more generalizable, and even outperforms conventional vision-pretrained models.
arXiv Detail & Related papers (2024-02-24T06:27:33Z)
Training Deep Surrogate Models with Large Scale Online Learning [48.7576911714538]
Deep learning algorithms have emerged as a viable alternative for obtaining fast solutions for PDEs. Models are usually trained on synthetic data generated by solvers, stored on disk and read back for training. It proposes an open source online training framework for deep surrogate models.
arXiv Detail & Related papers (2023-06-28T12:02:27Z)
Meta-Learning for Airflow Simulations with Graph Neural Networks [3.52359746858894]
We present a meta-learning approach to enhance the performance of learned models on out-of-distribution (OoD) samples. Specifically, we set the airflow simulation in CFD over various airfoils as a meta-learning problem, where each set of examples defined on a single airfoil shape is treated as a separate task. We experimentally demonstrate the efficiency of the proposed approach for improving the OoD generalization performance of learned models.
arXiv Detail & Related papers (2023-06-18T19:25:13Z)
Unifying Synergies between Self-supervised Learning and Dynamic Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms. We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting. The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Scalable Deep-Learning-Accelerated Topology Optimization for Additively Manufactured Materials [4.221095652322005]
Topology optimization (TO) is a popular and powerful computational approach for designing novel structures, materials, and devices. To address these issues, we propose a general scalable deep-learning (DL) based TO framework, referred to as SDL-TO. Our framework accelerates TO by learning the iterative history data and simultaneously training on the mapping between the given design and its gradient.
arXiv Detail & Related papers (2020-11-28T17:38:31Z)
Learning to Continuously Optimize Wireless Resource In Episodically Dynamic Environment [55.91291559442884]
This work develops a methodology that enables data-driven methods to continuously learn and optimize in a dynamic environment. We propose to build the notion of continual learning into the modeling process of learning wireless systems. Our design is based on a novel min-max formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2020-11-16T08:24:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.