Related papers: An Introduction to Automatic Differentiation forMachine Learning

An Introduction to Automatic Differentiation forMachine Learning

URL: http://arxiv.org/abs/2110.06209v1
Date: Tue, 12 Oct 2021 00:10:28 GMT
Title: An Introduction to Automatic Differentiation forMachine Learning
Authors: Davan Harrison
Abstract summary: Neural network models are typically implemented using frameworks that perform gradient based optimization methods to fit a model to a dataset. These frameworks use a technique of calculating derivatives called automatic differentiation (AD) which removes the burden of performing derivative calculations from the model designer.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning and neural network models in particular have been improving the state of the art performance on many artificial intelligence related tasks. Neural network models are typically implemented using frameworks that perform gradient based optimization methods to fit a model to a dataset. These frameworks use a technique of calculating derivatives called automatic differentiation (AD) which removes the burden of performing derivative calculations from the model designer. In this report we describe AD, its motivations, and different implementation approaches. We briefly describe dataflow programming as it relates to AD. Lastly, we present example programs that are implemented with Tensorflow and PyTorch, which are two commonly used AD frameworks.

Related papers

DaCe AD: Unifying High-Performance Automatic Differentiation for Machine Learning and Scientific Computing [54.73410106410609]
This work presents DaCe AD, a general, efficient automatic differentiation engine that requires no code modifications.<n>DaCe AD uses a novel ILP-based algorithm to optimize the trade-off between storing and recomputing to achieve maximum performance within a given memory constraint.
arXiv Detail & Related papers (2025-09-02T11:09:45Z)
Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
AI-Aided Kalman Filters [65.35350122917914]
The Kalman filter (KF) and its variants are among the most celebrated algorithms in signal processing. Recent developments illustrate the possibility of fusing deep neural networks (DNNs) with classic Kalman-type filtering. This article provides a tutorial-style overview of design approaches for incorporating AI in aiding KF-type algorithms.
arXiv Detail & Related papers (2024-10-16T06:47:53Z)
Multi-GPU Approach for Training of Graph ML Models on large CFD Meshes [0.0]
Mesh-based numerical solvers are an important part in many design tool chains. Machine Learning based surrogate models are fast in predicting approximate solutions but often lack accuracy. This paper scales a state-of-the-art surrogate model from the domain of graph-based machine learning to industry-relevant mesh sizes.
arXiv Detail & Related papers (2023-07-25T15:49:25Z)
Towards a population-informed approach to the definition of data-driven models for structural dynamics [0.0]
A population-based scheme is followed here and two different machine-learning algorithms from the meta-learning domain are used. The algorithms seem to perform as intended and outperform a traditional machine-learning algorithm at approximating the quantities of interest.
arXiv Detail & Related papers (2023-07-19T09:45:41Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning [65.268245109828]
In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models. Deep learning in resource-limited domains still faces multiple challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning. Model reprogramming enables resource-efficient cross-domain machine learning by repurposing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning.
arXiv Detail & Related papers (2022-02-22T02:33:54Z)
Assemble Foundation Models for Automatic Code Summarization [9.53949558569201]
We propose a flexible and robust approach for automatic code summarization based on neural networks. We assemble available foundation models, such as CodeBERT and GPT-2, into a single model named AdaMo. We introduce two adaptive schemes from the perspective of knowledge transfer, namely continuous pretraining and intermediate finetuning.
arXiv Detail & Related papers (2022-01-13T21:38:33Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
TrackMPNN: A Message Passing Graph Neural Architecture for Multi-Object Tracking [8.791710193028903]
This study follows many previous approaches to multi-object tracking (MOT) that model the problem using graph-based data structures. We create a framework based on dynamic undirected graphs that represent the data association problem over multiple timesteps. We also provide solutions and propositions for the computational problems that need to be addressed to create a memory-efficient, real-time, online algorithm.
arXiv Detail & Related papers (2021-01-11T21:52:25Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
PermuteAttack: Counterfactual Explanation of Machine Learning Credit Scorecards [0.0]
This paper is a note on new directions and methodologies for validation and explanation of Machine Learning (ML) models employed for retail credit scoring in finance. Our proposed framework draws motivation from the field of Artificial Intelligence (AI) security and adversarial ML.
arXiv Detail & Related papers (2020-08-24T00:05:13Z)
Model-Augmented Actor-Critic: Backpropagating through Paths [81.86992776864729]
Current model-based reinforcement learning approaches use the model simply as a learned black-box simulator. We show how to make more effective use of the model by exploiting its differentiability.
arXiv Detail & Related papers (2020-05-16T19:18:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.