Related papers: Model-Based Deep Learning: On the Intersection of Deep Learning and Optimization

Model-Based Deep Learning: On the Intersection of Deep Learning and Optimization

URL: http://arxiv.org/abs/2205.02640v1
Date: Thu, 5 May 2022 13:40:08 GMT
Title: Model-Based Deep Learning: On the Intersection of Deep Learning and Optimization
Authors: Nir Shlezinger, Yonina C. Eldar, and Stephen P. Boyd
Abstract summary: Decision making algorithms are used in a multitude of different applications. Deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models are becoming increasingly popular. Model-based optimization and data-centric deep learning are often considered to be distinct disciplines.
Score: 101.32332941117271
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Decision making algorithms are used in a multitude of different applications. Conventional approaches for designing decision algorithms employ principled and simplified modelling, based on which one can determine decisions via tractable optimization. More recently, deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models, are becoming increasingly popular. Model-based optimization and data-centric deep learning are often considered to be distinct disciplines. Here, we characterize them as edges of a continuous spectrum varying in specificity and parameterization, and provide a tutorial-style presentation to the methodologies lying in the middle ground of this spectrum, referred to as model-based deep learning. We accompany our presentation with running examples in super-resolution and stochastic control, and show how they are expressed using the provided characterization and specialized in each of the detailed methodologies. The gains of combining model-based optimization and deep learning are demonstrated using experimental results in various applications, ranging from biomedical imaging to digital communications.

Related papers

A Comparative Study of Explainable AI Methods: Model-Agnostic vs. Model-Specific Approaches [0.0]
I examine how LIME and SHAP differ from Grad-CAM and Guided Backpropagation when interpreting ResNet50 predictions. I found that each method reveals different aspects of the models decision-making process. My analysis shows there is no "one-size-fits-all" solution for model interpretability.
arXiv Detail & Related papers (2025-04-05T20:13:20Z)
Applied Federated Model Personalisation in the Industrial Domain: A Comparative Study [5.999474111757664]
Three suggested strategies to tackle this challenge include Active Learning, Knowledge Distillation, and Local Memorization. The present study delves into the fundamental principles of these three approaches and proposes an advanced Federated Learning System. The results of the original and optimised models are then compared in both local and federated contexts using a comparison analysis.
arXiv Detail & Related papers (2024-09-10T23:00:19Z)
Supervised Fine-Tuning as Inverse Reinforcement Learning [8.044033685073003]
The prevailing approach to aligning Large Language Models (LLMs) typically relies on human or AI feedback. In our work, we question the efficacy of such datasets and explore various scenarios where alignment with expert demonstrations proves more realistic.
arXiv Detail & Related papers (2024-03-18T17:52:57Z)
Simulation-Based Prior Knowledge Elicitation for Parametric Bayesian Models [2.9172603864294024]
We focus on translating domain expert knowledge into corresponding prior distributions over model parameters, a process known as prior elicitation. A major challenge for existing elicitation methods is how to effectively utilize all of these different formats in order to formulate prior distributions that align with the expert's expectations, regardless of the model structure. Our results support the claim that our method is largely independent of the underlying model structure and adaptable to various elicitation techniques, including quantile-based, moment-based, and histogram-based methods.
arXiv Detail & Related papers (2023-08-22T10:43:05Z)
MinT: Boosting Generalization in Mathematical Reasoning via Multi-View Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs) We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles. Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z)
Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data. Main aim of the identified model is to predict new data from previous observations. We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z)
Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem. The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning [137.39196753245105]
We present a new model-based reinforcement learning algorithm that learns a multi-headed dynamics model for dynamics generalization. We incorporate context learning, which encodes dynamics-specific information from past experiences into the context latent vector. Our method exhibits superior zero-shot generalization performance across a variety of control tasks, compared to state-of-the-art RL methods.
arXiv Detail & Related papers (2020-10-26T03:20:42Z)
Experimental Design for Overparameterized Learning with Application to Single Shot Deep Active Learning [5.141687309207561]
Modern machine learning models are trained on large amounts of labeled data. Access to large volumes of labeled data is often limited or expensive. We propose a new design strategy for curating the training set.
arXiv Detail & Related papers (2020-09-27T11:27:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.