Exploring Model Transferability through the Lens of Potential Energy
- URL: http://arxiv.org/abs/2308.15074v1
- Date: Tue, 29 Aug 2023 07:15:57 GMT
- Title: Exploring Model Transferability through the Lens of Potential Energy
- Authors: Xiaotong Li, Zixuan Hu, Yixiao Ge, Ying Shan, Ling-Yu Duan
- Abstract summary: Transfer learning has become crucial in computer vision tasks due to the vast availability of pre-trained deep learning models.
Existing methods for measuring the transferability of pre-trained models rely on statistical correlations between encoded static features and task labels.
We present an insightful physics-inspired approach named PED to address these challenges.
- Score: 78.60851825944212
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning has become crucial in computer vision tasks due to the vast
availability of pre-trained deep learning models. However, selecting the
optimal pre-trained model from a diverse pool for a specific downstream task
remains a challenge. Existing methods for measuring the transferability of
pre-trained models rely on statistical correlations between encoded static
features and task labels, but they overlook the impact of underlying
representation dynamics during fine-tuning, leading to unreliable results,
especially for self-supervised models. In this paper, we present an insightful
physics-inspired approach named PED to address these challenges. We reframe the
challenge of model selection through the lens of potential energy and directly
model the interaction forces that influence fine-tuning dynamics. By capturing
the motion of dynamic representations to decline the potential energy within a
force-driven physical model, we can acquire an enhanced and more stable
observation for estimating transferability. The experimental results on 10
downstream tasks and 12 self-supervised models demonstrate that our approach
can seamlessly integrate into existing ranking techniques and enhance their
performances, revealing its effectiveness for the model selection task and its
potential for understanding the mechanism in transfer learning. Code will be
available at https://github.com/lixiaotong97/PED.
Related papers
- Latent-Predictive Empowerment: Measuring Empowerment without a Simulator [56.53777237504011]
We present Latent-Predictive Empowerment (LPE), an algorithm that can compute empowerment in a more practical manner.
LPE learns large skillsets by maximizing an objective that is a principled replacement for the mutual information between skills and states.
arXiv Detail & Related papers (2024-10-15T00:41:18Z) - Enhancing Dynamical System Modeling through Interpretable Machine
Learning Augmentations: A Case Study in Cathodic Electrophoretic Deposition [0.8796261172196743]
We introduce a comprehensive data-driven framework aimed at enhancing the modeling of physical systems.
As a demonstrative application, we pursue the modeling of cathodic electrophoretic deposition (EPD), commonly known as e-coating.
arXiv Detail & Related papers (2024-01-16T14:58:21Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Model-Based Reinforcement Learning with Isolated Imaginations [61.67183143982074]
We propose Iso-Dream++, a model-based reinforcement learning approach.
We perform policy optimization based on the decoupled latent imaginations.
This enables long-horizon visuomotor control tasks to benefit from isolating mixed dynamics sources in the wild.
arXiv Detail & Related papers (2023-03-27T02:55:56Z) - STDEN: Towards Physics-Guided Neural Networks for Traffic Flow
Prediction [31.49270000605409]
The lack of integration between physical principles and data-driven models is an important reason for limiting the development of this field.
We propose a physics-guided deep learning model named Spatio-Temporal Differential Equation Network (STDEN), which casts the physical mechanism of traffic flow dynamics into a deep neural network framework.
Experiments on three real-world traffic datasets in Beijing show that our model outperforms state-of-the-art baselines by a significant margin.
arXiv Detail & Related papers (2022-09-01T04:58:18Z) - Your Autoregressive Generative Model Can be Better If You Treat It as an
Energy-Based One [83.5162421521224]
We propose a unique method termed E-ARM for training autoregressive generative models.
E-ARM takes advantage of a well-designed energy-based learning objective.
We show that E-ARM can be trained efficiently and is capable of alleviating the exposure bias problem.
arXiv Detail & Related papers (2022-06-26T10:58:41Z) - End-to-End Learning of Hybrid Inverse Dynamics Models for Precise and
Compliant Impedance Control [16.88250694156719]
We present a novel hybrid model formulation that enables us to identify fully physically consistent inertial parameters of a rigid body dynamics model.
We compare our approach against state-of-the-art inverse dynamics models on a 7 degree of freedom manipulator.
arXiv Detail & Related papers (2022-05-27T07:39:28Z) - Planning from Pixels using Inverse Dynamics Models [44.16528631970381]
We propose a novel way to learn latent world models by learning to predict sequences of future actions conditioned on task completion.
We evaluate our method on challenging visual goal completion tasks and show a substantial increase in performance compared to prior model-free approaches.
arXiv Detail & Related papers (2020-12-04T06:07:36Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.