Improving Gradient-Trend Identification: Fast-Adaptive Moment Estimation
with Finance-Inspired Triple Exponential Moving Average
- URL: http://arxiv.org/abs/2306.01423v2
- Date: Thu, 21 Dec 2023 08:39:17 GMT
- Title: Improving Gradient-Trend Identification: Fast-Adaptive Moment Estimation
with Finance-Inspired Triple Exponential Moving Average
- Authors: Roi Peleg, Teddy Lazebnik, Assaf Hoogi
- Abstract summary: We introduce a novel called fast-adaptive moment estimation (FAME)
Inspired by the triple exponential moving average (TEMA) used in the financial domain, FAME improves the precision of identifying gradient trends.
Because of the introduction of TEMA into the optimization process, FAME can identify trends with higher accuracy and fewer lag issues.
- Score: 2.480023305418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The performance improvement of deep networks significantly depends on their
optimizers. With existing optimizers, precise and efficient recognition of the
gradients trend remains a challenge. Existing optimizers predominantly adopt
techniques based on the first-order exponential moving average (EMA), which
results in noticeable delays that impede the real-time tracking of gradients
trend and consequently yield sub-optimal performance. To overcome this
limitation, we introduce a novel optimizer called fast-adaptive moment
estimation (FAME). Inspired by the triple exponential moving average (TEMA)
used in the financial domain, FAME leverages the potency of higher-order TEMA
to improve the precision of identifying gradient trends. TEMA plays a central
role in the learning process as it actively influences optimization dynamics;
this role differs from its conventional passive role as a technical indicator
in financial contexts. Because of the introduction of TEMA into the
optimization process, FAME can identify gradient trends with higher accuracy
and fewer lag issues, thereby offering smoother and more consistent responses
to gradient fluctuations compared to conventional first-order EMA. To study the
effectiveness of our novel FAME optimizer, we conducted comprehensive
experiments encompassing six diverse computer-vision benchmarks and tasks,
spanning detection, classification, and semantic comprehension. We integrated
FAME into 15 learning architectures and compared its performance with those of
six popular optimizers. Results clearly showed that FAME is more robust and
accurate and provides superior performance stability by minimizing noise (i.e.,
trend fluctuations). Notably, FAME achieves higher accuracy levels in
remarkably fewer training epochs than its counterparts, clearly indicating its
significance for optimizing deep networks in computer-vision tasks.
Related papers
- Understanding Optimization in Deep Learning with Central Flows [53.66160508990508]
We show that an RMS's implicit behavior can be explicitly captured by a "central flow:" a differential equation.
We show that these flows can empirically predict long-term optimization trajectories of generic neural networks.
arXiv Detail & Related papers (2024-10-31T17:58:13Z) - Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives.
We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z) - Improving Instance Optimization in Deformable Image Registration with Gradient Projection [7.6061804149819885]
Deformable image registration is inherently a multi-objective optimization problem.
These conflicting objectives often lead to poor optimization outcomes.
Deep learning methods have recently gained popularity in this domain due to their efficiency in processing large datasets.
arXiv Detail & Related papers (2024-10-21T08:27:13Z) - HGSLoc: 3DGS-based Heuristic Camera Pose Refinement [13.393035855468428]
Visual localization refers to the process of determining camera poses and orientation within a known scene representation.
In this paper, we propose HGSLoc, which integrates 3D reconstruction with a refinement strategy to achieve higher pose estimation accuracy.
Our method demonstrates a faster rendering speed and higher localization accuracy compared to NeRF-based neural rendering approaches.
arXiv Detail & Related papers (2024-09-17T06:48:48Z) - Track Everything Everywhere Fast and Robustly [46.362962852140015]
We propose a novel test-time optimization approach for efficiently tracking any pixel in a video.
We introduce a novel invertible deformation network, CaDeX++, which factorizes the function representation into a local spatial-temporal feature grid.
Our experiments demonstrate a substantial improvement in training speed (more than textbf10 times faster), robustness, and accuracy in tracking over the SoTA optimization-based method OmniMotion.
arXiv Detail & Related papers (2024-03-26T17:58:22Z) - Bidirectional Looking with A Novel Double Exponential Moving Average to
Adaptive and Non-adaptive Momentum Optimizers [109.52244418498974]
We propose a novel textscAdmeta (textbfADouble exponential textbfMov averagtextbfE textbfAdaptive and non-adaptive momentum) framework.
We provide two implementations, textscAdmetaR and textscAdmetaS, the former based on RAdam and the latter based on SGDM.
arXiv Detail & Related papers (2023-07-02T18:16:06Z) - Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training.
We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields.
Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z) - Optimization-Inspired Learning with Architecture Augmentations and
Control Mechanisms for Low-Level Vision [74.9260745577362]
This paper proposes a unified optimization-inspired learning framework to aggregate Generative, Discriminative, and Corrective (GDC) principles.
We construct three propagative modules to effectively solve the optimization models with flexible combinations.
Experiments across varied low-level vision tasks validate the efficacy and adaptability of GDC.
arXiv Detail & Related papers (2020-12-10T03:24:53Z) - Transferable Graph Optimizers for ML Compilers [18.353830282858834]
We propose an end-to-end, transferable deep reinforcement learning method for computational graph optimization (GO)
GO generates decisions on the entire graph rather than on each individual node autoregressively, drastically speeding up the search compared to prior methods.
GO achieves 21% improvement over human experts and 18% improvement over the prior state of the art with 15x faster convergence.
arXiv Detail & Related papers (2020-10-21T20:28:33Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.