Related papers: Zero Grads: Learning Local Surrogate Losses for Non-Differentiable Graphics

Zero Grads: Learning Local Surrogate Losses for Non-Differentiable Graphics

URL: http://arxiv.org/abs/2308.05739v2
Date: Tue, 7 May 2024 14:50:00 GMT
Title: Zero Grads: Learning Local Surrogate Losses for Non-Differentiable Graphics
Authors: Michael Fischer, Tobias Ritschel,
Abstract summary: We show how to automatically optimize graphics for undefined or zero gradients. The fitting is performed online, alongside the parameter optimization, and self-supervised, without pre-computed data or pre-trained models. In contrast to other algorithms, our approach scales well to higher dimensions, which we demonstrate on problems with up to 35k variables.
Score: 13.558059987555234
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gradient-based optimization is now ubiquitous across graphics, but unfortunately can not be applied to problems with undefined or zero gradients. To circumvent this issue, the loss function can be manually replaced by a ``surrogate'' that has similar minima but is differentiable. Our proposed framework, ZeroGrads, automates this process by learning a neural approximation of the objective function, which in turn can be used to differentiate through arbitrary black-box graphics pipelines. We train the surrogate on an actively smoothed version of the objective and encourage locality, focusing the surrogate's capacity on what matters at the current training episode. The fitting is performed online, alongside the parameter optimization, and self-supervised, without pre-computed data or pre-trained models. As sampling the objective is expensive (it requires a full rendering or simulator run), we devise an efficient sampling scheme that allows for tractable run-times and competitive performance at little overhead. We demonstrate optimizing diverse non-convex, non-differentiable black-box problems in graphics, such as visibility in rendering, discrete parameter spaces in procedural modelling or optimal control in physics-driven animation. In contrast to other derivative-free algorithms, our approach scales well to higher dimensions, which we demonstrate on problems with up to 35k interlinked variables.

Related papers

Locality-aware Surrogates for Gradient-based Black-box Optimization [5.541187309945154]
We propose locality-aware surrogate models for model-based black-box optimization. We first establish a theoretical connection between gradient alignment and the minimization of a Gradient Path Integral Equation loss. We develop a scalable training algorithm that minimizes the GradPIE loss, enabling both offline and online learning.
arXiv Detail & Related papers (2025-01-31T14:28:47Z)
Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement [29.675650285351768]
Machine unlearning (MU) has emerged to enhance the privacy and trustworthiness of deep neural networks. Approximate MU is a practical method for large-scale models. We propose a fast-slow parameter update strategy to implicitly approximate the up-to-date salient unlearning direction.
arXiv Detail & Related papers (2024-09-29T15:17:33Z)
Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness [86.61582747039053]
Language model training in distributed settings is limited by the communication cost of exchanges. We extend recent work using shared randomness to perform distributed fine-tuning with low bandwidth.
arXiv Detail & Related papers (2023-06-16T17:59:51Z)
Target-based Surrogates for Stochastic Optimization [26.35752393302125]
We consider minimizing functions for which it is expensive to compute the (possibly) gradient. Such functions are prevalent in computation reinforcement learning, imitation learning and adversarial training. Our framework allows the use of standard optimization algorithms to construct surrogates which can be minimized efficiently.
arXiv Detail & Related papers (2023-02-06T08:08:34Z)
Plateau-reduced Differentiable Path Tracing [18.174063717952187]
We show that inverse rendering might not converge due to inherent plateaus, i.e. regions of zero gradient, in the objective function. We propose to alleviate this by convolving the high-dimensional rendering function that maps parameters to images with an additional kernel that blurs the parameter space.
arXiv Detail & Related papers (2022-11-30T18:58:53Z)
RISP: Rendering-Invariant State Predictor with Differentiable Simulation and Rendering for Cross-Domain Parameter Estimation [110.4255414234771]
Existing solutions require massive training data or lack generalizability to unknown rendering configurations. We propose a novel approach that marries domain randomization and differentiable rendering gradients to address this problem. Our approach achieves significantly lower reconstruction errors and has better generalizability among unknown rendering configurations.
arXiv Detail & Related papers (2022-05-11T17:59:51Z)
DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools [96.38972082580294]
DiffSkill is a novel framework that uses a differentiable physics simulator for skill abstraction to solve deformable object manipulation tasks. In particular, we first obtain short-horizon skills using individual tools from a gradient-based simulator. We then learn a neural skill abstractor from the demonstration trajectories which takes RGBD images as input.
arXiv Detail & Related papers (2022-03-31T17:59:38Z)
Joint inference and input optimization in equilibrium networks [68.63726855991052]
deep equilibrium model is a class of models that foregoes traditional network depth and instead computes the output of a network by finding the fixed point of a single nonlinear layer. We show that there is a natural synergy between these two settings. We demonstrate this strategy on various tasks such as training generative models while optimizing over latent codes, training models for inverse problems like denoising and inpainting, adversarial training and gradient based meta-learning.
arXiv Detail & Related papers (2021-11-25T19:59:33Z)
Self-Tuning Stochastic Optimization with Curvature-Aware Gradient Filtering [53.523517926927894]
We explore the use of exact per-sample Hessian-vector products and gradients to construct self-tuning quadratics. We prove that our model-based procedure converges in noisy gradient setting. This is an interesting step for constructing self-tuning quadratics.
arXiv Detail & Related papers (2020-11-09T22:07:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.