Related papers: Tunable Convolutions with Parametric Multi-Loss Optimization

Tunable Convolutions with Parametric Multi-Loss Optimization

URL: http://arxiv.org/abs/2304.00898v1
Date: Mon, 3 Apr 2023 11:36:10 GMT
Title: Tunable Convolutions with Parametric Multi-Loss Optimization
Authors: Matteo Maggioni, Thomas Tanay, Francesca Babiloni, Steven McDonagh, Ale\v{s} Leonardis
Abstract summary: Behavior of neural networks is irremediably determined by the specific loss and data used during training. It is often desirable to tune the model at inference time based on external factors such as preferences of the user or dynamic characteristics of the data. This is especially important to balance the perception-distortion trade-off of ill-posed image-to-image translation tasks.
Score: 5.658123802733283
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Behavior of neural networks is irremediably determined by the specific loss and data used during training. However it is often desirable to tune the model at inference time based on external factors such as preferences of the user or dynamic characteristics of the data. This is especially important to balance the perception-distortion trade-off of ill-posed image-to-image translation tasks. In this work, we propose to optimize a parametric tunable convolutional layer, which includes a number of different kernels, using a parametric multi-loss, which includes an equal number of objectives. Our key insight is to use a shared set of parameters to dynamically interpolate both the objectives and the kernels. During training, these parameters are sampled at random to explicitly optimize all possible combinations of objectives and consequently disentangle their effect into the corresponding kernels. During inference, these parameters become interactive inputs of the model hence enabling reliable and consistent control over the model behavior. Extensive experimental results demonstrate that our tunable convolutions effectively work as a drop-in replacement for traditional convolutions in existing neural networks at virtually no extra computational cost, outperforming state-of-the-art control strategies in a wide range of applications; including image denoising, deblurring, super-resolution, and style transfer.

Related papers

A Framework for Nonstationary Gaussian Processes with Neural Network Parameters [0.8057006406834466]
We propose a framework that uses nonstationary kernels whose parameters vary across the feature space, modeling these parameters as the output of a neural network.<n>Our method clearly describes the behavior of the nonstationary parameters and is compatible with approximation methods for scaling to large datasets.<n>We test a nonstationary variance and noise variant of our method on several machine learning datasets and find that it achieves better accuracy and log-score than both a stationary model and a hierarchical model approximated with variational inference.
arXiv Detail & Related papers (2025-07-16T14:09:49Z)
You Only Train Once [11.97836331714694]
You Only Train Once (YOTO) contributes to limiting training to one shot for the latter aspect of losses selection and weighting.<n>We leverage the differentiability of the composite loss formulation which is widely used for optimizing multiple empirical losses simultaneously.<n>We show that YOTO consistently outperforms the best grid-search model on unseen test data.
arXiv Detail & Related papers (2025-06-04T18:04:58Z)
ConsistentFeature: A Plug-and-Play Component for Neural Network Regularization [0.32885740436059047]
Over- parameterized neural network models often lead to significant performance discrepancies between training and test sets. We introduce a simple perspective on overfitting: models learn different representations in different i.i.d. datasets. We propose an adaptive method, ConsistentFeature, that regularizes the model by constraining feature differences across random subsets of the same training set.
arXiv Detail & Related papers (2024-12-02T13:21:31Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Adaptive Sampling for Continuous Group Equivariant Neural Networks [5.141137421503899]
We introduce an adaptive sampling approach that dynamically adjusts the sampling process to the symmetries in the data. Our findings demonstrate improved model performance, and a marginal increase in memory efficiency.
arXiv Detail & Related papers (2024-09-13T11:50:09Z)
Subject-specific Deep Neural Networks for Count Data with High-cardinality Categorical Features [1.2289361708127877]
We propose a novel hierarchical likelihood learning framework for introducing gamma random effects into a Poisson deep neural network. The proposed method simultaneously yields maximum likelihood estimators for fixed parameters and best unbiased predictors for random effects. State-of-the-art network architectures can be easily implemented into the proposed h-likelihood framework.
arXiv Detail & Related papers (2023-10-18T01:54:48Z)
Sparse-Input Neural Network using Group Concave Regularization [10.103025766129006]
Simultaneous feature selection and non-linear function estimation are challenging in neural networks. We propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings.
arXiv Detail & Related papers (2023-07-01T13:47:09Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Bayesian optimization of distributed neurodynamical controller models for spatial navigation [1.9249287163937971]
We introduce the NeuroSwarms controller, in which agent-based interactions are modeled by analogy to neuronal network interactions. This complexity precludes linear analyses of stability, controllability, and performance typically used to study conventional swarm models. We present a framework for tuning dynamical controller models of autonomous multi-agent systems based on Bayesian Optimization.
arXiv Detail & Related papers (2021-10-31T21:43:06Z)
Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance. distinct models are required to be trained to reach different points in the rate-distortion (R-D) space. We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z)
A Flexible Framework for Designing Trainable Priors with Adaptive Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems. We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions. This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z)
Set Based Stochastic Subsampling [85.5331107565578]
We propose a set-based two-stage end-to-end neural subsampling model that is jointly optimized with an textitarbitrary downstream task network. We show that it outperforms the relevant baselines under low subsampling rates on a variety of tasks including image classification, image reconstruction, function reconstruction and few-shot classification.
arXiv Detail & Related papers (2020-06-25T07:36:47Z)
Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand. We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice. Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)
Understanding the Effects of Data Parallelism and Sparsity on Neural Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity. Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.