Related papers: Lifting Architectural Constraints of Injective Flows

Lifting Architectural Constraints of Injective Flows

URL: http://arxiv.org/abs/2306.01843v5
Date: Thu, 27 Jun 2024 06:51:18 GMT
Title: Lifting Architectural Constraints of Injective Flows
Authors: Peter Sorrenson, Felix Draxler, Armand Rousselot, Sander Hummerich, Lea Zimmermann, Ullrich Köthe,
Abstract summary: Normalizing Flows explicitly maximize a full-dimensional likelihood on the training data. Injective Flows fix this by jointly learning a manifold and the distribution on it. We show that naively learning both the data manifold and the distribution on it can lead to divergent solutions.
Score: 7.452460759055847
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Normalizing Flows explicitly maximize a full-dimensional likelihood on the training data. However, real data is typically only supported on a lower-dimensional manifold leading the model to expend significant compute on modeling noise. Injective Flows fix this by jointly learning a manifold and the distribution on it. So far, they have been limited by restrictive architectures and/or high computational cost. We lift both constraints by a new efficient estimator for the maximum likelihood loss, compatible with free-form bottleneck architectures. We further show that naively learning both the data manifold and the distribution on it can lead to divergent solutions, and use this insight to motivate a stable maximum likelihood training objective. We perform extensive experiments on toy, tabular and image data, demonstrating the competitive performance of the resulting model.

Related papers

A Theoretical Framework for Modular Learning of Robust Generative Models [41.69461814486466]
Training large-scale generative models is resource-intensive and relies heavily on dataset weighting.<n>We present a theoretical framework for modular generative modeling where a set of pre-trained experts are combined via a gating mechanism.<n>We show that our modular architecture effectively mitigates conflict and can robustly outperform monolithic baselines.
arXiv Detail & Related papers (2026-02-19T17:09:13Z)
MPRU: Modular Projection-Redistribution Unlearning as Output Filter for Classification Pipelines [23.370444162993707]
We propose an emphinductive approach to machine unlearning (MU)<n>Unlearning can be done by reversing the last training sequence. This is implemented by appending a projection-redistribution layer in the end of the model.<n>Experiment results show consistently similar output to a fully retrained model with a high computational cost reduction.
arXiv Detail & Related papers (2025-10-30T08:09:37Z)
Nonparametric Data Attribution for Diffusion Models [57.820618036556084]
Data attribution for generative models seeks to quantify the influence of individual training examples on model outputs.<n>We propose a nonparametric attribution method that operates entirely on data, measuring influence via patch-level similarity between generated and training images.
arXiv Detail & Related papers (2025-10-16T03:37:16Z)
Efficient Manifold-Constrained Neural ODE for High-Dimensional Datasets [8.436711484752365]
We propose a novel approach to explore the underlying manifold to restrict the ODE process.<n>Specifically, we employ a structure-preserved encoder to process data and find the underlying graph to approximate the manifold.<n>Our results demonstrate superior performance, underscoring the effectiveness of our approach in addressing the challenges of high-dimensional datasets.
arXiv Detail & Related papers (2025-10-05T10:36:14Z)
Stochastic Interpolants via Conditional Dependent Coupling [36.84747986070112]
Existing image generation models face critical challenges regarding the trade-off between computation and fidelity.<n>We introduce a unified multistage generative framework based on our proposed Conditional Dependent Coupling strategy.<n>It decomposes the generative process into interpolant trajectories at multiple stages, ensuring accurate distribution learning while enabling end-to-end optimization.
arXiv Detail & Related papers (2025-09-27T05:03:08Z)
SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models [51.74498855100541]
Large language models (LLMs) have shown strong reasoning capabilities when fine-tuned with reinforcement learning (RL)<n>We propose textbfSPaRFT, a self-paced learning framework that enables efficient learning based on the capability of the model being trained.
arXiv Detail & Related papers (2025-08-07T03:50:48Z)
Expanding Expressiveness of Diffusion Models with Limited Data via Self-Distillation based Fine-Tuning [24.791783885165923]
Training diffusion models on limited datasets poses challenges in terms of limited generation capacity and expressiveness. We propose Self-Distillation for Fine-Tuning diffusion models (SDFT) to address these challenges.
arXiv Detail & Related papers (2023-11-02T06:24:06Z)
Balancing Act: Constraining Disparate Impact in Sparse Models [20.058720715290434]
We propose a constrained optimization approach that directly addresses the disparate impact of pruning. Our formulation bounds the accuracy change between the dense and sparse models, for each sub-group. Experimental results demonstrate that our technique scales reliably to problems involving large models and hundreds of protected sub-groups.
arXiv Detail & Related papers (2023-10-31T17:37:35Z)
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other. We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z)
ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data. Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z)
Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data. However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations. This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z)
Wasserstein Distributional Learning [5.830831796910439]
Wasserstein Distributional Learning (WDL) is a flexible density-on-scalar regression modeling framework. We show that WDL better characterizes and uncovers the nonlinear dependence of the conditional densities. We demonstrate the effectiveness of the WDL framework through simulations and real-world applications.
arXiv Detail & Related papers (2022-09-12T02:32:17Z)
Multi-Scale Architectures Matter: On the Adversarial Robustness of Flow-based Lossless Compression [16.109578069331135]
Flow-based models perform better due to their excellent probability density estimation and satisfactory inference speed. Multi-scale architecture provides a shortcut from the shallow layer to the output layer. Flows with multi-scale architecture achieve the best trade-off between coding complexity and compression efficiency.
arXiv Detail & Related papers (2022-08-26T15:17:43Z)
ManiFlow: Implicitly Representing Manifolds with Normalizing Flows [145.9820993054072]
Normalizing Flows (NFs) are flexible explicit generative models that have been shown to accurately model complex real-world data distributions. We propose an optimization objective that recovers the most likely point on the manifold given a sample from the perturbed distribution. Finally, we focus on 3D point clouds for which we utilize the explicit nature of NFs, i.e. surface normals extracted from the gradient of the log-likelihood and the log-likelihood itself.
arXiv Detail & Related papers (2022-08-18T16:07:59Z)
Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers. We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z)
Self Normalizing Flows [65.73510214694987]
We propose a flexible framework for training normalizing flows by replacing expensive terms in the gradient by learned approximate inverses at each layer. This reduces the computational complexity of each layer's exact update from $mathcalO(D3)$ to $mathcalO(D2)$. We show experimentally that such models are remarkably stable and optimize to similar data likelihood values as their exact gradient counterparts.
arXiv Detail & Related papers (2020-11-14T09:51:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.