Mixture Manifold Networks: A Computationally Efficient Baseline for
Inverse Modeling
- URL: http://arxiv.org/abs/2211.14366v1
- Date: Fri, 25 Nov 2022 20:18:07 GMT
- Title: Mixture Manifold Networks: A Computationally Efficient Baseline for
Inverse Modeling
- Authors: Gregory P. Spell, Simiao Ren, Leslie M. Collins, Jordan M. Malof
- Abstract summary: We propose and show the efficacy of a new method to address generic inverse problems.
Recent work has shown impressive results using deep learning, but we note that there is a trade-off between model performance and computational time.
- Score: 7.891408798179181
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose and show the efficacy of a new method to address generic inverse
problems. Inverse modeling is the task whereby one seeks to determine the
control parameters of a natural system that produce a given set of observed
measurements. Recent work has shown impressive results using deep learning, but
we note that there is a trade-off between model performance and computational
time. For some applications, the computational time at inference for the best
performing inverse modeling method may be overly prohibitive to its use. We
present a new method that leverages multiple manifolds as a mixture of backward
(e.g., inverse) models in a forward-backward model architecture. These multiple
backwards models all share a common forward model, and their training is
mitigated by generating training examples from the forward model. The proposed
method thus has two innovations: 1) the multiple Manifold Mixture Network (MMN)
architecture, and 2) the training procedure involving augmenting backward model
training data using the forward model. We demonstrate the advantages of our
method by comparing to several baselines on four benchmark inverse problems,
and we furthermore provide analysis to motivate its design.
Related papers
- Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling.
Our research explores task-specific model pruning to inform decisions about designing SMoE architectures.
We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures [14.551812310439004]
We introduce an untrained forward model residual block within the model-based architecture to match the data consistency in the measurement domain for each instance.
Our approach offers a unified solution that is less parameter-sensitive, requires no additional data, and enables simultaneous fitting of the forward model and reconstruction in a single pass.
arXiv Detail & Related papers (2024-03-07T19:02:13Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Merging Models with Fisher-Weighted Averaging [24.698591753644077]
We introduce a fundamentally different method for transferring knowledge across models that amounts to "merging" multiple models into one.
Our approach effectively involves computing a weighted average of the models' parameters.
We show that our merging procedure makes it possible to combine models in previously unexplored ways.
arXiv Detail & Related papers (2021-11-18T17:59:35Z) - Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
This paper focuses on the problem of 3D human reconstruction from 2D evidence.
We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
arXiv Detail & Related papers (2021-08-26T17:55:11Z) - Sample Efficient Reinforcement Learning via Model-Ensemble Exploration
and Exploitation [3.728946517493471]
MEEE is a model-ensemble method that consists of optimistic exploration and weighted exploitation.
Our approach outperforms other model-free and model-based state-of-the-art methods, especially in sample complexity.
arXiv Detail & Related papers (2021-07-05T07:18:20Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Using machine learning to correct model error in data assimilation and
forecast applications [0.0]
We propose to use this method to correct the error of an existent, knowledge-based model.
The resulting surrogate model is an hybrid model between the original (knowledge-based) model and the ML model.
Using the hybrid surrogate models for DA yields a significantly better analysis than using the original model.
arXiv Detail & Related papers (2020-10-23T18:30:45Z) - Benchmarking deep inverse models over time, and the neural-adjoint
method [3.4376560669160394]
We consider the task of solving generic inverse problems, where one wishes to determine the hidden parameters of a natural system.
We conceptualize these models as different schemes for efficiently, but randomly, exploring the space of possible inverse solutions.
We compare several state-of-the-art inverse modeling approaches on four benchmark tasks.
arXiv Detail & Related papers (2020-09-27T18:32:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.