Consistent estimation of generative model representations in the data
kernel perspective space
- URL: http://arxiv.org/abs/2409.17308v1
- Date: Wed, 25 Sep 2024 19:35:58 GMT
- Title: Consistent estimation of generative model representations in the data
kernel perspective space
- Authors: Aranyak Acharyya and Michael W. Trosset and Carey E. Priebe and Hayden
S. Helm
- Abstract summary: Generative models, such as large language models and text-to-image diffusion models, produce relevant information when presented a query.
Different models may produce different information when presented the same query.
We present novel theoretical results for embedding-based representations of generative models in the context of a set of queries.
- Score: 13.099029073152257
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative models, such as large language models and text-to-image diffusion
models, produce relevant information when presented a query. Different models
may produce different information when presented the same query. As the
landscape of generative models evolves, it is important to develop techniques
to study and analyze differences in model behaviour. In this paper we present
novel theoretical results for embedding-based representations of generative
models in the context of a set of queries. We establish sufficient conditions
for the consistent estimation of the model embeddings in situations where the
query set and the number of models grow.
Related papers
- Embedding-based statistical inference on generative models [10.948308354932639]
We extend results related to embedding-based representations of generative models to classical statistical inference settings.
We demonstrate that using the perspective space as the basis of a notion of "similar" is effective for multiple model-level inference tasks.
arXiv Detail & Related papers (2024-10-01T22:28:39Z) - A Survey on Diffusion Models for Time Series and Spatio-Temporal Data [92.1255811066468]
We review the use of diffusion models in time series and S-temporal data, categorizing them by model, task type, data modality, and practical application domain.
We categorize diffusion models into unconditioned and conditioned types discuss time series and S-temporal data separately.
Our survey covers their application extensively in various fields including healthcare, recommendation, climate, energy, audio, and transportation.
arXiv Detail & Related papers (2024-04-29T17:19:40Z) - Representer Point Selection for Explaining Regularized High-dimensional
Models [105.75758452952357]
We introduce a class of sample-based explanations we term high-dimensional representers.
Our workhorse is a novel representer theorem for general regularized high-dimensional models.
We study the empirical performance of our proposed methods on three real-world binary classification datasets and two recommender system datasets.
arXiv Detail & Related papers (2023-05-31T16:23:58Z) - Content-Based Search for Deep Generative Models [45.322081206025544]
We introduce the task of content-based model search: given a query and a large set of generative models, finding the models that best match the query.
As each generative model produces a distribution of images, we formulate the search task as an optimization problem to select the model with the highest probability of generating similar content as the query.
We demonstrate that our method outperforms several baselines on Generative Model Zoo, a new benchmark we create for the model retrieval task.
arXiv Detail & Related papers (2022-10-06T17:59:51Z) - Language Model Cascades [72.18809575261498]
Repeated interactions at test-time with a single model, or the composition of multiple models together, further expands capabilities.
Cases with control flow and dynamic structure require techniques from probabilistic programming.
We formalize several existing techniques from this perspective, including scratchpads / chain of thought, verifiers, STaR, selection-inference, and tool use.
arXiv Detail & Related papers (2022-07-21T07:35:18Z) - Geometric and Topological Inference for Deep Representations of Complex
Networks [13.173307471333619]
We present a class of statistics that emphasize the topology as well as the geometry of representations.
We evaluate these statistics in terms of the sensitivity and specificity that they afford when used for model selection.
These new methods enable brain and computer scientists to visualize the dynamic representational transformations learned by brains and models.
arXiv Detail & Related papers (2022-03-10T17:14:14Z) - An Ample Approach to Data and Modeling [1.0152838128195467]
We describe a framework for modeling how models can be built that integrates concepts and methods from a wide range of fields.
The reference M* meta model framework is presented, which relies critically in associating whole datasets and respective models in terms of a strict equivalence relation.
Several considerations about how the developed framework can provide insights about data clustering, complexity, collaborative research, deep learning, and creativity are then presented.
arXiv Detail & Related papers (2021-10-05T01:26:09Z) - Model-agnostic multi-objective approach for the evolutionary discovery
of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results.
We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - Conditional Generative Models for Counterfactual Explanations [0.0]
We propose a general framework to generate sparse, in-distribution counterfactual model explanations.
The framework is flexible with respect to the type of generative model used as well as the task of the underlying predictive model.
arXiv Detail & Related papers (2021-01-25T14:31:13Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.