Divergence-Minimization for Latent-Structure Models: Monotone Operators, Contraction Guarantees, and Robust Inference
- URL: http://arxiv.org/abs/2511.17974v1
- Date: Sat, 22 Nov 2025 08:25:29 GMT
- Title: Divergence-Minimization for Latent-Structure Models: Monotone Operators, Contraction Guarantees, and Robust Inference
- Authors: Lei Li, Anand N. Vidyashankar,
- Abstract summary: We develop a divergence-minimization (DM) framework for robust and efficient inference in latent-mixture models.<n>By optimizing a residual-adjusted divergence, the DM approach recovers EM as a special case and yields robust alternatives.
- Score: 5.373905622325275
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a divergence-minimization (DM) framework for robust and efficient inference in latent-mixture models. By optimizing a residual-adjusted divergence, the DM approach recovers EM as a special case and yields robust alternatives through different divergence choices. We establish that the sample objective decreases monotonically along the iterates, leading the DM sequence to stationary points under standard conditions, and that at the population level the operator exhibits local contractivity near the minimizer. Additionally, we verify consistency and $\sqrt{n}$-asymptotic normality of minimum-divergence estimators and of finitely many DM iterations, showing that under correct specification their limiting covariance matches the Fisher information. Robustness is analyzed via the residual-adjustment function, yielding bounded influence functions and a strictly positive breakdown bound for bounded-RAF divergences, and we contrast this with the non-robust behaviour of KL/EM. Next, we address the challenge of determining the number of mixture components by proposing a penalized divergence criterion combined with repeated sample splitting, which delivers consistent order selection and valid post-selection inference. Empirically, DM instantiations based on Hellinger and negative exponential divergences deliver accurate inference and remain stable under contamination in mixture and image-segmentation tasks. The results clarify connections to MM and proximal-point methods and offer practical defaults, making DM a drop-in alternative to EM for robust latent-structure inference.
Related papers
- Multivariate Time Series Data Imputation via Distributionally Robust Regularization [2.3351357479046717]
imputation compromised by mismatch between observed and true data distributions.<n>We propose the Distributionally Robust Regularized Imputer Objective (DRIO)<n>Experiments show DRIO consistently improves imputation under both missing-completely-at-random and missing-not-at-random settings.
arXiv Detail & Related papers (2026-01-31T18:15:03Z) - Decoupling the Effect of Chain-of-Thought Reasoning: A Human Label Variation Perspective [60.45433515408158]
We show that long Chain-of-Thought (CoT) serves as a decisive decision-maker for the top option but fails to function as a granular distribution calibrator for ambiguous tasks.<n>We observe a distinct "decoupled mechanism": while CoT improves distributional alignment, final accuracy is dictated by CoT content.
arXiv Detail & Related papers (2026-01-06T16:26:40Z) - Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps [41.371172458797524]
Non-identifiability of gating parameters up to common translations, intrinsic gate-expert interactions, and tight numerator-denominator coupling are addressed.<n>For model selection, we adapt dendrogram-guided SGMoE, yielding a consistent, sweep-free selector of the number of experts that attains optimal parameter rates.<n>On a dataset of drought-identifiable maize traits, our dendrogram-guided SGMoE selects two experts, exposes a clear mixing hierarchy, stabilizes the likelihood early, and yields interpretable genotype-phenotype maps.
arXiv Detail & Related papers (2025-10-14T17:23:44Z) - Drift No More? Context Equilibria in Multi-Turn LLM Interactions [58.69551510148673]
contexts drift is the gradual divergence of a model's outputs from goal-consistent behavior across turns.<n>Unlike single-turn errors, drift unfolds temporally and is poorly captured by static evaluation metrics.<n>We show that multi-turn drift can be understood as a controllable equilibrium phenomenon rather than as inevitable decay.
arXiv Detail & Related papers (2025-10-09T04:48:49Z) - Discretization-free Multicalibration through Loss Minimization over Tree Ensembles [22.276913140687725]
We propose a discretization-free multicalibration method over an ensemble of depth-two decision trees.<n>Our algorithm provably achieves multicalibration, provided that the data distribution satisfies a technical condition we term as loss saturation.
arXiv Detail & Related papers (2025-05-23T03:29:58Z) - Decoupling Training-Free Guided Diffusion by ADMM [17.425995507142467]
We propose a novel framework that distinctly decouples the unconditional generation model and the guided loss function.
We develop a new algorithm based on the Alternating Direction Method of Multipliers (ADMM) to adaptively balance these components.
Our experiments demonstrate that our proposed method ADMMDiff consistently generates high-quality samples.
arXiv Detail & Related papers (2024-11-18T23:05:54Z) - Amortized Posterior Sampling with Diffusion Prior Distillation [55.03585818289934]
Amortized Posterior Sampling is a novel variational inference approach for efficient posterior sampling in inverse problems.<n>Our method trains a conditional flow model to minimize the divergence between the variational distribution and the posterior distribution implicitly defined by the diffusion model.<n>Unlike existing methods, our approach is unsupervised, requires no paired training data, and is applicable to both Euclidean and non-Euclidean domains.
arXiv Detail & Related papers (2024-07-25T09:53:12Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Targeted Separation and Convergence with Kernel Discrepancies [61.973643031360254]
kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or (ii) control weak convergence to P.<n>In this article we derive new sufficient and necessary conditions to ensure (i) and (ii)<n>For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels.
arXiv Detail & Related papers (2022-09-26T16:41:16Z) - Derivatives and residual distribution of regularized M-estimators with application to adaptive tuning [5.064404027153094]
We study M-estimators with gradient-Lipschitz loss function regularized with convex penalty.<n>A practical example is the robust M-estimator constructed with the Huber loss and the Elastic-Net penalty.
arXiv Detail & Related papers (2021-07-11T23:20:16Z) - Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference.
Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures.
We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.