Learning under Singularity: An Information Criterion improving WBIC and
sBIC
- URL: http://arxiv.org/abs/2402.12762v2
- Date: Thu, 22 Feb 2024 08:32:24 GMT
- Title: Learning under Singularity: An Information Criterion improving WBIC and
sBIC
- Authors: Lirui Liu and Joe Suzuki
- Abstract summary: We introduce a novel Information Criterion (IC), termed Learning under Singularity (LS)
LS is effective without regularity constraints and demonstrates stability.
This approach offers a flexible and robust method for model selection, free from regularity constraints.
- Score: 1.0878040851637998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel Information Criterion (IC), termed Learning under
Singularity (LS), designed to enhance the functionality of the Widely
Applicable Bayes Information Criterion (WBIC) and the Singular Bayesian
Information Criterion (sBIC). LS is effective without regularity constraints
and demonstrates stability. Watanabe defined a statistical model or a learning
machine as regular if the mapping from a parameter to a probability
distribution is one-to-one and its Fisher information matrix is positive
definite. In contrast, models not meeting these conditions are termed singular.
Over the past decade, several information criteria for singular cases have been
proposed, including WBIC and sBIC. WBIC is applicable in non-regular scenarios
but faces challenges with large sample sizes and redundant estimation of known
learning coefficients. Conversely, sBIC is limited in its broader application
due to its dependence on maximum likelihood estimates. LS addresses these
limitations by enhancing the utility of both WBIC and sBIC. It incorporates the
empirical loss from the Widely Applicable Information Criterion (WAIC) to
represent the goodness of fit to the statistical model, along with a penalty
term similar to that of sBIC. This approach offers a flexible and robust method
for model selection, free from regularity constraints.
Related papers
- Did Models Sufficient Learn? Attribution-Guided Training via Subset-Selected Counterfactual Augmentation [61.248535801314375]
Subset-Selected Counterfactual Augmentation (SS-CA)<n>We develop Counterfactual LIMA to identify minimal spatial region sets whose removal can selectively alter model predictions.<n>Experiments show that SS-CA improves generalization on in-distribution (ID) test data and achieves superior performance on out-of-distribution (OOD) benchmarks.
arXiv Detail & Related papers (2025-11-15T08:39:22Z) - Continual learning via probabilistic exchangeable sequence modelling [6.269118318460723]
Continual learning (CL) refers to the ability to continuously learn and accumulate new knowledge while retaining useful information from past experiences.
We propose CL-BRUNO, a probabilistic, Neural Process-based CL model that performs scalable and tractable Bayesian update and prediction.
arXiv Detail & Related papers (2025-03-26T17:08:20Z) - Offline Learning for Combinatorial Multi-armed Bandits [56.96242764723241]
Off-CMAB is the first offline learning framework for CMAB.
Off-CMAB combines pessimistic reward estimations with solvers.
Experiments on synthetic and real-world datasets highlight the superior performance of CLCB.
arXiv Detail & Related papers (2025-01-31T16:56:18Z) - Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications.
Ensuring their alignment with the diverse preferences of individual users has become a critical challenge.
We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z) - Statistically Valid Information Bottleneck via Multiple Hypothesis Testing [35.59201763567714]
We introduce a statistically valid solution to the information bottleneck (IB) problem via multiple hypothesis testing (IB-MHT)
IB-MHT ensures that the learned features meet the IB constraints with high probability, regardless of the size of the available dataset.
Results validate the effectiveness of IB-MHT in outperforming conventional methods in terms of statistical robustness and reliability.
arXiv Detail & Related papers (2024-09-11T15:04:32Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Flexible and Robust Counterfactual Explanations with Minimal Satisfiable
Perturbations [56.941276017696076]
We propose a conceptually simple yet effective solution named Counterfactual Explanations with Minimal Satisfiable Perturbations (CEMSP)
CEMSP constrains changing values of abnormal features with the help of their semantically meaningful normal ranges.
Compared to existing methods, we conduct comprehensive experiments on both synthetic and real-world datasets to demonstrate that our method provides more robust explanations while preserving flexibility.
arXiv Detail & Related papers (2023-09-09T04:05:56Z) - Intuitionistic Fuzzy Broad Learning System: Enhancing Robustness Against Noise and Outliers [0.0]
We propose fuzzy broad learning system (F-BLS) and intuitionistic fuzzy broad learning system (IF-BLS) models.
We implement the proposed F-BLS and IF-BLS models to diagnose Alzheimer's disease (AD)
arXiv Detail & Related papers (2023-07-15T21:40:36Z) - MF-CLIP: Leveraging CLIP as Surrogate Models for No-box Adversarial Attacks [65.86360607693457]
No-box attacks, where adversaries have no prior knowledge, remain relatively underexplored despite its practical relevance.
This work presents a systematic investigation into leveraging large-scale Vision-Language Models (VLMs) as surrogate models for executing no-box attacks.
Our theoretical and empirical analyses reveal a key limitation in the execution of no-box attacks stemming from insufficient discriminative capabilities for direct application of vanilla CLIP as a surrogate model.
We propose MF-CLIP: a novel framework that enhances CLIP's effectiveness as a surrogate model through margin-aware feature space optimization.
arXiv Detail & Related papers (2023-07-13T08:10:48Z) - Gibbs-Based Information Criteria and the Over-Parameterized Regime [20.22034560278484]
Double-descent refers to the unexpected drop in test loss of a learning algorithm beyond an interpolating threshold.
We update these analyses using the information risk minimization framework and provide Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) for models learned by the Gibbs algorithm.
arXiv Detail & Related papers (2023-06-08T22:54:48Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Validation Diagnostics for SBI algorithms based on Normalizing Flows [55.41644538483948]
This work proposes easy to interpret validation diagnostics for multi-dimensional conditional (posterior) density estimators based on NF.
It also offers theoretical guarantees based on results of local consistency.
This work should help the design of better specified models or drive the development of novel SBI-algorithms.
arXiv Detail & Related papers (2022-11-17T15:48:06Z) - Supervised Multivariate Learning with Simultaneous Feature Auto-grouping
and Dimension Reduction [7.093830786026851]
This paper proposes a novel clustered reduced-rank learning framework.
It imposes two joint matrix regularizations to automatically group the features in constructing predictive factors.
It is more interpretable than low-rank modeling and relaxes the stringent sparsity assumption in variable selection.
arXiv Detail & Related papers (2021-12-17T20:11:20Z) - Pointwise Feasibility of Gaussian Process-based Safety-Critical Control
under Model Uncertainty [77.18483084440182]
Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs) are popular tools for enforcing safety and stability of a controlled system, respectively.
We present a Gaussian Process (GP)-based approach to tackle the problem of model uncertainty in safety-critical controllers that use CBFs and CLFs.
arXiv Detail & Related papers (2021-06-13T23:08:49Z) - Pool-based sequential active learning with multi kernels [10.203602318836444]
We study a pool-based sequential active learning (AL) in which one sample is queried at each time from a large pool of unlabeled data.
We propose two selection criteria, named expected- Kerneldiscrepancy (EKD) and expected- Kernel-loss (EKL)
Also, it is identified that the proposed EKD and EKL successfully generalize the concepts of popular query-by-committee (QBC) and expected-model-change (EMC)
arXiv Detail & Related papers (2020-10-22T03:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.