On Investigating the Conservative Property of Score-Based Generative
Models
- URL: http://arxiv.org/abs/2209.12753v3
- Date: Sun, 4 Jun 2023 15:13:43 GMT
- Title: On Investigating the Conservative Property of Score-Based Generative
Models
- Authors: Chen-Hao Chao, Wei-Fang Sun, Bo-Wun Cheng, Chun-Yi Lee
- Abstract summary: We propose Quasi-Conservative Score-Based Models (QCSBMs) for keeping the advantages of both CSBMs and USBMs.
Our theoretical derivations demonstrate that the training objective of QCSBMs can be efficiently integrated into the training processes.
- Score: 15.121796988652461
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing Score-Based Models (SBMs) can be categorized into constrained SBMs
(CSBMs) or unconstrained SBMs (USBMs) according to their parameterization
approaches. CSBMs model probability density functions as Boltzmann
distributions, and assign their predictions as the negative gradients of some
scalar-valued energy functions. On the other hand, USBMs employ flexible
architectures capable of directly estimating scores without the need to
explicitly model energy functions. In this paper, we demonstrate that the
architectural constraints of CSBMs may limit their modeling ability. In
addition, we show that USBMs' inability to preserve the property of
conservativeness may lead to degraded performance in practice. To address the
above issues, we propose Quasi-Conservative Score-Based Models (QCSBMs) for
keeping the advantages of both CSBMs and USBMs. Our theoretical derivations
demonstrate that the training objective of QCSBMs can be efficiently integrated
into the training processes by leveraging the Hutchinson's trace estimator. In
addition, our experimental results on the CIFAR-10, CIFAR-100, ImageNet, and
SVHN datasets validate the effectiveness of QCSBMs. Finally, we justify the
advantage of QCSBMs using an example of a one-layered autoencoder.
Related papers
- A Novel Approach to Explainable AI with Quantized Active Ingredients in Decision Making [0.0]
We propose an explainable AI framework based on our comparative study with Quantum Boltzmann Machines (QBMs) and Classical Boltzmann Machines (CBMs)<n>We leverage principles of quantum computing within classical machine learning to provide substantive transparency around decision-making.<n>For interpretability, we employ gradient-based saliency maps in QBMs and SHAP (SHapley Additive exPlanations) in CBMs to evaluate feature attributions.
arXiv Detail & Related papers (2026-01-13T17:06:19Z) - Controllable Concept Bottleneck Models [55.03639763625018]
Controllable Concept Bottleneck Models (CCBMs)<n>CCBMs support three granularities of model editing: concept-label-level, concept-level, and data-level.<n>CCBMs enjoy mathematically rigorous closed-form approximations derived from influence functions that obviate the need for retraining.
arXiv Detail & Related papers (2026-01-01T19:30:06Z) - Post-hoc Stochastic Concept Bottleneck Models [18.935442650741]
Concept Bottleneck Models (CBMs) are interpretable models that predict the target variable through high-level human-understandable concepts.<n>We introduce Post-hoc Concept Bottleneck Models (PSCBMs), a lightweight method that augments any pre-trained CBM with a normal distribution over concepts without retraining the backbone model.<n>We show that PSCBMs perform much better than CBMs under interventions, while remaining far more efficient than retraining a similar model from scratch.
arXiv Detail & Related papers (2025-10-09T13:42:54Z) - The Curious Case of In-Training Compression of State Space Models [49.819321766705514]
State Space Models (SSMs) tackle long sequence modeling tasks efficiently, offer both parallelizable training and fast inference.<n>Key design challenge is striking the right balance between maximizing expressivity and limiting this computational burden.<n>Our approach, textscCompreSSM, applies to Linear Time-Invariant SSMs such as Linear Recurrent Units, but is also extendable to selective models.
arXiv Detail & Related papers (2025-10-03T09:02:33Z) - Discriminative Policy Optimization for Token-Level Reward Models [55.98642069903191]
Process reward models (PRMs) provide more nuanced supervision compared to outcome reward models (ORMs)<n>Q-RM explicitly learns token-level Q-functions from preference data without relying on fine-grained annotations.<n>Reinforcement learning with Q-RM significantly enhances training efficiency, achieving convergence 12 times faster than ORM on GSM8K and 11 times faster than step-level PRM on MATH.
arXiv Detail & Related papers (2025-05-29T11:40:34Z) - QMamba: Post-Training Quantization for Vision State Space Models [45.97843526485619]
State Space Models (SSMs) have gained increasing attention for vision models recently.
Given the computational cost of deploying SSMs on resource-limited edge devices, Post-Training Quantization (PTQ) is a technique with the potential for efficient deployment of SSMs.
We propose QMamba, one of the first PTQ frameworks to be designed for vision SSMs based on the analysis of the activation distributions in SSMs.
arXiv Detail & Related papers (2025-01-23T12:45:20Z) - EQ-CBM: A Probabilistic Concept Bottleneck with Energy-based Models and Quantized Vectors [4.481898130085069]
Concept bottleneck models (CBMs) have gained attention as an effective approach by leveraging human-understandable concepts to enhance interpretability.
Existing CBMs face challenges due to deterministic concept encoding and reliance on inconsistent concepts, leading to inaccuracies.
We propose EQ-CBM, a novel framework that enhances CBMs through probabilistic concept encoding.
arXiv Detail & Related papers (2024-09-22T23:43:45Z) - QKSAN: A Quantum Kernel Self-Attention Network [53.96779043113156]
A Quantum Kernel Self-Attention Mechanism (QKSAM) is introduced to combine the data representation merit of Quantum Kernel Methods (QKM) with the efficient information extraction capability of SAM.
A Quantum Kernel Self-Attention Network (QKSAN) framework is proposed based on QKSAM, which ingeniously incorporates the Deferred Measurement Principle (DMP) and conditional measurement techniques.
Four QKSAN sub-models are deployed on PennyLane and IBM Qiskit platforms to perform binary classification on MNIST and Fashion MNIST.
arXiv Detail & Related papers (2023-08-25T15:08:19Z) - Human Trajectory Forecasting with Explainable Behavioral Uncertainty [63.62824628085961]
Human trajectory forecasting helps to understand and predict human behaviors, enabling applications from social robots to self-driving cars.
Model-free methods offer superior prediction accuracy but lack explainability, while model-based methods provide explainability but cannot predict well.
We show that BNSP-SFM achieves up to a 50% improvement in prediction accuracy, compared with 11 state-of-the-art methods.
arXiv Detail & Related papers (2023-07-04T16:45:21Z) - Predictable MDP Abstraction for Unsupervised Model-Based RL [93.91375268580806]
We propose predictable MDP abstraction (PMA)
Instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space.
We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised model-based RL approaches.
arXiv Detail & Related papers (2023-02-08T07:37:51Z) - Measuring the Driving Forces of Predictive Performance: Application to
Credit Scoring [0.0]
In credit scoring, machine learning models are known to outperform standard parametric models.
We introduce the XPER methodology to decompose a performance metric into contributions associated with a model.
We show that a small number of features can explain a surprisingly large part of the model performance.
arXiv Detail & Related papers (2022-12-12T13:09:46Z) - I saw, I conceived, I concluded: Progressive Concepts as Bottlenecks [2.9398911304923447]
Concept bottleneck models (CBMs) provide explainability and intervention during inference by correcting predicted, intermediate concepts.
This makes CBMs attractive for high-stakes decision-making.
We take the quality assessment of fetal ultrasound scans as a real-life use case for CBM decision support in healthcare.
arXiv Detail & Related papers (2022-11-19T09:31:19Z) - Do Quantum Circuit Born Machines Generalize? [58.720142291102135]
We present the first work in the literature that presents the QCBM's generalization performance as an integral evaluation metric for quantum generative models.
We show that the QCBM is able to effectively learn the reweighted dataset and generate unseen samples with higher quality than those in the training set.
arXiv Detail & Related papers (2022-07-27T17:06:34Z) - EBMs vs. CL: Exploring Self-Supervised Visual Pretraining for Visual
Question Answering [53.40635559899501]
clean and diverse labeled data is a major roadblock for training models on complex tasks such as visual question answering (VQA)
We review and evaluate self-supervised methods to leverage unlabeled images and pretrain a model, which we then fine-tune on a custom VQA task.
We find that both EBMs and CL can learn representations from unlabeled images that enable training a VQA model on very little annotated data.
arXiv Detail & Related papers (2022-06-29T01:44:23Z) - Your Autoregressive Generative Model Can be Better If You Treat It as an
Energy-Based One [83.5162421521224]
We propose a unique method termed E-ARM for training autoregressive generative models.
E-ARM takes advantage of a well-designed energy-based learning objective.
We show that E-ARM can be trained efficiently and is capable of alleviating the exposure bias problem.
arXiv Detail & Related papers (2022-06-26T10:58:41Z) - Post-hoc Concept Bottleneck Models [11.358495577593441]
Concept Bottleneck Models (CBMs) map the inputs onto a set of interpretable concepts and use the concepts to make predictions.
CBMs are restrictive in practice as they require concept labels in the training data to learn the bottleneck and do not leverage strong pretrained models.
We show that we can turn any neural network into a PCBM without sacrificing model performance while still retaining interpretability benefits.
arXiv Detail & Related papers (2022-05-31T00:29:26Z) - $\mathcal{F}$-EBM: Energy Based Learning of Functional Data [1.0896567381206714]
Energy-Based Models (EBMs) have proven to be a highly effective approach for modelling densities on finite-dimensional spaces.
We present a novel class of EBM which is able to learn distributions of functions from functional samples evaluated at finitely many points.
arXiv Detail & Related papers (2022-02-04T01:01:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.