Uncertainty-Masked Bernoulli Diffusion for Camouflaged Object Detection Refinement
- URL: http://arxiv.org/abs/2506.10712v1
- Date: Thu, 12 Jun 2025 14:02:18 GMT
- Title: Uncertainty-Masked Bernoulli Diffusion for Camouflaged Object Detection Refinement
- Authors: Yuqi Shen, Fengyang Xiao, Sujie Hu, Youwei Pang, Yifan Pu, Chengyu Fang, Xiu Li, Chunming He,
- Abstract summary: Camouflaged Object Detection (COD) presents inherent challenges due to subtle visual differences between targets and their backgrounds.<n>We propose the Uncertainty-Masked Bernoulli Diffusion (UMBD) model, the first generative refinement framework specifically designed for COD.<n>UMBD introduces an uncertainty-guided masking mechanism that selectively applies Bernoulli diffusion to residual regions with poor segmentation quality.
- Score: 24.522233459116354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Camouflaged Object Detection (COD) presents inherent challenges due to the subtle visual differences between targets and their backgrounds. While existing methods have made notable progress, there remains significant potential for post-processing refinement that has yet to be fully explored. To address this limitation, we propose the Uncertainty-Masked Bernoulli Diffusion (UMBD) model, the first generative refinement framework specifically designed for COD. UMBD introduces an uncertainty-guided masking mechanism that selectively applies Bernoulli diffusion to residual regions with poor segmentation quality, enabling targeted refinement while preserving correctly segmented areas. To support this process, we design the Hybrid Uncertainty Quantification Network (HUQNet), which employs a multi-branch architecture and fuses uncertainty from multiple sources to improve estimation accuracy. This enables adaptive guidance during the generative sampling process. The proposed UMBD framework can be seamlessly integrated with a wide range of existing Encoder-Decoder-based COD models, combining their discriminative capabilities with the generative advantages of diffusion-based refinement. Extensive experiments across multiple COD benchmarks demonstrate consistent performance improvements, achieving average gains of 5.5% in MAE and 3.2% in weighted F-measure with only modest computational overhead. Code will be released.
Related papers
- Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding [58.92526489742584]
We propose provably lossless.<n> verification method that significantly boosts the expected number of accepted tokens.<n>We show that HSD yields consistent improvements in acceptance rates across diverse model families and benchmarks.
arXiv Detail & Related papers (2026-01-09T11:10:29Z) - Beyond Confidence: Adaptive and Coherent Decoding for Diffusion Language Models [64.92045568376705]
Coherent Contextual Decoding (CCD) is a novel inference framework built upon two core innovations.<n>CCD employs a trajectory rectification mechanism that leverages historical context to enhance sequence coherence.<n>Instead of rigid allocations based on diffusion steps, we introduce an adaptive sampling strategy that dynamically adjusts the unmasking budget for each step.
arXiv Detail & Related papers (2025-11-26T09:49:48Z) - PDAC: Efficient Coreset Selection for Continual Learning via Probability Density Awareness [19.191960069245354]
Rehearsal-based Continual Learning (CL) maintains a limited memory buffer to store replay samples for knowledge retention.<n>Current Rehearsal-based CL methods typically construct the memory buffer by selecting a representative subset.<n>We propose the Probability Density-Aware Coreset (PDAC) method to estimate each sample's joint density, enabling efficient density-prioritized buffer selection.
arXiv Detail & Related papers (2025-11-12T17:00:21Z) - Diffusion-Driven Progressive Target Manipulation for Source-Free Domain Adaptation [108.0345347464393]
Source-free domain adaptation (SFDA) is a challenging task that tackles domain shifts using only a pre-trained source model and unlabeled target data.<n>Non-generation SFDA methods suffer from unreliable pseudo-labels in challenging scenarios with large domain discrepancies.<n>We propose a novel generation-based framework named Diffusion-Driven Progressive Target Manipulation.
arXiv Detail & Related papers (2025-10-29T08:38:03Z) - A Gradient Guided Diffusion Framework for Chance Constrained Programming [38.33757282877801]
Chance constrained Diffusion (CCP) is a powerful framework for addressing optimization problems under uncertainty.<n>In this paper, we introduce a novelGuidedOpt-based framework which tackles problems through three key innovations.
arXiv Detail & Related papers (2025-10-14T07:40:55Z) - SPREAD: Sampling-based Pareto front Refinement via Efficient Adaptive Diffusion [0.8594140167290097]
SPREAD is a generative framework based on Denoising Diffusion Probabilistic Models (DDPMs)<n>It learns a conditional diffusion process over points sampled from the decision space.<n>It refines candidates via a sampling scheme that uses an adaptive multiple gradient descent-inspired update for fast convergence.
arXiv Detail & Related papers (2025-09-25T12:09:37Z) - Towards Better Code Generation: Adaptive Decoding with Uncertainty Guidance [28.99265405319943]
We introduce AdaDec, an adaptive decoding framework guided by token-level uncertainty quantified via Shannon entropy.<n>AdaDec achieves up to a 15.5% improvement in Pass@1 accuracy compared to greedy decoding, matches or outperforms traditional beam search.
arXiv Detail & Related papers (2025-06-10T16:49:46Z) - Channel Fingerprint Construction for Massive MIMO: A Deep Conditional Generative Approach [65.47969413708344]
We introduce the concept of CF twins and design a conditional generative diffusion model (CGDM)<n>We employ a variational inference technique to derive the evidence lower bound (ELBO) for the log-marginal distribution of the observed fine-grained CF conditioned on the coarse-grained CF.<n>We show that the proposed approach exhibits significant improvement in reconstruction performance compared to the baselines.
arXiv Detail & Related papers (2025-05-12T01:36:06Z) - One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step.<n>To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration.<n>Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z) - Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems [89.35169042718739]
collaborative inference enables end users to leverage powerful deep learning models without exposure of sensitive raw data to cloud servers.<n>Recent studies have revealed that these intermediate features may not sufficiently preserve privacy, as information can be leaked and raw data can be reconstructed via model inversion attacks (MIAs)<n>This work first theoretically proves that the conditional entropy of inputs given intermediate features provides a guaranteed lower bound on the reconstruction mean square error (MSE) under any MIA.<n>Then, we derive a differentiable and solvable measure for bounding this conditional entropy based on the Gaussian mixture estimation and propose a conditional entropy algorithm to enhance the inversion robustness
arXiv Detail & Related papers (2025-03-01T07:15:21Z) - Rectified Diffusion Guidance for Conditional Generation [62.00207951161297]
We revisit the theory behind CFG and rigorously confirm that the improper configuration of the combination coefficients (i.e., the widely used summing-to-one version) brings about expectation shift of the generative distribution.
We propose ReCFG with a relaxation on the guidance coefficients such that denoising with ReCFG strictly aligns with the diffusion theory.
That way the rectified coefficients can be readily pre-computed via traversing the observed data, leaving the sampling speed barely affected.
arXiv Detail & Related papers (2024-10-24T13:41:32Z) - Improved Variational Inference in Discrete VAEs using Error Correcting Codes [3.053842954605396]
This work introduces a novel method to improve inference in discrete Variational Autoencoders by reframing the problem through a generative perspective.<n>We conceptualize the model as a communication system, and propose to leverage Error-Correcting Codes (ECCs) to introduce redundancy in latent representations.<n>We present a proof-of-concept using a Discrete Variational Autoencoder with binary latent variables and low-complexity repetition codes, extending it to a hierarchical structure for disentangling global and local data features.
arXiv Detail & Related papers (2024-10-10T11:59:58Z) - Generative Edge Detection with Stable Diffusion [52.870631376660924]
Edge detection is typically viewed as a pixel-level classification problem mainly addressed by discriminative methods.
We propose a novel approach, named Generative Edge Detector (GED), by fully utilizing the potential of the pre-trained stable diffusion model.
We conduct extensive experiments on multiple datasets and achieve competitive performance.
arXiv Detail & Related papers (2024-10-04T01:52:23Z) - Gaussian Process Upper Confidence Bounds in Distributed Point Target Tracking over Wireless Sensor Networks [8.837529873076235]
This paper proposes a distributed Gaussian process (DGP) approach for point target tracking and derives upper confidence bounds (UCBs) of the state estimates.
A novel hybrid Bayesian filtering method is proposed to enhance the DGP approach by adopting a Poisson measurement likelihood model.
Numerical results demonstrate the tracking accuracy and robustness of the proposed approaches.
arXiv Detail & Related papers (2024-09-11T22:42:11Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - Predictive Uncertainty Quantification for Bird's Eye View Segmentation: A Benchmark and Novel Loss Function [10.193504550494486]
This paper introduces a benchmark for predictive uncertainty quantification in Bird's Eye View (BEV) segmentation.<n>Our study focuses on the effectiveness of quantified uncertainty in detecting misclassified and out-of-distribution pixels.<n>We propose a novel loss function, Uncertainty-Focal-Cross-Entropy (UFCE), specifically designed for highly imbalanced data.
arXiv Detail & Related papers (2024-05-31T16:32:46Z) - OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving [15.331332063879342]
OccGen is a simple yet powerful generative perception model for the task of 3D semantic occupancy prediction.
OccGen adopts a ''noise-to-occupancy'' generative paradigm, progressively inferring and refining the occupancy map.
A key insight of this generative pipeline is that the diffusion denoising process is naturally able to model the coarse-to-fine refinement of the dense 3D occupancy map.
arXiv Detail & Related papers (2024-04-23T13:20:09Z) - CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion
Models [72.93652777646233]
Camouflaged Object Detection (COD) is a challenging task in computer vision due to the high similarity between camouflaged objects and their surroundings.
We propose a new paradigm that treats COD as a conditional mask-generation task leveraging diffusion models.
Our method, dubbed CamoDiffusion, employs the denoising process of diffusion models to iteratively reduce the noise of the mask.
arXiv Detail & Related papers (2023-05-29T07:49:44Z) - Deep Capsule Encoder-Decoder Network for Surrogate Modeling and
Uncertainty Quantification [0.0]
The proposed framework is developed by adapting Capsule Network (CapsNet) architecture into image-to-image regression encoder-decoder network.
The obtained results from performance evaluation indicate that the proposed approach is accurate, efficient, and robust.
arXiv Detail & Related papers (2022-01-19T17:45:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.