Generic Event Boundary Detection via Denoising Diffusion
- URL: http://arxiv.org/abs/2508.12084v1
- Date: Sat, 16 Aug 2025 15:44:34 GMT
- Title: Generic Event Boundary Detection via Denoising Diffusion
- Authors: Jaejun Hwang, Dayoung Gong, Manjin Kim, Minsu Cho,
- Abstract summary: Generic event boundary detection aims to identify natural boundaries in a video, segmenting it into distinct and meaningful chunks.<n>Previous methods have focused on deterministic predictions, overlooking the diversity of plausible solutions.<n>We introduce a novel diffusion-based boundary detection model, dubbed DiffGEBD, that tackles the problem of GEBD from a generative perspective.
- Score: 42.88245960369029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generic event boundary detection (GEBD) aims to identify natural boundaries in a video, segmenting it into distinct and meaningful chunks. Despite the inherent subjectivity of event boundaries, previous methods have focused on deterministic predictions, overlooking the diversity of plausible solutions. In this paper, we introduce a novel diffusion-based boundary detection model, dubbed DiffGEBD, that tackles the problem of GEBD from a generative perspective. The proposed model encodes relevant changes across adjacent frames via temporal self-similarity and then iteratively decodes random noise into plausible event boundaries being conditioned on the encoded features. Classifier-free guidance allows the degree of diversity to be controlled in denoising diffusion. In addition, we introduce a new evaluation metric to assess the quality of predictions considering both diversity and fidelity. Experiments show that our method achieves strong performance on two standard benchmarks, Kinetics-GEBD and TAPOS, generating diverse and plausible event boundaries.
Related papers
- ABounD: Adversarial Boundary-Driven Few-Shot Learning for Multi-Class Anomaly Detection [24.691181948844136]
ABversaounD is an Adrial Boundary-Driven few-shot learning framework for multi-class anomaly detection.<n>It integrates semantic concept learning with decision boundary shaping.<n>Experiments on MVTec-AD and VisA datasets demonstrate state-of-the-art performance.
arXiv Detail & Related papers (2025-11-27T13:18:22Z) - Online Generic Event Boundary Detection [27.34486732049466]
We introduce a new task, Online Generic Event Boundary Detection (On-GEBD), aiming to detect boundaries of generic events immediately in streaming videos.<n>This task faces unique challenges of identifying subtle, taxonomy-free event changes in real-time, without the access to future frames.<n>We propose a novel On-GEBD framework, inspired by Event Theory (EST) which explains how humans segment ongoing activity into events by leveraging discrepancies between predicted and actual information.
arXiv Detail & Related papers (2025-10-08T10:23:45Z) - UniSegDiff: Boosting Unified Lesion Segmentation via a Staged Diffusion Model [53.34835793648352]
We propose UniSegDiff, a novel diffusion model framework for lesion segmentation.<n>UniSegDiff addresses lesion segmentation in a unified manner across multiple modalities and organs.<n> Comprehensive experimental results demonstrate that UniSegDiff significantly outperforms previous state-of-the-art (SOTA) approaches.
arXiv Detail & Related papers (2025-07-24T12:33:10Z) - Uncertainty-Masked Bernoulli Diffusion for Camouflaged Object Detection Refinement [24.522233459116354]
Camouflaged Object Detection (COD) presents inherent challenges due to subtle visual differences between targets and their backgrounds.<n>We propose the Uncertainty-Masked Bernoulli Diffusion (UMBD) model, the first generative refinement framework specifically designed for COD.<n>UMBD introduces an uncertainty-guided masking mechanism that selectively applies Bernoulli diffusion to residual regions with poor segmentation quality.
arXiv Detail & Related papers (2025-06-12T14:02:18Z) - Generative Edge Detection with Stable Diffusion [52.870631376660924]
Edge detection is typically viewed as a pixel-level classification problem mainly addressed by discriminative methods.
We propose a novel approach, named Generative Edge Detector (GED), by fully utilizing the potential of the pre-trained stable diffusion model.
We conduct extensive experiments on multiple datasets and achieve competitive performance.
arXiv Detail & Related papers (2024-10-04T01:52:23Z) - Coarse-to-Fine Proposal Refinement Framework for Audio Temporal Forgery Detection and Localization [60.899082019130766]
We introduce a frame-level detection network (FDN) and a proposal refinement network (PRN) for audio temporal forgery detection and localization.
FDN aims to mine informative inconsistency cues between real and fake frames to obtain discriminative features that are beneficial for roughly indicating forgery regions.
PRN is responsible for predicting confidence scores and regression offsets to refine the coarse-grained proposals derived from the FDN.
arXiv Detail & Related papers (2024-07-23T15:07:52Z) - Fine-grained Dynamic Network for Generic Event Boundary Detection [9.17191007695011]
We propose a novel dynamic pipeline for generic event boundaries named DyBDet.
By introducing a multi-exit network architecture, DyBDet automatically learns the allocation to different video snippets.
Experiments on the challenging Kinetics-GEBD and TAPOS datasets demonstrate that adopting the dynamic strategy significantly benefits GEBD tasks.
arXiv Detail & Related papers (2024-07-05T06:02:46Z) - DiffSED: Sound Event Detection with Denoising Diffusion [70.18051526555512]
We reformulate the SED problem by taking a generative learning perspective.
Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process.
During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions.
arXiv Detail & Related papers (2023-08-14T17:29:41Z) - Implicit neural representation for change detection [15.741202788959075]
Most commonly used approaches to detecting changes in point clouds are based on supervised methods.
We propose an unsupervised approach that comprises two components: Implicit Neural Representation (INR) for continuous shape reconstruction and a Gaussian Mixture Model for categorising changes.
We apply our method to a benchmark dataset comprising simulated LiDAR point clouds for urban sprawling.
arXiv Detail & Related papers (2023-07-28T09:26:00Z) - B-BACN: Bayesian Boundary-Aware Convolutional Network for Crack
Characterization [4.447467536572625]
Uncertainty of crack detection is challenging due to various factors, such as measurement noises, signal processing, and model simplifications.
A machine learning-based approach is proposed to quantify both uncertainty and aleatoric uncertainties concurrently.
We introduce a Boundary-Aware Convolutional Network (B-BACN) that emphasizes uncertainty-aware boundary refinement to generate precise and reliable crack boundary detections.
arXiv Detail & Related papers (2023-02-14T04:50:42Z) - UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional
Variational Autoencoders [81.5490760424213]
We propose the first framework (UCNet) to employ uncertainty for RGB-D saliency detection by learning from the data labeling process.
Inspired by the saliency data labeling process, we propose probabilistic RGB-D saliency detection network.
arXiv Detail & Related papers (2020-04-13T04:12:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.