Related papers: SOTA: Spike-Navigated Optimal TrAnsport Saliency Region Detection in Composite-bias Videos

SOTA: Spike-Navigated Optimal TrAnsport Saliency Region Detection in Composite-bias Videos

URL: http://arxiv.org/abs/2505.00394v1
Date: Thu, 01 May 2025 08:30:40 GMT
Title: SOTA: Spike-Navigated Optimal TrAnsport Saliency Region Detection in Composite-bias Videos
Authors: Wenxuan Liu, Yao Deng, Kang Chen, Xian Zhong, Zhaofei Yu, Tiejun Huang,
Abstract summary: Spike-d Optimal TrAnsport Saliency Region Detection (SOTA) is a framework that leverages the strengths of spike cameras while mitigating biases in both spatial and temporal dimensions.<n>Our method introduces Spike-based Micro-debias (SM) to capture subtle frame-to-frame variations.<n>SOTA refines predictions by reducing inconsistencies across diverse conditions.
Score: 50.51658520045165
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing saliency detection methods struggle in real-world scenarios due to motion blur and occlusions. In contrast, spike cameras, with their high temporal resolution, significantly enhance visual saliency maps. However, the composite noise inherent to spike camera imaging introduces discontinuities in saliency detection. Low-quality samples further distort model predictions, leading to saliency bias. To address these challenges, we propose Spike-navigated Optimal TrAnsport Saliency Region Detection (SOTA), a framework that leverages the strengths of spike cameras while mitigating biases in both spatial and temporal dimensions. Our method introduces Spike-based Micro-debias (SM) to capture subtle frame-to-frame variations and preserve critical details, even under minimal scene or lighting changes. Additionally, Spike-based Global-debias (SG) refines predictions by reducing inconsistencies across diverse conditions. Extensive experiments on real and synthetic datasets demonstrate that SOTA outperforms existing methods by eliminating composite noise bias. Our code and dataset will be released at https://github.com/lwxfight/sota.

Related papers

Dynamic View Synthesis from Small Camera Motion Videos [56.359460602781304]
We present a novel view synthesis for dynamic $3$D scenes based on distribution-based depth regularization.<n>We also introduce constraints that enforce the volume density of spatial points before the object boundary along the ray to be near zero, ensuring that our model learns the correct geometry of the scene.<n>We conduct extensive experiments to demonstrate the effectiveness of our approach in representing scenes with small camera motion input, and our results compare favorably to state-of-the-art methods.
arXiv Detail & Related papers (2025-06-29T09:17:55Z)
Multiple Object Tracking in Video SAR: A Benchmark and Tracking Baseline [6.467005601813546]
Video synthetic aperture radar (Video SAR) is used for multi-object tracking.<n>Doppler shifts induced by target motion result in artifacts that are easily mistaken for shadows.<n>A major limitation in this field is the lack of public benchmark datasets for standardized algorithm evaluation.
arXiv Detail & Related papers (2025-06-13T06:12:25Z)
SpikeStereoNet: A Brain-Inspired Framework for Stereo Depth Estimation from Spike Streams [43.43061247688823]
Bio-inspired spike cameras emit asynchronous events at microsecond-level resolution, providing an alternative sensing modality.<n>Existing methods lack specialized stereo algorithms and benchmarks tailored to the spike data.<n>We propose SpikeStereoNet, a brain-inspired framework and the first to estimate stereo depth directly from raw spike streams.
arXiv Detail & Related papers (2025-05-26T04:14:34Z)
SpikeNVS: Enhancing Novel View Synthesis from Blurry Images via Spike Camera [78.20482568602993]
Conventional RGB cameras are susceptible to motion blur. Neuromorphic cameras like event and spike cameras inherently capture more comprehensive temporal information. Our design can enhance novel view synthesis across NeRF and 3DGS.
arXiv Detail & Related papers (2024-04-10T03:31:32Z)
SpikeNeRF: Learning Neural Radiance Fields from Continuous Spike Stream [26.165424006344267]
Spike cameras offer distinct advantages over standard cameras. Existing approaches reliant on spike cameras often assume optimal illumination. We introduce SpikeNeRF, the first work that derives a NeRF-based volumetric scene representation from spike camera data.
arXiv Detail & Related papers (2024-03-17T13:51:25Z)
SpikeReveal: Unlocking Temporal Sequences from Real Blurry Inputs with Spike Streams [44.02794438687478]
Spike cameras have proven effective in capturing motion features and beneficial for solving this ill-posed problem. Existing methods fall into the supervised learning paradigm, which suffers from notable performance degradation when applied to real-world scenarios. We propose the first self-supervised framework for the task of spike-guided motion deblurring.
arXiv Detail & Related papers (2024-03-14T15:29:09Z)
RANRAC: Robust Neural Scene Representations via Random Ray Consensus [12.161889666145127]
RANdom RAy Consensus (RANRAC) is an efficient approach to eliminate the effect of inconsistent data. We formulate a fuzzy adaption of the RANSAC paradigm, enabling its application to large scale models. Results indicate significant improvements compared to state-of-the-art robust methods for novel-view synthesis.
arXiv Detail & Related papers (2023-12-15T13:33:09Z)
Implicit Event-RGBD Neural SLAM [54.74363487009845]
Implicit neural SLAM has achieved remarkable progress recently. Existing methods face significant challenges in non-ideal scenarios. We propose EN-SLAM, the first event-RGBD implicit neural SLAM framework.
arXiv Detail & Related papers (2023-11-18T08:48:58Z)
Spike Stream Denoising via Spike Camera Simulation [64.11994763727631]
We propose a systematic noise model for spike camera based on its unique circuit. The first benchmark for spike stream denoising is proposed which includes clear (noisy) spike stream. Experiments show that DnSS has promising performance on the proposed benchmark.
arXiv Detail & Related papers (2023-04-06T14:59:48Z)
Robust Unsupervised Video Anomaly Detection by Multi-Path Frame Prediction [61.17654438176999]
We propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design. Our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset.
arXiv Detail & Related papers (2020-11-05T11:34:12Z)
Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames. Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction. We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.