Related papers: Fast Neural Scene Flow

Fast Neural Scene Flow

URL: http://arxiv.org/abs/2304.09121v3
Date: Tue, 29 Aug 2023 12:32:01 GMT
Title: Fast Neural Scene Flow
Authors: Xueqian Li, Jianqiao Zheng, Francesco Ferroni, Jhony Kaesemodel Pontes, Simon Lucey
Abstract summary: A coordinate neural network estimates scene flow at runtime, without any training. In this paper, we demonstrate that scene flow is different -- with the dominant computational bottleneck stemming from the loss function itself. Our fast neural scene flow (FNSF) approach reports for the first time real-time performance comparable to learning methods.
Score: 36.29234109363439
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural Scene Flow Prior (NSFP) is of significant interest to the vision community due to its inherent robustness to out-of-distribution (OOD) effects and its ability to deal with dense lidar points. The approach utilizes a coordinate neural network to estimate scene flow at runtime, without any training. However, it is up to 100 times slower than current state-of-the-art learning methods. In other applications such as image, video, and radiance function reconstruction innovations in speeding up the runtime performance of coordinate networks have centered upon architectural changes. In this paper, we demonstrate that scene flow is different -- with the dominant computational bottleneck stemming from the loss function itself (i.e., Chamfer distance). Further, we rediscover the distance transform (DT) as an efficient, correspondence-free loss function that dramatically speeds up the runtime optimization. Our fast neural scene flow (FNSF) approach reports for the first time real-time performance comparable to learning methods, without any training or OOD bias on two of the largest open autonomous driving (AV) lidar datasets Waymo Open and Argoverse.

Related papers

Learning Normal Flow Directly From Event Neighborhoods [18.765370814655626]
We propose a novel supervised point-based method for normal flow estimation. Using a local point cloud encoder, our method directly estimates per-event normal flow from raw events. Our method achieves better and more consistent performance than state-of-the-art methods when transferred across different datasets.
arXiv Detail & Related papers (2024-12-15T19:09:45Z)
LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation [64.34935748707673]
Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors. We propose a novel method of Learning Resampling (termed LeRF) which takes advantage of both the structural priors learned by DNNs and the locally continuous assumption. LeRF assigns spatially varying resampling functions to input image pixels and learns to predict the shapes of these resampling functions with a neural network.
arXiv Detail & Related papers (2024-07-13T16:09:45Z)
StreamFlow: Streamlined Multi-Frame Optical Flow Estimation for Video Sequences [31.210626775505407]
Occlusions between consecutive frames have long posed a significant challenge in optical flow estimation. We present a Streamlined In-batch Multi-frame (SIM) pipeline tailored to video input, attaining a similar level of time efficiency to two-frame networks. StreamFlow not only excels in terms of performance on challenging KITTI and Sintel datasets, with particular improvement in occluded areas.
arXiv Detail & Related papers (2023-11-28T07:53:51Z)
Deep Multi-Threshold Spiking-UNet for Image Processing [51.88730892920031]
This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture. To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy. Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart.
arXiv Detail & Related papers (2023-07-20T16:00:19Z)
LOGO-Former: Local-Global Spatio-Temporal Transformer for Dynamic Facial Expression Recognition [19.5702895176141]
Previous methods for facial expression recognition (DFER) in the wild are mainly based on Convolutional Neural Networks (CNNs), whose local operations ignore the long-range dependencies in videos. We propose Transformer-based methods for DFER to achieve better performances but result in higher FLOPs and computational costs. Experiments on two in-the-wild dynamic facial expression datasets (i.e., DFEW and FERV39K) indicate that our method provides an effective way to make use of the spatial and temporal dependencies for DFER.
arXiv Detail & Related papers (2023-05-05T07:53:13Z)
EM-driven unsupervised learning for efficient motion segmentation [3.5232234532568376]
This paper presents a CNN-based fully unsupervised method for motion segmentation from optical flow. We use the Expectation-Maximization (EM) framework to leverage the loss function and the training procedure of our motion segmentation neural network. Our method outperforms comparable unsupervised methods and is very efficient.
arXiv Detail & Related papers (2022-01-06T14:35:45Z)
Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic [137.04558017227583]
Actor-critic (AC) algorithms, empowered by neural networks, have had significant empirical success in recent years. We take a mean-field perspective on the evolution and convergence of feature-based neural AC. We prove that neural AC finds the globally optimal policy at a sublinear rate.
arXiv Detail & Related papers (2021-12-27T06:09:50Z)
KORSAL: Key-point Detection based Online Real-Time Spatio-Temporal Action Localization [0.9507070656654633]
Real-time and online action localization in a video is a critical yet highly challenging problem. Recent attempts achieve this by using computationally intensive 3D CNN architectures or highly redundant two-stream architectures with optical flow. We propose utilizing fast and efficient key-point based bounding box prediction to spatially localize actions. Our model achieves a frame rate of 41.8 FPS, which is a 10.7% improvement over contemporary real-time methods.
arXiv Detail & Related papers (2021-11-05T08:39:36Z)
Neural Scene Flow Prior [30.878829330230797]
Before the deep learning revolution, many perception algorithms were based on runtime optimization in conjunction with a strong prior/regularization penalty. This paper revisits the scene flow problem that relies predominantly on runtime optimization and strong regularization. A central innovation here is the inclusion of a neural scene flow prior, which uses the architecture of neural networks as a new type of implicit regularizer.
arXiv Detail & Related papers (2021-11-01T20:44:12Z)
Optical-Flow-Reuse-Based Bidirectional Recurrent Network for Space-Time Video Super-Resolution [52.899234731501075]
Space-time video super-resolution (ST-VSR) simultaneously increases the spatial resolution and frame rate for a given video. Existing methods typically suffer from difficulties in how to efficiently leverage information from a large range of neighboring frames. We propose a coarse-to-fine bidirectional recurrent neural network instead of using ConvLSTM to leverage knowledge between adjacent frames.
arXiv Detail & Related papers (2021-10-13T15:21:30Z)
End-to-end Learning for Inter-Vehicle Distance and Relative Velocity Estimation in ADAS with a Monocular Camera [81.66569124029313]
We propose a camera-based inter-vehicle distance and relative velocity estimation method based on end-to-end training of a deep neural network. The key novelty of our method is the integration of multiple visual clues provided by any two time-consecutive monocular frames. We also propose a vehicle-centric sampling mechanism to alleviate the effect of perspective distortion in the motion field.
arXiv Detail & Related papers (2020-06-07T08:18:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.