Efficient Pedestrian Detection in Top-View Fisheye Images Using
Compositions of Perspective View Patches
- URL: http://arxiv.org/abs/2009.02711v2
- Date: Fri, 23 Oct 2020 04:48:30 GMT
- Title: Efficient Pedestrian Detection in Top-View Fisheye Images Using
Compositions of Perspective View Patches
- Authors: Sheng-Ho Chiang, Tsaipei Wang, Yi-Fu Chen
- Abstract summary: Existing detectors designed for perspective images do not perform as successfully on images taken with top-view fisheye cameras.
In our proposed approach, several perspective views are generated from a fisheye image and then form a composite image.
As pedestrians in this composite image are more likely to be upright, existing detectors designed and trained for perspective images can be applied directly without additional training.
The detection performance on several public datasets compare favorably with state-of-the-art results.
- Score: 3.5706999675827413
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pedestrian detection in images is a topic that has been studied extensively,
but existing detectors designed for perspective images do not perform as
successfully on images taken with top-view fisheye cameras, mainly due to the
orientation variation of people in such images. In our proposed approach,
several perspective views are generated from a fisheye image and then
concatenated to form a composite image. As pedestrians in this composite image
are more likely to be upright, existing detectors designed and trained for
perspective images can be applied directly without additional training. We also
describe a new method of mapping detection bounding boxes from the perspective
views to the fisheye frame. The detection performance on several public
datasets compare favorably with state-of-the-art results.
Related papers
- Panoramic Distortion-Aware Tokenization for Person Detection and Localization Using Transformers in Overhead Fisheye Images [9.018416031676136]
Person detection is an open challenge because of factors including person rotation and small-sized persons.
We convert fisheye images into panoramic images using panoramic distortion-aware tokenization.
We propose a person detection and localization method that combines panoramic-image remapping and the tokenization procedure.
arXiv Detail & Related papers (2025-03-18T13:05:41Z) - Multi-View People Detection in Large Scenes via Supervised View-Wise Contribution Weighting [44.48514301889318]
This paper focuses on improving multi-view people detection by developing a supervised view-wise contribution weighting approach.
A large synthetic dataset is adopted to enhance the model's generalization ability.
Experimental results demonstrate the effectiveness of our approach in achieving promising cross-scene multi-view people detection performance.
arXiv Detail & Related papers (2024-05-30T11:03:27Z) - Robust Multi-Modal Image Stitching for Improved Scene Understanding [2.0476854378186102]
We've devised a unique and comprehensive image-stitching pipeline that taps into OpenCV's stitching module.
Our approach integrates feature-based matching, transformation estimation, and blending techniques to bring about panoramic views that are of top-tier quality.
arXiv Detail & Related papers (2023-12-28T13:24:48Z) - SimFIR: A Simple Framework for Fisheye Image Rectification with
Self-supervised Representation Learning [105.01294305972037]
We introduce SimFIR, a framework for fisheye image rectification based on self-supervised representation learning.
To learn fine-grained distortion representations, we first split a fisheye image into multiple patches and extract their representations with a Vision Transformer.
The transfer performance on the downstream rectification task is remarkably boosted, which verifies the effectiveness of the learned representations.
arXiv Detail & Related papers (2023-08-17T15:20:17Z) - A Stronger Stitching Algorithm for Fisheye Images based on Deblurring
and Registration [3.6417475195085602]
We devise a stronger stitching algorithm for fisheye images by combining the traditional image processing method with deep learning.
In the stage of fisheye image correction, we propose the Attention-based Activation Free Network (ANAFNet) to deblur fisheye images corrected by calibration method.
In the part of image registration, we propose the ORB-FREAK-GMS (OFG), a comprehensive image matching algorithm, to improve the accuracy of image registration.
arXiv Detail & Related papers (2023-07-22T06:54:16Z) - Self-supervised Interest Point Detection and Description for Fisheye and
Perspective Images [7.451395029642832]
Keypoint detection and matching is a fundamental task in many computer vision problems.
In this work, we focus on the case when this is caused by the geometry of the cameras used for image acquisition.
We build on a state-of-the-art approach and derive a self-supervised procedure that enables training an interest point detector and descriptor network.
arXiv Detail & Related papers (2023-06-02T22:39:33Z) - Generalizable Person Re-Identification via Viewpoint Alignment and
Fusion [74.30861504619851]
This work proposes to use a 3D dense pose estimation model and a texture mapping module to map pedestrian images to canonical view images.
Due to the imperfection of the texture mapping module, the canonical view images may lose the discriminative detail clues from the original images.
We show that our method can lead to superior performance over the existing approaches in various evaluation settings.
arXiv Detail & Related papers (2022-12-05T16:24:09Z) - ObjectFormer for Image Manipulation Detection and Localization [118.89882740099137]
We propose ObjectFormer to detect and localize image manipulations.
We extract high-frequency features of the images and combine them with RGB features as multimodal patch embeddings.
We conduct extensive experiments on various datasets and the results verify the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-03-28T12:27:34Z) - Series Photo Selection via Multi-view Graph Learning [52.33318426088579]
Series photo selection (SPS) is an important branch of the image aesthetics quality assessment.
We leverage a graph neural network to construct the relationships between multi-view features.
A siamese network is proposed to select the best one from a series of nearly identical photos.
arXiv Detail & Related papers (2022-03-18T04:23:25Z) - FisheyeSuperPoint: Keypoint Detection and Description Network for
Fisheye Images [2.187613144178315]
Keypoint detection and description is a commonly used building block in computer vision systems.
SuperPoint is a self-supervised keypoint detector and descriptor that has achieved state-of-the-art results on homography estimation.
We introduce a fisheye adaptation pipeline to enable training on undistorted fisheye images.
arXiv Detail & Related papers (2021-02-27T11:26:34Z) - Look here! A parametric learning based approach to redirect visual
attention [49.609412873346386]
We introduce an automatic method to make an image region more attention-capturing via subtle image edits.
Our model predicts a distinct set of global parametric transformations to be applied to the foreground and background image regions.
Our edits enable inference at interactive rates on any image size, and easily generalize to videos.
arXiv Detail & Related papers (2020-08-12T16:08:36Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.