A comprehensive overview of deep learning models for object detection from videos/images
- URL: http://arxiv.org/abs/2601.14677v1
- Date: Wed, 21 Jan 2026 05:50:21 GMT
- Title: A comprehensive overview of deep learning models for object detection from videos/images
- Authors: Sukana Zulfqar, Sadia Saeed, M. Azam Zia, Anjum Ali, Faisal Mehmood, Abid Ali,
- Abstract summary: Object detection in video and image surveillance is a well-established yet rapidly influenced by recent deep learning advancements.<n>This review examines architectural innovations, generative model integration, and the use of temporal information to enhance robustness and accuracy.<n>The primary goal is to evaluate the current effectiveness of semantic object detection, while analysing deep learning models and their practical applications.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Object detection in video and image surveillance is a well-established yet rapidly evolving task, strongly influenced by recent deep learning advancements. This review summarises modern techniques by examining architectural innovations, generative model integration, and the use of temporal information to enhance robustness and accuracy. Unlike earlier surveys, it classifies methods based on core architectures, data processing strategies, and surveillance specific challenges such as dynamic environments, occlusions, lighting variations, and real-time requirements. The primary goal is to evaluate the current effectiveness of semantic object detection, while secondary aims include analysing deep learning models and their practical applications. The review covers CNN-based detectors, GAN-assisted approaches, and temporal fusion methods, highlighting how generative models support tasks such as reconstructing missing frames, reducing occlusions, and normalising illumination. It also outlines preprocessing pipelines, feature extraction progress, benchmarking datasets, and comparative evaluations. Finally, emerging trends in low-latency, efficient, and spatiotemporal learning approaches are identified for future research.
Related papers
- Deep Learning for Crack Detection: A Review of Learning Paradigms, Generalizability, and Datasets [4.874652036065497]
Crack detection plays a crucial role in civil infrastructures, including inspection of pavements, buildings, etc.<n>Deep learning has significantly advanced this field in recent years.<n>Emerging trends are reshaping the landscape, including transitions in learning paradigms and improvements in generalizability.
arXiv Detail & Related papers (2025-08-14T00:47:00Z) - A Systematic Investigation on Deep Learning-Based Omnidirectional Image and Video Super-Resolution [30.62413133817583]
This paper presents a systematic review of recent progress in omnidirectional image and video super-resolution.<n>We introduce a new dataset, 360Insta, that comprises authentically degraded omnidirectional images and videos.<n>We conduct comprehensive qualitative and quantitative evaluations of existing methods on both public datasets and our proposed dataset.
arXiv Detail & Related papers (2025-06-07T08:24:44Z) - Underwater Object Detection in the Era of Artificial Intelligence: Current, Challenge, and Future [119.88454942558485]
Underwater object detection (UOD) aims to identify and localise objects in underwater images or videos.
In recent years, artificial intelligence (AI) based methods, especially deep learning methods, have shown promising performance in UOD.
arXiv Detail & Related papers (2024-10-08T00:25:33Z) - Cross-Target Stance Detection: A Survey of Techniques, Datasets, and Challenges [7.242609314791262]
Cross-target stance detection is the task of determining the viewpoint expressed in a text towards a given target.
With the increasing need to analyze and mining viewpoints and opinions online, the task has recently seen a significant surge in interest.
This review paper examines the advancements in cross-target stance detection over the last decade.
arXiv Detail & Related papers (2024-09-20T15:49:14Z) - Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Deep Learning-Based Object Pose Estimation: A Comprehensive Survey [73.74933379151419]
We discuss the recent advances in deep learning-based object pose estimation.
Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks.
arXiv Detail & Related papers (2024-05-13T14:44:22Z) - Efficient Parameter Mining and Freezing for Continual Object Detection [0.0]
We propose efficient ways to identify which layers are the most important for a network to maintain the performance of a detector across sequential updates.
The presented findings highlight the substantial advantages of layer-level parameter isolation in facilitating incremental learning within object detection models.
arXiv Detail & Related papers (2024-02-20T01:07:32Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Incremental Object Detection via Meta-Learning [77.55310507917012]
We propose a meta-learning approach that learns to reshape model gradients, such that information across incremental tasks is optimally shared.
In comparison to existing meta-learning methods, our approach is task-agnostic, allows incremental addition of new-classes and scales to high-capacity models for object detection.
arXiv Detail & Related papers (2020-03-17T13:40:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.