Context in object detection: a systematic literature review
- URL: http://arxiv.org/abs/2503.23249v1
- Date: Sat, 29 Mar 2025 23:21:28 GMT
- Title: Context in object detection: a systematic literature review
- Authors: Mahtab Jamali, Paul Davidsson, Reza Khoshkangini, Martin Georg Ljungqvist, Radu-Casian Mihailescu,
- Abstract summary: This study explores the impact of various context-based approaches to object detection.<n>More than 265 publications are included in this survey, covering different aspects of context in different categories of object detection.
- Score: 1.0310977366592338
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Context is an important factor in computer vision as it offers valuable information to clarify and analyze visual data. Utilizing the contextual information inherent in an image or a video can improve the precision and effectiveness of object detectors. For example, where recognizing an isolated object might be challenging, context information can improve comprehension of the scene. This study explores the impact of various context-based approaches to object detection. Initially, we investigate the role of context in object detection and survey it from several perspectives. We then review and discuss the most recent context-based object detection approaches and compare them. Finally, we conclude by addressing research questions and identifying gaps for further studies. More than 265 publications are included in this survey, covering different aspects of context in different categories of object detection, including general object detection, video object detection, small object detection, camouflaged object detection, zero-shot, one-shot, and few-shot object detection. This literature review presents a comprehensive overview of the latest advancements in context-based object detection, providing valuable contributions such as a thorough understanding of contextual information and effective methods for integrating various context types into object detection, thus benefiting researchers.
Related papers
- The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understanding [8.448399308205266]
We introduce an evaluation protocol based on dynamic vocabulary generation to test whether models detect, discern, and assign the correct fine-grained description to objects.
We further enhance our investigation by evaluating several state-of-the-art open-vocabulary object detectors using the proposed protocol.
arXiv Detail & Related papers (2023-11-29T10:40:52Z) - Contextual Object Detection with Multimodal Large Language Models [66.15566719178327]
We introduce a novel research problem of contextual object detection.
Three representative scenarios are investigated, including the language cloze test, visual captioning, and question answering.
We present ContextDET, a unified multimodal model that is capable of end-to-end differentiable modeling of visual-language contexts.
arXiv Detail & Related papers (2023-05-29T17:50:33Z) - A Comprehensive Study on Object Detection Techniques in Unconstrained
Environments [0.0]
Object detection is a crucial task in computer vision that aims to identify and localize objects in images or videos.
The recent advancements in deep learning and Convolutional Neural Networks (CNNs) have significantly improved the performance of object detection techniques.
This paper presents a comprehensive study of object detection techniques in unconstrained environments, including various challenges, datasets, and state-of-the-art approaches.
arXiv Detail & Related papers (2023-04-11T15:45:03Z) - Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A Survey [10.665235711722076]
Oriented object detection is one of the most fundamental and challenging tasks in remote sensing.
Recent years have witnessed remarkable progress in oriented object detection using deep learning techniques.
arXiv Detail & Related papers (2023-02-21T06:31:53Z) - Robust Region Feature Synthesizer for Zero-Shot Object Detection [87.79902339984142]
We build a novel zero-shot object detection framework that contains an Intra-class Semantic Diverging component and an Inter-class Structure Preserving component.
It is the first study to carry out zero-shot object detection in remote sensing imagery.
arXiv Detail & Related papers (2022-01-01T03:09:15Z) - Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs.
We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z) - Class-agnostic Object Detection [16.97782147401037]
We propose class-agnostic object detection as a new problem that focuses on detecting objects irrespective of their object-classes.
Specifically, the goal is to predict bounding boxes for all objects in an image but not their object-classes.
We propose training and evaluation protocols for benchmarking class-agnostic detectors to advance future research in this domain.
arXiv Detail & Related papers (2020-11-28T19:22:38Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Learning Object Detection from Captions via Textual Scene Attributes [70.90708863394902]
We argue that captions contain much richer information about the image, including attributes of objects and their relations.
We present a method that uses the attributes in this "textual scene graph" to train object detectors.
We empirically demonstrate that the resulting model achieves state-of-the-art results on several challenging object detection datasets.
arXiv Detail & Related papers (2020-09-30T10:59:20Z) - COBE: Contextualized Object Embeddings from Narrated Instructional Video [52.73710465010274]
We propose a new framework for learning Contextualized OBject Embeddings from automatically-transcribed narrations of instructional videos.
We leverage the semantic and compositional structure of language by training a visual detector to predict a contextualized word embedding of the object and its associated narration.
Our experiments show that our detector learns to predict a rich variety of contextual object information, and that it is highly effective in the settings of few-shot and zero-shot learning.
arXiv Detail & Related papers (2020-07-14T19:04:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.