Knowledge Distillation for Object Detection: from generic to remote
sensing datasets
- URL: http://arxiv.org/abs/2307.09264v1
- Date: Tue, 18 Jul 2023 13:49:00 GMT
- Title: Knowledge Distillation for Object Detection: from generic to remote
sensing datasets
- Authors: Ho\`ang-\^An L\^e and Minh-Tan Pham
- Abstract summary: We evaluate various off-the-shelf object knowledge distillation methods which have been originally developed on generic computer vision datasets.
In particular, methods covering both logit and feature imitation approaches are applied for vehicle detection using the well-known benchmarks as xView and VEDAI datasets.
- Score: 7.872075562968697
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation, a well-known model compression technique, is an
active research area in both computer vision and remote sensing communities. In
this paper, we evaluate in a remote sensing context various off-the-shelf
object detection knowledge distillation methods which have been originally
developed on generic computer vision datasets such as Pascal VOC. In
particular, methods covering both logit mimicking and feature imitation
approaches are applied for vehicle detection using the well-known benchmarks
such as xView and VEDAI datasets. Extensive experiments are performed to
compare the relative performance and interrelationships of the methods.
Experimental results show high variations and confirm the importance of result
aggregation and cross validation on remote sensing datasets.
Related papers
- Local Feature Matching Using Deep Learning: A Survey [19.322545965903608]
Local feature matching enjoys wide-ranging applications in the realm of computer vision, encompassing domains such as image retrieval, 3D reconstruction, and object recognition.
In recent years, the introduction of deep learning models has sparked widespread exploration into local feature matching techniques.
The paper also explores the practical application of local feature matching in diverse domains such as Structure from Motion, Remote Sensing Image Registration, and Medical Image Registration.
arXiv Detail & Related papers (2024-01-31T04:32:41Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Object-centric Cross-modal Feature Distillation for Event-based Object
Detection [87.50272918262361]
RGB detectors still outperform event-based detectors due to sparsity of the event data and missing visual details.
We develop a novel knowledge distillation approach to shrink the performance gap between these two modalities.
We show that object-centric distillation allows to significantly improve the performance of the event-based student object detector.
arXiv Detail & Related papers (2023-11-09T16:33:08Z) - Improving Cross-dataset Deepfake Detection with Deep Information
Decomposition [57.284370468207214]
Deepfake technology poses a significant threat to security and social trust.
Existing detection methods suffer from sharp performance degradation when faced with cross-dataset scenarios.
We propose a deep information decomposition (DID) framework in this paper.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - A Functional Data Perspective and Baseline On Multi-Layer
Out-of-Distribution Detection [30.499548939422194]
Methods that explore the multiple layers either require a special architecture or a supervised objective to do so.
This work adopts an original approach based on a functional view of the network that exploits the sample's trajectories through the various layers and their statistical dependencies.
We validate our method and empirically demonstrate its effectiveness in OOD detection compared to strong state-of-the-art baselines on computer vision benchmarks.
arXiv Detail & Related papers (2023-06-06T09:14:05Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z) - Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs.
We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z) - Towards an efficient framework for Data Extraction from Chart Images [27.114170963444074]
We adopt state-of-the-art computer vision techniques for the data extraction stage in a data mining system.
For building a robust point detector, a fully convolutional network with feature fusion module is adopted.
For data conversion, we translate the detected element into data with semantic value.
arXiv Detail & Related papers (2021-05-05T13:18:53Z) - Visual Relationship Detection with Visual-Linguistic Knowledge from
Multimodal Representations [103.00383924074585]
Visual relationship detection aims to reason over relationships among salient objects in images.
We propose a novel approach named Visual-Linguistic Representations from Transformers (RVL-BERT)
RVL-BERT performs spatial reasoning with both visual and language commonsense knowledge learned via self-supervised pre-training.
arXiv Detail & Related papers (2020-09-10T16:15:09Z) - Sensor Data for Human Activity Recognition: Feature Representation and
Benchmarking [27.061240686613182]
The field of Human Activity Recognition (HAR) focuses on obtaining and analysing data captured from monitoring devices (e.g. sensors)
We address the issue of accurately recognising human activities using different Machine Learning (ML) techniques.
arXiv Detail & Related papers (2020-05-15T00:46:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.