Related papers: Object Detection and Recognition of Swap-Bodies using Camera mounted on a Vehicle

Object Detection and Recognition of Swap-Bodies using Camera mounted on a Vehicle

URL: http://arxiv.org/abs/2004.08118v1
Date: Fri, 17 Apr 2020 08:49:54 GMT
Title: Object Detection and Recognition of Swap-Bodies using Camera mounted on a Vehicle
Authors: Ebin Zacharias, Didier Stricker, Martin Teuchler and Kripasindhu Sarkar
Abstract summary: This project aims to jointly perform object detection of a swap-body and to find the type of swap-body by reading an ILU code. Recent research activities have drastically improved deep learning techniques which proves to enhance the field of computer vision.
Score: 13.702911401489427
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Object detection and identification is a challenging area of computer vision and a fundamental requirement for autonomous cars. This project aims to jointly perform object detection of a swap-body and to find the type of swap-body by reading an ILU code using an efficient optical character recognition (OCR) method. Recent research activities have drastically improved deep learning techniques which proves to enhance the field of computer vision. Collecting enough images for training the model is a critical step towards achieving good results. The data for training were collected from different locations with maximum possible variations and the details are explained. In addition, data augmentation methods applied for training has proved to be effective in improving the performance of the trained model. Training the model achieved good results and the test results are also provided. The final model was tested with images and videos. Finally, this paper also draws attention to some of the major challenges faced during various stages of the project and the possible solutions applied.

Related papers

Explorations in Self-Supervised Learning: Dataset Composition Testing for Object Classification [0.0]
We investigate the impact of sampling and pretraining using datasets with different image characteristics on the performance of self-supervised learning (SSL) models for object classification. We find that depth pretrained models are more effective on low resolution images, while RGB pretrained models perform better on higher resolution images.
arXiv Detail & Related papers (2024-12-01T11:21:01Z)
Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models. In this paper, we investigate how detection performance varies across model backbones, types, and datasets. We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
LEAP:D - A Novel Prompt-based Approach for Domain-Generalized Aerial Object Detection [2.1233286062376497]
We introduce an innovative vision-language approach using learnable prompts. This shift from conventional manual prompts aims to reduce domain-specific knowledge interference. We streamline the training process with a one-step approach, updating the learnable prompt concurrently with model training.
arXiv Detail & Related papers (2024-11-14T04:39:10Z)
A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance. We propose a simple yet effective data augmentation approach by leveraging advancements in generative models. Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z)
An Ensemble Model for Distorted Images in Real Scenarios [0.0]
In this paper, we apply the object detector YOLOv7 to detect distorted images from the CDCOCO dataset. Through carefully designed optimizations, our model achieves excellent performance on the CDCOCO test set. Our denoising detection model can denoise and repair distorted images, making the model useful in a variety of real-world scenarios and environments.
arXiv Detail & Related papers (2023-09-26T15:12:55Z)
A Dual-Cycled Cross-View Transformer Network for Unified Road Layout Estimation and 3D Object Detection in the Bird's-Eye-View [4.251500966181852]
We propose a unified model for road layout estimation and 3D object detection inspired by the transformer architecture and the CycleGAN learning framework. We set up extensive learning scenarios to study the effect of multi-class learning for road layout estimation in various situations. Experiment results attest the effectiveness of our model; we achieve state-of-the-art performance in both the road layout estimation and 3D object detection tasks.
arXiv Detail & Related papers (2022-09-19T08:43:38Z)
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model [97.9548609175831]
We resort to plain vision transformers with about 100 million parameters and make the first attempt to propose large vision models customized for remote sensing tasks. Specifically, to handle the large image size and objects of various orientations in RS images, we propose a new rotated varied-size window attention. Experiments on detection tasks demonstrate the superiority of our model over all state-of-the-art models, achieving 81.16% mAP on the DOTA-V1.0 dataset.
arXiv Detail & Related papers (2022-08-08T09:08:40Z)
One-Shot Object Affordance Detection in the Wild [76.46484684007706]
Affordance detection refers to identifying the potential action possibilities of objects in an image. We devise a One-Shot Affordance Detection Network (OSAD-Net) that estimates the human action purpose and then transfers it to help detect the common affordance from all candidate images. With complex scenes and rich annotations, our PADv2 dataset can be used as a test bed to benchmark affordance detection methods.
arXiv Detail & Related papers (2021-08-08T14:53:10Z)
Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only. We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z)
Factors of Influence for Transfer Learning across Diverse Appearance Domains and Task Types [50.1843146606122]
A simple form of transfer learning is common in current state-of-the-art computer vision models. Previous systematic studies of transfer learning have been limited and the circumstances in which it is expected to work are not fully understood. In this paper we carry out an extensive experimental exploration of transfer learning across vastly different image domains.
arXiv Detail & Related papers (2021-03-24T16:24:20Z)
Learning Collision-Free Space Detection from Stereo Images: Homography Matrix Brings Better Data Augmentation [16.99302954185652]
It remains an open challenge to train deep convolutional neural networks (DCNNs) using only a small quantity of training samples. This paper explores an effective training data augmentation approach that can be employed to improve the overall DCNN performance.
arXiv Detail & Related papers (2020-12-14T19:14:35Z)
Auto-Rectify Network for Unsupervised Indoor Depth Estimation [119.82412041164372]
We establish that the complex ego-motions exhibited in handheld settings are a critical obstacle for learning depth. We propose a data pre-processing method that rectifies training images by removing their relative rotations for effective learning. Our results outperform the previous unsupervised SOTA method by a large margin on the challenging NYUv2 dataset.
arXiv Detail & Related papers (2020-06-04T08:59:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.