Related papers: FindVehicle and VehicleFinder: A NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system

FindVehicle and VehicleFinder: A NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system

URL: http://arxiv.org/abs/2304.10893v1
Date: Fri, 21 Apr 2023 11:20:23 GMT
Title: FindVehicle and VehicleFinder: A NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system
Authors: Runwei Guan, Ka Lok Man, Feifan Chen, Shanliang Yao, Rongsheng Hu, Xiaohui Zhu, Jeremy Smith, Eng Gee Lim and Yutao Yue
Abstract summary: Natural language (NL) based vehicle retrieval is a task aiming to retrieve a vehicle that is most consistent with a given NL query from among all candidate vehicles. To tackle these problems and simplify, we borrow the idea from named entity recognition (NER) and construct FindVehicle, a NER dataset in the traffic domain. VehicleFinder achieves 87.7% precision and 89.4% recall when retrieving a target vehicle by text command on our homemade dataset.
Score: 7.078561467480664
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Natural language (NL) based vehicle retrieval is a task aiming to retrieve a vehicle that is most consistent with a given NL query from among all candidate vehicles. Because NL query can be easily obtained, such a task has a promising prospect in building an interactive intelligent traffic system (ITS). Current solutions mainly focus on extracting both text and image features and mapping them to the same latent space to compare the similarity. However, existing methods usually use dependency analysis or semantic role-labelling techniques to find keywords related to vehicle attributes. These techniques may require a lot of pre-processing and post-processing work, and also suffer from extracting the wrong keyword when the NL query is complex. To tackle these problems and simplify, we borrow the idea from named entity recognition (NER) and construct FindVehicle, a NER dataset in the traffic domain. It has 42.3k labelled NL descriptions of vehicle tracks, containing information such as the location, orientation, type and colour of the vehicle. FindVehicle also adopts both overlapping entities and fine-grained entities to meet further requirements. To verify its effectiveness, we propose a baseline NL-based vehicle retrieval model called VehicleFinder. Our experiment shows that by using text encoders pre-trained by FindVehicle, VehicleFinder achieves 87.7\% precision and 89.4\% recall when retrieving a target vehicle by text command on our homemade dataset based on UA-DETRAC. The time cost of VehicleFinder is 279.35 ms on one ARM v8.2 CPU and 93.72 ms on one RTX A4000 GPU, which is much faster than the Transformer-based system. The dataset is open-source via the link https://github.com/GuanRunwei/FindVehicle, and the implementation can be found via the link https://github.com/GuanRunwei/VehicleFinder-CTIM.

Related papers

SalienDet: A Saliency-based Feature Enhancement Algorithm for Object Detection for Autonomous Driving [160.57870373052577]
We propose a saliency-based OD algorithm (SalienDet) to detect unknown objects. Our SalienDet utilizes a saliency-based algorithm to enhance image features for object proposal generation. We design a dataset relabeling approach to differentiate the unknown objects from all objects in training sample set to achieve Open-World Detection.
arXiv Detail & Related papers (2023-05-11T16:19:44Z)
AutoTransfer: AutoML with Knowledge Transfer -- An Application to Graph Neural Networks [75.11008617118908]
AutoML techniques consider each task independently from scratch, leading to high computational cost. Here we propose AutoTransfer, an AutoML solution that improves search efficiency by transferring the prior architectural design knowledge to the novel task of interest.
arXiv Detail & Related papers (2023-03-14T07:23:16Z)
HPointLoc: Point-based Indoor Place Recognition using Synthetic RGB-D Images [58.720142291102135]
We present a novel dataset named as HPointLoc, specially designed for exploring capabilities of visual place recognition in indoor environment. The dataset is based on the popular Habitat simulator, in which it is possible to generate indoor scenes using both own sensor data and open datasets.
arXiv Detail & Related papers (2022-12-30T12:20:56Z)
Symmetric Network with Spatial Relationship Modeling for Natural Language-based Vehicle Retrieval [3.610372087454382]
Natural language (NL) based vehicle retrieval aims to search specific vehicle given text description. We propose a Symmetric Network with Spatial Relationship Modeling (SSM) method for NL-based vehicle retrieval. We achieve 43.92% MRR accuracy on the test set of the 6th AI City Challenge on natural language-based vehicle retrieval track.
arXiv Detail & Related papers (2022-06-22T07:02:04Z)
Connecting Language and Vision for Natural Language-Based Vehicle Retrieval [77.88818029640977]
In this paper, we apply one new modality, i.e., the language description, to search the vehicle of interest. To connect language and vision, we propose to jointly train the state-of-the-art vision models with the transformer-based language model. Our proposed method has achieved the 1st place on the 5th AI City Challenge, yielding competitive performance 18.69% MRR accuracy.
arXiv Detail & Related papers (2021-05-31T11:42:03Z)
SBNet: Segmentation-based Network for Natural Language-based Vehicle Search [8.286899656309476]
Natural language-based vehicle retrieval is a task to find a target vehicle within a given image based on a natural language description as a query. This technology can be applied to various areas including police searching for a suspect vehicle. We propose a deep neural network called SBNet that performs natural language-based segmentation for vehicle retrieval.
arXiv Detail & Related papers (2021-04-22T08:06:17Z)
Topo-boundary: A Benchmark Dataset on Topological Road-boundary Detection Using Aerial Images for Autonomous Driving [11.576868193291997]
We propose a new benchmark dataset, named textitTopo-boundary, for off-line topological road-boundary detection. The dataset contains 21,556 $1000times1000$-sized 4-channel aerial images. We implement and evaluate 3 segmentation-based baselines and 5 graph-based baselines using the dataset.
arXiv Detail & Related papers (2021-03-31T14:42:00Z)
Object Detection and Tracking Algorithms for Vehicle Counting: A Comparative Analysis [3.093890460224435]
Authors deploy several state of the art object detection and tracking algorithms to detect and track different classes of vehicles. Model combinations are validated and compared against the manually counted ground truths of over 9 hours' traffic video data. Results demonstrate that the combination of CenterNet and Deep SORT, Detectron2 and Deep SORT, and YOLOv4 and Deep SORT produced the best overall counting percentage for all vehicles.
arXiv Detail & Related papers (2020-07-31T17:49:27Z)
Vehicle Detection of Multi-source Remote Sensing Data Using Active Fine-tuning Network [26.08837467340853]
The proposed Ms-AFt framework integrates transfer learning, segmentation, and active classification into a unified framework for auto-labeling and detection. The proposed Ms-AFt employs a fine-tuning network to firstly generate a vehicle training set from an unlabeled dataset. Extensive experimental results conducted on two open ISPRS benchmark datasets, demonstrate the superiority and effectiveness of the proposed Ms-AFt for vehicle detection.
arXiv Detail & Related papers (2020-07-16T17:46:46Z)
VehicleNet: Learning Robust Visual Representation for Vehicle Re-identification [116.1587709521173]
We propose to build a large-scale vehicle dataset (called VehicleNet) by harnessing four public vehicle datasets. We design a simple yet effective two-stage progressive approach to learning more robust visual representation from VehicleNet. We achieve the state-of-art accuracy of 86.07% mAP on the private test set of AICity Challenge.
arXiv Detail & Related papers (2020-04-14T05:06:38Z)
PyODDS: An End-to-end Outlier Detection System with Automated Machine Learning [55.32009000204512]
We present PyODDS, an automated end-to-end Python system for Outlier Detection with Database Support. Specifically, we define the search space in the outlier detection pipeline, and produce a search strategy within the given search space. It also provides unified interfaces and visualizations for users with or without data science or machine learning background.
arXiv Detail & Related papers (2020-03-12T03:30:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.