Bent & Broken Bicycles: Leveraging synthetic data for damaged object
re-identification
- URL: http://arxiv.org/abs/2304.07883v1
- Date: Sun, 16 Apr 2023 20:23:58 GMT
- Title: Bent & Broken Bicycles: Leveraging synthetic data for damaged object
re-identification
- Authors: Luca Piano, Filippo Gabriele Prattic\`o, Alessandro Sebastian Russo,
Lorenzo Lanari, Lia Morra, Fabrizio Lamberti
- Abstract summary: We propose a novel task of damaged object re-identification, which aims at distinguishing changes in visual appearance due to deformations or missing parts from subtle intra-class variations.
We leverage the power of computer-generated imagery to create, in a semi-automatic fashion, high-quality synthetic images of the same bike before and after a damage occurs.
As a baseline for this task, we propose TransReI3D, a multi-task, transformer-based deep network unifying damage detection.
- Score: 59.80753896200009
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Instance-level object re-identification is a fundamental computer vision
task, with applications from image retrieval to intelligent monitoring and
fraud detection. In this work, we propose the novel task of damaged object
re-identification, which aims at distinguishing changes in visual appearance
due to deformations or missing parts from subtle intra-class variations. To
explore this task, we leverage the power of computer-generated imagery to
create, in a semi-automatic fashion, high-quality synthetic images of the same
bike before and after a damage occurs. The resulting dataset, Bent & Broken
Bicycles (BBBicycles), contains 39,200 images and 2,800 unique bike instances
spanning 20 different bike models. As a baseline for this task, we propose
TransReI3D, a multi-task, transformer-based deep network unifying damage
detection (framed as a multi-label classification task) with object
re-identification. The BBBicycles dataset is available at
https://huggingface.co/datasets/GrainsPolito/BBBicycles
Related papers
- Neuromorphic Synergy for Video Binarization [54.195375576583864]
Bimodal objects serve as a visual form to embed information that can be easily recognized by vision systems.
Neuromorphic cameras offer new capabilities for alleviating motion blur, but it is non-trivial to first de-blur and then binarize the images in a real-time manner.
We propose an event-based binary reconstruction method that leverages the prior knowledge of the bimodal target's properties to perform inference independently in both event space and image space.
We also develop an efficient integration method to propagate this binary image to high frame rate binary video.
arXiv Detail & Related papers (2024-02-20T01:43:51Z) - CrashCar101: Procedural Generation for Damage Assessment [6.172653479848284]
We propose a procedural generation pipeline that damages 3D car models.
We obtain synthetic 2D images of damaged cars paired with pixel-accurate annotations for part and damage categories.
For part segmentation, we show that the segmentation models trained on a combination of real data and our synthetic data outperform all models trained only on real data.
arXiv Detail & Related papers (2023-11-11T11:12:28Z) - Towards Viewpoint Robustness in Bird's Eye View Segmentation [85.99907496019972]
We study how AV perception models are affected by changes in camera viewpoint.
Small changes to pitch, yaw, depth, or height of the camera at inference time lead to large drops in performance.
We introduce a technique for novel view synthesis and use it to transform collected data to the viewpoint of target rigs.
arXiv Detail & Related papers (2023-09-11T02:10:07Z) - OSRE: Object-to-Spot Rotation Estimation for Bike Parking Assessment [10.489021696058632]
This paper builds a camera-agnostic, well-annotated synthetic bike rotation dataset.
We then propose an object-to-spot rotation estimator (OSRE) by extending the object detection task to further regress the bike rotations in two axes.
The proposed OSRE is evaluated on synthetic and real-world data providing promising results.
arXiv Detail & Related papers (2023-03-01T18:34:10Z) - A Fine-Grained Vehicle Detection (FGVD) Dataset for Unconstrained Roads [29.09167268252761]
We introduce the first Fine-Grained Vehicle Detection dataset in the wild, captured from a moving camera mounted on a car.
It contains 5502 scene images with 210 unique fine-grained labels of multiple vehicle types organized in a three-level hierarchy.
While previous classification datasets also include makes for different kinds of cars, the FGVD dataset introduces new class labels for categorizing two-wheelers, autorickshaws, and trucks.
arXiv Detail & Related papers (2022-12-30T06:50:15Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - CrossTransformers: spatially-aware few-shot transfer [92.33252608837947]
Given new tasks with very little data, modern vision systems degrade remarkably quickly.
We show how the neural network representations which underpin modern vision systems are subject to supervision collapse.
We propose self-supervised learning to encourage general-purpose features that transfer better.
arXiv Detail & Related papers (2020-07-22T15:37:08Z) - Towards Accurate Vehicle Behaviour Classification With Multi-Relational
Graph Convolutional Networks [22.022759283770377]
We propose a pipeline for understanding vehicle behaviour from a monocular image sequence or video.
A temporal sequence of such encodings is fed to a recurrent network to label vehicle behaviours.
The proposed framework can classify a variety of vehicle behaviours to high fidelity on datasets that are diverse.
arXiv Detail & Related papers (2020-02-03T14:34:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.