RF-Annotate: Automatic RF-Supervised Image Annotation of Common Objects
in Context
- URL: http://arxiv.org/abs/2211.08837v1
- Date: Wed, 16 Nov 2022 11:25:38 GMT
- Title: RF-Annotate: Automatic RF-Supervised Image Annotation of Common Objects
in Context
- Authors: Emerson Sie, Deepak Vasisht
- Abstract summary: Wireless tags are increasingly used to track and identify common items of interest such as retail goods, food, medicine, clothing, books, documents, keys, equipment, and more.
We present RF-Annotate, a pipeline for autonomous pixel-wise image annotation which enables robots to collect labelled visual data of objects of interest as they encounter them within their environment.
- Score: 0.25019493958767397
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Wireless tags are increasingly used to track and identify common items of
interest such as retail goods, food, medicine, clothing, books, documents,
keys, equipment, and more. At the same time, there is a need for labelled
visual data featuring such items for the purpose of training object detection
and recognition models for robots operating in homes, warehouses, stores,
libraries, pharmacies, and so on. In this paper, we ask: can we leverage the
tracking and identification capabilities of such tags as a basis for a
large-scale automatic image annotation system for robotic perception tasks? We
present RF-Annotate, a pipeline for autonomous pixel-wise image annotation
which enables robots to collect labelled visual data of objects of interest as
they encounter them within their environment. Our pipeline uses unmodified
commodity RFID readers and RGB-D cameras, and exploits arbitrary small-scale
motions afforded by mobile robotic platforms to spatially map RFIDs to
corresponding objects in the scene. Our only assumption is that the objects of
interest within the environment are pre-tagged with inexpensive battery-free
RFIDs costing 3-15 cents each. We demonstrate the efficacy of our pipeline on
several RGB-D sequences of tabletop scenes featuring common objects in a
variety of indoor environments.
Related papers
- Learning Object Properties Using Robot Proprioception via Differentiable Robot-Object Interaction [52.12746368727368]
Differentiable simulation has become a powerful tool for system identification.
Our approach calibrates object properties by using information from the robot, without relying on data from the object itself.
We demonstrate the effectiveness of our method on a low-cost robotic platform.
arXiv Detail & Related papers (2024-10-04T20:48:38Z) - Multimodal Anomaly Detection based on Deep Auto-Encoder for Object Slip
Perception of Mobile Manipulation Robots [22.63980025871784]
The proposed framework integrates heterogeneous data streams collected from various robot sensors, including RGB and depth cameras, a microphone, and a force-torque sensor.
The integrated data is used to train a deep autoencoder to construct latent representations of the multisensory data that indicate the normal status.
Anomalies can then be identified by error scores measured by the difference between the trained encoder's latent values and the latent values of reconstructed input data.
arXiv Detail & Related papers (2024-03-06T09:15:53Z) - Follow Anything: Open-set detection, tracking, and following in
real-time [89.83421771766682]
We present a robotic system to detect, track, and follow any object in real-time.
Our approach, dubbed follow anything'' (FAn), is an open-vocabulary and multimodal model.
FAn can be deployed on a laptop with a lightweight (6-8 GB) graphics card, achieving a throughput of 6-20 frames per second.
arXiv Detail & Related papers (2023-08-10T17:57:06Z) - NeRF-Supervision: Learning Dense Object Descriptors from Neural Radiance
Fields [54.27264716713327]
We show that a Neural Radiance Fields (NeRF) representation of a scene can be used to train dense object descriptors.
We use an optimized NeRF to extract dense correspondences between multiple views of an object, and then use these correspondences as training data for learning a view-invariant representation of the object.
Dense correspondence models supervised with our method significantly outperform off-the-shelf learned descriptors by 106%.
arXiv Detail & Related papers (2022-03-03T18:49:57Z) - CNN-based Omnidirectional Object Detection for HermesBot Autonomous
Delivery Robot with Preliminary Frame Classification [53.56290185900837]
We propose an algorithm for optimizing a neural network for object detection using preliminary binary frame classification.
An autonomous mobile robot with 6 rolling-shutter cameras on the perimeter providing a 360-degree field of view was used as the experimental setup.
arXiv Detail & Related papers (2021-10-22T15:05:37Z) - Domain and Modality Gaps for LiDAR-based Person Detection on Mobile
Robots [91.01747068273666]
This paper studies existing LiDAR-based person detectors with a particular focus on mobile robot scenarios.
Experiments revolve around the domain gap between driving and mobile robot scenarios, as well as the modality gap between 3D and 2D LiDAR sensors.
Results provide practical insights into LiDAR-based person detection and facilitate informed decisions for relevant mobile robot designs and applications.
arXiv Detail & Related papers (2021-06-21T16:35:49Z) - Robotic Grasping of Fully-Occluded Objects using RF Perception [18.339320861642722]
RF-Grasp is a robotic system that can grasp fully-occluded objects in unstructured environments.
RF-Grasp relies on an eye-in-hand camera and batteryless RFID tags attached to objects of interest.
arXiv Detail & Related papers (2020-12-31T04:01:45Z) - Monitoring Browsing Behavior of Customers in Retail Stores via RFID
Imaging [24.007822566345943]
We propose TagSee, a multi-person imaging system based on monostatic RFID imaging.
We implement TagSee using a Impinj Speedway R420 reader and SMARTRAC DogBone RFID tags.
TagSee can achieve a TPR of more than 90% and a FPR of less than 10% in multi-person scenarios using training data from just 3-4 users.
arXiv Detail & Related papers (2020-07-07T16:36:24Z) - Learning Camera Miscalibration Detection [83.38916296044394]
This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras.
Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric.
By training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.
arXiv Detail & Related papers (2020-05-24T10:32:49Z) - Event-based Robotic Grasping Detection with Neuromorphic Vision Sensor
and Event-Stream Dataset [8.030163836902299]
Neuromorphic vision is a small and young community of research. Compared to traditional frame-based computer vision, neuromorphic vision is a small and young community of research.
We construct a robotic grasping dataset named Event-Stream dataset with 91 objects.
As leds blink at high frequency, the Event-Stream dataset is annotated in a high frequency of 1 kHz.
We develop a deep neural network for grasping detection which consider the angle learning problem as classification instead of regression.
arXiv Detail & Related papers (2020-04-28T16:55:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.