DeepDarts: Modeling Keypoints as Objects for Automatic Scorekeeping in
Darts using a Single Camera
- URL: http://arxiv.org/abs/2105.09880v1
- Date: Thu, 20 May 2021 16:25:57 GMT
- Title: DeepDarts: Modeling Keypoints as Objects for Automatic Scorekeeping in
Darts using a Single Camera
- Authors: William McNally, Pascale Walters, Kanav Vats, Alexander Wong, John
McPhee
- Abstract summary: Existing multi-camera solutions for automatic scorekeeping in steel-tip darts are very expensive and thus inaccessible to most players.
We present a new approach to keypoint detection and apply it to predict dart scores from a single image taken from any camera angle.
We develop a deep convolutional neural network around this idea and use it to predict dart locations and dartboard calibration points.
- Score: 75.34178733070547
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing multi-camera solutions for automatic scorekeeping in steel-tip darts
are very expensive and thus inaccessible to most players. Motivated to develop
a more accessible low-cost solution, we present a new approach to keypoint
detection and apply it to predict dart scores from a single image taken from
any camera angle. This problem involves detecting multiple keypoints that may
be of the same class and positioned in close proximity to one another. The
widely adopted framework for regressing keypoints using heatmaps is not
well-suited for this task. To address this issue, we instead propose to model
keypoints as objects. We develop a deep convolutional neural network around
this idea and use it to predict dart locations and dartboard calibration points
within an overall pipeline for automatic dart scoring, which we call DeepDarts.
Additionally, we propose several task-specific data augmentation strategies to
improve the generalization of our method. As a proof of concept, two datasets
comprising 16k images originating from two different dartboard setups were
manually collected and annotated to evaluate the system. In the primary dataset
containing 15k images captured from a face-on view of the dartboard using a
smartphone, DeepDarts predicted the total score correctly in 94.7% of the test
images. In a second more challenging dataset containing limited training data
(830 images) and various camera angles, we utilize transfer learning and
extensive data augmentation to achieve a test accuracy of 84.0%. Because
DeepDarts relies only on single images, it has the potential to be deployed on
edge devices, giving anyone with a smartphone access to an automatic dart
scoring system for steel-tip darts. The code and datasets are available.
Related papers
- FisheyeDetNet: 360° Surround view Fisheye Camera based Object Detection System for Autonomous Driving [4.972459365804512]
Object detection is a mature problem in autonomous driving with pedestrian detection being one of the first deployed algorithms.
Standard bounding box representation fails in fisheye cameras due to heavy radial distortion, particularly in the periphery.
We design rotated bounding boxes, ellipse, generic polygon as polar arc/angle representations and define an instance segmentation mIOU metric to analyze these representations.
The proposed model FisheyeDetNet with polygon outperforms others and achieves a mAP score of 49.5 % on Valeo fisheye surround-view dataset for automated driving applications.
arXiv Detail & Related papers (2024-04-20T18:50:57Z) - X-Pose: Detecting Any Keypoints [28.274913140048003]
X-Pose is a novel framework for multi-object keypoint detection in images.
UniKPT is a large-scale dataset of keypoint detection datasets.
X-Pose achieves notable improvements over state-of-the-art non-promptable, visual prompt-based, and textual prompt-based methods.
arXiv Detail & Related papers (2023-10-12T17:22:58Z) - OmniPD: One-Step Person Detection in Top-View Omnidirectional Indoor
Scenes [4.297070083645049]
We propose a one-step person detector for topview omnidirectional indoor scenes based on convolutional neural networks (CNNs)
The method predicts bounding boxes of multiple persons directly in omnidirectional images without perspective transformation.
Our method is applicable to other CNN-based object detectors and can potentially generalize for detecting other objects in omnidirectional images.
arXiv Detail & Related papers (2022-04-14T09:41:53Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and
Stereo Data Fusion [62.24001258298076]
VPFNet is a new architecture that cleverly aligns and aggregates the point cloud and image data at the virtual' points.
Our VPFNet achieves 83.21% moderate 3D AP and 91.86% moderate BEV AP on the KITTI test set, ranking the 1st since May 21th, 2021.
arXiv Detail & Related papers (2021-11-29T08:51:20Z) - One-Shot Object Affordance Detection in the Wild [76.46484684007706]
Affordance detection refers to identifying the potential action possibilities of objects in an image.
We devise a One-Shot Affordance Detection Network (OSAD-Net) that estimates the human action purpose and then transfers it to help detect the common affordance from all candidate images.
With complex scenes and rich annotations, our PADv2 dataset can be used as a test bed to benchmark affordance detection methods.
arXiv Detail & Related papers (2021-08-08T14:53:10Z) - 6D Object Pose Estimation using Keypoints and Part Affinity Fields [24.126513851779936]
The task of 6D object pose estimation from RGB images is an important requirement for autonomous service robots to be able to interact with the real world.
We present a two-step pipeline for estimating the 6 DoF translation and orientation of known objects.
arXiv Detail & Related papers (2021-07-05T14:41:19Z) - Robust 2D/3D Vehicle Parsing in CVIS [54.825777404511605]
We present a novel approach to robustly detect and perceive vehicles in different camera views as part of a cooperative vehicle-infrastructure system (CVIS)
Our formulation is designed for arbitrary camera views and makes no assumptions about intrinsic or extrinsic parameters.
In practice, our approach outperforms SOTA methods on 2D detection, instance segmentation, and 6-DoF pose estimation.
arXiv Detail & Related papers (2021-03-11T03:35:05Z) - Data Augmentation for Object Detection via Differentiable Neural
Rendering [71.00447761415388]
It is challenging to train a robust object detector when annotated data is scarce.
Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data.
We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
arXiv Detail & Related papers (2021-03-04T06:31:06Z) - Structure-Aware Network for Lane Marker Extraction with Dynamic Vision
Sensor [14.55881454495574]
We introduce Dynamic Vision Sensor (DVS), a type of event-based sensor to lane marker extraction task.
We generate high-resolution DVS dataset for lane marker extraction with resolution of 1280$times$800 pixels.
We then propose a structure-aware network for lane marker extraction in DVS images.
We evaluate our proposed network with other state-of-the-art lane marker extraction models on this dataset.
arXiv Detail & Related papers (2020-08-14T06:28:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.