Web-Scale Generic Object Detection at Microsoft Bing
- URL: http://arxiv.org/abs/2107.01814v1
- Date: Mon, 5 Jul 2021 06:46:09 GMT
- Title: Web-Scale Generic Object Detection at Microsoft Bing
- Authors: Stephen Xi Chen, Saurajit Mukherjee, Unmesh Phadke, Tingting Wang,
Junwon Park, Ravi Theja Yada
- Abstract summary: We present Generic Object Detection (GenOD), one of the largest object detection systems deployed to a web-scale general visual search engine.
It can detect over 900 categories for all Microsoft Bing Visual Search queries in near real-time.
- Score: 4.350999432264304
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we present Generic Object Detection (GenOD), one of the
largest object detection systems deployed to a web-scale general visual search
engine that can detect over 900 categories for all Microsoft Bing Visual Search
queries in near real-time. It acts as a fundamental visual query understanding
service that provides object-centric information and shows gains in multiple
production scenarios, improving upon domain-specific models. We discuss the
challenges of collecting data, training, deploying and updating such a
large-scale object detection model with multiple dependencies. We discuss a
data collection pipeline that reduces per-bounding box labeling cost by 81.5%
and latency by 61.2% while improving on annotation quality. We show that GenOD
can improve weighted average precision by over 20% compared to multiple
domain-specific models. We also improve the model update agility by nearly 2
times with the proposed disjoint detector training compared to joint
fine-tuning. Finally we demonstrate how GenOD benefits visual search
applications by significantly improving object-level search relevance by 54.9%
and user engagement by 59.9%.
Related papers
- Bayesian Detector Combination for Object Detection with Crowdsourced Annotations [49.43709660948812]
Acquiring fine-grained object detection annotations in unconstrained images is time-consuming, expensive, and prone to noise.
We propose a novel Bayesian Detector Combination (BDC) framework to more effectively train object detectors with noisy crowdsourced annotations.
BDC is model-agnostic, requires no prior knowledge of the annotators' skill level, and seamlessly integrates with existing object detection models.
arXiv Detail & Related papers (2024-07-10T18:00:54Z) - A High-Resolution Dataset for Instance Detection with Multi-View
Instance Capture [15.298790238028356]
Instance detection (InsDet) is a long-lasting problem in robotics and computer vision.
Current InsDet are too small in scale by today's standards.
We introduce a new InsDet dataset and protocol.
arXiv Detail & Related papers (2023-10-30T03:58:41Z) - ComplETR: Reducing the cost of annotations for object detection in dense
scenes with vision transformers [73.29057814695459]
ComplETR is designed to explicitly complete missing annotations in partially annotated dense scene datasets.
This reduces the need to annotate every object instance in the scene thereby reducing annotation cost.
We show performance improvement for several popular detectors such as Faster R-CNN, Cascade R-CNN, CenterNet2, and Deformable DETR.
arXiv Detail & Related papers (2022-09-13T00:11:16Z) - Scaling Novel Object Detection with Weakly Supervised Detection
Transformers [21.219817483091166]
We propose the Weakly Supervised Detection Transformer, which enables efficient knowledge transfer from a large-scale pretraining dataset to WSOD finetuning.
Our experiments show that our approach outperforms previous state-of-the-art models on large-scale novel object detection datasets.
arXiv Detail & Related papers (2022-07-11T21:45:54Z) - Dynamic Relevance Learning for Few-Shot Object Detection [6.550840743803705]
We propose a dynamic relevance learning model, which utilizes the relationship between all support images and Region of Interest (RoI) on the query images to construct a dynamic graph convolutional network (GCN)
The proposed model achieves the best overall performance, which shows its effectiveness of learning more generalized features.
arXiv Detail & Related papers (2021-08-04T18:29:42Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Scale-aware Automatic Augmentation for Object Detection [63.087930708444695]
We propose Scale-aware AutoAug to learn data augmentation policies for object detection.
In experiments, Scale-aware AutoAug yields significant and consistent improvement on various object detectors.
arXiv Detail & Related papers (2021-03-31T17:11:14Z) - End-to-End Multi-Object Tracking with Global Response Map [23.755882375664875]
We present a completely end-to-end approach that takes image-sequence/video as input and outputs directly the located and tracked objects of learned types.
Specifically, with our introduced multi-object representation strategy, a global response map can be accurately generated over frames.
Experimental results based on the MOT16 and MOT17 benchmarks show that our proposed on-line tracker achieved state-of-the-art performance on several tracking metrics.
arXiv Detail & Related papers (2020-07-13T12:30:49Z) - AutoOD: Automated Outlier Detection via Curiosity-guided Search and
Self-imitation Learning [72.99415402575886]
Outlier detection is an important data mining task with numerous practical applications.
We propose AutoOD, an automated outlier detection framework, which aims to search for an optimal neural network model.
Experimental results on various real-world benchmark datasets demonstrate that the deep model identified by AutoOD achieves the best performance.
arXiv Detail & Related papers (2020-06-19T18:57:51Z) - Condensing Two-stage Detection with Automatic Object Key Part Discovery [87.1034745775229]
Two-stage object detectors generally require excessively large models for their detection heads to achieve high accuracy.
We propose that the model parameters of two-stage detection heads can be condensed and reduced by concentrating on object key parts.
Our proposed technique consistently maintains original performance while waiving around 50% of the model parameters of common two-stage detection heads.
arXiv Detail & Related papers (2020-06-10T01:20:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.