Graph Fusion Network for Multi-Oriented Object Detection
- URL: http://arxiv.org/abs/2205.03562v3
- Date: Tue, 20 Jun 2023 03:50:38 GMT
- Title: Graph Fusion Network for Multi-Oriented Object Detection
- Authors: Shi-Xue Zhang, Xiaobin Zhu, Jie-Bo Hou, Xu-Cheng Yin
- Abstract summary: We propose a novel graph fusion network, named GFNet, for multi-oriented object detection.
Our GFNet is adaptively fuse dense detection boxes to detect more accurate and holistic multi-oriented object instances.
- Score: 15.451824121019449
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In object detection, non-maximum suppression (NMS) methods are extensively
adopted to remove horizontal duplicates of detected dense boxes for generating
final object instances. However, due to the degraded quality of dense detection
boxes and not explicit exploration of the context information, existing NMS
methods via simple intersection-over-union (IoU) metrics tend to underperform
on multi-oriented and long-size objects detection. Distinguishing with general
NMS methods via duplicate removal, we propose a novel graph fusion network,
named GFNet, for multi-oriented object detection. Our GFNet is extensible and
adaptively fuse dense detection boxes to detect more accurate and holistic
multi-oriented object instances. Specifically, we first adopt a locality-aware
clustering algorithm to group dense detection boxes into different clusters. We
will construct an instance sub-graph for the detection boxes belonging to one
cluster. Then, we propose a graph-based fusion network via Graph Convolutional
Network (GCN) to learn to reason and fuse the detection boxes for generating
final instance boxes. Extensive experiments both on public available
multi-oriented text datasets (including MSRA-TD500, ICDAR2015, ICDAR2017-MLT)
and multi-oriented object datasets (DOTA) verify the effectiveness and
robustness of our method against general NMS methods in multi-oriented object
detection.
Related papers
- Consensus Focus for Object Detection and minority classes [3.739946023378878]
We propose a modified consensus focus for semi-supervised and long-tailed object detection.
Our tests on synthetic driving datasets retrieved higher confidence and more accurate bounding boxes than the NMS, soft-NMS, and WBF.
arXiv Detail & Related papers (2024-01-10T19:55:15Z) - Object-Centric Multiple Object Tracking [124.30650395969126]
This paper proposes a video object-centric model for multiple-object tracking pipelines.
It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module.
Benefited from object-centric learning, we only require sparse detection labels for object localization and feature binding.
arXiv Detail & Related papers (2023-09-01T03:34:12Z) - Illicit item detection in X-ray images for security applications [7.519872646378835]
Automated detection of contraband items in X-ray images can significantly increase public safety.
Modern computer vision algorithms relying on Deep Neural Networks (DNNs) have proven capable of undertaking this task.
This paper proposes a two-fold improvement of such algorithms for the X-ray analysis domain.
arXiv Detail & Related papers (2023-05-03T07:28:05Z) - Discovery-and-Selection: Towards Optimal Multiple Instance Learning for
Weakly Supervised Object Detection [86.86602297364826]
We propose a discoveryand-selection approach fused with multiple instance learning (DS-MIL)
Our proposed DS-MIL approach can consistently improve the baselines, reporting state-of-the-art performance.
arXiv Detail & Related papers (2021-10-18T07:06:57Z) - Multi-Source Domain Adaptation for Object Detection [52.87890831055648]
We propose a unified Faster R-CNN based framework, termed Divide-and-Merge Spindle Network (DMSN)
DMSN can simultaneously enhance domain innative and preserve discriminative power.
We develop a novel pseudo learning algorithm to approximate optimal parameters of pseudo target subset.
arXiv Detail & Related papers (2021-06-30T03:17:20Z) - Global Correlation Network: End-to-End Joint Multi-Object Detection and
Tracking [2.749204052800622]
We present a novel network to realize joint multi-object detection and tracking in an end-to-end way, called Global Correlation Network (GCNet)
GCNet introduces the global correlation layer for regression of absolute size and coordinates of bounding boxes instead of offsets prediction.
The pipeline of detection and tracking by GCNet is conceptually simple, which does not need non-maximum suppression, data association, and other complicated tracking strategies.
arXiv Detail & Related papers (2021-03-23T13:16:42Z) - Object Detection Made Simpler by Eliminating Heuristic NMS [70.93004137521946]
We show a simple NMS-free, end-to-end object detection framework.
We attain on par or even improved detection accuracy compared with the original one-stage detector.
arXiv Detail & Related papers (2021-01-28T02:38:29Z) - MRDet: A Multi-Head Network for Accurate Oriented Object Detection in
Aerial Images [51.227489316673484]
We propose an arbitrary-oriented region proposal network (AO-RPN) to generate oriented proposals transformed from horizontal anchors.
To obtain accurate bounding boxes, we decouple the detection task into multiple subtasks and propose a multi-head network.
Each head is specially designed to learn the features optimal for the corresponding task, which allows our network to detect objects accurately.
arXiv Detail & Related papers (2020-12-24T06:36:48Z) - End-to-End Object Detection with Fully Convolutional Network [71.56728221604158]
We introduce a Prediction-aware One-To-One (POTO) label assignment for classification to enable end-to-end detection.
A simple 3D Max Filtering (3DMF) is proposed to utilize the multi-scale features and improve the discriminability of convolutions in the local region.
Our end-to-end framework achieves competitive performance against many state-of-the-art detectors with NMS on COCO and CrowdHuman datasets.
arXiv Detail & Related papers (2020-12-07T09:14:55Z) - End-to-End Multi-Object Tracking with Global Response Map [23.755882375664875]
We present a completely end-to-end approach that takes image-sequence/video as input and outputs directly the located and tracked objects of learned types.
Specifically, with our introduced multi-object representation strategy, a global response map can be accurately generated over frames.
Experimental results based on the MOT16 and MOT17 benchmarks show that our proposed on-line tracker achieved state-of-the-art performance on several tracking metrics.
arXiv Detail & Related papers (2020-07-13T12:30:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.