MPDIoU: A Loss for Efficient and Accurate Bounding Box Regression
- URL: http://arxiv.org/abs/2307.07662v1
- Date: Fri, 14 Jul 2023 23:54:49 GMT
- Title: MPDIoU: A Loss for Efficient and Accurate Bounding Box Regression
- Authors: Ma Siliang, Xu Yong
- Abstract summary: We propose a novel bounding box similarity comparison metric MPDIoU.
The MPDIoU loss function is applied to state-of-the-art instance segmentation (e.g., YOLACT) and object detection (e.g., YOLOv7) model trained on PASCAL VOC, MS COCO, and IIIT5k.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bounding box regression (BBR) has been widely used in object detection and
instance segmentation, which is an important step in object localization.
However, most of the existing loss functions for bounding box regression cannot
be optimized when the predicted box has the same aspect ratio as the
groundtruth box, but the width and height values are exactly different. In
order to tackle the issues mentioned above, we fully explore the geometric
features of horizontal rectangle and propose a novel bounding box similarity
comparison metric MPDIoU based on minimum point distance, which contains all of
the relevant factors considered in the existing loss functions, namely
overlapping or non-overlapping area, central points distance, and deviation of
width and height, while simplifying the calculation process. On this basis, we
propose a bounding box regression loss function based on MPDIoU, called LMPDIoU
. Experimental results show that the MPDIoU loss function is applied to
state-of-the-art instance segmentation (e.g., YOLACT) and object detection
(e.g., YOLOv7) model trained on PASCAL VOC, MS COCO, and IIIT5k outperforms
existing loss functions.
Related papers
- FPDIoU Loss: A Loss Function for Efficient Bounding Box Regression of Rotated Object Detection [10.655167287088368]
We propose a novel metric for arbitrary shapes comparison based on minimum points distance.
$FPDIoU$ loss has been applied to state-of-the-art rotated object detection.
arXiv Detail & Related papers (2024-05-16T09:44:00Z) - Shape-IoU: More Accurate Metric considering Bounding Box Shape and Scale [5.8666339171606445]
The Shape IoU method can calculate the loss by focusing on the shape and scale of the bounding box itself.
Our method can effectively improve detection performance and outperform existing methods, achieving state-of-the-art performance in different detection tasks.
arXiv Detail & Related papers (2023-12-29T16:05:02Z) - Edge Based Oriented Object Detection [8.075609633483248]
We propose a unique loss function based on edge gradients to enhance the detection accuracy of oriented objects.
We achieve a mAP increase of 1.3% on the DOTA dataset.
arXiv Detail & Related papers (2023-09-15T09:19:38Z) - Instance-Variant Loss with Gaussian RBF Kernel for 3D Cross-modal
Retriveal [52.41252219453429]
Existing methods treat all instances equally, applying the same penalty strength to instances with varying degrees of difficulty.
This can result in ambiguous convergence or local optima, severely compromising the separability of the feature space.
We propose an Instance-Variant loss to assign different penalty strengths to different instances, improving the space separability.
arXiv Detail & Related papers (2023-05-07T10:12:14Z) - SIoU Loss: More Powerful Learning for Bounding Box Regression [0.0]
Loss function SIoU was suggested, where penalty metrics were redefined considering the angle of the vector between the desired regression.
Applied to conventional Neural Networks and datasets it is shown that SIoU improves both the speed of training and the accuracy of the inference.
arXiv Detail & Related papers (2022-05-25T12:46:21Z) - PnP-DETR: Towards Efficient Visual Analysis with Transformers [146.55679348493587]
Recently, DETR pioneered the solution vision tasks with transformers, it directly translates the image feature map into the object result.
Recent transformer-based image recognition model andTT show consistent efficiency gain.
arXiv Detail & Related papers (2021-09-15T01:10:30Z) - InverseForm: A Loss Function for Structured Boundary-Aware Segmentation [80.39674800972182]
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network.
This plug-in loss term complements the cross-entropy loss in capturing boundary transformations.
We analyze the quantitative and qualitative effects of our loss function on three indoor and outdoor segmentation benchmarks.
arXiv Detail & Related papers (2021-04-06T18:52:45Z) - Optimization for Oriented Object Detection via Representation Invariance
Loss [2.501282372971187]
mainstream rotation detectors use oriented bounding boxes (OBB) or quadrilateral bounding boxes (QBB) to represent the rotating objects.
We propose a Representation Invariance Loss (RIL) to optimize the bounding box regression for the rotating objects.
Our method achieves consistent and substantial improvement in experiments on remote sensing datasets and scene text datasets.
arXiv Detail & Related papers (2021-03-22T07:55:33Z) - Focal and Efficient IOU Loss for Accurate Bounding Box Regression [63.14659624634066]
In object detection, bounding box regression (BBR) is a crucial step that determines the object localization performance.
Most previous loss functions for BBR have two main drawbacks: (i) Both $ell_n$-norm and IOU-based loss functions are inefficient to depict the objective of BBR, which leads to slow convergence and inaccurate regression results.
arXiv Detail & Related papers (2021-01-20T14:33:58Z) - Canny-VO: Visual Odometry with RGB-D Cameras based on Geometric 3D-2D
Edge Alignment [85.32080531133799]
This paper reviews the classical problem of free-form curve registration and applies it to an efficient RGBD visual odometry system called Canny-VO.
Two replacements for the distance transformation commonly used in edge registration are proposed: Approximate Nearest Neighbour Fields and Oriented Nearest Neighbour Fields.
3D2D edge alignment benefits from these alternative formulations in terms of both efficiency and accuracy.
arXiv Detail & Related papers (2020-12-15T11:42:17Z) - Enhancing Geometric Factors in Model Learning and Inference for Object
Detection and Instance Segmentation [91.12575065731883]
We propose Complete-IoU (CIoU) loss and Cluster-NMS for enhancing geometric factors in both bounding box regression and Non-Maximum Suppression (NMS)
The training of deep models using CIoU loss results in consistent AP and AR improvements in comparison to widely adopted $ell_n$-norm loss and IoU-based loss.
Cluster-NMS is very efficient due to its pure GPU implementation, and geometric factors can be incorporated to improve both AP and AR.
arXiv Detail & Related papers (2020-05-07T16:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.