Related papers: Control Distance IoU and Control Distance IoU Loss Function for Better Bounding Box Regression

Control Distance IoU and Control Distance IoU Loss Function for Better Bounding Box Regression

URL: http://arxiv.org/abs/2103.11696v1
Date: Mon, 22 Mar 2021 09:57:25 GMT
Title: Control Distance IoU and Control Distance IoU Loss Function for Better Bounding Box Regression
Authors: Dong Chen and Duoqian Miao
Abstract summary: We first present an evaluation-feedback module, which is proposed to consist of evaluation system and feedback mechanism. Finally, we focus on both the evaluation system and the feedback mechanism, and propose Control Distance IoU and Control Distance IoU loss function.
Score: 11.916482804759479
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Numerous improvements for feedback mechanisms have contributed to the great progress in object detection. In this paper, we first present an evaluation-feedback module, which is proposed to consist of evaluation system and feedback mechanism. Then we analyze and summarize the disadvantages and improvements of traditional evaluation-feedback module. Finally, we focus on both the evaluation system and the feedback mechanism, and propose Control Distance IoU and Control Distance IoU loss function (or CDIoU and CDIoU loss for short) without increasing parameters or FLOPs in models, which show different significant enhancements on several classical and emerging models. Some experiments and comparative tests show that coordinated evaluation-feedback module can effectively improve model performance. CDIoU and CDIoU loss have different excellent performances in several models such as Faster R-CNN, YOLOv4, RetinaNet and ATSS. There is a maximum AP improvement of 1.9% and an average AP of 0.8% improvement on MS COCO dataset, compared to traditional evaluation-feedback modules.

Related papers

WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training [64.0932926819307]
We present Warmup-Stable and Merge (WSM), a framework that establishes a formal connection between learning rate decay and model merging.<n>WSM provides a unified theoretical foundation for emulating various decay strategies.<n>Our framework consistently outperforms the widely-adopted Warmup-Stable-Decay (WSD) approach across multiple benchmarks.
arXiv Detail & Related papers (2025-07-23T16:02:06Z)
RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback [57.967762383794806]
RefCritic is a long-chain-of-thought critic module based on reinforcement learning with dual rule-based rewards.<n>We evaluate RefCritic on Qwen2.5-14B-Instruct and DeepSeek-R1-Distill-Qwen-14B across five benchmarks.
arXiv Detail & Related papers (2025-07-20T16:19:51Z)
Pairwise or Pointwise? Evaluating Feedback Protocols for Bias in LLM-Based Evaluation [57.380464382910375]
We show that the choice of feedback protocol can significantly affect evaluation reliability and induce systematic biases. In particular, we show that pairwise evaluation protocols are more vulnerable to distracted evaluation.
arXiv Detail & Related papers (2025-04-20T19:05:59Z)
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
AI-in-the-Loop Sensing and Communication Joint Design for Edge Intelligence [65.29835430845893]
We propose a framework that enhances edge intelligence through AI-in-the-loop joint sensing and communication. A key contribution of our work is establishing an explicit relationship between validation loss and the system's tunable parameters. We show that our framework reduces communication energy consumption by up to 77 percent and sensing costs measured by the number of samples by up to 52 percent.
arXiv Detail & Related papers (2025-02-14T14:56:58Z)
Contrastive Learning for Cold Start Recommendation with Adaptive Feature Fusion [2.2194815687410627]
This paper proposes a cold start recommendation model that integrates contrastive learning. The model dynamically adjusts the weights of key features through an adaptive feature selection module. It integrates user attributes, item meta-information, and contextual features by combining a multimodal feature fusion mechanism.
arXiv Detail & Related papers (2025-02-05T23:15:31Z)
VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment [55.7956150385255]
We investigate the efficacy of AI feedback to scale supervision for aligning vision-language models. We introduce VLFeedback, the first large-scale vision-language feedback dataset. We train Silkie, an LVLM fine-tuned via direct preference optimization on VLFeedback.
arXiv Detail & Related papers (2024-10-12T07:56:47Z)
Improved Unet brain tumor image segmentation based on GSConv module and ECA attention mechanism [0.0]
An improved model of medical image segmentation for brain tumor is discussed, which is a deep learning algorithm based on U-Net architecture. Based on the traditional U-Net, we introduce GSConv module and ECA attention mechanism to improve the performance of the model in medical image segmentation tasks.
arXiv Detail & Related papers (2024-09-20T16:35:19Z)
Improved Unet model for brain tumor image segmentation based on ASPP-coordinate attention mechanism [9.496880456126709]
We propose an improved Unet model for brain tumor image segmentation. It combines coordinate attention mechanism and ASPP module to improve the segmentation effect. Compared to the traditional Unet, the enhanced model offers superior segmentation and edge accuracy.
arXiv Detail & Related papers (2024-09-13T07:08:48Z)
Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts. We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z)
Unified-IoU: For High-Quality Object Detection [1.62877896907106]
We propose a new IoU loss function, called Unified-IoU (UIoU), which is more concerned with the weight assignment between different quality prediction boxes. Our proposed method achieves better performance on multiple datasets, especially at a high IoU threshold.
arXiv Detail & Related papers (2024-08-13T04:56:45Z)
Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation [67.88747330066049]
Fine-grained feedback captures nuanced distinctions in image quality and prompt-alignment. We show that demonstrating its superiority to coarse-grained feedback is not automatic. We identify key challenges in eliciting and utilizing fine-grained feedback.
arXiv Detail & Related papers (2024-06-24T17:19:34Z)
YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection. The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs. We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z)
Autonomous Evaluation and Refinement of Digital Agents [57.12281122337407]
We show that domain-general automatic evaluators can significantly improve the performance of agents for web navigation and device control. We validate the performance of these models in several popular benchmarks for digital agents, finding between 74.4 and 92.9% agreement with oracle evaluation metrics.
arXiv Detail & Related papers (2024-04-09T17:25:47Z)
Reinforcement Learning from Delayed Observations via World Models [10.298219828693489]
In reinforcement learning settings, agents assume immediate feedback about the effects of their actions after taking them. In practice, this assumption may not hold true due to physical constraints and can significantly impact the performance of learning algorithms. We propose leveraging world models, which have shown success in integrating past observations and learning dynamics, to handle observation delays.
arXiv Detail & Related papers (2024-03-18T23:18:27Z)
Systematic Architectural Design of Scale Transformed Attention Condenser DNNs via Multi-Scale Class Representational Response Similarity Analysis [93.0013343535411]
We propose a novel type of analysis called Multi-Scale Class Representational Response Similarity Analysis (ClassRepSim) We show that adding STAC modules to ResNet style architectures can result in up to a 1.6% increase in top-1 accuracy. Results from ClassRepSim analysis can be used to select an effective parameterization of the STAC module resulting in competitive performance.
arXiv Detail & Related papers (2023-06-16T18:29:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.