Towards Automatic Power Battery Detection: New Challenge, Benchmark
Dataset and Baseline
- URL: http://arxiv.org/abs/2312.02528v2
- Date: Thu, 29 Feb 2024 03:56:57 GMT
- Title: Towards Automatic Power Battery Detection: New Challenge, Benchmark
Dataset and Baseline
- Authors: Xiaoqi Zhao, Youwei Pang, Zhenyu Chen, Qian Yu, Lihe Zhang, Hanqi Liu,
Jiaming Zuo, Huchuan Lu
- Abstract summary: We conduct a comprehensive study on a new task named power battery detection (PBD)
It aims to localize the dense cathode and anode plates endpoints from X-ray images to evaluate the quality of power batteries.
We propose a novel segmentation-based solution for PBD, termed multi-dimensional collaborative network (MDCNet)
- Score: 70.30473488226093
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We conduct a comprehensive study on a new task named power battery detection
(PBD), which aims to localize the dense cathode and anode plates endpoints from
X-ray images to evaluate the quality of power batteries. Existing manufacturers
usually rely on human eye observation to complete PBD, which makes it difficult
to balance the accuracy and efficiency of detection. To address this issue and
drive more attention into this meaningful task, we first elaborately collect a
dataset, called X-ray PBD, which has $1,500$ diverse X-ray images selected from
thousands of power batteries of $5$ manufacturers, with $7$ different visual
interference. Then, we propose a novel segmentation-based solution for PBD,
termed multi-dimensional collaborative network (MDCNet). With the help of line
and counting predictors, the representation of the point segmentation branch
can be improved at both semantic and detail aspects.Besides, we design an
effective distance-adaptive mask generation strategy, which can alleviate the
visual challenge caused by the inconsistent distribution density of plates to
provide MDCNet with stable supervision. Without any bells and whistles, our
segmentation-based MDCNet consistently outperforms various other corner
detection, crowd counting and general/tiny object detection-based solutions,
making it a strong baseline that can help facilitate future research in PBD.
Finally, we share some potential difficulties and works for future researches.
The source code and datasets will be publicly available at
\href{https://github.com/Xiaoqi-Zhao-DLUT/X-ray-PBD}{X-ray PBD}.
Related papers
- PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network [24.54269823691119]
We present an advanced study on more challenging high-resolution salient object detection (HRSOD) from both dataset and network framework perspectives.
To compensate for the lack of HRSOD dataset, we thoughtfully collect a large-scale high resolution salient object detection dataset, called UHRSD.
All the images are finely annotated in pixel-level, far exceeding previous low-resolution SOD datasets.
arXiv Detail & Related papers (2024-08-02T09:31:21Z) - Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - InsPLAD: A Dataset and Benchmark for Power Line Asset Inspection in UAV
Images [1.8524180288472398]
This paper introduces InsPLAD, a Power Line Asset Inspection dataset and Benchmark containing 10,607 high-resolution Unmanned Aerial Vehicles colour images.
The dataset contains seventeen unique power line assets captured from real-world operating power lines.
We thoroughly evaluate state-of-the-art and popular methods for three image-level computer vision tasks covered by InsPLAD: object detection, through the AP metric; defect classification, through Balanced Accuracy; and anomaly detection, through the AUROC metric.
arXiv Detail & Related papers (2023-11-02T22:06:23Z) - Autonomous Point Cloud Segmentation for Power Lines Inspection in Smart
Grid [56.838297900091426]
An unsupervised Machine Learning (ML) framework is proposed, to detect, extract and analyze the characteristics of power lines of both high and low voltage.
The proposed framework can efficiently detect the power lines and perform PLC-based hazard analysis.
arXiv Detail & Related papers (2023-08-14T17:14:58Z) - Multi-View Fusion and Distillation for Subgrade Distresses Detection
based on 3D-GPR [19.49863426864145]
We introduce a novel methodology for the subgrade distress detection task by leveraging the multi-view information from 3D-GPR data.
We develop a novel textbfMulti-textbfView textbfVusion and textbfDistillation framework, textbfGPR-MVFD, specifically designed to optimally utilize the multi-view GPR dataset.
arXiv Detail & Related papers (2023-08-09T08:06:28Z) - Leveraging Multi-view Data for Improved Detection Performance: An
Industrial Use Case [0.5249805590164901]
We present a multi-view object detection framework that offers a fast and precise solution.
We introduce a novel multi-view dataset with semi-automatic ground-truth data, which results in significant labeling resource savings.
Our experiments demonstrate a 15% improvement in mAP for detecting components that range in size from 0.5 to 27.0 mm.
arXiv Detail & Related papers (2023-04-17T09:41:37Z) - Pyramid Grafting Network for One-Stage High Resolution Saliency
Detection [29.013012579688347]
We propose a one-stage framework called Pyramid Grafting Network (PGNet) to extract features from different resolution images independently.
An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically.
We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions.
arXiv Detail & Related papers (2022-04-11T12:22:21Z) - Unsupervised Person Re-Identification with Wireless Positioning under
Weak Scene Labeling [131.18390399368997]
We propose to explore unsupervised person re-identification with both visual data and wireless positioning trajectories under weak scene labeling.
Specifically, we propose a novel unsupervised multimodal training framework (UMTF), which models the complementarity of visual data and wireless information.
Our UMTF contains a multimodal data association strategy (MMDA) and a multimodal graph neural network (MMGN)
arXiv Detail & Related papers (2021-10-29T08:25:44Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.