Efficient Segmentation with Texture in Ore Images Based on
Box-supervised Approach
- URL: http://arxiv.org/abs/2311.05929v1
- Date: Fri, 10 Nov 2023 08:28:22 GMT
- Title: Efficient Segmentation with Texture in Ore Images Based on
Box-supervised Approach
- Authors: Guodong Sun and Delong Huang and Yuting Peng and Le Cheng and Bo Wu
and Yang Zhang
- Abstract summary: A box-supervised technique with texture features is proposed to identify complete and independent ores.
The proposed method achieves over 50 frames per second with a small model size of 21.6 MB.
The method maintains a high level of accuracy compared with the state-of-the-art approaches on ore image dataset.
- Score: 6.6773975364173
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Image segmentation methods have been utilized to determine the particle size
distribution of crushed ores. Due to the complex working environment,
high-powered computing equipment is difficult to deploy. At the same time, the
ore distribution is stacked, and it is difficult to identify the complete
features. To address this issue, an effective box-supervised technique with
texture features is provided for ore image segmentation that can identify
complete and independent ores. Firstly, a ghost feature pyramid network
(Ghost-FPN) is proposed to process the features obtained from the backbone to
reduce redundant semantic information and computation generated by complex
networks. Then, an optimized detection head is proposed to obtain the feature
to maintain accuracy. Finally, Lab color space (Lab) and local binary patterns
(LBP) texture features are combined to form a fusion feature similarity-based
loss function to improve accuracy while incurring no loss. Experiments on MS
COCO have shown that the proposed fusion features are also worth studying on
other types of datasets. Extensive experimental results demonstrate the
effectiveness of the proposed method, which achieves over 50 frames per second
with a small model size of 21.6 MB. Meanwhile, the method maintains a high
level of accuracy compared with the state-of-the-art approaches on ore image
dataset. The source code is available at
\url{https://github.com/MVME-HBUT/OREINST}.
Related papers
- An Efficient MLP-based Point-guided Segmentation Network for Ore Images
with Ambiguous Boundary [12.258442550351178]
This paper proposes a lightweight framework based on Multi-Layer Perceptron (MLP), which focuses on solving the problem of edge burring.
Our approach achieves a remarkable processing speed of over 27 frames per second with a model size of only 73 MB.
Our method delivers a consistently high level of accuracy, with impressive performance scores of 60.4 and 48.9 in$AP_50box$ and$AP_50mask$ respectively.
arXiv Detail & Related papers (2024-02-27T10:09:29Z) - PairingNet: A Learning-based Pair-searching and -matching Network for
Image Fragments [6.694162736590122]
We propose a learning-based image fragment pair-searching and -matching approach to solve the challenging restoration problem.
Our proposed network achieves excellent pair-searching accuracy, reduces matching errors, and significantly reduces computational time.
arXiv Detail & Related papers (2023-12-14T07:43:53Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - Efficient Context Integration through Factorized Pyramidal Learning for
Ultra-Lightweight Semantic Segmentation [1.0499611180329804]
We propose a novel Factorized Pyramidal Learning (FPL) module to aggregate rich contextual information in an efficient manner.
We decompose the spatial pyramid into two stages which enables a simple and efficient feature fusion within the module to solve the notorious checkerboard effect.
Based on the FPL module and FIR unit, we propose an ultra-lightweight real-time network, called FPLNet, which achieves state-of-the-art accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-02-23T05:34:51Z) - EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - Towards an efficient framework for Data Extraction from Chart Images [27.114170963444074]
We adopt state-of-the-art computer vision techniques for the data extraction stage in a data mining system.
For building a robust point detector, a fully convolutional network with feature fusion module is adopted.
For data conversion, we translate the detected element into data with semantic value.
arXiv Detail & Related papers (2021-05-05T13:18:53Z) - Lightweight Convolutional Neural Network with Gaussian-based Grasping
Representation for Robotic Grasping Detection [4.683939045230724]
Current object detectors are difficult to strike a balance between high accuracy and fast inference speed.
We present an efficient and robust fully convolutional neural network model to perform robotic grasping pose estimation.
The network is an order of magnitude smaller than other excellent algorithms.
arXiv Detail & Related papers (2021-01-25T16:36:53Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Progressively Guided Alternate Refinement Network for RGB-D Salient
Object Detection [63.18846475183332]
We aim to develop an efficient and compact deep network for RGB-D salient object detection.
We propose a progressively guided alternate refinement network to refine it.
Our model outperforms existing state-of-the-art approaches by a large margin.
arXiv Detail & Related papers (2020-08-17T02:55:06Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.