Related papers: Efficient Segmentation with Texture in Ore Images Based on Box-supervised Approach

Efficient Segmentation with Texture in Ore Images Based on Box-supervised Approach

URL: http://arxiv.org/abs/2311.05929v1
Date: Fri, 10 Nov 2023 08:28:22 GMT
Title: Efficient Segmentation with Texture in Ore Images Based on Box-supervised Approach
Authors: Guodong Sun and Delong Huang and Yuting Peng and Le Cheng and Bo Wu and Yang Zhang
Abstract summary: A box-supervised technique with texture features is proposed to identify complete and independent ores. The proposed method achieves over 50 frames per second with a small model size of 21.6 MB. The method maintains a high level of accuracy compared with the state-of-the-art approaches on ore image dataset.
Score: 6.6773975364173
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Image segmentation methods have been utilized to determine the particle size distribution of crushed ores. Due to the complex working environment, high-powered computing equipment is difficult to deploy. At the same time, the ore distribution is stacked, and it is difficult to identify the complete features. To address this issue, an effective box-supervised technique with texture features is provided for ore image segmentation that can identify complete and independent ores. Firstly, a ghost feature pyramid network (Ghost-FPN) is proposed to process the features obtained from the backbone to reduce redundant semantic information and computation generated by complex networks. Then, an optimized detection head is proposed to obtain the feature to maintain accuracy. Finally, Lab color space (Lab) and local binary patterns (LBP) texture features are combined to form a fusion feature similarity-based loss function to improve accuracy while incurring no loss. Experiments on MS COCO have shown that the proposed fusion features are also worth studying on other types of datasets. Extensive experimental results demonstrate the effectiveness of the proposed method, which achieves over 50 frames per second with a small model size of 21.6 MB. Meanwhile, the method maintains a high level of accuracy compared with the state-of-the-art approaches on ore image dataset. The source code is available at \url{https://github.com/MVME-HBUT/OREINST}.

Related papers

Focus Through Motion: RGB-Event Collaborative Token Sparsification for Efficient Object Detection [56.88160531995454]
Existing RGB-Event detection methods process the low-information regions of both modalities uniformly during feature extraction and fusion.<n>We propose FocusMamba, which performs adaptive collaborative sparsification of multimodal features.<n>Experiments on the DSEC-Det and PKU-DAVIS-SOD datasets demonstrate that the proposed method achieves superior performance in both accuracy and efficiency.
arXiv Detail & Related papers (2025-09-04T04:18:46Z)
An Efficient MLP-based Point-guided Segmentation Network for Ore Images with Ambiguous Boundary [12.258442550351178]
This paper proposes a lightweight framework based on Multi-Layer Perceptron (MLP), which focuses on solving the problem of edge burring. Our approach achieves a remarkable processing speed of over 27 frames per second with a model size of only 73 MB. Our method delivers a consistently high level of accuracy, with impressive performance scores of 60.4 and 48.9 in$AP_50box$ and$AP_50mask$ respectively.
arXiv Detail & Related papers (2024-02-27T10:09:29Z)
PairingNet: A Learning-based Pair-searching and -matching Network for Image Fragments [6.694162736590122]
We propose a learning-based image fragment pair-searching and -matching approach to solve the challenging restoration problem. Our proposed network achieves excellent pair-searching accuracy, reduces matching errors, and significantly reduces computational time.
arXiv Detail & Related papers (2023-12-14T07:43:53Z)
DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection. It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor. Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z)
Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization. This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z)
Efficient Context Integration through Factorized Pyramidal Learning for Ultra-Lightweight Semantic Segmentation [1.0499611180329804]
We propose a novel Factorized Pyramidal Learning (FPL) module to aggregate rich contextual information in an efficient manner. We decompose the spatial pyramid into two stages which enables a simple and efficient feature fusion within the module to solve the notorious checkerboard effect. Based on the FPL module and FIR unit, we propose an ultra-lightweight real-time network, called FPLNet, which achieves state-of-the-art accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-02-23T05:34:51Z)
EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics. In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF) We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z)
Towards an efficient framework for Data Extraction from Chart Images [27.114170963444074]
We adopt state-of-the-art computer vision techniques for the data extraction stage in a data mining system. For building a robust point detector, a fully convolutional network with feature fusion module is adopted. For data conversion, we translate the detected element into data with semantic value.
arXiv Detail & Related papers (2021-05-05T13:18:53Z)
Lightweight Convolutional Neural Network with Gaussian-based Grasping Representation for Robotic Grasping Detection [4.683939045230724]
Current object detectors are difficult to strike a balance between high accuracy and fast inference speed. We present an efficient and robust fully convolutional neural network model to perform robotic grasping pose estimation. The network is an order of magnitude smaller than other excellent algorithms.
arXiv Detail & Related papers (2021-01-25T16:36:53Z)
Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts. We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively. Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively. Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z)
Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection [63.18846475183332]
We aim to develop an efficient and compact deep network for RGB-D salient object detection. We propose a progressively guided alternate refinement network to refine it. Our model outperforms existing state-of-the-art approaches by a large margin.
arXiv Detail & Related papers (2020-08-17T02:55:06Z)
High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment. Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.