Related papers: Dual-Stream Global-Local Feature Collaborative Representation Network for Scene Classification of Mining Area

Dual-Stream Global-Local Feature Collaborative Representation Network for Scene Classification of Mining Area

URL: http://arxiv.org/abs/2507.20216v2
Date: Thu, 31 Jul 2025 09:00:13 GMT
Title: Dual-Stream Global-Local Feature Collaborative Representation Network for Scene Classification of Mining Area
Authors: Shuqi Fan, Haoyi Wang, Xianju Li,
Abstract summary: This study fuses multi-source data to construct a multi-modal mine land cover scene classification dataset.<n>We propose a dual-branch fusion model utilizing collaborative representation to decompose global features into key semantic vectors.<n>The overall accuracy of this model is 83.63%, which outperforms other comparative models.
Score: 2.4578723416255754
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Scene classification of mining areas provides accurate foundational data for geological environment monitoring and resource development planning. This study fuses multi-source data to construct a multi-modal mine land cover scene classification dataset. A significant challenge in mining area classification lies in the complex spatial layout and multi-scale characteristics. By extracting global and local features, it becomes possible to comprehensively reflect the spatial distribution, thereby enabling a more accurate capture of the holistic characteristics of mining scenes. We propose a dual-branch fusion model utilizing collaborative representation to decompose global features into a set of key semantic vectors. This model comprises three key components:(1) Multi-scale Global Transformer Branch: It leverages adjacent large-scale features to generate global channel attention features for small-scale features, effectively capturing the multi-scale feature relationships. (2) Local Enhancement Collaborative Representation Branch: It refines the attention weights by leveraging local features and reconstructed key semantic sets, ensuring that the local context and detailed characteristics of the mining area are effectively integrated. This enhances the model's sensitivity to fine-grained spatial variations. (3) Dual-Branch Deep Feature Fusion Module: It fuses the complementary features of the two branches to incorporate more scene information. This fusion strengthens the model's ability to distinguish and classify complex mining landscapes. Finally, this study employs multi-loss computation to ensure a balanced integration of the modules. The overall accuracy of this model is 83.63%, which outperforms other comparative models. Additionally, it achieves the best performance across all other evaluation metrics.

Related papers

TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation [65.74990259650984]
We introduce TerraFM, a scalable self-supervised learning model that leverages globally distributed Sentinel-1 and Sentinel-2 imagery.<n>Our training strategy integrates local-global contrastive learning and introduces a dual-centering mechanism.<n>TerraFM achieves strong generalization on both classification and segmentation tasks, outperforming prior models on GEO-Bench and Copernicus-Bench.
arXiv Detail & Related papers (2025-06-06T17:59:50Z)
Prototype-Based Information Compensation Network for Multi-Source Remote Sensing Data Classification [56.065032039986725]
Multi-source remote sensing data joint classification aims to provide accuracy and reliability of land cover classification.<n>Existing methods confront two challenges: inter-frequency multi-source feature coupling and inconsistency of complementary information exploration.<n>We present a Prototype-based Information Compensation Network (PICNet) for land cover classification based on HSI and SAR/LiDAR data.
arXiv Detail & Related papers (2025-05-06T22:30:23Z)
Multilateral Cascading Network for Semantic Segmentation of Large-Scale Outdoor Point Clouds [6.253217784798542]
Multilateral Cascading Network (MCNet) designed to address this challenge.<n>MCNet comprises two key components: a Multilateral Cascading Attention Enhancement (MCAE) module, and a Point Cross Stage Partial (P-CSP) module.<n>Our results surpassed the current best result by 2.1% in overall mIoU and yielded an improvement of 15.9% on average for small-sample object categories.
arXiv Detail & Related papers (2024-09-21T02:23:01Z)
PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection. We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN) PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z)
Hierarchical Attention and Parallel Filter Fusion Network for Multi-Source Data Classification [33.26466989592473]
We propose a hierarchical attention and parallel filter fusion network for multi-source data classification. Our proposed method achieves 91.44% and 80.51% of overall accuracy (OA) on the respective datasets.
arXiv Detail & Related papers (2024-08-22T23:14:22Z)
Monocular Per-Object Distance Estimation with Masked Object Modeling [33.59920084936913]
Our paper draws inspiration from Masked Image Modeling (MiM) and extends it to multi-object tasks.<n>Our strategy, termed Masked Object Modeling (MoM), enables a novel application of masking techniques.<n>We evaluate the effectiveness of MoM on a novel reference architecture (DistFormer) on the standard KITTI, NuScenes, and MOT Synth datasets.
arXiv Detail & Related papers (2024-01-06T10:56:36Z)
DETR Doesn't Need Multi-Scale or Locality Design [69.56292005230185]
This paper presents an improved DETR detector that maintains a "plain" nature. It uses a single-scale feature map and global cross-attention calculations without specific locality constraints. We show that two simple technologies are surprisingly effective within a plain design to compensate for the lack of multi-scale feature maps and locality constraints.
arXiv Detail & Related papers (2023-08-03T17:59:04Z)
DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation [21.717520350930705]
Transformer-based models have been widely demonstrated to be successful in computer vision tasks. However, they are often dominated by features of large patterns leading to the loss of local details. We propose a Dual-Aggregation Transformer Network called DuAT, which is characterized by two innovative designs. Our proposed model outperforms state-of-the-art methods in the segmentation of skin lesion images, and polyps in colonoscopy images.
arXiv Detail & Related papers (2022-12-21T07:54:02Z)
Global Aggregation then Local Distribution for Scene Parsing [99.1095068574454]
We show that our approach can be modularized as an end-to-end trainable block and easily plugged into existing semantic segmentation networks. Our approach allows us to build new state of the art on major semantic segmentation benchmarks including Cityscapes, ADE20K, Pascal Context, Camvid and COCO-stuff.
arXiv Detail & Related papers (2021-07-28T03:46:57Z)
Detail Preserved Point Cloud Completion via Separated Feature Aggregation [26.566021924980706]
Point cloud shape completion is a challenging problem in 3D vision and robotics. We propose two different feature aggregation strategies, named global & local feature aggregation(GLFA) and residual feature aggregation(RFA) Our proposed network outperforms current state-of-the art methods especially on detail preservation.
arXiv Detail & Related papers (2020-07-05T16:11:55Z)
Global Context-Aware Progressive Aggregation Network for Salient Object Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features. We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.