Related papers: In the Search for Optimal Multi-view Learning Models for Crop Classification with Global Remote Sensing Data

In the Search for Optimal Multi-view Learning Models for Crop Classification with Global Remote Sensing Data

URL: http://arxiv.org/abs/2403.16582v2
Date: Wed, 4 Sep 2024 11:14:18 GMT
Title: In the Search for Optimal Multi-view Learning Models for Crop Classification with Global Remote Sensing Data
Authors: Francisco Mena, Diego Arenas, Andreas Dengel,
Abstract summary: We use the CropHarvest dataset for validation, which provides optical, radar, weather time series, and topographic information as input data. We suggest identifying the optimal encoder architecture tailored for a particular fusion strategy, and then determining the most suitable fusion strategy for the classification task.
Score: 5.143097874851516
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Studying and analyzing cropland is a difficult task due to its dynamic and heterogeneous growth behavior. Usually, diverse data sources can be collected for its estimation. Although deep learning models have proven to excel in the crop classification task, they face substantial challenges when dealing with multiple inputs, named Multi-View Learning (MVL). The methods used in the MVL scenario can be structured based on the encoder architecture, the fusion strategy, and the optimization technique. The literature has primarily focused on using specific encoder architectures for local regions, lacking a deeper exploration of other components in the MVL methodology. In contrast, we investigate the simultaneous selection of the fusion strategy and encoder architecture, assessing global-scale cropland and crop-type classifications. We use a range of five fusion strategies (Input, Feature, Decision, Ensemble, Hybrid) and five temporal encoders (LSTM, GRU, TempCNN, TAE, L-TAE) as possible configurations in the MVL method. We use the CropHarvest dataset for validation, which provides optical, radar, weather time series, and topographic information as input data. We found that in scenarios with a limited number of labeled samples, a unique configuration is insufficient for all the cases. Instead, a specialized combination should be meticulously sought, including an encoder and fusion strategy. To streamline this search process, we suggest identifying the optimal encoder architecture tailored for a particular fusion strategy, and then determining the most suitable fusion strategy for the classification task. We provide a methodological framework for researchers exploring crop classification through an MVL methodology.

Related papers

CLIP Meets Diffusion: A Synergistic Approach to Anomaly Detection [54.85000884785013]
Anomaly detection is a complex problem due to the ambiguity in defining anomalies, the diversity of anomaly types, and the scarcity of training data.<n>We propose CLIPfusion, a method that leverages both discriminative and generative foundation models.<n>We believe that our method underscores the effectiveness of multi-modal and multi-model fusion in tackling the multifaceted challenges of anomaly detection.
arXiv Detail & Related papers (2025-06-13T13:30:15Z)
Layer-Aware Embedding Fusion for LLMs in Text Classifications [1.4250487522292254]
We propose a layer-aware embedding selection method and investigate how to quantitatively evaluate different layers to identify the most important ones for downstream NLP tasks. Experiments on four English text classification datasets demonstrate that different layers in LLMs exhibit varying degrees of representational strength for classification. We also explore how combining embeddings from multiple LLMs, without requiring model fine-tuning, can improve performance.
arXiv Detail & Related papers (2025-04-08T07:45:50Z)
An Enhanced Classification Method Based on Adaptive Multi-Scale Fusion for Long-tailed Multispectral Point Clouds [67.96583737413296]
We propose an enhanced classification method based on adaptive multi-scale fusion for MPCs with long-tailed distributions. In the training set generation stage, a grid-balanced sampling strategy is designed to reliably generate training samples from sparse labeled datasets. In the feature learning stage, a multi-scale feature fusion module is proposed to fuse shallow features of land-covers at different scales.
arXiv Detail & Related papers (2024-12-16T03:21:20Z)
Towards a Unified View of Preference Learning for Large Language Models: A Survey [88.66719962576005]
Large Language Models (LLMs) exhibit remarkably powerful capabilities. One of the crucial factors to achieve success is aligning the LLM's output with human preferences. We decompose all the strategies in preference learning into four components: model, data, feedback, and algorithm.
arXiv Detail & Related papers (2024-09-04T15:11:55Z)
Hierarchical Attention and Parallel Filter Fusion Network for Multi-Source Data Classification [33.26466989592473]
We propose a hierarchical attention and parallel filter fusion network for multi-source data classification. Our proposed method achieves 91.44% and 80.51% of overall accuracy (OA) on the respective datasets.
arXiv Detail & Related papers (2024-08-22T23:14:22Z)
MT-HCCAR: Multi-Task Deep Learning with Hierarchical Classification and Attention-based Regression for Cloud Property Retrieval [4.24122904716917]
This paper introduces MT-HCCAR, an end-to-end deep learning model employing multi-task learning to tackle cloud masking, cloud phase retrieval, and COT prediction. The MT-HCCAR integrates a hierarchical classification network (HC) and a classification-assisted attention-based regression network (CAR) to enhance precision and robustness in cloud labeling and COT prediction.
arXiv Detail & Related papers (2024-01-29T19:50:50Z)
Contextualization Distillation from Large Language Model for Knowledge Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks. Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments. Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z)
RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching) To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth. We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z)
SeisCLIP: A seismology foundation model pre-trained by multi-modal data for multi-purpose seismic feature extraction [16.01738433164131]
We develop SeisCLIP, a seismology foundation model trained through contrastive learning from multi-modal data. It consists of a transformer encoder for extracting crucial features from time-frequency seismic spectrum and an foundational encoder for integrating the phase and source information of the same event. Notably, SeisCLIP's performance surpasses that of baseline methods in event classification, localization, and focal mechanism analysis tasks.
arXiv Detail & Related papers (2023-09-05T15:40:13Z)
Learning the Right Layers: a Data-Driven Layer-Aggregation Strategy for Semi-Supervised Learning on Multilayer Graphs [2.752817022620644]
Clustering (or community detection) on multilayer graphs poses several additional complications. One of the major challenges is to establish the extent to which each layer contributes to the cluster iteration assignment. We propose a parameter-free Laplacian-regularized model that learns an optimal nonlinear combination of the different layers from the available input labels.
arXiv Detail & Related papers (2023-05-31T19:50:11Z)
Generating Multidimensional Clusters With Support Lines [0.0]
We present Clugen, a modular procedure for synthetic data generation. Cluken is open source, comprehensively unit tested and documented. We demonstrate that Clugen is fit for use in the assessment of clustering algorithms.
arXiv Detail & Related papers (2023-01-24T22:08:24Z)
Adaptive Context Selection for Polyp Segmentation [99.9959901908053]
We propose an adaptive context selection based encoder-decoder framework which is composed of Local Context Attention (LCA) module, Global Context Module (GCM) and Adaptive Selection Module (ASM) LCA modules deliver local context features from encoder layers to decoder layers, enhancing the attention to the hard region which is determined by the prediction map of previous layer. GCM aims to further explore the global context features and send to the decoder layers. ASM is used for adaptive selection and aggregation of context features through channel-wise attention.
arXiv Detail & Related papers (2023-01-12T04:06:44Z)
Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention [100.81495948184649]
We present Perceiver-VL, a vision-and-language framework that efficiently handles high-dimensional multimodal inputs such as long videos and text. Our framework scales with linear complexity, in contrast to the quadratic complexity of self-attention used in many state-of-the-art transformer-based models.
arXiv Detail & Related papers (2022-11-21T18:22:39Z)
Efficient Data-specific Model Search for Collaborative Filtering [56.60519991956558]
Collaborative filtering (CF) is a fundamental approach for recommender systems. In this paper, motivated by the recent advances in automated machine learning (AutoML), we propose to design a data-specific CF model. Key here is a new framework that unifies state-of-the-art (SOTA) CF methods and splits them into disjoint stages of input encoding, embedding function, interaction and prediction function.
arXiv Detail & Related papers (2021-06-14T14:30:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.