Related papers: Fusion of Satellite Images and Weather Data with Transformer Networks for Downy Mildew Disease Detection

Fusion of Satellite Images and Weather Data with Transformer Networks for Downy Mildew Disease Detection

URL: http://arxiv.org/abs/2209.02797v1
Date: Tue, 6 Sep 2022 19:55:16 GMT
Title: Fusion of Satellite Images and Weather Data with Transformer Networks for Downy Mildew Disease Detection
Authors: William Maillet, Maryam Ouhami, Adel Hafiane
Abstract summary: Crop diseases significantly affect the quantity and quality of agricultural production. In this paper, we propose a new approach to realize data fusion using three transformers. The architecture is built from three main components, a Vision Transformer and two transformer-encoders, allowing to fuse both image and weather modalities.
Score: 3.6868861317674524
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Crop diseases significantly affect the quantity and quality of agricultural production. In a context where the goal of precision agriculture is to minimize or even avoid the use of pesticides, weather and remote sensing data with deep learning can play a pivotal role in detecting crop diseases, allowing localized treatment of crops. However, combining heterogeneous data such as weather and images remains a hot topic and challenging task. Recent developments in transformer architectures have shown the possibility of fusion of data from different domains, for instance text-image. The current trend is to custom only one transformer to create a multimodal fusion model. Conversely, we propose a new approach to realize data fusion using three transformers. In this paper, we first solved the missing satellite images problem, by interpolating them with a ConvLSTM model. Then, proposed a multimodal fusion architecture that jointly learns to process visual and weather information. The architecture is built from three main components, a Vision Transformer and two transformer-encoders, allowing to fuse both image and weather modalities. The results of the proposed method are promising achieving 97\% overall accuracy.

Related papers

Transformer Fusion with Optimal Transport [25.022849817421964]
Fusion is a technique for merging multiple independently-trained neural networks in order to combine their capabilities. This paper presents a systematic approach for fusing two or more transformer-based networks exploiting Optimal Transport to (soft-)align the various architectural components.
arXiv Detail & Related papers (2023-10-09T13:40:31Z)
Mutual Information-driven Triple Interaction Network for Efficient Image Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing. The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal. The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z)
Improving FHB Screening in Wheat Breeding Using an Efficient Transformer Model [0.0]
Fusarium head blight is a devastating disease that causes significant economic losses annually on small grains. Image processing techniques have been developed using supervised machine learning algorithms for the early detection of FHB. A new Context Bridge is proposed to integrate the local representation capability of the U-Net network in the transformer model.
arXiv Detail & Related papers (2023-08-07T15:44:58Z)
HST-MRF: Heterogeneous Swin Transformer with Multi-Receptive Field for Medical Image Segmentation [5.51045524851432]
We propose a Heterogeneous Swin Transformer with Multi-Receptive Field (HST-MRF) model for medical image segmentation. The main purpose is to solve the problem of loss of structural information caused by patch segmentation using transformer. Experimental results show that our proposed method outperforms state-of-the-art models and can achieve superior performance.
arXiv Detail & Related papers (2023-04-10T14:30:03Z)
TFormer: A throughout fusion transformer for multi-modal skin lesion diagnosis [6.899641625551976]
We introduce a pure transformer-based method, which we refer to as Throughout Fusion Transformer (TFormer)", for sufficient information intergration in MSLD. We then carefully design a stack of dual-branch hierarchical multi-modal transformer (HMT) blocks to fuse information across different image modalities in a stage-by-stage way. Our TFormer outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2022-11-21T12:07:05Z)
MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation. It simultaneously learns global semantic information and local spatial-detailed features. Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z)
TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers [49.689566246504356]
We propose TransFusion, a robust solution to LiDAR-camera fusion with a soft-association mechanism to handle inferior image conditions. TransFusion achieves state-of-the-art performance on large-scale datasets. We extend the proposed method to the 3D tracking task and achieve the 1st place in the leaderboard of nuScenes tracking.
arXiv Detail & Related papers (2022-03-22T07:15:13Z)
VoRTX: Volumetric 3D Reconstruction With Transformers for Voxelwise View Selection and Fusion [68.68537312256144]
VoRTX is an end-to-end volumetric 3D reconstruction network using transformers for wide-baseline, multi-view feature fusion. We train our model on ScanNet and show that it produces better reconstructions than state-of-the-art methods.
arXiv Detail & Related papers (2021-12-01T02:18:11Z)
Visual Saliency Transformer [127.33678448761599]
We develop a novel unified model based on a pure transformer, Visual Saliency Transformer (VST), for both RGB and RGB-D salient object detection (SOD) It takes image patches as inputs and leverages the transformer to propagate global contexts among image patches. Experimental results show that our model outperforms existing state-of-the-art results on both RGB and RGB-D SOD benchmark datasets.
arXiv Detail & Related papers (2021-04-25T08:24:06Z)
TransVG: End-to-End Visual Grounding with Transformers [102.11922622103613]
We present a transformer-based framework for visual grounding, namely TransVG, to address the task of grounding a language query to an image. We show that the complex fusion modules can be replaced by a simple stack of transformer encoder layers with higher performance.
arXiv Detail & Related papers (2021-04-17T13:35:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.