Fusion of Satellite Images and Weather Data with Transformer Networks
for Downy Mildew Disease Detection
- URL: http://arxiv.org/abs/2209.02797v1
- Date: Tue, 6 Sep 2022 19:55:16 GMT
- Title: Fusion of Satellite Images and Weather Data with Transformer Networks
for Downy Mildew Disease Detection
- Authors: William Maillet, Maryam Ouhami, Adel Hafiane
- Abstract summary: Crop diseases significantly affect the quantity and quality of agricultural production.
In this paper, we propose a new approach to realize data fusion using three transformers.
The architecture is built from three main components, a Vision Transformer and two transformer-encoders, allowing to fuse both image and weather modalities.
- Score: 3.6868861317674524
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Crop diseases significantly affect the quantity and quality of agricultural
production. In a context where the goal of precision agriculture is to minimize
or even avoid the use of pesticides, weather and remote sensing data with deep
learning can play a pivotal role in detecting crop diseases, allowing localized
treatment of crops. However, combining heterogeneous data such as weather and
images remains a hot topic and challenging task. Recent developments in
transformer architectures have shown the possibility of fusion of data from
different domains, for instance text-image. The current trend is to custom only
one transformer to create a multimodal fusion model. Conversely, we propose a
new approach to realize data fusion using three transformers. In this paper, we
first solved the missing satellite images problem, by interpolating them with a
ConvLSTM model. Then, proposed a multimodal fusion architecture that jointly
learns to process visual and weather information. The architecture is built
from three main components, a Vision Transformer and two transformer-encoders,
allowing to fuse both image and weather modalities. The results of the proposed
method are promising achieving 97\% overall accuracy.
Related papers
- Transformer Fusion with Optimal Transport [25.022849817421964]
Fusion is a technique for merging multiple independently-trained neural networks in order to combine their capabilities.
This paper presents a systematic approach for fusing two or more transformer-based networks exploiting Optimal Transport to (soft-)align the various architectural components.
arXiv Detail & Related papers (2023-10-09T13:40:31Z) - Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing.
The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal.
The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z) - Improving FHB Screening in Wheat Breeding Using an Efficient Transformer
Model [0.0]
Fusarium head blight is a devastating disease that causes significant economic losses annually on small grains.
Image processing techniques have been developed using supervised machine learning algorithms for the early detection of FHB.
A new Context Bridge is proposed to integrate the local representation capability of the U-Net network in the transformer model.
arXiv Detail & Related papers (2023-08-07T15:44:58Z) - HST-MRF: Heterogeneous Swin Transformer with Multi-Receptive Field for
Medical Image Segmentation [5.51045524851432]
We propose a Heterogeneous Swin Transformer with Multi-Receptive Field (HST-MRF) model for medical image segmentation.
The main purpose is to solve the problem of loss of structural information caused by patch segmentation using transformer.
Experimental results show that our proposed method outperforms state-of-the-art models and can achieve superior performance.
arXiv Detail & Related papers (2023-04-10T14:30:03Z) - TFormer: A throughout fusion transformer for multi-modal skin lesion
diagnosis [6.899641625551976]
We introduce a pure transformer-based method, which we refer to as Throughout Fusion Transformer (TFormer)", for sufficient information intergration in MSLD.
We then carefully design a stack of dual-branch hierarchical multi-modal transformer (HMT) blocks to fuse information across different image modalities in a stage-by-stage way.
Our TFormer outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2022-11-21T12:07:05Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with
Transformers [49.689566246504356]
We propose TransFusion, a robust solution to LiDAR-camera fusion with a soft-association mechanism to handle inferior image conditions.
TransFusion achieves state-of-the-art performance on large-scale datasets.
We extend the proposed method to the 3D tracking task and achieve the 1st place in the leaderboard of nuScenes tracking.
arXiv Detail & Related papers (2022-03-22T07:15:13Z) - VoRTX: Volumetric 3D Reconstruction With Transformers for Voxelwise View
Selection and Fusion [68.68537312256144]
VoRTX is an end-to-end volumetric 3D reconstruction network using transformers for wide-baseline, multi-view feature fusion.
We train our model on ScanNet and show that it produces better reconstructions than state-of-the-art methods.
arXiv Detail & Related papers (2021-12-01T02:18:11Z) - Visual Saliency Transformer [127.33678448761599]
We develop a novel unified model based on a pure transformer, Visual Saliency Transformer (VST), for both RGB and RGB-D salient object detection (SOD)
It takes image patches as inputs and leverages the transformer to propagate global contexts among image patches.
Experimental results show that our model outperforms existing state-of-the-art results on both RGB and RGB-D SOD benchmark datasets.
arXiv Detail & Related papers (2021-04-25T08:24:06Z) - TransVG: End-to-End Visual Grounding with Transformers [102.11922622103613]
We present a transformer-based framework for visual grounding, namely TransVG, to address the task of grounding a language query to an image.
We show that the complex fusion modules can be replaced by a simple stack of transformer encoder layers with higher performance.
arXiv Detail & Related papers (2021-04-17T13:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.