VisText-Mosquito: A Multimodal Dataset and Benchmark for AI-Based Mosquito Breeding Site Detection and Reasoning
- URL: http://arxiv.org/abs/2506.14629v1
- Date: Tue, 17 Jun 2025 15:24:30 GMT
- Title: VisText-Mosquito: A Multimodal Dataset and Benchmark for AI-Based Mosquito Breeding Site Detection and Reasoning
- Authors: Md. Adnanul Islam, Md. Faiyaz Abdullah Sayeedi, Md. Asaduzzaman Shuvo, Muhammad Ziaur Rahman, Shahanur Rahman Bappy, Raiyan Rahman, Swakkhar Shatabda,
- Abstract summary: VisText-Mosquito is a multimodal dataset to support automated detection, segmentation, and reasoning for mosquito breeding site analysis.<n>The dataset includes 1,828 annotated images for object detection, 142 images for water surface segmentation, and natural language reasoning texts linked to each image.<n>This paper showcases how AI-based detection can proactively address mosquito-borne disease risks.
- Score: 1.083674643223243
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mosquito-borne diseases pose a major global health risk, requiring early detection and proactive control of breeding sites to prevent outbreaks. In this paper, we present VisText-Mosquito, a multimodal dataset that integrates visual and textual data to support automated detection, segmentation, and reasoning for mosquito breeding site analysis. The dataset includes 1,828 annotated images for object detection, 142 images for water surface segmentation, and natural language reasoning texts linked to each image. The YOLOv9s model achieves the highest precision of 0.92926 and mAP@50 of 0.92891 for object detection, while YOLOv11n-Seg reaches a segmentation precision of 0.91587 and mAP@50 of 0.79795. For reasoning generation, our fine-tuned BLIP model achieves a final loss of 0.0028, with a BLEU score of 54.7, BERTScore of 0.91, and ROUGE-L of 0.87. This dataset and model framework emphasize the theme "Prevention is Better than Cure", showcasing how AI-based detection can proactively address mosquito-borne disease risks. The dataset and implementation code are publicly available at GitHub: https://github.com/adnanul-islam-jisun/VisText-Mosquito
Related papers
- CRTRE: Causal Rule Generation with Target Trial Emulation Framework [47.2836994469923]
We introduce a novel method called causal rule generation with target trial emulation framework (CRTRE)
CRTRE applies randomize trial design principles to estimate the causal effect of association rules.
We then incorporate such association rules for the downstream applications such as prediction of disease onsets.
arXiv Detail & Related papers (2024-11-10T02:40:06Z) - Effective and Efficient Adversarial Detection for Vision-Language Models via A Single Vector [97.92369017531038]
We build a new laRge-scale Adervsarial images dataset with Diverse hArmful Responses (RADAR)
We then develop a novel iN-time Embedding-based AdveRSarial Image DEtection (NEARSIDE) method, which exploits a single vector that distilled from the hidden states of Visual Language Models (VLMs) to achieve the detection of adversarial images against benign ones in the input.
arXiv Detail & Related papers (2024-10-30T10:33:10Z) - Unlearnable Examples Detection via Iterative Filtering [84.59070204221366]
Deep neural networks are proven to be vulnerable to data poisoning attacks.
It is quite beneficial and challenging to detect poisoned samples from a mixed dataset.
We propose an Iterative Filtering approach for UEs identification.
arXiv Detail & Related papers (2024-08-15T13:26:13Z) - Attention Based Feature Fusion Network for Monkeypox Skin Lesion Detection [0.09642500063568188]
Recent monkeypox outbreak has raised significant public health concerns.
Deep learning algorithms can be used to identify diseases, including COVID-19.
We introduce a lightweight model that merges two pre-trained architectures to classify human monkeypox disease.
arXiv Detail & Related papers (2024-08-13T05:21:03Z) - MosquitoFusion: A Multiclass Dataset for Real-Time Detection of Mosquitoes, Swarms, and Breeding Sites Using Deep Learning [0.0]
We present an integrated approach to real-time mosquito detection using our multiclass dataset (MosquitoFusion) containing 1204 diverse images.
The pre-trained YOLOv8 model, trained on this dataset, achieved a mean Average Precision (mAP@50) of 57.1%, with precision at 73.4% and recall at 50.5%.
arXiv Detail & Related papers (2024-04-01T21:49:05Z) - DeepSeaNet: Improving Underwater Object Detection using EfficientDet [0.0]
This project involves implementing and evaluating various object detection models on an annotated underwater dataset.
The dataset comprises annotated image sequences of fish, crabs, starfish, and other aquatic animals captured in Limfjorden water with limited visibility.
I compare the results of YOLOv3 (31.10% mean Average Precision (mAP)), YOLOv4 (83.72% mAP), YOLOv5 (97.6%), YOLOv8 (98.20%), EfficientDet (98.56% mAP) and Detectron2 (95.20% mAP) on the same dataset.
arXiv Detail & Related papers (2023-05-26T13:41:35Z) - Uncertainty-inspired Open Set Learning for Retinal Anomaly
Identification [71.06194656633447]
We establish an uncertainty-inspired open-set (UIOS) model, which was trained with fundus images of 9 retinal conditions.
Our UIOS model with thresholding strategy achieved an F1 score of 99.55%, 97.01% and 91.91% for the internal testing set.
UIOS correctly predicted high uncertainty scores, which would prompt the need for a manual check in the datasets of non-target categories retinal diseases, low-quality fundus images, and non-fundus images.
arXiv Detail & Related papers (2023-04-08T10:47:41Z) - End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge
Distillation [86.41437210485932]
We aim at advancing zero-shot HOI detection to detect both seen and unseen HOIs simultaneously.
We propose a novel end-to-end zero-shot HOI Detection framework via vision-language knowledge distillation.
Our method outperforms the previous SOTA by 8.92% on unseen mAP and 10.18% on overall mAP.
arXiv Detail & Related papers (2022-04-01T07:27:19Z) - Semantic Segmentation of Anaemic RBCs Using Multilevel Deep
Convolutional Encoder-Decoder Network [2.5398817423053037]
We propose a convolutional neural network (CNN) model for semantic segmentation of red blood cells.
The proposed model preserved pixel-level semantic information extracted in one layer and then passed to the next layer to choose relevant features.
This phenomenon helps to precise pixel-level counting of healthy and anaemic-RBC elements along with morphological analysis.
arXiv Detail & Related papers (2022-02-09T17:31:50Z) - Osteoporosis Prescreening using Panoramic Radiographs through a Deep
Convolutional Neural Network with Attention Mechanism [65.70943212672023]
Deep convolutional neural network (CNN) with an attention module can detect osteoporosis on panoramic radiographs.
dataset of 70 panoramic radiographs (PRs) from 70 different subjects of age between 49 to 60 was used.
arXiv Detail & Related papers (2021-10-19T00:03:57Z) - The Report on China-Spain Joint Clinical Testing for Rapid COVID-19 Risk
Screening by Eye-region Manifestations [59.48245489413308]
We developed and tested a COVID-19 rapid prescreening model using the eye-region images captured in China and Spain with cellphone cameras.
The performance was measured using area under receiver-operating-characteristic curve (AUC), sensitivity, specificity, accuracy, and F1.
arXiv Detail & Related papers (2021-09-18T02:28:01Z) - Automatic Detection of Aedes aegypti Breeding Grounds Based on Deep
Networks with Spatio-Temporal Consistency [2.4858193569899907]
Aedes aegypti mosquito infects millions of people with diseases such as dengue zika, chikungunya, and urban yellow fever.
Main form to combat these diseases is to avoid mosquito reproduction by searching for and eliminating the potential mosquito breeding grounds.
In this work, we introduce a comprehensive dataset of aerial videos, acquired with an unmanned aerial vehicle, containing possible mosquito breeding sites.
arXiv Detail & Related papers (2020-07-29T14:30:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.