Priority prediction of Asian Hornet sighting report using machine
learning methods
- URL: http://arxiv.org/abs/2107.05465v1
- Date: Mon, 28 Jun 2021 07:33:53 GMT
- Title: Priority prediction of Asian Hornet sighting report using machine
learning methods
- Authors: Yixin Liu, Jiaxin Guo, Jieyang Dong, Luoqian Jiang and Haoyuan Ouyang
- Abstract summary: The Asian giant hornet (Vespa mandarinia) is devastating not only to native bee colonies, but also to local apiculture.
We propose a method to predict the priority of sighting reports based on machine learning.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As infamous invaders to the North American ecosystem, the Asian giant hornet
(Vespa mandarinia) is devastating not only to native bee colonies, but also to
local apiculture. One of the most effective way to combat the harmful species
is to locate and destroy their nests. By mobilizing the public to actively
report possible sightings of the Asian giant hornet, the governmentcould timely
send inspectors to confirm and possibly destroy the nests. However, such
confirmation requires lab expertise, where manually checking the reports one by
one is extremely consuming of human resources. Further given the limited
knowledge of the public about the Asian giant hornet and the randomness of
report submission, only few of the numerous reports proved positive, i.e.
existing nests. How to classify or prioritize the reports efficiently and
automatically, so as to determine the dispatch of personnel, is of great
significance to the control of the Asian giant hornet. In this paper, we
propose a method to predict the priority of sighting reports based on machine
learning. We model the problem of optimal prioritization of sighting reports as
a problem of classification and prediction. We extracted a variety of rich
features in the report: location, time, image(s), and textual description.
Based on these characteristics, we propose a classification model based on
logistic regression to predict the credibility of a certain report.
Furthermore, our model quantifies the impact between reports to get the
priority ranking of the reports. Extensive experiments on the public dataset
from the WSDA (the Washington State Department of Agriculture) have proved the
effectiveness of our method.
Related papers
- Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods.
We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results.
We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z) - Hierarchical Multi-label Classification for Fine-level Event Extraction from Aviation Accident Reports [18.005377921658308]
This article argues that we can identify the events more accurately by leveraging the event taxonomy.
We achieve this hierarchical classification task by incorporating a novel hierarchical attention module into BERT.
It has been shown that fine-level prediction accuracy is highly improved, and the regularization term can be beneficial to the rare event identification problem.
arXiv Detail & Related papers (2024-03-26T17:51:06Z) - A Bayesian Spatial Model to Correct Under-Reporting in Urban
Crowdsourcing [1.850972250657274]
Decision-makers often observe the occurrence of events through a reporting process.
We show how to overcome this challenge by leveraging the fact that events are spatially correlated.
arXiv Detail & Related papers (2023-12-18T23:40:56Z) - SatBird: Bird Species Distribution Modeling with Remote Sensing and
Citizen Science Data [68.2366021016172]
We present SatBird, a satellite dataset of locations in the USA with labels derived from presence-absence observation data from the citizen science database eBird.
We also provide a dataset in Kenya representing low-data regimes.
We benchmark a set of baselines on our dataset, including SOTA models for remote sensing tasks.
arXiv Detail & Related papers (2023-11-02T02:00:27Z) - Automated Labeling of German Chest X-Ray Radiology Reports using Deep
Learning [50.591267188664666]
We propose a deep learning-based CheXpert label prediction model, pre-trained on reports labeled by a rule-based German CheXpert model.
Our results demonstrate the effectiveness of our approach, which significantly outperformed the rule-based model on all three tasks.
arXiv Detail & Related papers (2023-06-09T16:08:35Z) - Automatic Classification of Bug Reports Based on Multiple Text
Information and Reports' Intention [37.67372105858311]
This paper proposes a new automatic classification method for bug reports.
The innovation is that when categorizing bug reports, in addition to using the text information of the report, the intention of the report is also considered.
Our proposed method achieves better performance and its F-Measure achieves from 87.3% to 95.5%.
arXiv Detail & Related papers (2022-08-02T06:44:51Z) - Spatial Monitoring and Insect Behavioural Analysis Using Computer Vision
for Precision Pollination [6.2997667081978825]
Insects are the most important global pollinator of crops and play a key role in maintaining the sustainability of natural ecosystems.
Current computer vision facilitated insect tracking in complex outdoor environments is restricted in spatial coverage.
This article introduces a novel system to facilitate markerless data capture for insect counting, insect motion tracking, behaviour analysis and pollination prediction.
arXiv Detail & Related papers (2022-05-10T05:11:28Z) - Quantifying Spatial Under-reporting Disparities in Resident
Crowdsourcing [5.701305404173138]
We develop a method to identify reporting delays without using external ground-truth data.
We apply our method to over 100,000 resident reports made in New York City and to over 900,000 reports made in Chicago.
arXiv Detail & Related papers (2022-04-19T02:54:16Z) - COLD: A Benchmark for Chinese Offensive Language Detection [54.60909500459201]
We use COLDataset, a Chinese offensive language dataset with 37k annotated sentences.
We also propose textscCOLDetector to study output offensiveness of popular Chinese language models.
Our resources and analyses are intended to help detoxify the Chinese online communities and evaluate the safety performance of generative language models.
arXiv Detail & Related papers (2022-01-16T11:47:23Z) - Hidden Biases in Unreliable News Detection Datasets [60.71991809782698]
We show that selection bias during data collection leads to undesired artifacts in the datasets.
We observed a significant drop (>10%) in accuracy for all models tested in a clean split with no train/test source overlap.
We suggest future dataset creation include a simple model as a difficulty/bias probe and future model development use a clean non-overlapping site and date split.
arXiv Detail & Related papers (2021-04-20T17:16:41Z) - Showing Your Work Doesn't Always Work [73.63200097493576]
"Show Your Work: Improved Reporting of Experimental Results" advocates for reporting the expected validation effectiveness of the best-tuned model.
We analytically show that their estimator is biased and uses error-prone assumptions.
We derive an unbiased alternative and bolster our claims with empirical evidence from statistical simulation.
arXiv Detail & Related papers (2020-04-28T17:59:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.