Related papers: Automated Landfill Detection Using Deep Learning: A Comparative Study of Lightweight and Custom Architectures with the AerialWaste Dataset

Automated Landfill Detection Using Deep Learning: A Comparative Study of Lightweight and Custom Architectures with the AerialWaste Dataset

URL: http://arxiv.org/abs/2508.18315v1
Date: Sat, 23 Aug 2025 19:52:24 GMT
Title: Automated Landfill Detection Using Deep Learning: A Comparative Study of Lightweight and Custom Architectures with the AerialWaste Dataset
Authors: Nowshin Sharmily, Rusab Sarmun, Muhammad E. H. Chowdhury, Mir Hamidul Hussain, Saad Bin Abul Kashem, Molla E Majid, Amith Khandakar,
Abstract summary: AerialWaste dataset is a large collection of 10434 images of Lombardy region of Italy.<n>Deep learning models were used to train and validate the dataset.<n> binary classification could be performed on this dataset with 92.33% accuracy, 92.67% precision, 92.33% sensitivity, 92.41% F1 score and 92.71% specificity.
Score: 7.803636044185931
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Illegal landfills are posing as a hazardous threat to people all over the world. Due to the arduous nature of manually identifying the location of landfill, many landfills go unnoticed by authorities and later cause dangerous harm to people and environment. Deep learning can play a significant role in identifying these landfills while saving valuable time, manpower and resources. Despite being a burning concern, good quality publicly released datasets for illegal landfill detection are hard to find due to security concerns. However, AerialWaste Dataset is a large collection of 10434 images of Lombardy region of Italy. The images are of varying qualities, collected from three different sources: AGEA Orthophotos, WorldView-3, and Google Earth. The dataset contains professionally curated, diverse and high-quality images which makes it particularly suitable for scalable and impactful research. As we trained several models to compare results, we found complex and heavy models to be prone to overfitting and memorizing training data instead of learning patterns. Therefore, we chose lightweight simpler models which could leverage general features from the dataset. In this study, Mobilenetv2, Googlenet, Densenet, MobileVit and other lightweight deep learning models were used to train and validate the dataset as they achieved significant success with less overfitting. As we saw substantial improvement in the performance using some of these models, we combined the best performing models and came up with an ensemble model. With the help of ensemble and fusion technique, binary classification could be performed on this dataset with 92.33% accuracy, 92.67% precision, 92.33% sensitivity, 92.41% F1 score and 92.71% specificity.

Related papers

AerialMegaDepth: Learning Aerial-Ground Reconstruction and View Synthesis [57.249817395828174]
We propose a scalable framework combining pseudo-synthetic renderings from 3D city-wide meshes with real, ground-level crowd-sourced images.<n>The pseudo-synthetic data simulates a wide range of aerial viewpoints, while the real, crowd-sourced images help improve visual fidelity for ground-level images.<n>Using this hybrid dataset, we fine-tune several state-of-the-art algorithms and achieve significant improvements on real-world, zero-shot aerial-ground tasks.
arXiv Detail & Related papers (2025-04-17T17:57:05Z)
FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration [66.61201445650323]
Existing methods suffer from a generalization bottleneck in real-world scenarios.<n>We contribute a million-scale dataset with two notable advantages over existing training data.<n>We propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios.
arXiv Detail & Related papers (2024-12-02T12:08:40Z)
Community Forensics: Using Thousands of Generators to Train Fake Image Detectors [15.166026536032142]
One of the key challenges of detecting AI-generated images is spotting images that have been created by previously unseen generative models.<n>We propose a new dataset that is significantly larger and more diverse than prior work.<n>The resulting dataset contains 2.7M images that have been sampled from 4803 different models.
arXiv Detail & Related papers (2024-11-06T18:59:41Z)
An evaluation of Deep Learning based stereo dense matching dataset shift from aerial images and a large scale stereo dataset [2.048226951354646]
We present a method for generating ground-truth disparity maps directly from Light Detection and Ranging (LiDAR) and images. We evaluate 11 dense matching methods across datasets with diverse scene types, image resolutions, and geometric configurations.
arXiv Detail & Related papers (2024-02-19T20:33:46Z)
Large-scale Weakly Supervised Learning for Road Extraction from Satellite Imagery [9.28701721082481]
This paper proposes to leverage OpenStreetMap road data as weak labels and large scale satellite imagery to pre-train semantic segmentation models. Using as much as 100 times more data than the widely used DeepGlobe road dataset, our model exceeds the top performer of the current DeepGlobe leaderboard.
arXiv Detail & Related papers (2023-09-14T16:16:57Z)
Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory. We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory. Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z)
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision [38.22842778742829]
Discriminative self-supervised learning allows training models on any random group of internet images. We train models on billions of random images without any data pre-processing or prior assumptions about what we want the model to learn. We extensively study and validate our model performance on over 50 benchmarks including fairness, to distribution shift, geographical diversity, fine grained recognition, image copy detection and many image classification datasets.
arXiv Detail & Related papers (2022-02-16T22:26:47Z)
A Novel Disaster Image Dataset and Characteristics Analysis using Attention Model [2.1473182295633224]
This dataset contains images collected from various sources for three different disasters: fire, water and land. There are 13,720 manually annotated images in this dataset, each image is annotated by three individuals. A three layer attention model (TLAM) is trained and average five fold validation accuracy of 95.88% is achieved.
arXiv Detail & Related papers (2021-07-02T21:18:20Z)
Hidden Biases in Unreliable News Detection Datasets [60.71991809782698]
We show that selection bias during data collection leads to undesired artifacts in the datasets. We observed a significant drop (>10%) in accuracy for all models tested in a clean split with no train/test source overlap. We suggest future dataset creation include a simple model as a difficulty/bias probe and future model development use a clean non-overlapping site and date split.
arXiv Detail & Related papers (2021-04-20T17:16:41Z)
Deep Traffic Sign Detection and Recognition Without Target Domain Real Images [52.079665469286496]
We propose a novel database generation method that requires no real image from the target-domain, and (ii) templates of the traffic signs. The method does not aim at overcoming the training with real data, but to be a compatible alternative when the real data is not available. On large data sets, training with a fully synthetic data set almost matches the performance of training with a real one.
arXiv Detail & Related papers (2020-07-30T21:06:47Z)
GraspNet: A Large-Scale Clustered and Densely Annotated Dataset for Object Grasping [49.777649953381676]
We contribute a large-scale grasp pose detection dataset with a unified evaluation system. Our dataset contains 87,040 RGBD images with over 370 million grasp poses.
arXiv Detail & Related papers (2019-12-31T18:15:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.