Indoor Air Quality Dataset with Activities of Daily Living in Low to Middle-income Communities
- URL: http://arxiv.org/abs/2407.14501v3
- Date: Mon, 23 Sep 2024 06:46:19 GMT
- Title: Indoor Air Quality Dataset with Activities of Daily Living in Low to Middle-income Communities
- Authors: Prasenjit Karmakar, Swadhin Pradhan, Sandip Chakraborty,
- Abstract summary: We present measurements of air quality from 30 indoor sites over six months during summer and winter seasons in India.
The dataset contains various types of indoor environments.
It can provide the basis for data-driven learning model research aimed at coping with unique pollution patterns in developing countries.
- Score: 5.019848446554892
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, indoor air pollution has posed a significant threat to our society, claiming over 3.2 million lives annually. Developing nations, such as India, are most affected since lack of knowledge, inadequate regulation, and outdoor air pollution lead to severe daily exposure to pollutants. However, only a limited number of studies have attempted to understand how indoor air pollution affects developing countries like India. To address this gap, we present spatiotemporal measurements of air quality from 30 indoor sites over six months during summer and winter seasons. The sites are geographically located across four regions of type: rural, suburban, and urban, covering the typical low to middle-income population in India. The dataset contains various types of indoor environments (e.g., studio apartments, classrooms, research laboratories, food canteens, and residential households), and can provide the basis for data-driven learning model research aimed at coping with unique pollution patterns in developing countries. This unique dataset demands advanced data cleaning and imputation techniques for handling missing data due to power failure or network outages during data collection. Furthermore, through a simple speech-to-text application, we provide real-time indoor activity labels annotated by occupants. Therefore, environmentalists and ML enthusiasts can utilize this dataset to understand the complex patterns of the pollutants under different indoor activities, identify recurring sources of pollution, forecast exposure, improve floor plans and room structures of modern indoor designs, develop pollution-aware recommender systems, etc.
Related papers
- Causal Representation Learning in Temporal Data via Single-Parent Decoding [66.34294989334728]
Scientific research often seeks to understand the causal structure underlying high-level variables in a system.
Scientists typically collect low-level measurements, such as geographically distributed temperature readings.
We propose a differentiable method, Causal Discovery with Single-parent Decoding, that simultaneously learns the underlying latents and a causal graph over them.
arXiv Detail & Related papers (2024-10-09T15:57:50Z) - Urban Air Pollution Forecasting: a Machine Learning Approach leveraging Satellite Observations and Meteorological Forecasts [0.11249583407496218]
Air pollution poses a significant threat to public health and well-being, particularly in urban areas.
This study introduces a series of machine-learning models that integrate data from the Sentinel-5P satellite, meteorological conditions, and topological characteristics to forecast future levels of five major pollutants.
arXiv Detail & Related papers (2024-05-30T10:02:53Z) - Back to the Future: GNN-based NO$_2$ Forecasting via Future Covariates [49.93577170464313]
We deal with air quality observations in a city-wide network of ground monitoring stations.
We propose a conditioning block that embeds past and future covariates into the current observations.
We find that conditioning on future weather information has a greater impact than considering past traffic conditions.
arXiv Detail & Related papers (2024-04-08T09:13:16Z) - SatBird: Bird Species Distribution Modeling with Remote Sensing and
Citizen Science Data [68.2366021016172]
We present SatBird, a satellite dataset of locations in the USA with labels derived from presence-absence observation data from the citizen science database eBird.
We also provide a dataset in Kenya representing low-data regimes.
We benchmark a set of baselines on our dataset, including SOTA models for remote sensing tasks.
arXiv Detail & Related papers (2023-11-02T02:00:27Z) - Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol
Particles for Frontier Exploration [55.41644538483948]
This paper introduces a multimodal dataset from the harsh and unstructured underground environment with aerosol particles.
It contains synchronized raw data measurements from all onboard sensors in Robot Operating System (ROS) format.
The focus of this paper is not only to capture both temporal and spatial data diversities but also to present the impact of harsh conditions on captured data.
arXiv Detail & Related papers (2023-04-27T20:21:18Z) - Multi-scale Digital Twin: Developing a fast and physics-informed
surrogate model for groundwater contamination with uncertain climate models [53.44486283038738]
Climate change exacerbates the long-term soil management problem of groundwater contamination.
We develop a physics-informed machine learning surrogate model using U-Net enhanced Fourier Neural Contaminated (PDENO)
In parallel, we develop a convolutional autoencoder combined with climate data to reduce the dimensionality of climatic region similarities across the United States.
arXiv Detail & Related papers (2022-11-20T06:46:35Z) - Air Pollution Hotspot Detection and Source Feature Analysis using
Cross-domain Urban Data [2.458537954999774]
Areas adjacent to pollution sources often have high ambient pollution concentrations, and those areas are commonly referred to as air pollution hotspots.
We propose a two-step approach to detect hotspots from mobile sensing data, which includes local spike detection and sample-weighted clustering.
As a soft-validation, we build hotspot inference models for cities with and without mobile sensing data.
arXiv Detail & Related papers (2022-11-15T18:44:03Z) - Detecting Elevated Air Pollution Levels by Monitoring Web Search
Queries: Deep Learning-Based Time Series Forecasting [7.978612711536259]
Prior work relied on modeling pollutant concentrations collected from ground-based monitors and meteorological data for long-term forecasting.
This study aims to develop and validate models to nowcast the observed pollution levels using Web search data, which is publicly available in near real-time from major search engines.
We developed novel machine learning-based models using both traditional supervised classification methods and state-of-the-art deep learning methods to detect elevated air pollution levels at the US city level.
arXiv Detail & Related papers (2022-11-09T23:56:35Z) - Jalisco's multiclass land cover analysis and classification using a
novel lightweight convnet with real-world multispectral and relief data [51.715517570634994]
We present our novel lightweight (only 89k parameters) Convolution Neural Network (ConvNet) to make LC classification and analysis.
In this work, we combine three real-world open data sources to obtain 13 channels.
Our embedded analysis anticipates the limited performance in some classes and gives us the opportunity to group the most similar.
arXiv Detail & Related papers (2022-01-26T14:58:51Z) - Deciphering Environmental Air Pollution with Large Scale City Data [0.0]
Various factors ranging from emissions from traffic and power plants, household emissions, natural causes are known to be primary causal agents or influencers behind rising air pollution levels.
We introduce a large scale city-wise dataset for exploring the relationships among these agents over a long period of time.
Also, we provide a set of benchmarks for the problem of estimating or forecasting pollutant levels with a set of diverse models and methodologies.
arXiv Detail & Related papers (2021-09-09T22:00:51Z) - Use of Remote Sensing Data to Identify Air Pollution Signatures in India [0.3683202928838613]
The launch of the Sentinel-5P satellite has helped in the observation of a wider variety of air pollutants.
The clustering signatures can be used to identify states and districts based on the types of pollutants emitted by various pollution sources.
arXiv Detail & Related papers (2020-12-01T11:06:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.