Project RISE: Recognizing Industrial Smoke Emissions
- URL: http://arxiv.org/abs/2005.06111v9
- Date: Mon, 29 Apr 2024 22:16:08 GMT
- Title: Project RISE: Recognizing Industrial Smoke Emissions
- Authors: Yen-Chia Hsu, Ting-Hao 'Kenneth' Huang, Ting-Yao Hu, Paul Dille, Sean Prendi, Ryan Hoffman, Anastasia Tsuhlares, Jessica Pachuta, Randy Sargent, Illah Nourbakhsh,
- Abstract summary: We introduce RISE, the first large-scale video dataset for Recognizing Industrial Smoke Emissions.
Our dataset contains 12,567 clips from 19 distinct views from cameras that monitored three industrial facilities.
We ran experiments using deep neural networks to establish a strong performance baseline and reveal smoke recognition challenges.
- Score: 11.113345640040277
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Industrial smoke emissions pose a significant concern to human health. Prior works have shown that using Computer Vision (CV) techniques to identify smoke as visual evidence can influence the attitude of regulators and empower citizens to pursue environmental justice. However, existing datasets are not of sufficient quality nor quantity to train the robust CV models needed to support air quality advocacy. We introduce RISE, the first large-scale video dataset for Recognizing Industrial Smoke Emissions. We adopted a citizen science approach to collaborate with local community members to annotate whether a video clip has smoke emissions. Our dataset contains 12,567 clips from 19 distinct views from cameras that monitored three industrial facilities. These daytime clips span 30 days over two years, including all four seasons. We ran experiments using deep neural networks to establish a strong performance baseline and reveal smoke recognition challenges. Our survey study discussed community feedback, and our data analysis displayed opportunities for integrating citizen scientists and crowd workers into the application of Artificial Intelligence for Social Impact.
Related papers
- Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification [6.408114351192012]
We show that models trained on synthetic data for specific downstream tasks still perform worse than those trained on real data.
This discrepancy may be partly due to the sampling space being a subspace of the training videos.
In this paper, we first show that training privacy-preserving models in latent space is computationally more efficient and generalize better.
arXiv Detail & Related papers (2024-11-07T18:32:00Z) - Towards Generalist Robot Learning from Internet Video: A Survey [56.621902345314645]
Scaling deep learning to huge internet-scraped datasets has yielded remarkably general capabilities in natural language processing and visual understanding and generation.
Data is scarce and expensive to collect in robotics. This has seen robot learning struggle to match the generality of capabilities observed in other domains.
Learning from Videos (LfV) methods seek to address this data bottleneck by augmenting traditional robot data with large internet-scraped video datasets.
arXiv Detail & Related papers (2024-04-30T15:57:41Z) - IPAD: Industrial Process Anomaly Detection Dataset [71.39058003212614]
Video anomaly detection (VAD) is a challenging task aiming to recognize anomalies in video frames.
We propose a new dataset, IPAD, specifically designed for VAD in industrial scenarios.
This dataset covers 16 different industrial devices and contains over 6 hours of both synthetic and real-world video footage.
arXiv Detail & Related papers (2024-04-23T13:38:01Z) - Learning Human Action Recognition Representations Without Real Humans [66.61527869763819]
We present a benchmark that leverages real-world videos with humans removed and synthetic data containing virtual humans to pre-train a model.
We then evaluate the transferability of the representation learned on this data to a diverse set of downstream action recognition benchmarks.
Our approach outperforms previous baselines by up to 5%.
arXiv Detail & Related papers (2023-11-10T18:38:14Z) - GreenEyes: An Air Quality Evaluating Model based on WaveNet [11.513011576336744]
We propose a deep neural network model, which consists of a WaveNet-based backbone block for learning representations of sequences and an LSTM with a Temporal Attention module.
We show our model can effectively predict the air quality level of the next timestamp given any segment of the air quality data from the data set.
arXiv Detail & Related papers (2022-12-08T10:28:57Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - Using Statistical Models to Detect Occupancy in Buildings through
Monitoring VOC, CO$_2$, and other Environmental Factors [2.1485350418225244]
Previous research has relied on CO$$ sensors and vision-based techniques to determine occupancy patterns.
Volatile Organic Compounds (VOCs) are another pollutant originating from the occupants.
Volatile Organic Compounds (VOCs) are another pollutant originating from the occupants.
arXiv Detail & Related papers (2022-03-07T22:25:11Z) - STCNet: Spatio-Temporal Cross Network for Industrial Smoke Detection [52.648906951532155]
We propose a novel Spatio-Temporal Cross Network (STCNet) to recognize industrial smoke emissions.
The proposed STCNet involves a spatial to extract texture features and a temporal pathway to capture smoke motion information.
We show that our STCNet achieves clear improvements on the challenging RISE industrial smoke detection dataset against the best competitors by 6.2%.
arXiv Detail & Related papers (2020-11-10T02:28:47Z) - MSNet: A Multilevel Instance Segmentation Network for Natural Disaster
Damage Assessment in Aerial Videos [74.22132693931145]
We study the problem of efficiently assessing building damage after natural disasters like hurricanes, floods or fires.
The first contribution is a new dataset, consisting of user-generated aerial videos from social media with annotations of instance-level building damage masks.
The second contribution is a new model, namely MSNet, which contains novel region proposal network designs.
arXiv Detail & Related papers (2020-06-30T02:23:05Z) - All you can stream: Investigating the role of user behavior for
greenhouse gas intensity of video streaming [0.0]
Life cycle assessments (LCA) need to broaden their perspective from a mere technological to one that includes user decisions and behavior.
quantitative data on user behavior (e.g. streaming duration, choice of end device and resolution) are often lacking or difficult to integrate in LCA.
This study combined LCA with an online survey (N= 91, 7 consecutive days of assessment)
Results show that CO2-intensity of video streaming depends on several factors. It is shown that for climate intensity there is a factor 10 between choosing a smart TV and smartphone for video streaming.
arXiv Detail & Related papers (2020-06-19T13:38:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.