VayuBuddy: an LLM-Powered Chatbot to Democratize Air Quality Insights
- URL: http://arxiv.org/abs/2411.12760v1
- Date: Sat, 16 Nov 2024 08:02:35 GMT
- Title: VayuBuddy: an LLM-Powered Chatbot to Democratize Air Quality Insights
- Authors: Zeel B Patel, Yash Bachwana, Nitish Sharma, Sarath Guttikunda, Nipun Batra,
- Abstract summary: VayuBuddy is a Large Language Model (LLM)-powered chatbots for air quality sensor data analysis.
VyuBuddy receives the questions in natural language, analyses the structured sensory data with a LLM-generated Python code and provides answers in natural language.
VyuBuddy can also generate visual analysis such as line-plots, map plot, bar charts and many others from the sensory data.
- Score: 2.2754055137802074
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Nearly 6.7 million lives are lost due to air pollution every year. While policymakers are working on the mitigation strategies, public awareness can help reduce the exposure to air pollution. Air pollution data from government-installed sensors is often publicly available in raw format, but there is a non-trivial barrier for various stakeholders in deriving meaningful insights from that data. In this work, we present VayuBuddy, a Large Language Model (LLM)-powered chatbot system to reduce the barrier between the stakeholders and air quality sensor data. VayuBuddy receives the questions in natural language, analyses the structured sensory data with a LLM-generated Python code and provides answers in natural language. We use the data from Indian government air quality sensors. We benchmark the capabilities of 7 LLMs on 45 diverse question-answer pairs prepared by us. Additionally, VayuBuddy can also generate visual analysis such as line-plots, map plot, bar charts and many others from the sensory data as we demonstrate in this work.
Related papers
- AirCast: Improving Air Pollution Forecasting Through Multi-Variable Data Alignment [46.56288727659417]
Air pollution remains a leading global health risk, exacerbated by rapid industrialization and urbanization.
We introduce AirCast, a novel multi-variable air pollution forecasting model.
AirCast employs a multi-task head architecture that simultaneously forecasts atmospheric conditions and pollutant concentrations.
arXiv Detail & Related papers (2025-02-25T07:34:18Z) - Use of Air Quality Sensor Network Data for Real-time Pollution-Aware POI Suggestion [10.782779065468558]
This demo paper introduces AirSense-R, a privacy-preserving mobile application that delivers real-time, pollution-aware recommendations forPOIs.
By merging live air quality data from AirSENCE sensor networks with user preferences, the system enables health-conscious decision-making.
arXiv Detail & Related papers (2025-02-13T10:36:17Z) - AirCalypse: Can Twitter Help in Urban Air Quality Measurement and Who are the Influential Users? [0.6441880253307178]
This work is an empirical study on using Twitter as a "Sensor" to measure air quality.
The focal point of this work is to identify the users who have been actively tweeting in the air pollution events in Delhi.
We further study the behavior, i.e., perception of pollution from those users' posts with respect to the actual air pollution levels using the physical sensors.
arXiv Detail & Related papers (2025-01-25T11:13:15Z) - NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews [65.35458530702442]
We focus on journalistic interviews, a domain rich in grounding communication and abundant in data.
We curate a dataset of 40,000 two-person informational interviews from NPR and CNN.
LLMs are significantly less likely than human interviewers to use acknowledgements and to pivot to higher-level questions.
arXiv Detail & Related papers (2024-11-21T01:37:38Z) - In Generative AI we Trust: Can Chatbots Effectively Verify Political
Information? [39.58317527488534]
This article presents a comparative analysis of the ability of two large language model (LLM)-based chatbots, ChatGPT and Bing Chat, to detect veracity of political information.
We use AI auditing methodology to investigate how chatbots evaluate true, false, and borderline statements on five topics: COVID-19, Russian aggression against Ukraine, the Holocaust, climate change, and LGBTQ+ related debates.
The results show high performance of ChatGPT for the baseline veracity evaluation task, with 72 percent of the cases evaluated correctly on average across languages without pre-training.
arXiv Detail & Related papers (2023-12-20T15:17:03Z) - Gaussian Processes for Monitoring Air-Quality in Kampala [3.173497841606415]
We investigate the use of Gaussian Processes for nowcasting the current air-pollution in places where there are no sensors and forecasting the air-pollution in the future at the sensor locations.
We focus on the city of Kampala in Uganda, using data from AirQo's network of sensors.
arXiv Detail & Related papers (2023-11-28T09:25:23Z) - GreenEyes: An Air Quality Evaluating Model based on WaveNet [11.513011576336744]
We propose a deep neural network model, which consists of a WaveNet-based backbone block for learning representations of sequences and an LSTM with a Temporal Attention module.
We show our model can effectively predict the air quality level of the next timestamp given any segment of the air quality data from the data set.
arXiv Detail & Related papers (2022-12-08T10:28:57Z) - Predicting air quality via multimodal AI and satellite imagery [0.2492060267829796]
This paper seeks to create a multi-modal machine learning model for predicting air-quality metrics where monitoring stations do not exist.
A new dataset of European pollution monitoring station measurements is created with features including $textitaltitude, population, etc.$ from the ESA Copernicus project.
These predictions are then aggregated to create an "air-quality index" that could be used to compare air quality over different regions.
arXiv Detail & Related papers (2022-11-01T22:56:15Z) - Exploring Contextual Representation and Multi-Modality for End-to-End
Autonomous Driving [58.879758550901364]
Recent perception systems enhance spatial understanding with sensor fusion but often lack full environmental context.
We introduce a framework that integrates three cameras to emulate the human field of view, coupled with top-down bird-eye-view semantic data to enhance contextual representation.
Our method achieves displacement error by 0.67m in open-loop settings, surpassing current methods by 6.9% on the nuScenes dataset.
arXiv Detail & Related papers (2022-10-13T05:56:20Z) - COLD: A Benchmark for Chinese Offensive Language Detection [54.60909500459201]
We use COLDataset, a Chinese offensive language dataset with 37k annotated sentences.
We also propose textscCOLDetector to study output offensiveness of popular Chinese language models.
Our resources and analyses are intended to help detoxify the Chinese online communities and evaluate the safety performance of generative language models.
arXiv Detail & Related papers (2022-01-16T11:47:23Z) - A dataset for multi-sensor drone detection [67.75999072448555]
The use of small and remotely controlled unmanned aerial vehicles (UAVs) has increased in recent years.
Most studies on drone detection fail to specify the type of acquisition device, the drone type, the detection range, or the dataset.
We contribute with an annotated multi-sensor database for drone detection that includes infrared and visible videos and audio files.
arXiv Detail & Related papers (2021-11-02T20:52:03Z) - An Experimental Urban Case Study with Various Data Sources and a Model
for Traffic Estimation [65.28133251370055]
We organize an experimental campaign with video measurement in an area within the urban network of Zurich, Switzerland.
We focus on capturing the traffic state in terms of traffic flow and travel times by ensuring measurements from established thermal cameras.
We propose a simple yet efficient Multiple Linear Regression (MLR) model to estimate travel times with fusion of various data sources.
arXiv Detail & Related papers (2021-08-02T08:13:57Z) - Survey: Leakage and Privacy at Inference Time [59.957056214792665]
Leakage of data from publicly available Machine Learning (ML) models is an area of growing significance.
We focus on inference-time leakage, as the most likely scenario for publicly available models.
We propose a taxonomy across involuntary and malevolent leakage, available defences, followed by the currently available assessment metrics and applications.
arXiv Detail & Related papers (2021-07-04T12:59:16Z) - AiR -- An Augmented Reality Application for Visualizing Air Pollution [5.564705758320338]
AiR considers the air quality measured by CPCB, in a locality detected by the user's GPS or in a locality of user's choice, and visualizes various air pollutants present in the locality.
AiR also creates awareness in an interactive manner about the different pollutants, sources, and their impacts on health.
arXiv Detail & Related papers (2020-06-03T10:03:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.