EcoVerse: An Annotated Twitter Dataset for Eco-Relevance Classification, Environmental Impact Analysis, and Stance Detection
- URL: http://arxiv.org/abs/2404.05133v1
- Date: Mon, 8 Apr 2024 01:21:11 GMT
- Title: EcoVerse: An Annotated Twitter Dataset for Eco-Relevance Classification, Environmental Impact Analysis, and Stance Detection
- Authors: Francesca Grasso, Stefano Locci, Giovanni Siragusa, Luigi Di Caro,
- Abstract summary: EcoVerse is an annotated English Twitter dataset of 3,023 tweets spanning a wide spectrum of environmental topics.
We propose a three-level annotation scheme designed for Eco-Relevance Classification, Stance Detection, and introducing an original approach for Environmental Impact Analysis.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Anthropogenic ecological crisis constitutes a significant challenge that all within the academy must urgently face, including the Natural Language Processing (NLP) community. While recent years have seen increasing work revolving around climate-centric discourse, crucial environmental and ecological topics outside of climate change remain largely unaddressed, despite their prominent importance. Mainstream NLP tasks, such as sentiment analysis, dominate the scene, but there remains an untouched space in the literature involving the analysis of environmental impacts of certain events and practices. To address this gap, this paper presents EcoVerse, an annotated English Twitter dataset of 3,023 tweets spanning a wide spectrum of environmental topics. We propose a three-level annotation scheme designed for Eco-Relevance Classification, Stance Detection, and introducing an original approach for Environmental Impact Analysis. We detail the data collection, filtering, and labeling process that led to the creation of the dataset. Remarkable Inter-Annotator Agreement indicates that the annotation scheme produces consistent annotations of high quality. Subsequent classification experiments using BERT-based models, including ClimateBERT, are presented. These yield encouraging results, while also indicating room for a model specifically tailored for environmental texts. The dataset is made freely available to stimulate further research.
Related papers
- VegeDiff: Latent Diffusion Model for Geospatial Vegetation Forecasting [58.12667617617306]
We propose VegeDiff for the geospatial vegetation forecasting task.
VegeDiff is the first to employ a diffusion model to probabilistically capture the uncertainties in vegetation change processes.
By capturing the uncertainties in vegetation changes and modeling the complex influence of relevant variables, VegeDiff outperforms existing deterministic methods.
arXiv Detail & Related papers (2024-07-17T14:15:52Z) - Towards A Comprehensive Assessment of AI's Environmental Impact [0.5982922468400899]
Recent surge of interest in machine learning has sparked a trend towards large-scale adoption of AI/ML.
There is a need for a framework that monitors the environmental impact and degradation from AI/ML throughout its lifecycle.
This study proposes a methodology to track environmental variables relating to the multifaceted impact of AI around datacenters using openly available energy data and globally acquired satellite observations.
arXiv Detail & Related papers (2024-05-22T21:19:35Z) - FREE: The Foundational Semantic Recognition for Modeling Environmental Ecosystems [28.166089112650926]
FREE maps available environmental data into a text space and then converts the traditional predictive modeling task in environmental science to the semantic recognition problem.
When used for long-term prediction, FREE has the flexibility to incorporate newly collected observations to enhance future prediction.
The efficacy of FREE is evaluated in the context of two societally important real-world applications, predicting stream water temperature in the Delaware River Basin and predicting annual corn yield in Illinois and Iowa.
arXiv Detail & Related papers (2023-11-17T00:53:09Z) - SatBird: Bird Species Distribution Modeling with Remote Sensing and
Citizen Science Data [68.2366021016172]
We present SatBird, a satellite dataset of locations in the USA with labels derived from presence-absence observation data from the citizen science database eBird.
We also provide a dataset in Kenya representing low-data regimes.
We benchmark a set of baselines on our dataset, including SOTA models for remote sensing tasks.
arXiv Detail & Related papers (2023-11-02T02:00:27Z) - A Comparative Study of Machine Learning Algorithms for Anomaly Detection
in Industrial Environments: Performance and Environmental Impact [62.997667081978825]
This study seeks to address the demands of high-performance machine learning models with environmental sustainability.
Traditional machine learning algorithms, such as Decision Trees and Random Forests, demonstrate robust efficiency and performance.
However, superior outcomes were obtained with optimised configurations, albeit with a commensurate increase in resource consumption.
arXiv Detail & Related papers (2023-07-01T15:18:00Z) - Environmental Claim Detection [6.2887102994549595]
This paper introduces the task of environmental claim detection.
We release an expert-annotated dataset and models trained on this dataset.
We find that the number of environmental claims has steadily increased since the Paris Agreement in 2015.
arXiv Detail & Related papers (2022-09-01T14:51:07Z) - Delving into High-Quality Synthetic Face Occlusion Segmentation Datasets [83.749895930242]
We propose two techniques for producing high-quality naturalistic synthetic occluded faces.
We empirically show the effectiveness and robustness of both methods, even for unseen occlusions.
We present two high-resolution real-world occluded face datasets with fine-grained annotations, RealOcc and RealOcc-Wild.
arXiv Detail & Related papers (2022-05-12T17:03:57Z) - Unraveling the hidden environmental impacts of AI solutions for
environment [0.04588028371034406]
In the past ten years artificial intelligence has encountered such dramatic progress that it is seen now as a tool of choice to solve environmental issues.
The deep learning community began to realize that training models with more and more parameters required a lot of energy and as a consequence GHG emissions.
This article proposes to study the possible negative impact of "AI for green"
arXiv Detail & Related papers (2021-10-22T14:56:47Z) - Analyzing Sustainability Reports Using Natural Language Processing [68.8204255655161]
In recent years, companies have increasingly been aiming to both mitigate their environmental impact and adapt to the changing climate context.
This is reported via increasingly exhaustive reports, which cover many types of climate risks and exposures under the umbrella of Environmental, Social, and Governance (ESG)
We present this tool and the methodology that we used to develop it in the present article.
arXiv Detail & Related papers (2020-11-03T21:22:42Z) - Ecological Reinforcement Learning [76.9893572776141]
We study the kinds of environment properties that can make learning under such conditions easier.
understanding how properties of the environment impact the performance of reinforcement learning agents can help us to structure our tasks in ways that make learning tractable.
arXiv Detail & Related papers (2020-06-22T17:55:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.