ClimaText: A Dataset for Climate Change Topic Detection
- URL: http://arxiv.org/abs/2012.00483v2
- Date: Sat, 2 Jan 2021 16:13:06 GMT
- Title: ClimaText: A Dataset for Climate Change Topic Detection
- Authors: Francesco S. Varini and Jordan Boyd-Graber and Massimiliano Ciaramita
and Markus Leippold
- Abstract summary: We introduce textscClimaText, a dataset for sentence-based climate change topic detection.
We find that popular keyword-based models are not adequate for such a complex and evolving task.
Our analysis reveals a great potential for improvement in several directions.
- Score: 2.9767565026354186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Climate change communication in the mass media and other textual sources may
affect and shape public perception. Extracting climate change information from
these sources is an important task, e.g., for filtering content and
e-discovery, sentiment analysis, automatic summarization, question-answering,
and fact-checking. However, automating this process is a challenge, as climate
change is a complex, fast-moving, and often ambiguous topic with scarce
resources for popular text-based AI tasks. In this paper, we introduce
\textsc{ClimaText}, a dataset for sentence-based climate change topic
detection, which we make publicly available. We explore different approaches to
identify the climate change topic in various text sources. We find that popular
keyword-based models are not adequate for such a complex and evolving task.
Context-based algorithms like BERT \cite{devlin2018bert} can detect, in
addition to many trivial cases, a variety of complex and implicit topic
patterns. Nevertheless, our analysis reveals a great potential for improvement
in several directions, such as, e.g., capturing the discussion on indirect
effects of climate change. Hence, we hope this work can serve as a good
starting point for further research on this topic.
Related papers
- Analyzing Regional Impacts of Climate Change using Natural Language
Processing Techniques [0.9387233631570752]
We use BERT (Bidirectional Representations from Transformers) for Named Entity Recognition (NER) to identify specific geographies within the climate literature.
We conduct region-specific climate trend analyses to pinpoint the predominant themes or concerns related to climate change within a particular area.
These in-depth examinations of location-specific climate data enable the creation of more customized policy-making, adaptation, and mitigation strategies.
arXiv Detail & Related papers (2024-01-11T16:44:59Z) - Comparing Data-Driven and Mechanistic Models for Predicting Phenology in
Deciduous Broadleaf Forests [47.285748922842444]
We train a deep neural network to predict a phenological index from meteorological time series.
We find that this approach outperforms traditional process-based models.
arXiv Detail & Related papers (2024-01-08T15:29:23Z) - ClimateNLP: Analyzing Public Sentiment Towards Climate Change Using
Natural Language Processing [0.0]
This paper employs natural language processing (NLP) techniques to analyze climate change discourse and quantify the sentiment of climate change-related tweets.
The objective is to discern the sentiment individuals express and uncover patterns in public opinion concerning climate change.
arXiv Detail & Related papers (2023-10-12T07:48:50Z) - Using Natural Language Processing and Networks to Automate Structured Literature Reviews: An Application to Farmers Climate Change Adaptation [0.0]
This work aims to sensibly use Natural Language Processing by extracting variables relations and synthesizing their findings using networks.
As an example, we apply our methodology to the analysis of farmers' adaptation to climate change.
Results show that the use of Natural Language Processing together with networks in a descriptive manner offers a fast and interpretable way to synthesize literature review findings.
arXiv Detail & Related papers (2023-06-16T10:05:47Z) - Climate Change & Computer Audition: A Call to Action and Overview on
Audio Intelligence to Help Save the Planet [98.97255654573662]
This work provides an overview of areas in which audio intelligence can contribute to overcome climate-related challenges.
We categorise potential computer audition applications according to the five elements of earth, water, air, fire, and aether.
arXiv Detail & Related papers (2022-03-10T13:32:31Z) - Trend and Thoughts: Understanding Climate Change Concern using Machine
Learning and Social Media Data [3.7384509727711923]
We constructed a massive climate change Twitter dataset and conducted comprehensive analysis using machine learning.
By conducting topic modeling and natural language processing, we show the relationship between the number of tweets about climate change and major climate events.
Our dataset was published on Kaggle and can be used in further research.
arXiv Detail & Related papers (2021-11-06T19:59:03Z) - Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and
Context-Aware Auto-Encoders [59.038157066874255]
We propose a novel framework called RankAE to perform chat summarization without employing manually labeled data.
RankAE consists of a topic-oriented ranking strategy that selects topic utterances according to centrality and diversity simultaneously.
A denoising auto-encoder is designed to generate succinct but context-informative summaries based on the selected utterances.
arXiv Detail & Related papers (2020-12-14T07:31:17Z) - Analyzing Sustainability Reports Using Natural Language Processing [68.8204255655161]
In recent years, companies have increasingly been aiming to both mitigate their environmental impact and adapt to the changing climate context.
This is reported via increasingly exhaustive reports, which cover many types of climate risks and exposures under the umbrella of Environmental, Social, and Governance (ESG)
We present this tool and the methodology that we used to develop it in the present article.
arXiv Detail & Related papers (2020-11-03T21:22:42Z) - From Talk to Action with Accountability: Monitoring the Public
Discussion of Policy Makers with Deep Neural Networks and Topic Modelling [0.0]
We propose a multi-source topic aggregation system (MuSTAS)
MuSTAS processes policy makers speech and rhetoric from several publicly available sources into an easily digestible topic summary.
This topic digest will serve the general public and civil society in assessing where, how, and when politicians talk about climate and climate policies.
arXiv Detail & Related papers (2020-10-16T12:21:01Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z) - Detecting and Classifying Malevolent Dialogue Responses: Taxonomy, Data
and Methodology [68.8836704199096]
Corpus-based conversational interfaces are able to generate more diverse and natural responses than template-based or retrieval-based agents.
With their increased generative capacity of corpusbased conversational agents comes the need to classify and filter out malevolent responses.
Previous studies on the topic of recognizing and classifying inappropriate content are mostly focused on a certain category of malevolence.
arXiv Detail & Related papers (2020-08-21T22:43:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.