Efficient Aspect-Based Summarization of Climate Change Reports with Small Language Models
- URL: http://arxiv.org/abs/2411.14272v1
- Date: Thu, 21 Nov 2024 16:28:32 GMT
- Title: Efficient Aspect-Based Summarization of Climate Change Reports with Small Language Models
- Authors: Iacopo Ghinassi, Leonardo Catalano, Tommaso Colella,
- Abstract summary: We release a new dataset for Aspect-Based Summarization (ABS) of Climate Change reports.
We employ different Large Language Models (LLMs) and so-called Small Language Models (SLMs) to tackle this problem in an unsupervised way.
Considering the problem at hand, we also show how SLMs are not significantly worse for the problem while leading to reduced carbon footprint.
- Score: 0.0
- License:
- Abstract: The use of Natural Language Processing (NLP) for helping decision-makers with Climate Change action has recently been highlighted as a use case aligning with a broader drive towards NLP technologies for social good. In this context, Aspect-Based Summarization (ABS) systems that extract and summarize relevant information are particularly useful as they provide stakeholders with a convenient way of finding relevant information in expert-curated reports. In this work, we release a new dataset for ABS of Climate Change reports and we employ different Large Language Models (LLMs) and so-called Small Language Models (SLMs) to tackle this problem in an unsupervised way. Considering the problem at hand, we also show how SLMs are not significantly worse for the problem while leading to reduced carbon footprint; we do so by applying for the first time an existing framework considering both energy efficiency and task performance to the evaluation of zero-shot generative models for ABS. Overall, our results show that modern language models, both big and small, can effectively tackle ABS for Climate Change reports but more research is needed when we frame the problem as a Retrieval Augmented Generation (RAG) problem and our work and dataset will help foster efforts in this direction.
Related papers
- Enhancing SLM via ChatGPT and Dataset Augmentation [0.3844771221441211]
We employ knowledge distillation-based techniques and synthetic dataset augmentation to bridge the performance gap between large language models (LLMs) and small language models (SLMs)
Our methods involve two forms of rationale generation--information extraction and informed reasoning--to enrich the ANLI dataset.
Our findings reveal that the incorporation of synthetic rationales significantly improves the model's ability to comprehend natural language, leading to 1.3% and 2.3% higher classification accuracy, respectively, on the ANLI dataset.
arXiv Detail & Related papers (2024-09-19T09:24:36Z) - InkubaLM: A small language model for low-resource African languages [9.426968756845389]
InkubaLM is a small language model with 0.4 billion parameters.
It achieves performance comparable to models with significantly larger parameter counts.
It demonstrates remarkable consistency across multiple languages.
arXiv Detail & Related papers (2024-08-30T05:42:31Z) - Reporting and Analysing the Environmental Impact of Language Models on the Example of Commonsense Question Answering with External Knowledge [7.419725234099729]
ChatGPT sparked social interest in Large Language Models (LLMs)
LLMs demand substantial computational resources and are very costly to train, both financially and environmentally.
In this study, we infused T5 LLM with external knowledge and fine-tuned the model for Question-Answering task.
arXiv Detail & Related papers (2024-07-24T16:16:16Z) - Large Language Models for Next Point-of-Interest Recommendation [53.93503291553005]
Location-Based Social Network (LBSN) data is often used for the next Point of Interest (POI) recommendation task.
One frequently disregarded challenge is how to effectively use the abundant contextual information present in LBSN data.
We propose a framework that uses pretrained Large Language Models (LLMs) to tackle this challenge.
arXiv Detail & Related papers (2024-04-19T13:28:36Z) - Assessing Privacy Risks in Language Models: A Case Study on
Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack.
We exploit text similarity and the model's resistance to document modifications as potential MI signals.
We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z) - Data-efficient, Explainable and Safe Box Manipulation: Illustrating the Advantages of Physical Priors in Model-Predictive Control [0.0]
We show that prior knowledge of environment dynamics in an MPC framework can lead to improvements in explainability, safety and data-efficiency.
We model a payload manipulation problem based on a real robotic system, and show that leveraging prior knowledge about the dynamics of the environment in an MPC framework can lead to improvements in explainability, safety and data-efficiency.
arXiv Detail & Related papers (2023-03-02T20:28:19Z) - Large Language Models Are Latent Variable Models: Explaining and Finding
Good Demonstrations for In-Context Learning [104.58874584354787]
In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning.
This study aims to examine the in-context learning phenomenon through a Bayesian lens, viewing real-world LLMs as latent variable models.
arXiv Detail & Related papers (2023-01-27T18:59:01Z) - Privacy Adhering Machine Un-learning in NLP [66.17039929803933]
In real world industry use Machine Learning to build models on user data.
Such mandates require effort both in terms of data as well as model retraining.
continuous removal of data and model retraining steps do not scale.
We propose textitMachine Unlearning to tackle this challenge.
arXiv Detail & Related papers (2022-12-19T16:06:45Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.