Putting Context in Context: the Impact of Discussion Structure on Text
Classification
- URL: http://arxiv.org/abs/2402.02975v1
- Date: Mon, 5 Feb 2024 12:56:22 GMT
- Title: Putting Context in Context: the Impact of Discussion Structure on Text
Classification
- Authors: Nicol\`o Penzo, Antonio Longa, Bruno Lepri, Sara Tonelli, Marco
Guerini
- Abstract summary: We propose a series of experiments on a large dataset for stance detection in English.
We evaluate the contribution of different types of contextual information.
We show that structural information can be highly beneficial to text classification but only under certain circumstances.
- Score: 13.15873889847739
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current text classification approaches usually focus on the content to be
classified. Contextual aspects (both linguistic and extra-linguistic) are
usually neglected, even in tasks based on online discussions. Still in many
cases the multi-party and multi-turn nature of the context from which these
elements are selected can be fruitfully exploited. In this work, we propose a
series of experiments on a large dataset for stance detection in English, in
which we evaluate the contribution of different types of contextual
information, i.e. linguistic, structural and temporal, by feeding them as
natural language input into a transformer-based model. We also experiment with
different amounts of training data and analyse the topology of local discussion
networks in a privacy-compliant way. Results show that structural information
can be highly beneficial to text classification but only under certain
circumstances (e.g. depending on the amount of training data and on discussion
chain complexity). Indeed, we show that contextual information on smaller
datasets from other classification tasks does not yield significant
improvements. Our framework, based on local discussion networks, allows the
integration of structural information, while minimising user profiling, thus
preserving their privacy.
Related papers
- Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - An Information-Theoretic Approach to Analyze NLP Classification Tasks [3.273958158967657]
This work provides an information-theoretic framework to analyse the influence of inputs for text classification tasks.
Each text element has two components: an associated semantic meaning and a linguistic realization.
Multiple-choice reading comprehension (MCRC) and sentiment classification (SC) are selected to showcase the framework.
arXiv Detail & Related papers (2024-02-01T19:49:44Z) - Multi-Dimensional Evaluation of Text Summarization with In-Context
Learning [79.02280189976562]
In this paper, we study the efficacy of large language models as multi-dimensional evaluators using in-context learning.
Our experiments show that in-context learning-based evaluators are competitive with learned evaluation frameworks for the task of text summarization.
We then analyze the effects of factors such as the selection and number of in-context examples on performance.
arXiv Detail & Related papers (2023-06-01T23:27:49Z) - Idioms, Probing and Dangerous Things: Towards Structural Probing for
Idiomaticity in Vector Space [2.5288257442251107]
The goal of this paper is to learn more about how idiomatic information is structurally encoded in embeddings.
We perform a comparative probing study of static (GloVe) and contextual (BERT) embeddings.
Our experiments indicate that both encode some idiomatic information to varying degrees, but yield conflicting evidence as to whether idiomaticity is encoded in the vector norm.
arXiv Detail & Related papers (2023-04-27T17:06:20Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Contextual information integration for stance detection via
cross-attention [59.662413798388485]
Stance detection deals with identifying an author's stance towards a target.
Most existing stance detection models are limited because they do not consider relevant contextual information.
We propose an approach to integrate contextual information as text.
arXiv Detail & Related papers (2022-11-03T15:04:29Z) - A Knowledge-Enhanced Adversarial Model for Cross-lingual Structured
Sentiment Analysis [31.05169054736711]
Cross-lingual structured sentiment analysis task aims to transfer the knowledge from source language to target one.
We propose a Knowledge-Enhanced Adversarial Model (textttKEAM) with both implicit distributed and explicit structural knowledge.
We conduct experiments on five datasets and compare textttKEAM with both the supervised and unsupervised methods.
arXiv Detail & Related papers (2022-05-31T03:07:51Z) - Open-set Text Recognition via Character-Context Decoupling [16.2819099852748]
The open-set text recognition task is an emerging challenge that requires an extra capability to cognize novel characters during evaluation.
We argue that a major cause of the limited performance for current methods is the confounding effect of contextual information over the visual information of individual characters.
A Character-Context Decoupling framework is proposed to alleviate this problem by separating contextual information and character-visual information.
arXiv Detail & Related papers (2022-04-12T05:43:46Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - Contextual Argument Component Classification for Class Discussions [1.0152838128195467]
We show how two different types of contextual information, local discourse context and speaker context, can be incorporated into a computational model for classifying argument components.
We find that both context types can improve performance, although the improvements are dependent on context size and position.
arXiv Detail & Related papers (2021-02-20T08:48:07Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.