Quantitative Discourse Cohesion Analysis of Scientific Scholarly Texts
using Multilayer Networks
- URL: http://arxiv.org/abs/2205.07532v1
- Date: Mon, 16 May 2022 09:10:41 GMT
- Title: Quantitative Discourse Cohesion Analysis of Scientific Scholarly Texts
using Multilayer Networks
- Authors: Vasudha Bhatnagar, Swagata Duari, S.K. Gupta
- Abstract summary: We aim to computationally analyze the discourse cohesion in scientific scholarly texts using multilayer network representation.
We design section-level and document-level metrics to assess the extent of lexical cohesion in text.
We present an analytical framework, CHIAA (CHeck It Again, Author), to provide pointers to the author for potential improvements in the manuscript.
- Score: 10.556468838821338
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Discourse cohesion facilitates text comprehension and helps the reader form a
coherent narrative. In this study, we aim to computationally analyze the
discourse cohesion in scientific scholarly texts using multilayer network
representation and quantify the writing quality of the document. Exploiting the
hierarchical structure of scientific scholarly texts, we design section-level
and document-level metrics to assess the extent of lexical cohesion in text. We
use a publicly available dataset along with a curated set of contrasting
examples to validate the proposed metrics by comparing them against select
indices computed using existing cohesion analysis tools. We observe that the
proposed metrics correlate as expected with the existing cohesion indices.
We also present an analytical framework, CHIAA (CHeck It Again, Author), to
provide pointers to the author for potential improvements in the manuscript
with the help of the section-level and document-level metrics. The proposed
CHIAA framework furnishes a clear and precise prescription to the author for
improving writing by localizing regions in text with cohesion gaps. We
demonstrate the efficacy of CHIAA framework using succinct examples from
cohesion-deficient text excerpts in the experimental dataset.
Related papers
- Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness [3.2925222641796554]
"pointer-guided segment ordering" (SO) is a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations.
Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures.
arXiv Detail & Related papers (2024-06-06T15:17:51Z) - Re3: A Holistic Framework and Dataset for Modeling Collaborative Document Revision [62.12545440385489]
We introduce Re3, a framework for joint analysis of collaborative document revision.
We present Re3-Sci, a large corpus of aligned scientific paper revisions manually labeled according to their action and intent.
We use the new data to provide first empirical insights into collaborative document revision in the academic domain.
arXiv Detail & Related papers (2024-05-31T21:19:09Z) - BBScore: A Brownian Bridge Based Metric for Assessing Text Coherence [20.507596002357655]
Coherent texts inherently manifest a sequential and cohesive interplay among sentences.
BBScore is a reference-free metric grounded in Brownian bridge theory for assessing text coherence.
arXiv Detail & Related papers (2023-12-28T08:34:17Z) - Multi-Dimensional Evaluation of Text Summarization with In-Context
Learning [79.02280189976562]
In this paper, we study the efficacy of large language models as multi-dimensional evaluators using in-context learning.
Our experiments show that in-context learning-based evaluators are competitive with learned evaluation frameworks for the task of text summarization.
We then analyze the effects of factors such as the selection and number of in-context examples on performance.
arXiv Detail & Related papers (2023-06-01T23:27:49Z) - Improve Discourse Dependency Parsing with Contextualized Representations [28.916249926065273]
We propose to take advantage of transformers to encode contextualized representations of units of different levels.
Motivated by the observation of writing patterns commonly shared across articles, we propose a novel method that treats discourse relation identification as a sequence labelling task.
arXiv Detail & Related papers (2022-05-04T14:35:38Z) - Revise and Resubmit: An Intertextual Model of Text-based Collaboration
in Peer Review [52.359007622096684]
Peer review is a key component of the publishing process in most fields of science.
Existing NLP studies focus on the analysis of individual texts.
editorial assistance often requires modeling interactions between pairs of texts.
arXiv Detail & Related papers (2022-04-22T16:39:38Z) - Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document.
We also simultaneously cluster users, removing the need for post-hoc cluster estimation.
Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z) - TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between
Corpora [14.844685568451833]
We introduce TextEssence, an interactive system designed to enable comparative analysis of corpora using embeddings.
TextEssence includes visual, neighbor-based, and similarity-based modes of embedding analysis in a lightweight, web-based interface.
arXiv Detail & Related papers (2021-03-19T21:26:28Z) - Hierarchical Bi-Directional Self-Attention Networks for Paper Review
Rating Recommendation [81.55533657694016]
We propose a Hierarchical bi-directional self-attention Network framework (HabNet) for paper review rating prediction and recommendation.
Specifically, we leverage the hierarchical structure of the paper reviews with three levels of encoders: sentence encoder (level one), intra-review encoder (level two) and inter-review encoder (level three)
We are able to identify useful predictors to make the final acceptance decision, as well as to help discover the inconsistency between numerical review ratings and text sentiment conveyed by reviewers.
arXiv Detail & Related papers (2020-11-02T08:07:50Z) - Improving Text Generation Evaluation with Batch Centering and Tempered
Word Mover Distance [24.49032191669509]
We present two techniques for improving encoding representations for similarity metrics.
We show results over various BERT-backbone learned metrics and achieving state of the art correlation with human ratings on several benchmarks.
arXiv Detail & Related papers (2020-10-13T03:46:25Z) - Multilevel Text Alignment with Cross-Document Attention [59.76351805607481]
Existing alignment methods operate at a single, predefined level.
We propose a new learning approach that equips previously established hierarchical attention encoders for representing documents with a cross-document attention component.
arXiv Detail & Related papers (2020-10-03T02:52:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.