TraSE: Towards Tackling Authorial Style from a Cognitive Science
Perspective
- URL: http://arxiv.org/abs/2206.10706v2
- Date: Wed, 6 Dec 2023 02:49:49 GMT
- Title: TraSE: Towards Tackling Authorial Style from a Cognitive Science
Perspective
- Authors: Ronald Wilson, Avanti Bhandarkar and Damon Woodard
- Abstract summary: Authorship attribution experiments with over 27,000 authors and 1.4 million samples in a cross-domain scenario resulted in 90% attribution accuracy.
A qualitative analysis is performed on TraSE using physical human characteristics, like age, to validate its claim on capturing cognitive traits.
- Score: 4.123763595394021
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stylistic analysis of text is a key task in research areas ranging from
authorship attribution to forensic analysis and personality profiling. The
existing approaches for stylistic analysis are plagued by issues like topic
influence, lack of discriminability for large number of authors and the
requirement for large amounts of diverse data. In this paper, the source of
these issues are identified along with the necessity for a cognitive
perspective on authorial style in addressing them. A novel feature
representation, called Trajectory-based Style Estimation (TraSE), is introduced
to support this purpose. Authorship attribution experiments with over 27,000
authors and 1.4 million samples in a cross-domain scenario resulted in 90%
attribution accuracy suggesting that the feature representation is immune to
such negative influences and an excellent candidate for stylistic analysis.
Finally, a qualitative analysis is performed on TraSE using physical human
characteristics, like age, to validate its claim on capturing cognitive traits.
Related papers
- Enhancing Representation Generalization in Authorship Identification [9.148691357200216]
Authorship identification ascertains the authorship of texts whose origins remain undisclosed.
Modern authorship identification methods have proven effective in distinguishing authorial styles.
The presented work addresses the challenge of enhancing the generalization of stylistic representations in authorship identification.
arXiv Detail & Related papers (2023-09-30T17:11:00Z) - Sensitivity, Performance, Robustness: Deconstructing the Effect of
Sociodemographic Prompting [64.80538055623842]
sociodemographic prompting is a technique that steers the output of prompt-based models towards answers that humans with specific sociodemographic profiles would give.
We show that sociodemographic information affects model predictions and can be beneficial for improving zero-shot learning in subjective NLP tasks.
arXiv Detail & Related papers (2023-09-13T15:42:06Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - An Interdisciplinary Perspective on Evaluation and Experimental Design
for Visual Text Analytics: Position Paper [24.586485898038312]
In this paper, we focus on the issues of evaluating visual text analytics approaches.
We identify four key groups of challenges for evaluating visual text analytics approaches.
arXiv Detail & Related papers (2022-09-23T11:47:37Z) - Mitigating Bias in Facial Analysis Systems by Incorporating Label
Diversity [4.089080285684415]
We introduce a novel learning method that combines subjective human-based labels and objective annotations based on mathematical definitions of facial traits.
Our method successfully mitigates unintended biases, while maintaining significant accuracy on the downstream task.
arXiv Detail & Related papers (2022-04-13T13:17:27Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Artificial Text Detection via Examining the Topology of Attention Maps [58.46367297712477]
We propose three novel types of interpretable topological features for this task based on Topological Data Analysis (TDA)
We empirically show that the features derived from the BERT model outperform count- and neural-based baselines up to 10% on three common datasets.
The probing analysis of the features reveals their sensitivity to the surface and syntactic properties.
arXiv Detail & Related papers (2021-09-10T12:13:45Z) - The Sensitivity of Word Embeddings-based Author Detection Models to
Semantic-preserving Adversarial Perturbations [3.7552532139404797]
Authorship analysis is an important subject in the field of natural language processing.
This paper explores the limitations and sensitiveness of established approaches to adversarial manipulations of inputs.
arXiv Detail & Related papers (2021-02-23T19:55:45Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - Survey on Visual Sentiment Analysis [87.20223213370004]
This paper reviews pertinent publications and tries to present an exhaustive overview of the field of Visual Sentiment Analysis.
The paper also describes principles of design of general Visual Sentiment Analysis systems from three main points of view.
A formalization of the problem is discussed, considering different levels of granularity, as well as the components that can affect the sentiment toward an image in different ways.
arXiv Detail & Related papers (2020-04-24T10:15:22Z) - A white-box analysis on the writer-independent dichotomy transformation
applied to offline handwritten signature verification [13.751795751395091]
A writer-independent (WI) framework is used to train a single model to perform signature verification for all writers.
In WI systems, a single model is trained to perform signature verification for all writers from a dissimilarity space generated by the dichotomy transformation.
We present a white-box analysis of this approach highlighting how it handles the challenges, the dynamic selection of references through fusion function, and its application for transfer learning.
arXiv Detail & Related papers (2020-04-03T19:59:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.