Towards Improved Model Design for Authorship Identification: A Survey on
Writing Style Understanding
- URL: http://arxiv.org/abs/2009.14445v1
- Date: Wed, 30 Sep 2020 05:17:42 GMT
- Title: Towards Improved Model Design for Authorship Identification: A Survey on
Writing Style Understanding
- Authors: Weicheng Ma, Ruibo Liu, Lili Wang and Soroush Vosoughi
- Abstract summary: Authorship identification tasks rely heavily on linguistic styles.
Traditional machine learning methods based on handcrafted feature sets are already approaching their performance limits.
We describe outstanding methods in style-related tasks in general and analyze how they are used in combination in the top-performing models.
- Score: 30.642840676899734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Authorship identification tasks, which rely heavily on linguistic styles,
have always been an important part of Natural Language Understanding (NLU)
research. While other tasks based on linguistic style understanding benefit
from deep learning methods, these methods have not behaved as well as
traditional machine learning methods in many authorship-based tasks. With these
tasks becoming more and more challenging, however, traditional machine learning
methods based on handcrafted feature sets are already approaching their
performance limits. Thus, in order to inspire future applications of deep
learning methods in authorship-based tasks in ways that benefit the extraction
of stylistic features, we survey authorship-based tasks and other tasks related
to writing style understanding. We first describe our survey results on the
current state of research in both sets of tasks and summarize existing
achievements and problems in authorship-related tasks. We then describe
outstanding methods in style-related tasks in general and analyze how they are
used in combination in the top-performing models. We are optimistic about the
applicability of these models to authorship-based tasks and hope our survey
will help advance research in this field.
Related papers
- A Survey of Imitation Learning Methods, Environments and Metrics [9.967130899041651]
Imitation learning is an approach in which an agent learns how to execute a task by trying to mimic how one or more teachers perform it.
This learning approach offers a compromise between the time it takes to learn a new task and the effort needed to collect teacher samples for the agent.
The field of imitation learning has received much attention from researchers in recent years, resulting in many new methods and applications.
arXiv Detail & Related papers (2024-04-30T11:13:23Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Creating a Trajectory for Code Writing: Algorithmic Reasoning Tasks [0.923607423080658]
This paper describes instruments and the machine learning models used for validating them.
We have used the data collected in an introductory programming course in the penultimate week of the semester.
Preliminary research suggests ART type instruments can be combined with specific machine learning models to act as an effective learning trajectory.
arXiv Detail & Related papers (2024-04-03T05:07:01Z) - Pre-training Multi-task Contrastive Learning Models for Scientific
Literature Understanding [52.723297744257536]
Pre-trained language models (LMs) have shown effectiveness in scientific literature understanding tasks.
We propose a multi-task contrastive learning framework, SciMult, to facilitate common knowledge sharing across different literature understanding tasks.
arXiv Detail & Related papers (2023-05-23T16:47:22Z) - Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language
Understanding [51.31622274823167]
We propose a hierarchical framework with a coarse-to-fine paradigm, with the bottom level shared to all the tasks, the mid-level divided to different groups, and the top-level assigned to each of the tasks.
This allows our model to learn basic language properties from all tasks, boost performance on relevant tasks, and reduce the negative impact from irrelevant tasks.
arXiv Detail & Related papers (2022-08-19T02:46:20Z) - Context-Aware Language Modeling for Goal-Oriented Dialogue Systems [84.65707332816353]
We formulate goal-oriented dialogue as a partially observed Markov decision process.
We derive a simple and effective method to finetune language models in a goal-aware way.
We evaluate our method on a practical flight-booking task using AirDialogue.
arXiv Detail & Related papers (2022-04-18T17:23:11Z) - Survey on Automated Short Answer Grading with Deep Learning: from Word
Embeddings to Transformers [5.968260239320591]
Automated short answer grading (ASAG) has gained attention in education as a means to scale educational tasks to the growing number of students.
Recent progress in Natural Language Processing and Machine Learning has largely influenced the field of ASAG.
arXiv Detail & Related papers (2022-03-11T13:47:08Z) - Analyzing the Limits of Self-Supervision in Handling Bias in Language [52.26068057260399]
We evaluate how well language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing.
Our analyses indicate that language models are capable of performing these tasks to widely varying degrees across different bias dimensions, such as gender and political affiliation.
arXiv Detail & Related papers (2021-12-16T05:36:08Z) - Positioning yourself in the maze of Neural Text Generation: A
Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc.
We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.