A Computational Framework to Identify Self-Aspects in Text
- URL: http://arxiv.org/abs/2507.13115v1
- Date: Thu, 17 Jul 2025 13:31:04 GMT
- Title: A Computational Framework to Identify Self-Aspects in Text
- Authors: Jaya Caporusso, Matthew Purver, Senja Pollak,
- Abstract summary: The Self is a multifaceted construct and it is reflected in language.<n>Many of the aspects of the Self align with psychological and other well-researched phenomena.<n>This proposal introduces a plan to develop a computational framework to identify Self-aspects in text.
- Score: 9.187473897664105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This Ph.D. proposal introduces a plan to develop a computational framework to identify Self-aspects in text. The Self is a multifaceted construct and it is reflected in language. While it is described across disciplines like cognitive science and phenomenology, it remains underexplored in natural language processing (NLP). Many of the aspects of the Self align with psychological and other well-researched phenomena (e.g., those related to mental health), highlighting the need for systematic NLP-based analysis. In line with this, we plan to introduce an ontology of Self-aspects and a gold-standard annotated dataset. Using this foundation, we will develop and evaluate conventional discriminative models, generative large language models, and embedding-based retrieval approaches against four main criteria: interpretability, ground-truth adherence, accuracy, and computational efficiency. Top-performing models will be applied in case studies in mental health and empirical phenomenology.
Related papers
- Distinguishing AI-Generated and Human-Written Text Through Psycholinguistic Analysis [0.0]
This research specifically maps 31 distinct stylometric features to cognitive processes such as lexical retrieval, discourse planning, cognitive load management, and metacognitive self-monitoring.<n>This framework contributes to the development of reliable tools aimed at preserving academic integrity in the era of generative AI.
arXiv Detail & Related papers (2025-05-03T12:06:53Z) - Learning from Impairment: Leveraging Insights from Clinical Linguistics in Language Modelling Research [1.544681800932596]
This position paper investigates the potential of integrating insights from language impairment research and its clinical treatment to develop human-inspired learning strategies and evaluation frameworks for language models (LMs)<n>We inspect the theoretical underpinnings underlying some influential linguistically motivated training approaches derived from neurolinguistics and, particularly, aphasiology, aimed at enhancing the recovery and generalization of linguistic skills in aphasia treatment.<n>We highlight how these insights can inform the design of rigorous assessments for LMs, specifically in their handling of complex syntactic phenomena, as well as their implications for developing human-like learning strategies.
arXiv Detail & Related papers (2024-12-20T10:53:21Z) - STRICTA: Structured Reasoning in Critical Text Assessment for Peer Review and Beyond [68.47402386668846]
We introduce Structured Reasoning In Critical Text Assessment (STRICTA) to model text assessment as an explicit, step-wise reasoning process.<n>STRICTA breaks down the assessment into a graph of interconnected reasoning steps drawing on causality theory.<n>We apply STRICTA to a dataset of over 4000 reasoning steps from roughly 40 biomedical experts on more than 20 papers.
arXiv Detail & Related papers (2024-09-09T06:55:37Z) - Language Evolution with Deep Learning [49.879239655532324]
Computational modeling plays an essential role in the study of language emergence.
It aims to simulate the conditions and learning processes that could trigger the emergence of a structured language.
This chapter explores another class of computational models that have recently revolutionized the field of machine learning: deep learning models.
arXiv Detail & Related papers (2024-03-18T16:52:54Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language
Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks.
Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena.
For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z) - Interactive Natural Language Processing [67.87925315773924]
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP.
This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept.
arXiv Detail & Related papers (2023-05-22T17:18:29Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - Design considerations for a hierarchical semantic compositional
framework for medical natural language understanding [3.7003326903946756]
We describe a framework inspired by mechanisms of human cognition in an attempt to jump the NLP performance curve.
The paper describes insights from four key aspects including semantic memory, semantic composition, semantic activation.
We discuss the design of a generative semantic model and an associated semantic used to transform a free-text sentence into a logical representation of its meaning.
arXiv Detail & Related papers (2022-04-05T09:04:34Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Schr\"odinger's Tree -- On Syntax and Neural Language Models [10.296219074343785]
Language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities.
We observe a lack of clarity across numerous dimensions, which influences the hypotheses that researchers form.
We outline the implications of the different types of research questions exhibited in studies on syntax.
arXiv Detail & Related papers (2021-10-17T18:25:23Z) - Modelling Compositionality and Structure Dependence in Natural Language [0.12183405753834563]
Drawing on linguistics and set theory, a formalisation of these ideas is presented in the first half of this thesis.
We see how cognitive systems that process language need to have certain functional constraints.
Using the advances of word embedding techniques, a model of relational learning is simulated.
arXiv Detail & Related papers (2020-11-22T17:28:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.