Semiotic Complexity and Its Epistemological Implications for Modeling Culture
- URL: http://arxiv.org/abs/2508.00095v1
- Date: Thu, 31 Jul 2025 18:44:48 GMT
- Title: Semiotic Complexity and Its Epistemological Implications for Modeling Culture
- Authors: Zachary K. Stine, James E. Deitrick,
- Abstract summary: We frame such modeling as engaging in translation work from a cultural, linguistic domain into a computational, mathematical domain.<n> Translators benefit from articulating the internal theory of their translation process, and so do computational humanists in their work.<n>We lay out several recommendations for researchers to account better for these issues in their own work.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Greater theorizing of methods in the computational humanities is needed for epistemological and interpretive clarity, and therefore the maturation of the field. In this paper, we frame such modeling work as engaging in translation work from a cultural, linguistic domain into a computational, mathematical domain, and back again. Translators benefit from articulating the theory of their translation process, and so do computational humanists in their work -- to ensure internal consistency, avoid subtle yet consequential translation errors, and facilitate interpretive transparency. Our contribution in this paper is to lay out a particularly consequential dimension of the lack of theorizing and the sorts of translation errors that emerge in our modeling practices as a result. Along these lines we introduce the idea of semiotic complexity as the degree to which the meaning of some text may vary across interpretive lenses, and make the case that dominant modeling practices -- especially around evaluation -- commit a translation error by treating semiotically complex data as semiotically simple when it seems epistemologically convenient by conferring superficial clarity. We then lay out several recommendations for researchers to better account for these epistemological issues in their own work.
Related papers
- Less is More: Local Intrinsic Dimensions of Contextual Language Models [13.561226514150695]
We introduce a novel perspective based on the geometric properties of contextual latent embeddings to study the effects of training and fine-tuning.<n>We show that the local dimensions provide insights into the model's training dynamics and generalization ability.<n>Our experiments suggest configuring a practical: reductions in the mean local dimension tend to accompany and predict subsequent performance gains.
arXiv Detail & Related papers (2025-06-01T14:30:46Z) - Reasoning Inconsistencies and How to Mitigate Them in Deep Learning [4.124590489579409]
This thesis contributes two techniques for detecting and quantifying predictive inconsistencies.<n>To mitigate inconsistencies from biases in training data, this thesis presents a data efficient sampling method.<n>Finally, the thesis offers two techniques to optimize the models for complex reasoning tasks.
arXiv Detail & Related papers (2025-04-03T13:40:55Z) - Analysis and Visualization of Linguistic Structures in Large Language Models: Neural Representations of Verb-Particle Constructions in BERT [0.0]
This study investigates the internal representations of verb-particle combinations within large language models (LLMs)<n>We analyse the representational efficacy of its layers for various verb-particle constructions such as 'agree on', 'come back', and 'give up'<n>Results show that BERT's middle layers most effectively capture syntactic structures, with significant variability in representational accuracy across different verb categories.
arXiv Detail & Related papers (2024-12-19T09:21:39Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Modeling Target-Side Morphology in Neural Machine Translation: A
Comparison of Strategies [72.56158036639707]
Morphologically rich languages pose difficulties to machine translation.
A large amount of differently inflected word surface forms entails a larger vocabulary.
Some inflected forms of infrequent terms typically do not appear in the training corpus.
Linguistic agreement requires the system to correctly match the grammatical categories between inflected word forms in the output sentence.
arXiv Detail & Related papers (2022-03-25T10:13:20Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Schr\"odinger's Tree -- On Syntax and Neural Language Models [10.296219074343785]
Language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities.
We observe a lack of clarity across numerous dimensions, which influences the hypotheses that researchers form.
We outline the implications of the different types of research questions exhibited in studies on syntax.
arXiv Detail & Related papers (2021-10-17T18:25:23Z) - Under the Microscope: Interpreting Readability Assessment Models for
Filipino [0.0]
We dissect machine learning-based readability assessment models in Filipino by performing global and local model interpretation.
Results show that using a model trained with top features from global interpretation obtained higher performance than the ones using features selected by Spearman correlation.
arXiv Detail & Related papers (2021-10-01T01:27:10Z) - Interpretable Deep Learning: Interpretations, Interpretability,
Trustworthiness, and Beyond [49.93153180169685]
We introduce and clarify two basic concepts-interpretations and interpretability-that people usually get confused.
We elaborate the design of several recent interpretation algorithms, from different perspectives, through proposing a new taxonomy.
We summarize the existing work in evaluating models' interpretability using "trustworthy" interpretation algorithms.
arXiv Detail & Related papers (2021-03-19T08:40:30Z) - Enriching Non-Autoregressive Transformer with Syntactic and
SemanticStructures for Neural Machine Translation [54.864148836486166]
We propose to incorporate the explicit syntactic and semantic structures of languages into a non-autoregressive Transformer.
Our model achieves a significantly faster speed, as well as keeps the translation quality when compared with several state-of-the-art non-autoregressive models.
arXiv Detail & Related papers (2021-01-22T04:12:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.