Under the Microscope: Interpreting Readability Assessment Models for
Filipino
- URL: http://arxiv.org/abs/2110.00157v1
- Date: Fri, 1 Oct 2021 01:27:10 GMT
- Title: Under the Microscope: Interpreting Readability Assessment Models for
Filipino
- Authors: Joseph Marvin Imperial, Ethel Ong
- Abstract summary: We dissect machine learning-based readability assessment models in Filipino by performing global and local model interpretation.
Results show that using a model trained with top features from global interpretation obtained higher performance than the ones using features selected by Spearman correlation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Readability assessment is the process of identifying the level of ease or
difficulty of a certain piece of text for its intended audience. Approaches
have evolved from the use of arithmetic formulas to more complex
pattern-recognizing models trained using machine learning algorithms. While
using these approaches provide competitive results, limited work is done on
analyzing how linguistic variables affect model inference quantitatively. In
this work, we dissect machine learning-based readability assessment models in
Filipino by performing global and local model interpretation to understand the
contributions of varying linguistic features and discuss its implications in
the context of the Filipino language. Results show that using a model trained
with top features from global interpretation obtained higher performance than
the ones using features selected by Spearman correlation. Likewise, we also
empirically observed local feature weight boundaries for discriminating reading
difficulty at an extremely fine-grained level and their corresponding effects
if values are perturbed.
Related papers
- Holmes: A Benchmark to Assess the Linguistic Competence of Language Models [59.627729608055006]
We introduce Holmes, a new benchmark designed to assess language models (LMs) linguistic competence.
We use computation-based probing to examine LMs' internal representations regarding distinct linguistic phenomena.
As a result, we meet recent calls to disentangle LMs' linguistic competence from other cognitive abilities.
arXiv Detail & Related papers (2024-04-29T17:58:36Z) - Exploring Tokenization Strategies and Vocabulary Sizes for Enhanced Arabic Language Models [0.0]
This paper examines the impact of tokenization strategies and vocabulary sizes on the performance of Arabic language models.
Our study uncovers limited impacts of vocabulary size on model performance while keeping the model size unchanged.
Paper's recommendations include refining tokenization strategies to address dialect challenges, enhancing model robustness across diverse linguistic contexts, and expanding datasets to encompass the rich dialect based Arabic.
arXiv Detail & Related papers (2024-03-17T07:44:44Z) - Can Large Language Models Understand Context? [17.196362853457412]
This paper introduces a context understanding benchmark by adapting existing datasets to suit the evaluation of generative models.
Experimental results indicate that pre-trained dense models struggle with understanding more nuanced contextual features when compared to state-of-the-art fine-tuned models.
As LLM compression holds growing significance in both research and real-world applications, we assess the context understanding of quantized models under in-context-learning settings.
arXiv Detail & Related papers (2024-02-01T18:55:29Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - Scaling Language Models: Methods, Analysis & Insights from Training
Gopher [83.98181046650664]
We present an analysis of Transformer-based language model performance across a wide range of model scales.
Gains from scale are largest in areas such as reading comprehension, fact-checking, and the identification of toxic language.
We discuss the application of language models to AI safety and the mitigation of downstream harms.
arXiv Detail & Related papers (2021-12-08T19:41:47Z) - Automated Speech Scoring System Under The Lens: Evaluating and
interpreting the linguistic cues for language proficiency [26.70127591966917]
We utilize classical machine learning models to formulate a speech scoring task as both a classification and a regression problem.
First, we extract linguist features under five categories (fluency, pronunciation, content, grammar and vocabulary, and acoustic) and train models to grade responses.
In comparison, we find that the regression-based models perform equivalent to or better than the classification approach.
arXiv Detail & Related papers (2021-11-30T06:28:58Z) - Modeling morphology with Linear Discriminative Learning: considerations
and design choices [1.3535770763481905]
This study addresses a series of methodological questions that arise when modeling inflectional morphology with Linear Discriminative Learning.
We illustrate how decisions made about the representation of form and meaning influence model performance.
We discuss how the model can be set up to approximate the learning of inflected words in context.
arXiv Detail & Related papers (2021-06-15T07:37:52Z) - Explaining the Deep Natural Language Processing by Mining Textual
Interpretable Features [3.819533618886143]
T-EBAnO is a prediction-local and class-based model-global explanation strategies tailored to deep natural-language models.
It provides an objective, human-readable, domain-specific assessment of the reasons behind the automatic decision-making process.
arXiv Detail & Related papers (2021-06-12T06:25:09Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Explaining Black Box Predictions and Unveiling Data Artifacts through
Influence Functions [55.660255727031725]
Influence functions explain the decisions of a model by identifying influential training examples.
We conduct a comparison between influence functions and common word-saliency methods on representative tasks.
We develop a new measure based on influence functions that can reveal artifacts in training data.
arXiv Detail & Related papers (2020-05-14T00:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.