Discriminatory Expressions to Produce Interpretable Models in Short
Documents
- URL: http://arxiv.org/abs/2012.02104v2
- Date: Mon, 15 Feb 2021 14:25:09 GMT
- Title: Discriminatory Expressions to Produce Interpretable Models in Short
Documents
- Authors: Manuel Francisco and Juan Luis Castro
- Abstract summary: State-of-the-art models are black boxes that should not be used to solve problems that may have a social impact.
This paper presents a feature selection mechanism that is able to improve comprehensibility by using less but more meaningful features.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Social Networking Sites (SNS) are one of the most important ways of
communication. In particular, microblogging sites are being used as analysis
avenues due to their peculiarities (promptness, short texts...). There are
countless researches that use SNS in novel manners, but machine learning has
focused mainly in classification performance rather than interpretability
and/or other goodness metrics. Thus, state-of-the-art models are black boxes
that should not be used to solve problems that may have a social impact. When
the problem requires transparency, it is necessary to build interpretable
pipelines. Although the classifier may be interpretable, resulting models are
too complex to be considered comprehensible, making it impossible for humans to
understand the actual decisions. This paper presents a feature selection
mechanism that is able to improve comprehensibility by using less but more
meaningful features while achieving good performance in microblogging contexts
where interpretability is mandatory. Moreover, we present a ranking method to
evaluate features in terms of statistical relevance and bias. We conducted
exhaustive tests with five different datasets in order to evaluate
classification performance, generalisation capacity and complexity of the
model. Results show that our proposal is better and the most stable one in
terms of accuracy, generalisation and comprehensibility.
Related papers
- Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods.
We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z) - InterpretCC: Intrinsic User-Centric Interpretability through Global Mixture of Experts [31.738009841932374]
Interpretability for neural networks is a trade-off between three key requirements.
We present InterpretCC, a family of interpretable-by-design neural networks that guarantee human-centric interpretability.
arXiv Detail & Related papers (2024-02-05T11:55:50Z) - Even-if Explanations: Formal Foundations, Priorities and Complexity [18.126159829450028]
We show that both linear and tree-based models are strictly more interpretable than neural networks.
We introduce a preference-based framework that enables users to personalize explanations based on their preferences.
arXiv Detail & Related papers (2024-01-17T11:38:58Z) - Explaining Relation Classification Models with Semantic Extents [1.7604348079019634]
A lack of explainability is currently a complicating factor in many real-world applications.
We introduce semantic extents, a concept to analyze decision patterns for the relation classification task.
We provide an annotation tool and a software framework to determine semantic extents for humans and models.
arXiv Detail & Related papers (2023-08-04T08:17:52Z) - Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats.
We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes.
We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z) - Learning Optimal Fair Classification Trees: Trade-offs Between
Interpretability, Fairness, and Accuracy [7.215903549622416]
We propose a mixed integer optimization framework for learning optimal classification trees.
We benchmark our method against state-of-the-art approaches for fair classification on popular datasets.
Our method consistently finds decisions with almost full parity, while other methods rarely do.
arXiv Detail & Related papers (2022-01-24T19:47:10Z) - More Than Words: Towards Better Quality Interpretations of Text
Classifiers [16.66535643383862]
We show that token-based interpretability, while being a convenient first choice given the input interfaces of the ML models, is not the most effective one in all situations.
We show that higher-level feature attributions offer several advantages: 1) they are more robust as measured by the randomization tests, 2) they lead to lower variability when using approximation-based methods like SHAP, and 3) they are more intelligible to humans in situations where the linguistic coherence resides at a higher level.
arXiv Detail & Related papers (2021-12-23T10:18:50Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.