NxPlain: Web-based Tool for Discovery of Latent Concepts
- URL: http://arxiv.org/abs/2303.03019v1
- Date: Mon, 6 Mar 2023 10:45:24 GMT
- Title: NxPlain: Web-based Tool for Discovery of Latent Concepts
- Authors: Fahim Dalvi and Nadir Durrani and Hassan Sajjad and Tamim Jaban and
Musab Husaini and Ummar Abbas
- Abstract summary: We present NxPlain, a web application that provides an explanation of a model's prediction using latent concepts.
NxPlain discovers latent concepts learned in a deep NLP model, provides an interpretation of the knowledge learned in the model, and explains its predictions based on the used concepts.
- Score: 16.446370662629555
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The proliferation of deep neural networks in various domains has seen an
increased need for the interpretability of these models, especially in
scenarios where fairness and trust are as important as model performance. A lot
of independent work is being carried out to: i) analyze what linguistic and
non-linguistic knowledge is learned within these models, and ii) highlight the
salient parts of the input. We present NxPlain, a web application that provides
an explanation of a model's prediction using latent concepts. NxPlain discovers
latent concepts learned in a deep NLP model, provides an interpretation of the
knowledge learned in the model, and explains its predictions based on the used
concepts. The application allows users to browse through the latent concepts in
an intuitive order, letting them efficiently scan through the most salient
concepts with a global corpus level view and a local sentence-level view. Our
tool is useful for debugging, unraveling model bias, and for highlighting
spurious correlations in a model. A hosted demo is available here:
https://nxplain.qcri.org.
Related papers
- SOLD: Reinforcement Learning with Slot Object-Centric Latent Dynamics [16.020835290802548]
Slot-Attention for Object-centric Latent Dynamics is a novel algorithm that learns object-centric dynamics models from pixel inputs.
We demonstrate that the structured latent space not only improves model interpretability but also provides a valuable input space for behavior models to reason over.
Our results show that SOLD outperforms DreamerV3, a state-of-the-art model-based RL algorithm, across a range of benchmark robotic environments.
arXiv Detail & Related papers (2024-10-11T14:03:31Z) - Restyling Unsupervised Concept Based Interpretable Networks with Generative Models [14.604305230535026]
We propose a novel method that relies on mapping the concept features to the latent space of a pretrained generative model.
We quantitatively ascertain the efficacy of our method in terms of accuracy of the interpretable prediction network, fidelity of reconstruction, as well as faithfulness and consistency of learnt concepts.
arXiv Detail & Related papers (2024-07-01T14:39:41Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - CommonsenseVIS: Visualizing and Understanding Commonsense Reasoning
Capabilities of Natural Language Models [30.63276809199399]
We present CommonsenseVIS, a visual explanatory system that utilizes external commonsense knowledge bases to contextualize model behavior for commonsense question-answering.
Our system features multi-level visualization and interactive model probing and editing for different concepts and their underlying relations.
arXiv Detail & Related papers (2023-07-23T17:16:13Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable
ELements for explaining neural net classifiers on NLP tasks [3.475906200620518]
COCKATIEL is a novel, post-hoc, concept-based, model-agnostic XAI technique.
It generates meaningful explanations from the last layer of a neural net model trained on an NLP classification task.
It does so without compromising the accuracy of the underlying model or requiring a new one to be trained.
arXiv Detail & Related papers (2023-05-11T12:22:20Z) - ConceptX: A Framework for Latent Concept Analysis [21.760620298330235]
We present ConceptX, a human-in-the-loop framework for interpreting and annotating latent representational space in Language Models (pLMs)
We use an unsupervised method to discover concepts learned in these models and enable a graphical interface for humans to generate explanations for the concepts.
arXiv Detail & Related papers (2022-11-12T11:31:09Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.