TCNL: Transparent and Controllable Network Learning Via Embedding
Human-Guided Concepts
- URL: http://arxiv.org/abs/2210.03274v1
- Date: Fri, 7 Oct 2022 01:18:37 GMT
- Title: TCNL: Transparent and Controllable Network Learning Via Embedding
Human-Guided Concepts
- Authors: Zhihao Wang, Chuang Zhu
- Abstract summary: We propose a novel method, Transparent and Controllable Network Learning (TCNL), to overcome such challenges.
Towards the goal of improving transparency-interpretability, in TCNL, we define some concepts for specific classification tasks through scientific human-intuition study.
We also build the concept mapper to visualize features extracted by the concept extractor in a human-intuitive way.
- Score: 10.890006696574803
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Explaining deep learning models is of vital importance for understanding
artificial intelligence systems, improving safety, and evaluating fairness. To
better understand and control the CNN model, many methods for
transparency-interpretability have been proposed. However, most of these works
are less intuitive for human understanding and have insufficient human control
over the CNN model. We propose a novel method, Transparent and Controllable
Network Learning (TCNL), to overcome such challenges. Towards the goal of
improving transparency-interpretability, in TCNL, we define some concepts for
specific classification tasks through scientific human-intuition study and
incorporate concept information into the CNN model. In TCNL, the shallow
feature extractor gets preliminary features first. Then several concept feature
extractors are built right after the shallow feature extractor to learn
high-dimensional concept representations. The concept feature extractor is
encouraged to encode information related to the predefined concepts. We also
build the concept mapper to visualize features extracted by the concept
extractor in a human-intuitive way. TCNL provides a generalizable approach to
transparency-interpretability. Researchers can define concepts corresponding to
certain classification tasks and encourage the model to encode specific concept
information, which to a certain extent improves transparency-interpretability
and the controllability of the CNN model. The datasets (with concept sets) for
our experiments will also be released (https://github.com/bupt-ai-cz/TCNL).
Related papers
- Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery [52.498055901649025]
Concept Bottleneck Models (CBMs) have been proposed to address the 'black-box' problem of deep neural networks.
We propose a novel CBM approach -- called Discover-then-Name-CBM (DN-CBM) -- that inverts the typical paradigm.
Our concept extraction strategy is efficient, since it is agnostic to the downstream task, and uses concepts already known to the model.
arXiv Detail & Related papers (2024-07-19T17:50:11Z) - Restyling Unsupervised Concept Based Interpretable Networks with Generative Models [14.604305230535026]
We propose a novel method that relies on mapping the concept features to the latent space of a pretrained generative model.
We quantitatively ascertain the efficacy of our method in terms of accuracy of the interpretable prediction network, fidelity of reconstruction, as well as faithfulness and consistency of learnt concepts.
arXiv Detail & Related papers (2024-07-01T14:39:41Z) - Concept Distillation: Leveraging Human-Centered Explanations for Model
Improvement [3.026365073195727]
Concept Activation Vectors (CAVs) estimate a model's sensitivity and possible biases to a given concept.
We extend CAVs from post-hoc analysis to ante-hoc training in order to reduce model bias through fine-tuning.
We show applications of concept-sensitive training to debias several classification problems.
arXiv Detail & Related papers (2023-11-26T14:00:14Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - Visual Recognition with Deep Nearest Centroids [57.35144702563746]
We devise deep nearest centroids (DNC), a conceptually elegant yet surprisingly effective network for large-scale visual recognition.
Compared with parametric counterparts, DNC performs better on image classification (CIFAR-10, ImageNet) and greatly boots pixel recognition (ADE20K, Cityscapes)
arXiv Detail & Related papers (2022-09-15T15:47:31Z) - Human-Centered Concept Explanations for Neural Networks [47.71169918421306]
We introduce concept explanations including the class of Concept Activation Vectors (CAV)
We then discuss approaches to automatically extract concepts, and approaches to address some of their caveats.
Finally, we discuss some case studies that showcase the utility of such concept-based explanations in synthetic settings and real world applications.
arXiv Detail & Related papers (2022-02-25T01:27:31Z) - Expressive Explanations of DNNs by Combining Concept Analysis with ILP [0.3867363075280543]
We use inherent features learned by the network to build a global, expressive, verbal explanation of the rationale of a feed-forward convolutional deep neural network (DNN)
We show that our explanation is faithful to the original black-box model.
arXiv Detail & Related papers (2021-05-16T07:00:27Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - Learning Semantically Meaningful Features for Interpretable
Classifications [17.88784870849724]
SemCNN learns associations between visual features and word phrases.
Experiment results on multiple benchmark datasets demonstrate that SemCNN can learn features with clear semantic meaning.
arXiv Detail & Related papers (2021-01-11T14:35:16Z) - Learning Interpretable Concept-Based Models with Human Feedback [36.65337734891338]
We propose an approach for learning a set of transparent concept definitions in high-dimensional data that relies on users labeling concept features.
Our method produces concepts that both align with users' intuitive sense of what a concept means, and facilitate prediction of the downstream label by a transparent machine learning model.
arXiv Detail & Related papers (2020-12-04T23:41:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.