The FIX Benchmark: Extracting Features Interpretable to eXperts
- URL: http://arxiv.org/abs/2409.13684v2
- Date: Wed, 9 Oct 2024 17:47:01 GMT
- Title: The FIX Benchmark: Extracting Features Interpretable to eXperts
- Authors: Helen Jin, Shreya Havaldar, Chaehyeon Kim, Anton Xue, Weiqiu You, Helen Qu, Marco Gatti, Daniel A Hashimoto, Bhuvnesh Jain, Amin Madani, Masao Sako, Lyle Ungar, Eric Wong,
- Abstract summary: We present FIX (Features Interpretable to eXperts), a benchmark for measuring how well a collection of features aligns with expert knowledge.
We propose FIXScore, a unified expert alignment measure applicable to diverse real-world settings across cosmology, psychology, and medicine domains.
- Score: 9.688218822056823
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Feature-based methods are commonly used to explain model predictions, but these methods often implicitly assume that interpretable features are readily available. However, this is often not the case for high-dimensional data, and it can be hard even for domain experts to mathematically specify which features are important. Can we instead automatically extract collections or groups of features that are aligned with expert knowledge? To address this gap, we present FIX (Features Interpretable to eXperts), a benchmark for measuring how well a collection of features aligns with expert knowledge. In collaboration with domain experts, we propose FIXScore, a unified expert alignment measure applicable to diverse real-world settings across cosmology, psychology, and medicine domains in vision, language and time series data modalities. With FIXScore, we find that popular feature-based explanation methods have poor alignment with expert-specified knowledge, highlighting the need for new methods that can better identify features interpretable to experts.
Related papers
- On the Biased Assessment of Expert Finding Systems [11.083396379885478]
In large organisations, identifying experts on a given topic is crucial in leveraging the internal knowledge spread across teams and departments.
This case study provides an analysis of how these recommendations can impact the evaluation of expert finding systems.
We show that system-validated annotations lead to overestimated performance of traditional term-based retrieval models.
We also augment knowledge areas with synonyms to uncover a strong bias towards literal mentions of their constituent words.
arXiv Detail & Related papers (2024-10-07T13:19:08Z) - Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts [44.09546603624385]
We introduce a notion of expert specialization for Soft MoE.
We show that when there are many small experts, the architecture is implicitly biased in a fashion that allows us to efficiently approximate the specialized expert subset.
arXiv Detail & Related papers (2024-09-02T00:39:00Z) - Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods.
We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z) - Causal Discovery with Language Models as Imperfect Experts [119.22928856942292]
We consider how expert knowledge can be used to improve the data-driven identification of causal graphs.
We propose strategies for amending such expert knowledge based on consistency properties.
We report a case study, on real data, where a large language model is used as an imperfect expert.
arXiv Detail & Related papers (2023-07-05T16:01:38Z) - Entry Dependent Expert Selection in Distributed Gaussian Processes Using
Multilabel Classification [12.622412402489951]
An ensemble technique combines local predictions from Gaussian experts trained on different partitions of the data.
This paper proposes a flexible expert selection approach based on the characteristics of entry data points.
arXiv Detail & Related papers (2022-11-17T23:23:26Z) - MoEC: Mixture of Expert Clusters [93.63738535295866]
Sparsely Mixture of Experts (MoE) has received great interest due to its promising scaling capability with affordable computational overhead.
MoE converts dense layers into sparse experts, and utilizes a gated routing network to make experts conditionally activated.
However, as the number of experts grows, MoE with outrageous parameters suffers from overfitting and sparse data allocation.
arXiv Detail & Related papers (2022-07-19T06:09:55Z) - Utilizing Expert Features for Contrastive Learning of Time-Series
Representations [4.960805676180953]
We present an approach that incorporates expert knowledge for time-series representation learning.
Our method employs expert features to replace the commonly used data transformations in previous contrastive learning approaches.
We demonstrate on three real-world time-series datasets that ExpCLR surpasses several state-of-the-art methods for both unsupervised and semi-supervised representation learning.
arXiv Detail & Related papers (2022-06-23T07:56:27Z) - The Curious Layperson: Fine-Grained Image Recognition without Expert
Labels [90.88501867321573]
We consider a new problem: fine-grained image recognition without expert annotations.
We learn a model to describe the visual appearance of objects using non-expert image descriptions.
We then train a fine-grained textual similarity model that matches image descriptions with documents on a sentence-level basis.
arXiv Detail & Related papers (2021-11-05T17:58:37Z) - Towards Unbiased and Accurate Deferral to Multiple Experts [19.24068936057053]
We propose a framework that simultaneously learns a classifier and a deferral system, with the deferral system choosing to defer to one or more human experts.
We test our framework on a synthetic dataset and a content moderation dataset with biased synthetic experts, and show that it significantly improves the accuracy and fairness of the final predictions.
arXiv Detail & Related papers (2021-02-25T17:08:39Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z) - Expertise Style Transfer: A New Task Towards Better Communication
between Experts and Laymen [88.30492014778943]
We propose a new task of expertise style transfer and contribute a manually annotated dataset.
Solving this task not only simplifies the professional language, but also improves the accuracy and expertise level of laymen descriptions.
We establish the benchmark performance of five state-of-the-art models for style transfer and text simplification.
arXiv Detail & Related papers (2020-05-02T04:50:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.