On the Biased Assessment of Expert Finding Systems
- URL: http://arxiv.org/abs/2410.05018v1
- Date: Mon, 7 Oct 2024 13:19:08 GMT
- Title: On the Biased Assessment of Expert Finding Systems
- Authors: Jens-Joris Decorte, Jeroen Van Hautte, Chris Develder, Thomas Demeester,
- Abstract summary: In large organisations, identifying experts on a given topic is crucial in leveraging the internal knowledge spread across teams and departments.
This case study provides an analysis of how these recommendations can impact the evaluation of expert finding systems.
We show that system-validated annotations lead to overestimated performance of traditional term-based retrieval models.
We also augment knowledge areas with synonyms to uncover a strong bias towards literal mentions of their constituent words.
- Score: 11.083396379885478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In large organisations, identifying experts on a given topic is crucial in leveraging the internal knowledge spread across teams and departments. So-called enterprise expert retrieval systems automatically discover and structure employees' expertise based on the vast amount of heterogeneous data available about them and the work they perform. Evaluating these systems requires comprehensive ground truth expert annotations, which are hard to obtain. Therefore, the annotation process typically relies on automated recommendations of knowledge areas to validate. This case study provides an analysis of how these recommendations can impact the evaluation of expert finding systems. We demonstrate on a popular benchmark that system-validated annotations lead to overestimated performance of traditional term-based retrieval models and even invalidate comparisons with more recent neural methods. We also augment knowledge areas with synonyms to uncover a strong bias towards literal mentions of their constituent words. Finally, we propose constraints to the annotation process to prevent these biased evaluations, and show that this still allows annotation suggestions of high utility. These findings should inform benchmark creation or selection for expert finding, to guarantee meaningful comparison of methods.
Related papers
- Designing an Interpretable Interface for Contextual Bandits [0.0]
We design a new interface to explain to domain experts the underlying behaviour of a bandit.
Our findings suggest that by carefully balancing technical rigour with accessible presentation, it is possible to empower non-experts to manage complex machine learning systems.
arXiv Detail & Related papers (2024-09-23T15:47:44Z) - The FIX Benchmark: Extracting Features Interpretable to eXperts [9.688218822056823]
We present FIX (Features Interpretable to eXperts), a benchmark for measuring how well a collection of features aligns with expert knowledge.
We propose FIXScore, a unified expert alignment measure applicable to diverse real-world settings across cosmology, psychology, and medicine domains.
arXiv Detail & Related papers (2024-09-20T17:53:03Z) - Improving Retrieval in Theme-specific Applications using a Corpus
Topical Taxonomy [52.426623750562335]
We introduce ToTER (Topical taxonomy Enhanced Retrieval) framework.
ToTER identifies the central topics of queries and documents with the guidance of the taxonomy, and exploits their topical relatedness to supplement missing contexts.
As a plug-and-play framework, ToTER can be flexibly employed to enhance various PLM-based retrievers.
arXiv Detail & Related papers (2024-03-07T02:34:54Z) - Causal Discovery with Language Models as Imperfect Experts [119.22928856942292]
We consider how expert knowledge can be used to improve the data-driven identification of causal graphs.
We propose strategies for amending such expert knowledge based on consistency properties.
We report a case study, on real data, where a large language model is used as an imperfect expert.
arXiv Detail & Related papers (2023-07-05T16:01:38Z) - Re-Examining Human Annotations for Interpretable NLP [80.81532239566992]
We conduct controlled experiments using crowd-sourced websites on two widely used datasets in Interpretable NLP.
We compare the annotation results obtained from recruiting workers satisfying different levels of qualification.
Our results reveal that the annotation quality is highly subject to the workers' qualification, and workers can be guided to provide certain annotations by the instructions.
arXiv Detail & Related papers (2022-04-10T02:27:30Z) - A Comprehensive Overview of Recommender System and Sentiment Analysis [1.370633147306388]
This paper gives a comprehensive overview to help researchers who aim to work with recommender system and sentiment analysis.
It includes a background of the recommender system concept, including phases, approaches, and performance metrics used in recommender systems.
Then, it discusses the sentiment analysis concept and highlights the main points in the sentiment analysis, including level, approaches, and focuses on aspect-based sentiment analysis.
arXiv Detail & Related papers (2021-09-18T01:08:41Z) - Human readable network troubleshooting based on anomaly detection and
feature scoring [11.593495085674343]
We present a system based on (i) unsupervised learning methods for detecting anomalies in the time domain, (ii) an attention mechanism to rank features in the feature space and (iii) an expert knowledge module.
We thoroughly evaluate the performance of the full system and of its individual building blocks.
arXiv Detail & Related papers (2021-08-26T14:20:36Z) - Estimation of Fair Ranking Metrics with Incomplete Judgments [70.37717864975387]
We propose a sampling strategy and estimation technique for four fair ranking metrics.
We formulate a robust and unbiased estimator which can operate even with very limited number of labeled items.
arXiv Detail & Related papers (2021-08-11T10:57:00Z) - Rethinking Search: Making Experts out of Dilettantes [55.90140165205178]
When experiencing an information need, users want to engage with an expert, but often turn to an information retrieval system, such as a search engine.
This paper examines how ideas from classical information retrieval and large pre-trained language models can be synthesized and evolved into systems that truly deliver on the promise of expert advice.
arXiv Detail & Related papers (2021-05-05T18:40:00Z) - Fairness-Aware Explainable Recommendation over Knowledge Graphs [73.81994676695346]
We analyze different groups of users according to their level of activity, and find that bias exists in recommendation performance between different groups.
We show that inactive users may be more susceptible to receiving unsatisfactory recommendations, due to insufficient training data for the inactive users.
We propose a fairness constrained approach via re-ranking to mitigate this problem in the context of explainable recommendation over knowledge graphs.
arXiv Detail & Related papers (2020-06-03T05:04:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.