ASPIRE: Assistive System for Performance Evaluation in IR
- URL: http://arxiv.org/abs/2412.15759v1
- Date: Fri, 20 Dec 2024 10:25:28 GMT
- Title: ASPIRE: Assistive System for Performance Evaluation in IR
- Authors: Georgios Peikos, Wojciech Kusa, Symeon Symeonidis,
- Abstract summary: ASPIRE (Assistive System for Performance Evaluation in IR) is a visual analytics tool for in-depth analysis of Information Retrieval experiments.
ASPIRE supports four key aspects of IR experiment evaluation and analysis: single/multi-experiment comparisons, query-level analysis, query characteristics-performance interplay, and collection-based retrieval analysis.
- Score: 1.9441753575523208
- License:
- Abstract: Information Retrieval (IR) evaluation involves far more complexity than merely presenting performance measures in a table. Researchers often need to compare multiple models across various dimensions, such as the Precision-Recall trade-off and response time, to understand the reasons behind the varying performance of specific queries for different models. We introduce ASPIRE (Assistive System for Performance Evaluation in IR), a visual analytics tool designed to address these complexities by providing an extensive and user-friendly interface for in-depth analysis of IR experiments. ASPIRE supports four key aspects of IR experiment evaluation and analysis: single/multi-experiment comparisons, query-level analysis, query characteristics-performance interplay, and collection-based retrieval analysis. We showcase the functionality of ASPIRE using the TREC Clinical Trials collection. ASPIRE is an open-source toolkit available online: https://github.com/GiorgosPeikos/ASPIRE
Related papers
- VAMP: Visual Analytics for Microservices Performance [2.5824043688763543]
Existing distributed tracing tools leverage swimlane as the primary means to support performance analysis.
We introduce vamp once, the performance analysis of multiple end-to-end requests.
We show how vamp aids in identifying RPC execution time deviations with significant impact on end-to-end performance.
arXiv Detail & Related papers (2024-04-22T15:22:56Z) - RAGGED: Towards Informed Design of Retrieval Augmented Generation Systems [51.171355532527365]
Retrieval-augmented generation (RAG) can significantly improve the performance of language models (LMs)
RAGGED is a framework for analyzing RAG configurations across various document-based question answering tasks.
arXiv Detail & Related papers (2024-03-14T02:26:31Z) - BIRCO: A Benchmark of Information Retrieval Tasks with Complex Objectives [2.3420045370973828]
We present the Benchmark of Information Retrieval (IR) tasks with Complex Objectives (BIRCO)
BIRCO evaluates the ability of IR systems to retrieve documents given multi-faceted user objectives.
arXiv Detail & Related papers (2024-02-21T22:22:30Z) - Building Interpretable and Reliable Open Information Retriever for New
Domains Overnight [67.03842581848299]
Information retrieval is a critical component for many down-stream tasks such as open-domain question answering (QA)
We propose an information retrieval pipeline that uses entity/event linking model and query decomposition model to focus more accurately on different information units of the query.
We show that, while being more interpretable and reliable, our proposed pipeline significantly improves passage coverages and denotation accuracies across five IR and QA benchmarks.
arXiv Detail & Related papers (2023-08-09T07:47:17Z) - PyRCA: A Library for Metric-based Root Cause Analysis [66.72542200701807]
PyRCA is an open-source machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps)
It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents.
arXiv Detail & Related papers (2023-06-20T09:55:10Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Interactive Dimensionality Reduction for Comparative Analysis [28.52130400665133]
We introduce an interactive DR framework where we integrate our new DR method, called ULCA, with an interactive visual interface.
ULCA unifies two DR schemes, discriminant analysis and contrastive learning, to support various comparative analysis tasks.
We develop an optimization algorithm that enables analysts to interactively refine ULCA results.
arXiv Detail & Related papers (2021-06-29T15:05:36Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Shared Space Transfer Learning for analyzing multi-site fMRI data [83.41324371491774]
Multi-voxel pattern analysis (MVPA) learns predictive models from task-based functional magnetic resonance imaging (fMRI) data.
MVPA works best with a well-designed feature set and an adequate sample size.
Most fMRI datasets are noisy, high-dimensional, expensive to collect, and with small sample sizes.
This paper proposes the Shared Space Transfer Learning (SSTL) as a novel transfer learning approach.
arXiv Detail & Related papers (2020-10-24T08:50:26Z) - IOHanalyzer: Detailed Performance Analyses for Iterative Optimization
Heuristics [3.967483941966979]
IOHanalyzer is a new user-friendly tool for the analysis, comparison, and visualization of performance data of IOHs.
IOHanalyzer provides detailed statistics about fixed-target running times and about fixed-budget performance of the benchmarked algorithms.
IOHanalyzer can directly process performance data from the main benchmarking platforms.
arXiv Detail & Related papers (2020-07-08T08:20:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.