LISP -- A Rich Interaction Dataset and Loggable Interactive Search Platform
- URL: http://arxiv.org/abs/2601.09366v1
- Date: Wed, 14 Jan 2026 10:49:13 GMT
- Title: LISP -- A Rich Interaction Dataset and Loggable Interactive Search Platform
- Authors: Jana Isabelle Friese, Andreas Konstantin Kruff, Philipp Schaer, Norbert Fuhr, Nicola Ferro,
- Abstract summary: We present a reusable dataset and accompanying infrastructure for studying human search behavior in Interactive Information Retrieval (IIR)<n>The dataset combines detailed interaction logs from 61 participants with user characteristics, including perceptual speed, topic-specific interest, search expertise, and demographic information.
- Score: 10.637323019551035
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a reusable dataset and accompanying infrastructure for studying human search behavior in Interactive Information Retrieval (IIR). The dataset combines detailed interaction logs from 61 participants (122 sessions) with user characteristics, including perceptual speed, topic-specific interest, search expertise, and demographic information. To facilitate reproducibility and reuse, we provide a fully documented study setup, a web-based perceptual speed test, and a framework for conducting similar user studies. Our work allows researchers to investigate individual and contextual factors affecting search behavior, and to develop or validate user simulators that account for such variability. We illustrate the datasets potential through an illustrative analysis and release all resources as open-access, supporting reproducible research and resource sharing in the IIR community.
Related papers
- Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset [47.98539809308384]
We analyze the Asta Interaction dataset, a large-scale resource comprising over 200,000 user queries and interaction logs.<n>We characterize query patterns, engagement behaviors, and how usage evolves with experience.<n>We release the anonymized dataset and analysis with a new query taxonomy to inform future designs of real-world AI research assistants.
arXiv Detail & Related papers (2026-02-26T18:40:28Z) - ECHO: An Open Research Platform for Evaluation of Chat, Human Behavior, and Outcomes [6.989051035721272]
ECHO is an open research platform designed to support mixed-method studies of human interaction with both conversational AI systems and Web search engines.<n>It enables researchers from varying disciplines to orchestrate end-to-end experimental that integrate consent and background surveys, chat-based and search-based information-seeking sessions, writing or judgment tasks, and pre-task evaluations within a unified, low-coding-load framework.
arXiv Detail & Related papers (2026-02-10T21:10:38Z) - Linking Heterogeneous Data with Coordinated Agent Flows for Social Media Analysis [24.70488591952602]
Social media platforms generate massive volumes of heterogeneous data.<n>We present SIA (Social Insight Agents), an agent system that links heterogeneous multi-modal data.<n>We show that SIA effectively discovers diverse and meaningful insights from social media.
arXiv Detail & Related papers (2025-10-30T06:22:49Z) - Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions [39.2477761959206]
The challenge of improving user experiences in complex systems with search and recommendation (S&R) has drawn significant attention from both academia and industry.<n>We present a novel multimodal information retrieval dataset in this paper, namely Qilin.<n>The dataset is collected from Xiaohongshu, a popular social platform with over 300 million monthly active users and an average search penetration rate of over 70%.
arXiv Detail & Related papers (2025-03-01T14:15:00Z) - A Comprehensive Survey on Composed Image Retrieval [54.54527281731775]
Composed Image Retrieval (CIR) is an emerging yet challenging task that allows users to search for target images using a multimodal query.<n>There is currently no comprehensive review of CIR to provide a timely overview of this field.<n>We synthesize insights from over 120 publications in top conferences and journals, including ACM TOIS, SIGIR, and CVPR.
arXiv Detail & Related papers (2025-02-19T01:37:24Z) - A Comprehensive Survey on Imbalanced Data Learning [56.65067795190842]
imbalanced data is prevalent in various types of raw data and hinders the performance of machine learning.<n>This survey systematically analyzes various real-world data formats.<n>It concludes existing researches for different data formats into four categories: data re-balancing, feature representation, training strategy, and ensemble learning.
arXiv Detail & Related papers (2025-02-13T04:53:17Z) - User-centric evaluation of explainability of AI with and for humans: a comprehensive empirical study [5.775094401949666]
This study is located in the Human-Centered Artificial Intelligence (HCAI)
It focuses on the results of a user-centered assessment of commonly used eXplainable Artificial Intelligence (XAI) algorithms.
arXiv Detail & Related papers (2024-10-21T12:32:39Z) - DISCOVER: A Data-driven Interactive System for Comprehensive Observation, Visualization, and ExploRation of Human Behaviour [6.716560115378451]
We introduce a modular, flexible, yet user-friendly software framework specifically developed to streamline computational-driven data exploration for human behavior analysis.
Our primary objective is to democratize access to advanced computational methodologies, thereby enabling researchers across disciplines to engage in detailed behavioral analysis without the need for extensive technical proficiency.
arXiv Detail & Related papers (2024-07-18T11:28:52Z) - STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases [93.96463520716759]
We develop STARK, a large-scale Semi-structure retrieval benchmark on Textual and Knowledge Bases.
Our benchmark covers three domains: product search, academic paper search, and queries in precision medicine.
We design a novel pipeline to synthesize realistic user queries that integrate diverse relational information and complex textual properties.
arXiv Detail & Related papers (2024-04-19T22:54:54Z) - Learning Representations without Compositional Assumptions [79.12273403390311]
We propose a data-driven approach that learns feature set dependencies by representing feature sets as graph nodes and their relationships as learnable edges.
We also introduce LEGATO, a novel hierarchical graph autoencoder that learns a smaller, latent graph to aggregate information from multiple views dynamically.
arXiv Detail & Related papers (2023-05-31T10:36:10Z) - A Unified Comparison of User Modeling Techniques for Predicting Data
Interaction and Detecting Exploration Bias [17.518601254380275]
We compare and rank eight user modeling algorithms based on their performance on a diverse set of four user study datasets.
Based on our findings, we highlight open challenges and new directions for analyzing user interactions and visualization provenance.
arXiv Detail & Related papers (2022-08-09T19:51:10Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - Mining Implicit Entity Preference from User-Item Interaction Data for
Knowledge Graph Completion via Adversarial Learning [82.46332224556257]
We propose a novel adversarial learning approach by leveraging user interaction data for the Knowledge Graph Completion task.
Our generator is isolated from user interaction data, and serves to improve the performance of the discriminator.
To discover implicit entity preference of users, we design an elaborate collaborative learning algorithms based on graph neural networks.
arXiv Detail & Related papers (2020-03-28T05:47:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.