Report on the 1st Workshop on Large Language Model for Evaluation in Information Retrieval (LLM4Eval 2024) at SIGIR 2024
- URL: http://arxiv.org/abs/2408.05388v1
- Date: Fri, 09 Aug 2024 23:55:58 GMT
- Title: Report on the 1st Workshop on Large Language Model for Evaluation in Information Retrieval (LLM4Eval 2024) at SIGIR 2024
- Authors: Hossein A. Rahmani, Clemencia Siro, Mohammad Aliannejadi, Nick Craswell, Charles L. A. Clarke, Guglielmo Faggioli, Bhaskar Mitra, Paul Thomas, Emine Yilmaz,
- Abstract summary: The aim was to bring information retrieval researchers together around the topic of LLMs for evaluation in information retrieval.
Given the novelty of the topic, the workshop was focused around multi-sided discussions.
- Score: 37.103230004631996
- License:
- Abstract: The first edition of the workshop on Large Language Model for Evaluation in Information Retrieval (LLM4Eval 2024) took place in July 2024, co-located with the ACM SIGIR Conference 2024 in the USA (SIGIR 2024). The aim was to bring information retrieval researchers together around the topic of LLMs for evaluation in information retrieval that gathered attention with the advancement of large language models and generative AI. Given the novelty of the topic, the workshop was focused around multi-sided discussions, namely panels and poster sessions of the accepted proceedings papers.
Related papers
- LLM+KG@VLDB'24 Workshop Summary [9.347889830892182]
Large language models (LLMs) and knowledge graphs (KGs) have emerged as a hot topic.
At the LLM+KG'24 workshop, held in conjunction with VLDB 2024 in Guangzhou, China, one of the key themes explored was important data management challenges and opportunities.
This report outlines the major directions and approaches presented by various speakers during the workshop.
arXiv Detail & Related papers (2024-10-02T19:35:35Z) - Report on the Workshop on Simulations for Information Access (Sim4IA 2024) at SIGIR 2024 [33.22176229332443]
This paper is a report of the Workshop on Simulations for Information Access (Sim4IA) workshop at SIGIR 2024.
Key takeaways were user simulation's importance in academia and industry, the possible bridging of online and offline evaluation, and the issues of organizing a companion shared task around user simulations for information access.
arXiv Detail & Related papers (2024-09-26T16:32:10Z) - Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews [51.453135368388686]
We present an approach for estimating the fraction of text in a large corpus which is likely to be substantially modified or produced by a large language model (LLM)
Our maximum likelihood model leverages expert-written and AI-generated reference texts to accurately and efficiently examine real-world LLM-use at the corpus level.
arXiv Detail & Related papers (2024-03-11T21:51:39Z) - Recent Advances, Applications, and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2023 Symposium [71.81297744767885]
Third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA.
We organized eleven in-person roundtables and four virtual roundtables at ML4H 2022.
This document serves as a comprehensive review paper, summarizing the recent advancements in machine learning for healthcare.
arXiv Detail & Related papers (2024-03-03T22:21:58Z) - Perception Test 2023: A Summary of the First Challenge And Outcome [67.0525378209708]
The First Perception Test challenge was held as a half-day workshop alongside the IEEE/CVF International Conference on Computer Vision (ICCV) 2023.
The goal was to benchmarking state-of-the-art video models on the recently proposed Perception Test benchmark.
We summarise in this report the task descriptions, metrics, baselines, and results.
arXiv Detail & Related papers (2023-12-20T15:12:27Z) - Sixth International Workshop on Languages for Modelling Variability (MODEVAR 2024) [0.0]
This is the proceedings of the Sixth International Workshop on Languages for Modelling Variability (MODE 2024) which was held in Bern, Switzerland, February 06th 2024.
arXiv Detail & Related papers (2023-12-19T08:28:06Z) - Proceedings Fifth International Workshop on Formal Methods for
Autonomous Systems [0.0]
FMAS 2023 was co-located with 18th International Conference on integrated Formal Methods (iFM'22)
The workshop itself was held at Scheltema Leiden, a renovated 19th Century blanket factory alongside the canal.
arXiv Detail & Related papers (2023-11-15T14:20:56Z) - Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation
over More Languages and Beyond [89.54151859266202]
The 2023 Multilingual Speech Universal Performance Benchmark (ML-SUPERB) Challenge expands upon the acclaimed SUPERB framework.
The challenge garnered 12 model submissions and 54 language corpora, resulting in a comprehensive benchmark encompassing 154 languages.
The findings indicate that merely scaling models is not the definitive solution for multilingual speech tasks.
arXiv Detail & Related papers (2023-10-09T08:30:01Z) - L-Eval: Instituting Standardized Evaluation for Long Context Language
Models [91.05820785008527]
We propose L-Eval to institute a more standardized evaluation for long context language models (LCLMs)
We build a new evaluation suite containing 20 sub-tasks, 508 long documents, and over 2,000 human-labeled query-response pairs.
Results show that popular n-gram matching metrics generally can not correlate well with human judgment.
arXiv Detail & Related papers (2023-07-20T17:59:41Z) - Proceedings End-to-End Compositional Models of Vector-Based Semantics [0.0]
The workshop was sponsored by the research project 'A composition calculus for vector-based semantic modelling with a localization for Dutch'
The present volume collects the contributed papers and the abstracts of the invited talks.
arXiv Detail & Related papers (2022-08-10T12:50:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.