Demo: Soccer Information Retrieval via Natural Queries using SoccerRAG
- URL: http://arxiv.org/abs/2406.01280v2
- Date: Mon, 22 Jul 2024 06:44:20 GMT
- Title: Demo: Soccer Information Retrieval via Natural Queries using SoccerRAG
- Authors: Aleksander Theo Strand, Sushant Gautam, Cise Midoglu, Pål Halvorsen,
- Abstract summary: SoccerRAG is an innovative framework designed to harness the power of Retrieval Augmented Generation (RAG) and Large Language Models (LLMs)
By leveraging a multimodal dataset, SoccerRAG supports dynamic querying and automatic data validation.
We present a novel interactive user interface (UI) based on the Chainlit framework which wraps around the core functionality.
- Score: 42.095162323265676
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid evolution of digital sports media necessitates sophisticated information retrieval systems that can efficiently parse extensive multimodal datasets. This paper demonstrates SoccerRAG, an innovative framework designed to harness the power of Retrieval Augmented Generation (RAG) and Large Language Models (LLMs) to extract soccer-related information through natural language queries. By leveraging a multimodal dataset, SoccerRAG supports dynamic querying and automatic data validation, enhancing user interaction and accessibility to sports archives. We present a novel interactive user interface (UI) based on the Chainlit framework which wraps around the core functionality, and enable users to interact with the SoccerRAG framework in a chatbot-like visual manner.
Related papers
- GridMind: A Multi-Agent NLP Framework for Unified, Cross-Modal NFL Data Insights [0.0]
This paper introduces GridMind, a framework that unifies structured, semi-structured, and unstructured data through Retrieval-Augmented Generation (RAG) and large language models (LLMs)
This approach aligns with the evolving field of multimodal representation learning, where unified models are increasingly essential for real-time, cross-modal interactions.
arXiv Detail & Related papers (2025-03-24T18:33:36Z) - Towards Text-Image Interleaved Retrieval [49.96332254241075]
We introduce the text-image interleaved retrieval (TIIR) task, where the query and document are interleaved text-image sequences.
We construct a TIIR benchmark based on naturally interleaved wikiHow tutorials, where a specific pipeline is designed to generate interleaved queries.
We propose a novel Matryoshka Multimodal Embedder (MME), which compresses the number of visual tokens at different granularity.
arXiv Detail & Related papers (2025-02-18T12:00:47Z) - A Proposed Large Language Model-Based Smart Search for Archive System [0.0]
This study presents a novel framework for smart search in digital archival systems.
By employing a Retrieval-Augmented Generation (RAG) approach, the framework enables the processing of natural language queries.
We present the architecture and implementation of the system and evaluate its performance in four experiments.
arXiv Detail & Related papers (2025-01-13T02:53:07Z) - Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining [67.87810796668981]
Information-Sensitive Cropping (ISC) and Self-Refining Dual Learning (SRDL)
Iris achieves state-of-the-art performance across multiple benchmarks with only 850K GUI annotations.
These improvements translate to significant gains in both web and OS agent downstream tasks.
arXiv Detail & Related papers (2024-12-13T18:40:10Z) - Enhancing Multimodal Query Representation via Visual Dialogues for End-to-End Knowledge Retrieval [26.585985828583304]
We propose an end-to-end multimodal retrieval system, Ret-XKnow, to endow a text retriever with the ability to understand multimodal queries.
To effectively learn multimodal interaction, we also introduce the Visual Dialogue-to-Retrieval dataset automatically constructed from visual dialogue datasets.
We demonstrate that our approach not only significantly improves retrieval performance in zero-shot settings but also achieves substantial improvements in fine-tuning scenarios.
arXiv Detail & Related papers (2024-11-13T04:32:58Z) - GUI Agents with Foundation Models: A Comprehensive Survey [52.991688542729385]
This survey consolidates recent research on (M)LLM-based GUI agents.
We highlight key innovations in data, frameworks, and applications.
We hope this paper will inspire further developments in the field of (M)LLM-based GUI agents.
arXiv Detail & Related papers (2024-11-07T17:28:10Z) - EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data [15.801018643716437]
This paper aims to enhance the GUI understanding and interacting capabilities of large vision-language models (LVLMs) through a data-driven approach.
We propose EDGE, a general data synthesis framework that automatically generates large-scale, multi-granularity training data from webpages across the Web.
Our approach significantly reduces the dependence on manual annotations, empowering researchers to harness the vast public resources available on the Web to advance their work.
arXiv Detail & Related papers (2024-10-25T10:46:17Z) - An Interactive Multi-modal Query Answering System with Retrieval-Augmented Large Language Models [21.892975397847316]
We present an interactive Multi-modal Query Answering (MQA) system, empowered by our newly developed multi-modal retrieval framework and navigation graph index.
One notable aspect of MQA is its utilization of contrastive learning to assess the significance of different modalities.
The system achieves efficient retrieval through our advanced navigation graph index, refined using computational pruning techniques.
arXiv Detail & Related papers (2024-07-05T02:01:49Z) - LLaRA: Supercharging Robot Learning Data for Vision-Language Policy [56.505551117094534]
Vision Language Models (VLMs) can process state information as visual-textual prompts and respond with policy decisions in text.
We propose LLaRA: Large Language and Robotics Assistant, a framework that formulates robot action policy as conversations.
arXiv Detail & Related papers (2024-06-28T17:59:12Z) - SoccerRAG: Multimodal Soccer Information Retrieval via Natural Queries [42.095162323265676]
SoccerRAG is an innovative framework designed to harness the power of Retrieval Augmented Generation (RAG) and Large Language Models (LLMs)
By leveraging a multimodal dataset, SoccerRAG supports dynamic querying and automatic data validation.
Our evaluations indicate that SoccerRAG effectively handles complex queries, offering significant improvements over traditional retrieval systems.
arXiv Detail & Related papers (2024-06-03T12:39:04Z) - SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset [46.60191376520379]
This paper presents SoccerNet-Echoes, an augmentation of the SoccerNet dataset with automatically generated transcriptions of audio commentaries from soccer game broadcasts.
By incorporating textual data alongside visual and auditory content, SoccerNet-Echoes aims to serve as a comprehensive resource for the development of algorithms specialized in capturing the dynamics of soccer games.
arXiv Detail & Related papers (2024-05-12T18:25:38Z) - KamerRaad: Enhancing Information Retrieval in Belgian National Politics through Hierarchical Summarization and Conversational Interfaces [55.00702535694059]
KamerRaad is an AI tool that leverages large language models to help citizens interactively engage with Belgian political information.
The tool extracts and concisely summarizes key excerpts from parliamentary proceedings, followed by the potential for interaction based on generative AI.
arXiv Detail & Related papers (2024-04-22T15:01:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.