Related papers: From Multimodal Perception to Strategic Reasoning: A Survey on AI-Generated Game Commentary

Related papers

Let the Barbarians In: How AI Can Accelerate Systems Performance Research [80.43506848683633]
We term this iterative cycle of generation, evaluation, and refinement AI-Driven Research for Systems.<n>We demonstrate that ADRS-generated solutions can match or even outperform human state-of-the-art designs.
arXiv Detail & Related papers (2025-12-16T18:51:23Z)
Deep Research: A Systematic Survey [118.82795024422722]
Deep Research (DR) aims to combine the reasoning capabilities of large language models with external tools, such as search engines.<n>This survey presents a comprehensive and systematic overview of deep research systems.
arXiv Detail & Related papers (2025-11-24T15:28:28Z)
A Survey on Video Anomaly Detection via Deep Learning: Human, Vehicle, and Environment [2.3349787245442966]
Video Anomaly Detection (VAD) has emerged as a pivotal task in computer vision, with broad relevance across multiple fields.<n>Recent advances in deep learning have driven significant progress in this area, yet the field remains fragmented across domains and learning paradigms.<n>This survey offers a comprehensive perspective on VAD, systematically organizing the literature across various supervision levels.
arXiv Detail & Related papers (2025-08-19T18:50:49Z)
A Survey of Automatic Evaluation Methods on Text, Visual and Speech Generations [58.105900601078595]
We present a comprehensive review and a unified taxonomy of automatic evaluation methods for generated content across all three modalities.<n>Our analysis begins by examining evaluation methods for text generation, where techniques are most mature.<n>We then extend this framework to image and audio generation, demonstrating its broad applicability.
arXiv Detail & Related papers (2025-06-06T11:09:46Z)
Enhancing Knowledge Graph Completion with Entity Neighborhood and Relation Context [12.539576594311127]
We propose KGC-ERC, a framework that integrates both types of context to enrich the input of generative language models and enhance their reasoning capabilities.<n>Experiments on the Wikidata5M, Wiki27K, and FB15K-237-N datasets show that KGC-ERC outperforms or matches state-of-the-art baselines in predictive performance and scalability.
arXiv Detail & Related papers (2025-03-29T20:04:50Z)
A Survey on Knowledge-Oriented Retrieval-Augmented Generation [45.65542434522205]
Retrieval-Augmented Generation (RAG) has gained significant attention in recent years.<n>RAG combines large-scale retrieval systems with generative models.<n>We discuss the key characteristics of RAG, such as its ability to augment generative models with dynamic external knowledge.
arXiv Detail & Related papers (2025-03-11T01:59:35Z)
BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data [61.936320820180875]
Large language models (LLMs) have become increasingly pivotal across various domains. BabelBench is an innovative benchmark framework that evaluates the proficiency of LLMs in managing multimodal multistructured data with code execution. Our experimental findings on BabelBench indicate that even cutting-edge models like ChatGPT 4 exhibit substantial room for improvement.
arXiv Detail & Related papers (2024-10-01T15:11:24Z)
On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic Survey [82.49623756124357]
Zero-shot image recognition (ZSIR) aims to recognize and reason in unseen domains by learning generalized knowledge from limited data.<n>This paper thoroughly investigates recent advances in element-wise ZSIR and provides a basis for its future development.
arXiv Detail & Related papers (2024-08-09T05:49:21Z)
A Comprehensive Survey on Underwater Image Enhancement Based on Deep Learning [51.7818820745221]
Underwater image enhancement (UIE) presents a significant challenge within computer vision research. Despite the development of numerous UIE algorithms, a thorough and systematic review is still absent.
arXiv Detail & Related papers (2024-05-30T04:46:40Z)
How Far Are We From AGI: Are LLMs All We Need? [15.705756259264932]
AGI is distinguished by its ability to execute diverse real-world tasks with efficiency and effectiveness comparable to human intelligence. This paper outlines the requisite capability frameworks for AGI, integrating the internal, interface, and system dimensions. To give tangible insights into the ubiquitous impact of the integration of AI, we outline existing challenges and potential pathways toward AGI in multiple domains.
arXiv Detail & Related papers (2024-05-16T17:59:02Z)
ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications [10.529898520273063]
ACLSum is a novel summarization dataset carefully crafted and evaluated by domain experts. In contrast to previous datasets, ACLSum facilitates multi-aspect summarization of scientific papers.
arXiv Detail & Related papers (2024-03-08T13:32:01Z)
A Literature Review of Literature Reviews in Pattern Analysis and Machine Intelligence [51.26815896167173]
We present a comprehensive tertiary analysis of PAMI reviews along three complementary dimensions.<n>Our analyses reveal distinctive organizational patterns as well as persistent gaps in current review practices.<n>Finally, our evaluation of state-of-the-art AI-generated reviews indicates encouraging advances in coherence and organization.
arXiv Detail & Related papers (2024-02-20T11:28:50Z)
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models [49.16989035566899]
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by incorporating external knowledge sources. This paper constructs a large-scale and more comprehensive benchmark, and evaluates all the components of RAG systems in various RAG application scenarios.
arXiv Detail & Related papers (2024-01-30T14:25:32Z)
Retrieval-Augmented Generation for Large Language Models: A Survey [17.82361213043507]
Large Language Models (LLMs) showcase impressive capabilities but encounter challenges like hallucination. Retrieval-Augmented Generation (RAG) has emerged as a promising solution by incorporating knowledge from external databases.
arXiv Detail & Related papers (2023-12-18T07:47:33Z)
Striking Gold in Advertising: Standardization and Exploration of Ad Text Generation [5.3558730908641525]
We propose a first benchmark dataset, CAMERA, to standardize the task of ATG. Our experiments show the current state and the remaining challenges. We also explore how existing metrics in ATG and an LLM-based evaluator align with human evaluations.
arXiv Detail & Related papers (2023-09-21T12:51:24Z)
GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles [77.05288144035056]
Event Argument Extraction (EAE) has focused on improving model generalizability to cater to new events and domains. Standard benchmarking datasets like ACE and ERE cover less than 40 event types and 25 entity-centric argument roles.
arXiv Detail & Related papers (2022-05-25T05:46:28Z)
Representation-Centric Survey of Skeletal Action Recognition and the ANUBIS Benchmark [43.00059447663327]
3D skeleton-based human action recognition has emerged as a powerful alternative to traditional RGB and depth-based approaches.<n>Despite remarkable progress, current research remains fragmented across diverse input representations.<n>ANUBIS is a large-scale, challenging skeleton action dataset designed to address critical gaps in existing benchmarks.
arXiv Detail & Related papers (2022-05-04T14:03:43Z)
Open Domain Question Answering over Virtual Documents: A Unified Approach for Data and Text [62.489652395307914]
We use the data-to-text method as a means for encoding structured knowledge for knowledge-intensive applications, i.e. open-domain question answering (QA) Specifically, we propose a verbalizer-retriever-reader framework for open-domain QA over data and text where verbalized tables from Wikipedia and triples from Wikidata are used as augmented knowledge sources. We show that our Unified Data and Text QA, UDT-QA, can effectively benefit from the expanded knowledge index, leading to large gains over text-only baselines.
arXiv Detail & Related papers (2021-10-16T00:11:21Z)
Artificial Intelligence Narratives: An Objective Perspective on Current Developments [0.0]
This work provides a starting point for researchers interested in gaining a deeper understanding of the big picture of artificial intelligence (AI) An essential takeaway for the reader is that AI must be understood as an umbrella term encompassing a plethora of different methods, schools of thought, and their respective historical movements.
arXiv Detail & Related papers (2021-03-18T17:33:00Z)
GENIE: A Leaderboard for Human-in-the-Loop Evaluation of Text Generation [83.10599735938618]
Leaderboards have eased model development for many NLP datasets by standardizing their evaluation and delegating it to an independent external repository. This work introduces GENIE, an human evaluation leaderboard, which brings the ease of leaderboards to text generation tasks.
arXiv Detail & Related papers (2021-01-17T00:40:47Z)
KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT) All tasks in KILT are grounded in the same snapshot of Wikipedia. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.