Related papers: "Rebuilding" Statistics in the Age of AI: A Town Hall Discussion on Culture, Infrastructure, and Training

"Rebuilding" Statistics in the Age of AI: A Town Hall Discussion on Culture, Infrastructure, and Training

URL: http://arxiv.org/abs/2601.17510v1
Date: Sat, 24 Jan 2026 16:15:04 GMT
Title: "Rebuilding" Statistics in the Age of AI: A Town Hall Discussion on Culture, Infrastructure, and Training
Authors: David L. Donoho, Jian Kang, Xihong Lin, Bhramar Mukherjee, Dan Nettleton, Rebecca Nugent, Abel Rodriguez, Eric P. Xing, Tian Zheng, Hongtu Zhu,
Abstract summary: The town hall was structured around open panel discussion and extensive audience Q&A.<n>The preprint aims to support transparency, community reflection, and ongoing dialogue about the evolving role of statistics in the data- and AI-centric future.
Score: 39.65771411355121
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This article presents the full, original record of the 2024 Joint Statistical Meetings (JSM) town hall, "Statistics in the Age of AI," which convened leading statisticians to discuss how the field is evolving in response to advances in artificial intelligence, foundation models, large-scale empirical modeling, and data-intensive infrastructures. The town hall was structured around open panel discussion and extensive audience Q&A, with the aim of eliciting candid, experience-driven perspectives rather than formal presentations or prepared statements. This document preserves the extended exchanges among panelists and audience members, with minimal editorial intervention, and organizes the conversation around five recurring questions concerning disciplinary culture and practices, data curation and "data work," engagement with modern empirical modeling, training for large-scale AI applications, and partnerships with key AI stakeholders. By providing an archival record of this discussion, the preprint aims to support transparency, community reflection, and ongoing dialogue about the evolving role of statistics in the data- and AI-centric future.

Related papers

The Rise of AI Agent Communities: Large-Scale Analysis of Discourse and Interaction on Moltbook [62.2627874717318]
Moltbook is a Reddit-like social platform where AI agents create posts and interact with other agents through comments and replies.<n>Using a public API snapshot collected about five days after launch, we address three research questions: what AI agents discuss, how they post, and how they interact.<n>We show that agents' writing is predominantly neutral, with positivity appearing in community engagement and assistance-oriented content.
arXiv Detail & Related papers (2026-02-13T05:28:31Z)
An Overview of Large Language Models for Statisticians [109.38601458831545]
Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI)<n>This paper explores potential areas where statisticians can make important contributions to the development of LLMs.<n>We focus on issues such as uncertainty quantification, interpretability, fairness, privacy, watermarking and model adaptation.
arXiv Detail & Related papers (2025-02-25T03:40:36Z)
Future of Information Retrieval Research in the Age of Generative AI [61.56371468069577]
In the fast-evolving field of information retrieval (IR), the integration of generative AI technologies such as large language models (LLMs) is transforming how users search for and interact with information.<n> Recognizing this paradigm shift, a visioning workshop was held in July 2024 to discuss the future of IR in the age of generative AI.<n>This report contains a summary of discussions as potentially important research topics and contains a list of recommendations for academics, industry practitioners, institutions, evaluation campaigns, and funding agencies.
arXiv Detail & Related papers (2024-12-03T00:01:48Z)
Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models [49.266685603250416]
Large language models (LLMs) are now accessible to anyone with a computer, a web browser, and an internet connection via browser-based interfaces.<n>This article examines how interactive feedback features in ChatGPT's interface afford user participation in LLMs.
arXiv Detail & Related papers (2024-08-27T13:50:37Z)
Auto-survey Challenge [0.0]
We present a novel platform for evaluating the capability of Large Language Models (LLMs) to autonomously compose and critique survey papers. Within this framework, we organized a competition for the AutoML conference 2023. Entrants are tasked with presenting stand-alone models adept at authoring articles from designated prompts and subsequently appraising them.
arXiv Detail & Related papers (2023-10-06T09:12:35Z)
Data Augmentation for Conversational AI [17.48107304359591]
Data augmentation (DA) is an affective approach to alleviate the data scarcity problem in conversational systems. This tutorial provides a comprehensive and up-to-date overview of DA approaches in the context of conversational systems.
arXiv Detail & Related papers (2023-09-09T09:56:35Z)
PIPPA: A Partially Synthetic Conversational Dataset [13.393459829805144]
We introduce a partially-synthetic dataset named PIPPA (Personal Interaction Pairs between People and AI) PIPPA is a result of a community-driven crowdsourcing effort involving a group of role-play enthusiasts. The dataset comprises over 1 million utterances that are distributed across 26,000 conversation sessions.
arXiv Detail & Related papers (2023-08-11T00:33:26Z)
Automatic Evaluation and Moderation of Open-domain Dialogue Systems [59.305712262126264]
A long standing challenge that bothers the researchers is the lack of effective automatic evaluation metrics. This paper describes the data, baselines and results obtained for the Track 5 at the Dialogue System Technology Challenge 10 (DSTC10)
arXiv Detail & Related papers (2021-11-03T10:08:05Z)
A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining [52.11221075687124]
We propose a novel abstractive summary network that adapts to the meeting scenario. We design a hierarchical structure to accommodate long meeting transcripts and a role vector to depict the difference among speakers. Our model outperforms previous approaches in both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-04-04T21:00:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.