The Compute Divide in Machine Learning: A Threat to Academic
Contribution and Scrutiny?
- URL: http://arxiv.org/abs/2401.02452v2
- Date: Mon, 8 Jan 2024 12:37:58 GMT
- Title: The Compute Divide in Machine Learning: A Threat to Academic
Contribution and Scrutiny?
- Authors: Tamay Besiroglu, Sage Andrus Bergerson, Amelia Michael, Lennart Heim,
Xueyun Luo, Neil Thompson
- Abstract summary: We show that a compute divide has coincided with a reduced representation of academic-only research teams in compute intensive research topics.
To address the challenges arising from this trend, we recommend approaches aimed at thoughtfully expanding academic insights.
- Score: 1.0985060632689174
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There are pronounced differences in the extent to which industrial and
academic AI labs use computing resources. We provide a data-driven survey of
the role of the compute divide in shaping machine learning research. We show
that a compute divide has coincided with a reduced representation of
academic-only research teams in compute intensive research topics, especially
foundation models. We argue that, academia will likely play a smaller role in
advancing the associated techniques, providing critical evaluation and
scrutiny, and in the diffusion of such models. Concurrent with this change in
research focus, there is a noticeable shift in academic research towards
embracing open source, pre-trained models developed within the industry. To
address the challenges arising from this trend, especially reduced scrutiny of
influential models, we recommend approaches aimed at thoughtfully expanding
academic insights. Nationally-sponsored computing infrastructure coupled with
open science initiatives could judiciously boost academic compute access,
prioritizing research on interpretability, safety and security. Structured
access programs and third-party auditing may also allow measured external
evaluation of industry systems.
Related papers
- Retrieval-Enhanced Machine Learning: Synthesis and Opportunities [60.34182805429511]
Retrieval-enhancement can be extended to a broader spectrum of machine learning (ML)
This work introduces a formal framework of this paradigm, Retrieval-Enhanced Machine Learning (REML), by synthesizing the literature in various domains in ML with consistent notations which is missing from the current literature.
The goal of this work is to equip researchers across various disciplines with a comprehensive, formally structured framework of retrieval-enhanced models, thereby fostering interdisciplinary future research.
arXiv Detail & Related papers (2024-07-17T20:01:21Z) - Mapping Computer Science Research: Trends, Influences, and Predictions [0.0]
We employ advanced machine learning techniques, including Decision Tree and Logistic Regression models, to predict trending research areas.
Our analysis reveals that the number of references cited in research papers (Reference Count) plays a pivotal role in determining trending research areas.
The Logistic Regression model outperforms the Decision Tree model in predicting trends, exhibiting higher accuracy, precision, recall, and F1 score.
arXiv Detail & Related papers (2023-08-01T16:59:25Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - Statistical Foundation Behind Machine Learning and Its Impact on
Computer Vision [8.974457198386414]
This paper revisits the principle of uniform convergence in statistical learning, discusses how it acts as the foundation behind machine learning, and attempts to gain a better understanding of the essential problem that current deep learning algorithms are solving.
Using computer vision as an example domain in machine learning, the discussion shows that recent research trends in leveraging increasingly large-scale data to perform pre-training for representation learning are largely to reduce the discrepancy between a practically tractable empirical loss and its ultimately desired but intractable expected loss.
arXiv Detail & Related papers (2022-09-06T17:59:04Z) - Research Trends and Applications of Data Augmentation Algorithms [77.34726150561087]
We identify the main areas of application of data augmentation algorithms, the types of algorithms used, significant research trends, their progression over time and research gaps in data augmentation literature.
We expect readers to understand the potential of data augmentation, as well as identify future research directions and open questions within data augmentation research.
arXiv Detail & Related papers (2022-07-18T11:38:32Z) - Evaluation Methods and Measures for Causal Learning Algorithms [33.07234268724662]
We focus on the two fundamental causal-inference tasks and causality-aware machine learning tasks.
The survey seeks to bring to the forefront the urgency of developing publicly available benchmarks and consensus-building standards for causal learning evaluation with observational data.
arXiv Detail & Related papers (2022-02-07T00:24:34Z) - Flashlight: Enabling Innovation in Tools for Machine Learning [50.63188263773778]
We introduce Flashlight, an open-source library built to spur innovation in machine learning tools and systems.
We see Flashlight as a tool enabling research that can benefit widely used libraries downstream and bring machine learning and systems researchers closer together.
arXiv Detail & Related papers (2022-01-29T01:03:29Z) - Artificial Intelligence for IT Operations (AIOPS) Workshop White Paper [50.25428141435537]
Artificial Intelligence for IT Operations (AIOps) is an emerging interdisciplinary field arising in the intersection between machine learning, big data, streaming analytics, and the management of IT operations.
Main aim of the AIOPS workshop is to bring together researchers from both academia and industry to present their experiences, results, and work in progress in this field.
arXiv Detail & Related papers (2021-01-15T10:43:10Z) - Knowledge as Invariance -- History and Perspectives of
Knowledge-augmented Machine Learning [69.99522650448213]
Research in machine learning is at a turning point.
Research interests are shifting away from increasing the performance of highly parameterized models to exceedingly specific tasks.
This white paper provides an introduction and discussion of this emerging field in machine learning research.
arXiv Detail & Related papers (2020-12-21T15:07:19Z) - A narrowing of AI research? [0.0]
We study the evolution of the thematic diversity of AI research in academia and the private sector.
We measure the influence of private companies in AI research through the citations they receive and their collaborations with other institutions.
arXiv Detail & Related papers (2020-09-22T08:23:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.