Related papers: Sources of Irreproducibility in Machine Learning: A Review

Sources of Irreproducibility in Machine Learning: A Review

URL: http://arxiv.org/abs/2204.07610v2
Date: Fri, 14 Apr 2023 17:31:27 GMT
Title: Sources of Irreproducibility in Machine Learning: A Review
Authors: Odd Erik Gundersen, Kevin Coakley, Christine Kirkpatrick and Yolanda Gil
Abstract summary: There exist no theoretical framework that relates experiment design choices to potential effects on the conclusions. The objective of this paper is to develop a framework that enables applied data science practitioners and researchers to understand which experiment design choices can lead to false findings.
Score: 3.905855359082687
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Background: Many published machine learning studies are irreproducible. Issues with methodology and not properly accounting for variation introduced by the algorithm themselves or their implementations are attributed as the main contributors to the irreproducibility.Problem: There exist no theoretical framework that relates experiment design choices to potential effects on the conclusions. Without such a framework, it is much harder for practitioners and researchers to evaluate experiment results and describe the limitations of experiments. The lack of such a framework also makes it harder for independent researchers to systematically attribute the causes of failed reproducibility experiments. Objective: The objective of this paper is to develop a framework that enable applied data science practitioners and researchers to understand which experiment design choices can lead to false findings and how and by this help in analyzing the conclusions of reproducibility experiments. Method: We have compiled an extensive list of factors reported in the literature that can lead to machine learning studies being irreproducible. These factors are organized and categorized in a reproducibility framework motivated by the stages of the scientific method. The factors are analyzed for how they can affect the conclusions drawn from experiments. A model comparison study is used as an example. Conclusion: We provide a framework that describes machine learning methodology from experimental design decisions to the conclusions inferred from them.

Related papers

Data Fusion for Partial Identification of Causal Effects [62.56890808004615]
We propose a novel partial identification framework that enables researchers to answer key questions.<n>Is the causal effect positive or negative? and How severe must assumption violations be to overturn this conclusion?<n>We apply our framework to the Project STAR study, which investigates the effect of classroom size on students' third-grade standardized test performance.
arXiv Detail & Related papers (2025-05-30T07:13:01Z)
MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback [128.2992631982687]
We introduce the task of experiment-guided ranking, which aims to prioritize candidate hypotheses based on the results of previously tested ones.<n>We propose a simulator grounded in three domain-informed assumptions, modeling hypothesis performance as a function of similarity to a known ground truth hypothesis.<n>We curate a dataset of 124 chemistry hypotheses with experimentally reported outcomes to validate the simulator.
arXiv Detail & Related papers (2025-05-23T13:24:50Z)
Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage. Models may behave unreliably due to poorly explored failure modes. causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z)
Causal Representation Learning from Multimodal Biological Observations [57.00712157758845]
We aim to develop flexible identification conditions for multimodal data. We establish identifiability guarantees for each latent component, extending the subspace identification results from prior work. Our key theoretical ingredient is the structural sparsity of the causal connections among distinct modalities.
arXiv Detail & Related papers (2024-11-10T16:40:27Z)
Hypothesizing Missing Causal Variables with LLMs [55.28678224020973]
We formulate a novel task where the input is a partial causal graph with missing variables, and the output is a hypothesis about the missing variables to complete the partial graph. We show the strong ability of LLMs to hypothesize the mediation variables between a cause and its effect. We also observe surprising results where some of the open-source models outperform the closed GPT-4 model.
arXiv Detail & Related papers (2024-09-04T10:37:44Z)
Design Principles for Falsifiable, Replicable and Reproducible Empirical ML Research [2.3265565167163906]
Empirical research plays a fundamental role in the machine learning domain. We propose a model for the empirical research process, accompanied by guidelines to uphold the validity of empirical research.
arXiv Detail & Related papers (2024-05-28T11:37:59Z)
Examining the Effect of Implementation Factors on Deep Learning Reproducibility [1.4295431367554867]
Three deep learning experiments were ran five times each on 13 different hardware environments and four different software environments. There was a greater than 6% accuracy range on the same deterministic examples introduced from hardware or software environment variations alone.
arXiv Detail & Related papers (2023-12-11T18:51:13Z)
A Double Machine Learning Approach to Combining Experimental and Observational Data [59.29868677652324]
We propose a double machine learning approach to combine experimental and observational studies. Our framework tests for violations of external validity and ignorability under milder assumptions.
arXiv Detail & Related papers (2023-07-04T02:53:11Z)
A Causal Framework for Decomposing Spurious Variations [68.12191782657437]
We develop tools for decomposing spurious variations in Markovian and Semi-Markovian models. We prove the first results that allow a non-parametric decomposition of spurious effects. The described approach has several applications, ranging from explainable and fair AI to questions in epidemiology and medicine.
arXiv Detail & Related papers (2023-06-08T09:40:28Z)
Testing Causality in Scientific Modelling Software [0.26388783516590225]
Causal Testing Framework is a framework that uses Causal Inference techniques to establish causal effects from existing data. We present three case studies covering real-world scientific models, demonstrating how the Causal Testing Framework can infer metamorphic test outcomes.
arXiv Detail & Related papers (2022-09-01T10:57:54Z)
Observing Interventions: A logic for thinking about experiments [62.997667081978825]
This paper makes a first step towards a logic of learning from experiments. Crucial for our approach is the idea that the notion of an intervention can be used as a formal expression of a (real or hypothetical) experiment. For all the proposed logical systems, we provide a sound and complete axiomatization.
arXiv Detail & Related papers (2021-11-25T09:26:45Z)
A Guide to Reproducible Research in Signal Processing and Machine Learning [9.69596041242667]
In 2016 a survey conducted by the journal Nature found that 50% of researchers were unable to reproduce their own experiments. We aim to present signal processing researchers with a set of practical tools and strategies that can help mitigate many of the obstacles to producing reproducible computational experiments.
arXiv Detail & Related papers (2021-08-27T16:42:32Z)
Optimal Learning for Sequential Decisions in Laboratory Experimentation [0.0]
This tutorial is aimed to provide experimental scientists with a foundation in the science of making decisions. We introduce the concept of a learning policy, and review the major categories of policies. We then introduce a policy, known as the knowledge gradient, that maximizes the value of information from each experiment.
arXiv Detail & Related papers (2020-04-11T14:53:29Z)
A Survey on Causal Inference [64.45536158710014]
Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics. Various causal effect estimation methods for observational data have sprung up.
arXiv Detail & Related papers (2020-02-05T21:35:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.