Related papers: Automatically Finding and Categorizing Replication Studies

Related papers

LLM-Assisted Replication for Quantitative Social Science [11.948334549403583]
Large language models (LLMs) have accelerated scientific production by streamlining writing, coding, and reviewing.<n>We present an LLM-based system that replicates statistical analyses from social science papers and flags potential problems.
arXiv Detail & Related papers (2026-02-04T15:17:49Z)
Hypothesis Hunting with Evolving Networks of Autonomous Scientific Agents [52.50038914857797]
We term this process hypothesis hunting: the cumulative search for insight through sustained exploration across vast and complex hypothesis spaces.<n>We introduce AScience, a framework modeling discovery as the interaction of agents, networks, and evaluation norms, and implement it as ASCollab.<n> Experiments show that such social dynamics enable the accumulation of expert-rated results along the diversity-quality-novelty frontier.
arXiv Detail & Related papers (2025-10-08T08:47:07Z)
Beyond Memorization: Reasoning-Driven Synthesis as a Mitigation Strategy Against Benchmark Contamination [77.69093448529455]
We present an empirical study using an infinitely scalable framework to synthesize research-level QA directly from arXiv papers.<n>We evaluate a lack of significant performance decay near knowledge cutoff dates for models of various sizes, developers, and release dates.<n>We hypothesize that the multi-step reasoning required by our synthesis pipeline offered additional complexity that goes deeper than shallow memorization.
arXiv Detail & Related papers (2025-08-26T16:41:37Z)
100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models [58.98176123850354]
The recent release of DeepSeek-R1 has generated widespread social impact and sparked enthusiasm in the research community for exploring the explicit reasoning paradigm of language models. The implementation details of the released models have not been fully open-sourced by DeepSeek, including DeepSeek-R1-Zero, DeepSeek-R1, and the distilled small models. Many replication studies have emerged aiming to reproduce the strong performance achieved by DeepSeek-R1, reaching comparable performance through similar training procedures and fully open-source data resources.
arXiv Detail & Related papers (2025-05-01T14:28:35Z)
Replication Packages in Software Engineering Secondary Studies: A Systematic Mapping [0.9421843976231371]
Systematic reviews (SRs) summarize state-of-the-art evidence in science, including software engineering (SE) We examined 528 secondary studies published between 2013 and 2023 to analyze the availability and reporting of replication packages.
arXiv Detail & Related papers (2025-04-17T05:11:39Z)
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition [67.26124739345332]
Large language models (LLMs) have demonstrated potential in assisting scientific research, yet their ability to discover high-quality research hypotheses remains unexamined. We introduce the first large-scale benchmark for evaluating LLMs with a near-sufficient set of sub-tasks of scientific discovery. We develop an automated framework that extracts critical components - research questions, background surveys, inspirations, and hypotheses - from scientific papers.
arXiv Detail & Related papers (2025-03-27T08:09:15Z)
Using Large Language Models to Create AI Personas for Replication and Prediction of Media Effects: An Empirical Test of 133 Published Experimental Research Findings [0.3749861135832072]
This report analyzes the potential for large language models (LLMs) to expedite accurate replication of message effects studies. We tested LLM-powered participants by replicating 133 experimental findings from 14 papers containing 45 recent studies in the Journal of Marketing. Our LLM replications successfully reproduced 76% of the original main effects (84 out of 111), demonstrating strong potential for AI-assisted replication of studies in which people respond to media stimuli.
arXiv Detail & Related papers (2024-08-28T18:14:39Z)
Seeing Unseen: Discover Novel Biomedical Concepts via Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues. We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space. A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z)
Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions. We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z)
Sample Complexity Bounds for Score-Matching: Causal Discovery and Generative Modeling [82.36856860383291]
We demonstrate that accurate estimation of the score function is achievable by training a standard deep ReLU neural network. We establish bounds on the error rate of recovering causal relationships using the score-matching-based causal discovery method.
arXiv Detail & Related papers (2023-10-27T13:09:56Z)
In-class Data Analysis Replications: Teaching Students while Testing Science [16.951059542542843]
In the present study, we incorporated data analysis replications in the project component of the Applied Data Analysis course taught at EPFL. We find discrepancies between what students expect of data analysis replications and what they experience. We identify tangible benefits of the in-class data analysis replications for scientific communities.
arXiv Detail & Related papers (2023-08-31T06:53:22Z)
Replicable Reinforcement Learning [15.857503103543308]
We provide a provably replicable algorithm for parallel value iteration, and a provably replicable version of R-max in the episodic setting. These are the first formal replicability results for control problems, which present different challenges for replication than batch learning settings.
arXiv Detail & Related papers (2023-05-24T16:05:15Z)
A Study on Reproducibility and Replicability of Table Structure Recognition Methods [3.8366337377024298]
We examine both and replicability of a corpus of 16 papers on table structure recognition (TSR) We reproduce results consistent with the original in only four of the 16 papers studied. No paper is identified as replicable using the new dataset.
arXiv Detail & Related papers (2023-04-20T16:30:58Z)
Building a Relation Extraction Baseline for Gene-Disease Associations: A Reproducibility Study [0.0]
We reproduce DEXTER, a system to automatically extract Gene-Disease Associations from biomedical abstracts. The goal is to provide a benchmark for future works regarding Relation Extraction (RE)
arXiv Detail & Related papers (2022-07-04T08:19:43Z)
Mode recovery in neural autoregressive sequence modeling [55.05526174291747]
Recent studies have revealed unexpected and undesirable properties of neural autoregressive sequence models. We investigate how the modes, or local maxima, of a distribution are maintained throughout the full learning chain. We conclude that future research must consider the entire learning chain in order to fully understand the potentials and perils.
arXiv Detail & Related papers (2021-06-10T02:17:28Z)
Reproducibility Companion Paper: Knowledge Enhanced Neural Fashion Trend Forecasting [78.046352507802]
We provide an artifact that allows the replication of the experiments using a Python implementation. We reproduce the experiments conducted in the original paper and obtain similar performance as previously reported.
arXiv Detail & Related papers (2021-05-25T10:53:11Z)
Identifying Statistical Bias in Dataset Replication [102.92137353938388]
We study a replication of the ImageNet dataset on which models exhibit a significant (11-14%) drop in accuracy. After correcting for the identified statistical bias, only an estimated $3.6% pm 1.5%$ of the original $11.7% pm 1.0%$ accuracy drop remains unaccounted for.
arXiv Detail & Related papers (2020-05-19T17:48:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.