Related papers: Can ChatGPT be used to generate scientific hypotheses?

Related papers

Position: Intelligent Science Laboratory Requires the Integration of Cognitive and Embodied AI [98.19195693735487]
We propose the paradigm of Intelligent Science Laboratories (ISLs)<n>ISLs are a multi-layered, closed-loop framework that deeply integrates cognitive and embodied intelligence.<n>We argue that such systems are essential for overcoming the current limitations of scientific discovery.
arXiv Detail & Related papers (2025-06-24T13:31:44Z)
Matter-of-Fact: A Benchmark for Verifying the Feasibility of Literature-Supported Claims in Materials Science [1.7113423851651721]
We introduce Matter-of-Fact, a challenge dataset for determining the feasibility of hypotheses framed as claims.<n>We show that strong baselines that include retrieval augmented generation over scientific literature and code generation fail to exceed 72% performance.
arXiv Detail & Related papers (2025-06-04T19:43:18Z)
AI Scientists Fail Without Strong Implementation Capability [33.232300349142285]
The emergence of Artificial Intelligence (AI) Scientist represents a paradigm shift in scientific discovery.<n>Recent AI Scientist studies demonstrate sufficient capabilities for independent scientific discovery.<n>Despite this substantial progress, AI Scientist has yet to produce a groundbreaking achievement in the domain of computer science.
arXiv Detail & Related papers (2025-06-02T06:59:10Z)
Toward Reliable Scientific Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models [18.850296587858946]
We introduce TruthHypo, a benchmark for assessing the capabilities of large language models in generating truthful hypotheses.<n>KnowHD is a knowledge-based hallucination detector to evaluate how well hypotheses are grounded in existing knowledge.
arXiv Detail & Related papers (2025-05-20T16:49:40Z)
Sparks of Science: Hypothesis Generation Using Structured Paper Data [1.250723303641055]
We introduce HypoGen, the first dataset of approximately 5500 structured problem-hypothesis pairs extracted from top-tier computer science conferences. We demonstrate that framing hypothesis generation as conditional language modelling, with the model fine-tuned on Bit-Flip-Spark and the Chain-of-Reasoning. We show that by fine-tuning on our HypoGen dataset we improve the novelty, feasibility, and overall quality of the generated hypotheses.
arXiv Detail & Related papers (2025-04-17T14:29:18Z)
Scaling Laws in Scientific Discovery with AI and Robot Scientists [72.3420699173245]
An autonomous generalist scientist (AGS) concept combines agentic AI and embodied robotics to automate the entire research lifecycle. AGS aims to significantly reduce the time and resources needed for scientific discovery. As these autonomous systems become increasingly integrated into the research process, we hypothesize that scientific discovery might adhere to new scaling laws.
arXiv Detail & Related papers (2025-03-28T14:00:27Z)
Towards an AI co-scientist [48.11351101913404]
We introduce an AI co-scientist, a multi-agent system built on Gemini 2.0. The AI co-scientist is intended to help uncover new, original knowledge and to formulate demonstrably novel research hypotheses. The system's design incorporates a generate, debate, and evolve approach to hypothesis generation, inspired by the scientific method.
arXiv Detail & Related papers (2025-02-26T06:17:13Z)
Two Heads Are Better Than One: A Multi-Agent System Has the Potential to Improve Scientific Idea Generation [48.29699224989952]
VirSci organizes a team of agents to collaboratively generate, evaluate, and refine research ideas. We show that this multi-agent approach outperforms the state-of-the-art method in producing novel and impactful scientific ideas.
arXiv Detail & Related papers (2024-10-12T07:16:22Z)
Hypothesizing Missing Causal Variables with LLMs [55.28678224020973]
We formulate a novel task where the input is a partial causal graph with missing variables, and the output is a hypothesis about the missing variables to complete the partial graph. We show the strong ability of LLMs to hypothesize the mediation variables between a cause and its effect. We also observe surprising results where some of the open-source models outperform the closed GPT-4 model.
arXiv Detail & Related papers (2024-09-04T10:37:44Z)
"Turing Tests" For An AI Scientist [0.0]
This paper proposes a "Turing test for an AI scientist" to assess whether an AI agent can conduct scientific research independently. We propose seven benchmark tests that evaluate an AI agent's ability to make groundbreaking discoveries in various scientific domains.
arXiv Detail & Related papers (2024-05-22T05:14:27Z)
LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery [141.39722070734737]
We propose to enhance the knowledge-driven, abstract reasoning abilities of Large Language Models with the computational strength of simulations. We introduce Scientific Generative Agent (SGA), a bilevel optimization framework. We conduct experiments to demonstrate our framework's efficacy in law discovery and molecular design.
arXiv Detail & Related papers (2024-05-16T03:04:10Z)
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models [56.08917291606421]
ResearchAgent is a large language model-powered research idea writing agent. It generates problems, methods, and experiment designs while iteratively refining them based on scientific literature. We experimentally validate our ResearchAgent on scientific publications across multiple disciplines.
arXiv Detail & Related papers (2024-04-11T13:36:29Z)
The Generative AI Paradox: "What It Can Create, It May Not Understand" [81.89252713236746]
Recent wave of generative AI has sparked excitement and concern over potentially superhuman levels of artificial intelligence. At the same time, models still show basic errors in understanding that would not be expected even in non-expert humans. This presents us with an apparent paradox: how do we reconcile seemingly superhuman capabilities with the persistence of errors that few humans would make?
arXiv Detail & Related papers (2023-10-31T18:07:07Z)
Can Large Language Models Discern Evidence for Scientific Hypotheses? Case Studies in the Social Sciences [3.9985385067438344]
A strong hypothesis is a best guess based on existing evidence and informed by a comprehensive view of relevant literature. With exponential increase in the number of scientific articles published annually, manual aggregation and synthesis of evidence related to a given hypothesis is a challenge. We share a novel dataset for the task of scientific hypothesis evidencing using community-driven annotations of studies in the social sciences.
arXiv Detail & Related papers (2023-09-07T04:15:17Z)
Large Language Models for Automated Open-domain Scientific Hypotheses Discovery [50.40483334131271]
This work proposes the first dataset for social science academic hypotheses discovery. Unlike previous settings, the new dataset requires (1) using open-domain data (raw web corpus) as observations; and (2) proposing hypotheses even new to humanity. A multi- module framework is developed for the task, including three different feedback mechanisms to boost performance.
arXiv Detail & Related papers (2023-09-06T05:19:41Z)
Accelerating science with human-aware artificial intelligence [2.7786142348700658]
We show that incorporating the distribution of human expertise by training unsupervised models dramatically improves (up to 400%) AI prediction of future discoveries. These models succeed by predicting human predictions and the scientists who will make them. Accelerating human discovery or probing its blind spots, human-aware AI enables us to move toward and beyond the contemporary scientific frontier.
arXiv Detail & Related papers (2023-06-02T12:43:23Z)
Accelerating science with human versus alien artificial intelligences [3.6354412526174196]
We show that incorporating the distribution of human expertise into self-supervised models dramatically improves AI prediction of future human discoveries and inventions. These models succeed by predicting human predictions and the scientists who will make them. By tuning AI to avoid the crowd, however, it generates scientifically promising "alien" hypotheses unlikely to be imagined or pursued without intervention.
arXiv Detail & Related papers (2021-04-12T03:50:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.