Related papers: SciMON: Scientific Inspiration Machines Optimized for Novelty

SciMON: Scientific Inspiration Machines Optimized for Novelty

URL: http://arxiv.org/abs/2305.14259v7
Date: Mon, 3 Jun 2024 21:15:28 GMT
Title: SciMON: Scientific Inspiration Machines Optimized for Novelty
Authors: Qingyun Wang, Doug Downey, Heng Ji, Tom Hope,
Abstract summary: We explore and enhance the ability of neural language models to generate novel scientific directions grounded in literature. We take a dramatic departure with a novel setting in which models use as input background contexts. We present SciMON, a modeling framework that uses retrieval of "inspirations" from past scientific papers.
Score: 68.46036589035539
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We explore and enhance the ability of neural language models to generate novel scientific directions grounded in literature. Work on literature-based hypothesis generation has traditionally focused on binary link prediction--severely limiting the expressivity of hypotheses. This line of work also does not focus on optimizing novelty. We take a dramatic departure with a novel setting in which models use as input background contexts (e.g., problems, experimental settings, goals), and output natural language ideas grounded in literature. We present SciMON, a modeling framework that uses retrieval of "inspirations" from past scientific papers, and explicitly optimizes for novelty by iteratively comparing to prior papers and updating idea suggestions until sufficient novelty is achieved. Comprehensive evaluations reveal that GPT-4 tends to generate ideas with overall low technical depth and novelty, while our methods partially mitigate this issue. Our work represents a first step toward evaluating and developing language models that generate new ideas derived from the scientific literature

Related papers

Predicting New Research Directions in Materials Science using Large Language Models and Concept Graphs [30.813288388998256]
We show that large language models (LLMs) can extract concepts more efficiently than automated keyword extraction methods.<n>A machine learning model is trained to predict emerging combinations of concepts, based on historical data.<n>We show that the model can inspire materials scientists in their creative thinking process by predicting innovative combinations of topics that have not yet been investigated.
arXiv Detail & Related papers (2025-06-20T08:26:12Z)
Harnessing Large Language Models for Scientific Novelty Detection [49.10608128661251]
We propose to harness large language models (LLMs) for scientific novelty detection (ND)<n>To capture idea conception, we propose to train a lightweight retriever by distilling the idea-level knowledge from LLMs.<n> Experiments show our method consistently outperforms others on the proposed benchmark datasets for idea retrieval and ND tasks.
arXiv Detail & Related papers (2025-05-30T14:08:13Z)
Self-reflecting Large Language Models: A Hegelian Dialectical Approach [13.910371970437708]
Investigating NLP through a philosophical lens has recently caught researcher's eyes as it connects computational methods with classical schools of philosophy. This paper introduces a philosophical approach inspired by the Hegelian Dialectic for LLMs' self-reflection, utilizing a self-dialectical approach to emulate internal critiques and then synthesize new ideas by resolving the contradicting points. Our experiments show promise in generating new ideas and provide a stepping stone for future research.
arXiv Detail & Related papers (2025-01-24T20:54:29Z)
Good Idea or Not, Representation of LLM Could Tell [86.36317971482755]
We focus on idea assessment, which aims to leverage the knowledge of large language models to assess the merit of scientific ideas. We release a benchmark dataset from nearly four thousand manuscript papers with full texts, meticulously designed to train and evaluate the performance of different approaches to this task. Our findings suggest that the representations of large language models hold more potential in quantifying the value of ideas than their generative outputs.
arXiv Detail & Related papers (2024-09-07T02:07:22Z)
A Survey on Natural Language Counterfactual Generation [7.022371235308068]
Natural language counterfactual generation aims to minimally modify a given text such that the modified text will be classified into a different class. We propose a new taxonomy that systematically categorizes the generation methods into four groups and summarizes the metrics for evaluating the generation quality.
arXiv Detail & Related papers (2024-07-04T15:13:59Z)
Information Theoretic Text-to-Image Alignment [49.396917351264655]
We present a novel method that relies on an information-theoretic alignment measure to steer image generation. Our method is on-par or superior to the state-of-the-art, yet requires nothing but a pre-trained denoising network to estimate MI.
arXiv Detail & Related papers (2024-05-31T12:20:02Z)
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models [56.08917291606421]
ResearchAgent is a large language model-powered research idea writing agent. It generates problems, methods, and experiment designs while iteratively refining them based on scientific literature. We experimentally validate our ResearchAgent on scientific publications across multiple disciplines.
arXiv Detail & Related papers (2024-04-11T13:36:29Z)
Grounded Intuition of GPT-Vision's Abilities with Scientific Images [44.44139684561664]
We formalize a process that many have instinctively been trying already to develop "grounded intuition" of GPT-Vision. We use our technique to examine alt text generation for scientific figures, finding that GPT-Vision is particularly sensitive to prompting. Our method and analysis aim to help researchers ramp up their own grounded intuitions of new models while exposing how GPT-Vision can be applied to make information more accessible.
arXiv Detail & Related papers (2023-11-03T17:53:43Z)
Large Language Models for Automated Open-domain Scientific Hypotheses Discovery [50.40483334131271]
This work proposes the first dataset for social science academic hypotheses discovery. Unlike previous settings, the new dataset requires (1) using open-domain data (raw web corpus) as observations; and (2) proposing hypotheses even new to humanity. A multi- module framework is developed for the task, including three different feedback mechanisms to boost performance.
arXiv Detail & Related papers (2023-09-06T05:19:41Z)
Exploring and Verbalizing Academic Ideas by Concept Co-occurrence [42.16213986603552]
This study devises a framework based on concept co-occurrence for academic idea inspiration. We construct evolving concept graphs according to the co-occurrence relationship of concepts from 20 disciplines or topics. We generate a description of an idea based on a new data structure called co-occurrence citation quintuple.
arXiv Detail & Related papers (2023-06-04T07:01:30Z)
A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond [145.43029264191543]
Non-autoregressive (NAR) generation is first proposed in machine translation (NMT) to speed up inference. While NAR generation can significantly accelerate machine translation, the inference of autoregressive (AR) generation sacrificed translation accuracy. Many new models and algorithms have been designed/proposed to bridge the accuracy gap between NAR generation and AR generation.
arXiv Detail & Related papers (2022-04-20T07:25:22Z)
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics [8.293055016429863]
We study whether linguistic knowledge is a necessary condition for good performance of modern language models. We show that language models that are significantly compressed but perform well on their pretraining objectives retain good scores when probed for linguistic structures. This result supports the rediscovery hypothesis and leads to the second contribution of our paper: an information-theoretic framework that relates language modeling objective with linguistic information.
arXiv Detail & Related papers (2021-03-02T15:57:39Z)
Improving Adversarial Text Generation by Modeling the Distant Future [155.83051741029732]
We consider a text planning scheme and present a model-based imitation-learning approach to alleviate the aforementioned issues. We propose a novel guider network to focus on the generative process over a longer horizon, which can assist next-word prediction and provide intermediate rewards for generator optimization.
arXiv Detail & Related papers (2020-05-04T05:45:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.