LLMs as Potential Brainstorming Partners for Math and Science Problems
- URL: http://arxiv.org/abs/2310.10677v1
- Date: Tue, 10 Oct 2023 21:16:35 GMT
- Title: LLMs as Potential Brainstorming Partners for Math and Science Problems
- Authors: Sophia Gu
- Abstract summary: A significant chasm still exists between current human-machine intellectual collaborations and the resolution of complex math and science problems.
This is due to the recent advancements in Large Language Models (LLMs)
We conduct comprehensive case studies to explore both the capabilities and limitations of the current state-of-the-art LLM, notably GPT-4, in collective brainstorming with humans.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the recent rise of widely successful deep learning models, there is
emerging interest among professionals in various math and science communities
to see and evaluate the state-of-the-art models' abilities to collaborate on
finding or solving problems that often require creativity and thus
brainstorming. While a significant chasm still exists between current
human-machine intellectual collaborations and the resolution of complex math
and science problems, such as the six unsolved Millennium Prize Problems, our
initial investigation into this matter reveals a promising step towards
bridging the divide. This is due to the recent advancements in Large Language
Models (LLMs). More specifically, we conduct comprehensive case studies to
explore both the capabilities and limitations of the current state-of-the-art
LLM, notably GPT-4, in collective brainstorming with humans.
Related papers
- Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning [51.11965014462375]
Multimodal Large Language Models (MLLMs) integrate text, images, and other modalities.
This paper argues that MLLMs can significantly advance scientific reasoning across disciplines such as mathematics, physics, chemistry, and biology.
arXiv Detail & Related papers (2025-02-05T04:05:27Z) - A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges [25.82535441866882]
This survey provides the first comprehensive analysis of mathematical reasoning in the era of multimodal large language models (MLLMs)
We review over 200 studies published since 2021, and examine the state-of-the-art developments in Math-LLMs.
In particular, we explore multimodal mathematical reasoning pipeline, as well as the role of (M)LLMs and the associated methodologies.
arXiv Detail & Related papers (2024-12-16T16:21:41Z) - Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems [9.162206328913237]
This study explores the creative potential of Large Language Models (LLMs) in mathematical reasoning.
We introduce a novel framework and benchmark, CreativeMath, which encompasses problems ranging from middle school curricula to Olympic-level competitions.
Our experiments demonstrate that, while LLMs perform well on standard mathematical tasks, their capacity for creative problem-solving varies considerably.
arXiv Detail & Related papers (2024-10-24T00:12:49Z) - Evaluating Large Vision-and-Language Models on Children's Mathematical Olympiads [74.54183505245553]
A systematic analysis of AI capabilities for joint vision and text reasoning is missing in the current scientific literature.
We evaluate state-of-the-art LVLMs on their mathematical and algorithmic reasoning abilities using visuo-linguistic problems from children's Olympiads.
Our results show that modern LVLMs do demonstrate increasingly powerful reasoning skills in solving problems for higher grades, but lack the foundations to correctly answer problems designed for younger children.
arXiv Detail & Related papers (2024-06-22T05:04:39Z) - Large Language Models for Mathematical Reasoning: Progresses and Challenges [15.925641169201747]
Large Language Models (LLMs) are geared towards the automated resolution of mathematical problems.
This survey endeavors to address four pivotal dimensions.
It provides a holistic perspective on the current state, accomplishments, and future challenges in this rapidly evolving field.
arXiv Detail & Related papers (2024-01-31T20:26:32Z) - Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges [60.62904929065257]
Large language models (LLMs) offer possibility for resolving this issue by comprehending individual requests.
This paper reviews the recently emerged LLM research related to educational capabilities, including mathematics, writing, programming, reasoning, and knowledge-based question answering.
arXiv Detail & Related papers (2023-12-27T14:37:32Z) - MacGyver: Are Large Language Models Creative Problem Solvers? [87.70522322728581]
We explore the creative problem-solving capabilities of modern LLMs in a novel constrained setting.
We create MACGYVER, an automatically generated dataset consisting of over 1,600 real-world problems.
We present our collection to both LLMs and humans to compare and contrast their problem-solving abilities.
arXiv Detail & Related papers (2023-11-16T08:52:27Z) - SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models [70.5763210869525]
We introduce an expansive benchmark suite SciBench for Large Language Model (LLM)
SciBench contains a dataset featuring a range of collegiate-level scientific problems from mathematics, chemistry, and physics domains.
The results reveal that the current LLMs fall short of delivering satisfactory performance, with the best overall score of merely 43.22%.
arXiv Detail & Related papers (2023-07-20T07:01:57Z) - Understanding the Usability Challenges of Machine Learning In
High-Stakes Decision Making [67.72855777115772]
Machine learning (ML) is being applied to a diverse and ever-growing set of domains.
In many cases, domain experts -- who often have no expertise in ML or data science -- are asked to use ML predictions to make high-stakes decisions.
We investigate the ML usability challenges present in the domain of child welfare screening through a series of collaborations with child welfare screeners.
arXiv Detail & Related papers (2021-03-02T22:50:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.