Solving Probability and Statistics Problems by Program Synthesis
- URL: http://arxiv.org/abs/2111.08267v1
- Date: Tue, 16 Nov 2021 07:34:25 GMT
- Title: Solving Probability and Statistics Problems by Program Synthesis
- Authors: Leonard Tang and Elizabeth Ke and Nikhil Singh and Nakul Verma and
Iddo Drori
- Abstract summary: We solve university level probability and statistics questions by program synthesis using OpenAI's Codex.
Our work is the first to introduce a new dataset of university-level probability and statistics problems.
- Score: 1.0937094979510211
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We solve university level probability and statistics questions by program
synthesis using OpenAI's Codex, a Transformer trained on text and fine-tuned on
code. We transform course problems from MIT's 18.05 Introduction to Probability
and Statistics and Harvard's STAT110 Probability into programming tasks. We
then execute the generated code to get a solution. Since these course questions
are grounded in probability, we often aim to have Codex generate probabilistic
programs that simulate a large number of probabilistic dependencies to compute
its solution. Our approach requires prompt engineering to transform the
question from its original form to an explicit, tractable form that results in
a correct program and solution. To estimate the amount of work needed to
translate an original question into its tractable form, we measure the
similarity between original and transformed questions. Our work is the first to
introduce a new dataset of university-level probability and statistics problems
and solve these problems in a scalable fashion using the program synthesis
capabilities of large language models.
Related papers
- Bayesian Quantum State Tomography with Python's PyMC [0.0]
We show how to use Python-3's open source PyMC probabilistic programming package to transform an otherwise complicated QST optimization problem into a simple form.
We show how to use Python-3's open source PyMC probabilistic programming package to transform an otherwise complicated QST optimization problem into a simple form.
arXiv Detail & Related papers (2022-12-20T21:16:28Z) - JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem
Understanding [74.12405417718054]
This paper aims to advance the mathematical intelligence of machines by presenting the first Chinese mathematical pre-trained language model(PLM)
Unlike other standard NLP tasks, mathematical texts are difficult to understand, since they involve mathematical terminology, symbols and formulas in the problem statement.
We design a novel curriculum pre-training approach for improving the learning of mathematical PLMs, consisting of both basic and advanced courses.
arXiv Detail & Related papers (2022-06-13T17:03:52Z) - An Application of a Multivariate Estimation of Distribution Algorithm to
Cancer Chemotherapy [59.40521061783166]
Chemotherapy treatment for cancer is a complex optimisation problem with a large number of interacting variables and constraints.
We show that the more sophisticated algorithm would yield better performance on a complex problem like this.
We hypothesise that this is caused by the more sophisticated algorithm being impeded by the large number of interactions in the problem.
arXiv Detail & Related papers (2022-05-17T15:28:46Z) - A Conversational Paradigm for Program Synthesis [110.94409515865867]
We propose a conversational program synthesis approach via large language models.
We train a family of large language models, called CodeGen, on natural language and programming language data.
Our findings show the emergence of conversational capabilities and the effectiveness of the proposed conversational program synthesis paradigm.
arXiv Detail & Related papers (2022-03-25T06:55:15Z) - A Neural Network Solves and Generates Mathematics Problems by Program
Synthesis: Calculus, Differential Equations, Linear Algebra, and More [8.437319139670116]
We turn questions into programming tasks, automatically generate programs, and then execute them.
This is the first work to automatically solve, grade, and generate university-level Mathematics course questions at scale.
arXiv Detail & Related papers (2021-12-31T18:57:31Z) - ProbNum: Probabilistic Numerics in Python [62.52335490524408]
Probabilistic numerical methods (PNMs) solve numerical problems via probabilistic inference.
We present ProbNum: a Python library providing state-of-the-art PNMs.
arXiv Detail & Related papers (2021-12-03T07:20:50Z) - Solving Linear Algebra by Program Synthesis [1.0660480034605238]
We solve MIT's Linear Algebra 18.06 course and Columbia University's Computational Linear Algebra COMS3251 courses with perfect accuracy by interactive program synthesis.
This surprisingly strong result is achieved by turning the course questions into programming tasks and then running the programs to produce the correct answers.
arXiv Detail & Related papers (2021-11-16T01:16:43Z) - Measuring Mathematical Problem Solving With the MATH Dataset [55.4376028963537]
We introduce MATH, a dataset of 12,500 challenging competition mathematics problems.
Each problem has a full step-by-step solution which can be used to teach models to generate answer derivations and explanations.
We also contribute a large auxiliary pretraining dataset which helps teach models the fundamentals of mathematics.
arXiv Detail & Related papers (2021-03-05T18:59:39Z) - Transforming Probabilistic Programs for Model Checking [0.0]
We apply static analysis to probabilistic programs to automate large parts of two crucial model checking methods.
Our method transforms a probabilistic program specifying a density function into an efficient forward-sampling form.
We present an implementation targeting the popular Stan probabilistic programming language.
arXiv Detail & Related papers (2020-08-21T21:06:34Z) - Marginal likelihood computation for model selection and hypothesis
testing: an extensive review [66.37504201165159]
This article provides a comprehensive study of the state-of-the-art of the topic.
We highlight limitations, benefits, connections and differences among the different techniques.
Problems and possible solutions with the use of improper priors are also described.
arXiv Detail & Related papers (2020-05-17T18:31:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.