Related papers: OpenAI Cribbed Our Tax Example, But Can GPT-4 Really Do Tax?

OpenAI Cribbed Our Tax Example, But Can GPT-4 Really Do Tax?

URL: http://arxiv.org/abs/2309.09992v2
Date: Wed, 7 Feb 2024 16:40:22 GMT
Title: OpenAI Cribbed Our Tax Example, But Can GPT-4 Really Do Tax?
Authors: Andrew Blair-Stanek, Nils Holzenberger, Benjamin Van Durme
Abstract summary: The authors explain where OpenAI got the tax law example in its livestream demonstration of GPT-4. They also explain how GPT-4 got the wrong answer and how it fails to reliably calculate taxes.
Score: 50.46167465931653
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The authors explain where OpenAI got the tax law example in its livestream demonstration of GPT-4, why GPT-4 got the wrong answer, and how it fails to reliably calculate taxes.

Related papers

TaxCalcBench: Evaluating Frontier Models on the Tax Calculation Task [0.11999555634662631]
Calculating US personal income taxes is a task that requires building an understanding of vast amounts of English text.<n>We propose TaxCalcBench, a benchmark for determining models' abilities to calculate personal income tax returns.
arXiv Detail & Related papers (2025-07-22T00:37:59Z)
Is GPT-4 conscious? [3.463052279279264]
GPT-4 is often heralded as a leading commercial AI offering. But does it possess consciousness? This paper investigates this key question using the nine qualitative measurements of the Building Blocks theory.
arXiv Detail & Related papers (2024-06-19T05:26:55Z)
Unveiling the Safety of GPT-4o: An Empirical Study using Jailbreak Attacks [65.84623493488633]
This paper conducts a rigorous evaluation of GPT-4o against jailbreak attacks. The newly introduced audio modality opens up new attack vectors for jailbreak attacks on GPT-4o. Existing black-box multimodal jailbreak attack methods are largely ineffective against GPT-4o and GPT-4V.
arXiv Detail & Related papers (2024-06-10T14:18:56Z)
Learning Optimal Tax Design in Nonatomic Congestion Games [63.89699366726275]
We study how to learn the optimal tax design to maximize the efficiency in nonatomic congestion games. It is known that self-interested behavior among the players can alleviate the system's efficiency.
arXiv Detail & Related papers (2024-02-12T06:32:53Z)
On the Potential and Limitations of Few-Shot In-Context Learning to Generate Metamorphic Specifications for Tax Preparation Software [12.071874385139395]
Nearly 50% of taxpayers filed their individual income taxes using tax software in the U.S. in FY22. This paper formulates the task of generating metamorphic specifications as a translation task between properties extracted from tax documents.
arXiv Detail & Related papers (2023-11-20T18:12:28Z)
Large Language Models' Understanding of Math: Source Criticism and Extrapolation [0.0]
We evaluate the mathematical understanding of the GPT-4 model. It is hard to find scientific evidence suggesting GPT-4 has acquired an understanding of even basic mathematical concepts.
arXiv Detail & Related papers (2023-11-12T07:52:32Z)
How is ChatGPT's behavior changing over time? [72.79311931941876]
We evaluate the March 2023 and June 2023 versions of GPT-3.5 and GPT-4. We find that the performance and behavior of both GPT-3.5 and GPT-4 can vary greatly over time.
arXiv Detail & Related papers (2023-07-18T06:56:08Z)
Gpt-4: A Review on Advancements and Opportunities in Natural Language Processing [0.0]
Generative Pre-trained Transformer 4 (GPT-4) is the fourth-generation language model in the GPT series, developed by OpenAI. GPT-4 has a larger model size (more than one trillion), better multilingual capabilities, improved contextual understanding, and reasoning capabilities than GPT-3. Some of the potential applications of GPT-4 include chatbots, personal assistants, language translation, text summarization, and question-answering.
arXiv Detail & Related papers (2023-05-04T22:46:43Z)
Sparks of Artificial General Intelligence: Early experiments with GPT-4 [66.1188263570629]
GPT-4, developed by OpenAI, was trained using an unprecedented scale of compute and data. We demonstrate that GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more. We believe GPT-4 could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.
arXiv Detail & Related papers (2023-03-22T16:51:28Z)
Tax Knowledge Graph for a Smarter and More Personalized TurboTax [0.0]
We will share our innovative and practical approach to representing complicated U.S. and Canadian income tax compliance logic via a large-scale knowledge graph. We will cover how the Tax Knowledge Graph is constructed and automated, how it is used to calculate tax refunds, reasoned to find missing info, and navigated to explain the calculated results.
arXiv Detail & Related papers (2020-09-13T22:41:01Z)
The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies [119.07163415116686]
We train social planners that discover tax policies that can effectively trade-off economic equality and productivity. We present an economic simulation environment that features competitive pressures and market dynamics. We show that AI-driven tax policies improve the trade-off between equality and productivity by 16% over baseline policies.
arXiv Detail & Related papers (2020-04-28T06:57:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.