Taxation Perspectives from Large Language Models: A Case Study on Additional Tax Penalties
- URL: http://arxiv.org/abs/2503.03444v1
- Date: Wed, 05 Mar 2025 12:24:20 GMT
- Title: Taxation Perspectives from Large Language Models: A Case Study on Additional Tax Penalties
- Authors: Eunkyung Choi, Young Jin Suh, Hun Park, Wonseok Hwang,
- Abstract summary: We introduce PLAT, a new benchmark designed to assess the ability of LLMs to predict the legitimacy of additional tax penalties.<n>Our experiments with six LLMs reveal that their baseline capabilities are limited, especially when dealing with conflicting issues that demand a comprehensive understanding.
- Score: 5.185522256407782
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: How capable are large language models (LLMs) in the domain of taxation? Although numerous studies have explored the legal domain in general, research dedicated to taxation remain scarce. Moreover, the datasets used in these studies are either simplified, failing to reflect the real-world complexities, or unavailable as open source. To address this gap, we introduce PLAT, a new benchmark designed to assess the ability of LLMs to predict the legitimacy of additional tax penalties. PLAT is constructed to evaluate LLMs' understanding of tax law, particularly in cases where resolving the issue requires more than just applying related statutes. Our experiments with six LLMs reveal that their baseline capabilities are limited, especially when dealing with conflicting issues that demand a comprehensive understanding. However, we found that enabling retrieval, self-reasoning, and discussion among multiple agents with specific role assignments, this limitation can be mitigated.
Related papers
- J&H: Evaluating the Robustness of Large Language Models Under Knowledge-Injection Attacks in Legal Domain [12.550611136062722]
We propose a method of legal knowledge injection attacks for robustness testing.
The aim of the framework is to explore whether LLMs perform deductive reasoning when accomplishing legal tasks.
We have collected mistakes that legal experts might make in judicial decisions in the real world.
arXiv Detail & Related papers (2025-03-24T05:42:05Z) - Can Large Language Models Grasp Legal Theories? Enhance Legal Reasoning with Insights from Multi-Agent Collaboration [27.047809869136458]
Large Language Models (LLMs) could struggle to fully understand legal theories and perform legal reasoning tasks.
We introduce a challenging task (confusing charge prediction) to better evaluate LLMs' understanding of legal theories and reasoning capabilities.
We also propose a novel framework: Multi-Agent framework for improving complex Legal Reasoning capability.
arXiv Detail & Related papers (2024-10-03T14:15:00Z) - Transforming Scholarly Landscapes: Influence of Large Language Models on Academic Fields beyond Computer Science [77.31665252336157]
Large Language Models (LLMs) have ushered in a transformative era in Natural Language Processing (NLP)
This work empirically examines the influence and use of LLMs in fields beyond NLP.
arXiv Detail & Related papers (2024-09-29T01:32:35Z) - Exploring Language Model Generalization in Low-Resource Extractive QA [57.14068405860034]
We investigate Extractive Question Answering (EQA) with Large Language Models (LLMs) under domain drift.<n>We devise a series of experiments to explain the performance gap empirically.
arXiv Detail & Related papers (2024-09-27T05:06:43Z) - Are Large Language Models a Good Replacement of Taxonomies? [25.963448807848746]
Large language models (LLMs) demonstrate an impressive ability to internalize knowledge and answer natural language questions.
We ask if the schema of knowledge graph (i.e., taxonomy) is made obsolete by LLMs.
arXiv Detail & Related papers (2024-06-17T01:21:50Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - BLT: Can Large Language Models Handle Basic Legal Text? [44.89873147675516]
GPT-4 and Claude perform poorly on basic legal text handling.
Poor performance on benchmark casts into doubt their reliability as-is for legal practice.
Fine-tuning on training set brings even a small model to near-perfect performance.
arXiv Detail & Related papers (2023-11-16T09:09:22Z) - A Comprehensive Evaluation of Large Language Models on Legal Judgment
Prediction [60.70089334782383]
Large language models (LLMs) have demonstrated great potential for domain-specific applications.
Recent disputes over GPT-4's law evaluation raise questions concerning their performance in real-world legal tasks.
We design practical baseline solutions based on LLMs and test on the task of legal judgment prediction.
arXiv Detail & Related papers (2023-10-18T07:38:04Z) - Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model
Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI.
Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems.
Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z) - Large Language Models as Tax Attorneys: A Case Study in Legal
Capabilities Emergence [5.07013500385659]
This paper explores Large Language Models' (LLMs) capabilities in applying tax law.
Our experiments demonstrate emerging legal understanding capabilities, with improved performance in each subsequent OpenAI model release.
Findings indicate that LLMs, particularly when combined with prompting enhancements and the correct legal texts, can perform at high levels of accuracy but not yet at expert tax lawyer levels.
arXiv Detail & Related papers (2023-06-12T12:40:48Z) - Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks.
We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.