Related papers: A Comprehensive Study on Quality Assurance Tools for Java

A Comprehensive Study on Quality Assurance Tools for Java

URL: http://arxiv.org/abs/2305.16812v2
Date: Wed, 7 Jun 2023 11:45:03 GMT
Title: A Comprehensive Study on Quality Assurance Tools for Java
Authors: Han Liu, Sen Chen, Ruitao Feng, Chengwei Liu, Kaixuan Li, Zhengzi Xu, Liming Nie, Yang Liu, Yixiang Chen
Abstract summary: Quality assurance (QA) tools are receiving more and more attention and are widely used by developers. Most existing research is limited in the following ways:. They compare tools without considering scanning rules analysis. They disagree on the effectiveness of tools due to the study methodology and benchmark dataset. There is no large-scale study on the analysis of time performance.
Score: 15.255117038871337
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Quality assurance (QA) tools are receiving more and more attention and are widely used by developers. Given the wide range of solutions for QA technology, it is still a question of evaluating QA tools. Most existing research is limited in the following ways: (i) They compare tools without considering scanning rules analysis. (ii) They disagree on the effectiveness of tools due to the study methodology and benchmark dataset. (iii) They do not separately analyze the role of the warnings. (iv) There is no large-scale study on the analysis of time performance. To address these problems, in the paper, we systematically select 6 free or open-source tools for a comprehensive study from a list of 148 existing Java QA tools. To carry out a comprehensive study and evaluate tools in multi-level dimensions, we first mapped the scanning rules to the CWE and analyze the coverage and granularity of the scanning rules. Then we conducted an experiment on 5 benchmarks, including 1,425 bugs, to investigate the effectiveness of these tools. Furthermore, we took substantial effort to investigate the effectiveness of warnings by comparing the real labeled bugs with the warnings and investigating their role in bug detection. Finally, we assessed these tools' time performance on 1,049 projects. The useful findings based on our comprehensive study can help developers improve their tools and provide users with suggestions for selecting QA tools.

Related papers

Do AI models help produce verified bug fixes? [62.985237003585674]
Large Language Models are used to produce corrections to software bugs.<n>This paper investigates how programmers use Large Language Models to complement their own skills.<n>The results are a first step towards a proper role for AI and LLMs in providing guaranteed-correct fixes to program bugs.
arXiv Detail & Related papers (2025-07-21T17:30:16Z)
CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios [30.20881816731553]
The ability of large language models to utilize external tools has enabled them to tackle an increasingly diverse range of tasks.<n>As the tasks become more complex and long-horizon, the intricate tool utilization process may trigger various unexpected errors.<n>How to effectively handle such errors, including identifying, diagnosing, and recovering from them, has emerged as a key research direction for advancing tool learning.
arXiv Detail & Related papers (2025-06-11T17:59:18Z)
Acting Less is Reasoning More! Teaching Model to Act Efficiently [87.28134636548705]
Tool-integrated reasoning augments large language models with the ability to invoke external tools to solve tasks.<n>Current approaches typically optimize only for final correctness without considering the efficiency or necessity of external tool use.<n>We propose a framework that encourages models to produce accurate answers with minimal tool calls.<n>Our approach reduces tool calls by up to 68.3% and improves tool productivity by up to 215.4%, while maintaining comparable answer accuracy.
arXiv Detail & Related papers (2025-04-21T05:40:05Z)
Vexed by VEX tools: Consistency evaluation of container vulnerability scanners [0.0]
This paper presents a study that analyzed state-of-the-art vulnerability scanning tools applied to containers. We have focused the work on tools following the Vulnerability Exploitability eXchange (VEX) format.
arXiv Detail & Related papers (2025-03-18T16:22:43Z)
Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger [49.81945268343162]
We propose MeCo, an adaptive decision-making strategy for external tool use. MeCo captures high-level cognitive signals in the representation space, guiding when to invoke tools. Our experiments show that MeCo accurately detects LLMs' internal cognitive signals and significantly improves tool-use decision-making.
arXiv Detail & Related papers (2025-02-18T15:45:01Z)
Chatbots im Schulunterricht: Wir testen das Fobizz-Tool zur automatischen Bewertung von Hausaufgaben [0.0]
This study examines the AI-powered grading tool "AI Grading Assistant" by the German company Fobizz. The tool's numerical grades and qualitative feedback are often random and do not improve even when its suggestions are incorporated. The study critiques the broader trend of adopting AI as a quick fix for systemic problems in education.
arXiv Detail & Related papers (2024-12-09T16:50:02Z)
A Comprehensive Study on Static Application Security Testing (SAST) Tools for Android [22.558610938860124]
VulsTotal is a unified evaluation platform for defining and describing tools' supported vulnerability types. We select 11 free and open-sourced SAST tools from a pool of 97 existing options, adhering to clearly defined criteria. We then unify 67 general/common vulnerability types for Android SAST tools.
arXiv Detail & Related papers (2024-10-28T05:10:22Z)
Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario [62.615210194004106]
Current research on tool learning primarily focuses on selecting the most effective tool from a wide array of options, often overlooking cost-effectiveness. In this paper, we address the selection of homogeneous tools by predicting both their performance and the associated cost required to accomplish a given task.
arXiv Detail & Related papers (2024-06-18T09:24:09Z)
Tool Learning with Large Language Models: A Survey [60.733557487886635]
Tool learning with large language models (LLMs) has emerged as a promising paradigm for augmenting the capabilities of LLMs to tackle highly complex problems. Despite growing attention and rapid advancements in this field, the existing literature remains fragmented and lacks systematic organization.
arXiv Detail & Related papers (2024-05-28T08:01:26Z)
Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models. Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions. We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z)
Efficacy of static analysis tools for software defect detection on open-source projects [0.0]
The study used popular analysis tools such as SonarQube, PMD, Checkstyle, and FindBugs to perform the comparison. The study results show that SonarQube performs considerably well than all other tools in terms of its defect detection.
arXiv Detail & Related papers (2024-05-20T19:05:32Z)
Static Application Security Testing (SAST) Tools for Smart Contracts: How Far Are We? [14.974832502863526]
In recent years, the importance of smart contract security has been heightened by the increasing number of attacks against them. To address this issue, a multitude of static application security testing (SAST) tools have been proposed for detecting vulnerabilities in smart contracts. In this paper, we propose an up-to-date and fine-grained taxonomy that includes 45 unique vulnerability types for smart contracts.
arXiv Detail & Related papers (2024-04-28T13:40:18Z)
What Are Tools Anyway? A Survey from the Language Model Perspective [67.18843218893416]
Language models (LMs) are powerful yet mostly for text generation tasks. We provide a unified definition of tools as external programs used by LMs. We empirically study the efficiency of various tooling methods.
arXiv Detail & Related papers (2024-03-18T17:20:07Z)
TOOLVERIFIER: Generalization to New Tools via Self-Verification [69.85190990517184]
We introduce a self-verification method which distinguishes between close candidates by self-asking contrastive questions during tool selection. Experiments on 4 tasks from the ToolBench benchmark, consisting of 17 unseen tools, demonstrate an average improvement of 22% over few-shot baselines.
arXiv Detail & Related papers (2024-02-21T22:41:38Z)
ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios [48.38419686697733]
We propose ToolEyes, a fine-grained system tailored for the evaluation of large language models' tool learning capabilities in authentic scenarios. The system meticulously examines seven real-world scenarios, analyzing five dimensions crucial to LLMs in tool learning. ToolEyes incorporates a tool library boasting approximately 600 tools, serving as an intermediary between LLMs and the physical world.
arXiv Detail & Related papers (2024-01-01T12:49:36Z)
Automated Grading and Feedback Tools for Programming Education: A Systematic Review [7.776434991976473]
Most papers assess the correctness of assignments in object-oriented languages. Few tools assess the maintainability, readability or documentation of the source code. Most tools offered fully automated assessment to allow for near-instantaneous feedback.
arXiv Detail & Related papers (2023-06-20T17:54:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.