The Impact of AI Tool on Engineering at ANZ Bank An Empirical Study on GitHub Copilot within Corporate Environment
- URL: http://arxiv.org/abs/2402.05636v2
- Date: Wed, 17 Apr 2024 12:14:40 GMT
- Title: The Impact of AI Tool on Engineering at ANZ Bank An Empirical Study on GitHub Copilot within Corporate Environment
- Authors: Sayan Chatterjee, Ching Louis Liu, Gareth Rowland, Tim Hogarth,
- Abstract summary: This study explores the integration of AI tools in software engineering practices within a large organization.
We focus on ANZ Bank, which employs over 5000 engineers covering all aspects of the software development life cycle.
This paper details an experiment conducted using GitHub Copilot, a notable AI tool, within a controlled environment to evaluate its effectiveness in real-world engineering tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The increasing popularity of AI, particularly Large Language Models (LLMs), has significantly impacted various domains, including Software Engineering. This study explores the integration of AI tools in software engineering practices within a large organization. We focus on ANZ Bank, which employs over 5000 engineers covering all aspects of the software development life cycle. This paper details an experiment conducted using GitHub Copilot, a notable AI tool, within a controlled environment to evaluate its effectiveness in real-world engineering tasks. Additionally, this paper shares initial findings on the productivity improvements observed after GitHub Copilot was adopted on a large scale, with about 1000 engineers using it. ANZ Bank's six-week experiment with GitHub Copilot included two weeks of preparation and four weeks of active testing. The study evaluated participant sentiment and the tool's impact on productivity, code quality, and security. Initially, participants used GitHub Copilot for proposed use-cases, with their feedback gathered through regular surveys. In the second phase, they were divided into Control and Copilot groups, each tackling the same Python challenges, and their experiences were again surveyed. Results showed a notable boost in productivity and code quality with GitHub Copilot, though its impact on code security remained inconclusive. Participant responses were overall positive, confirming GitHub Copilot's effectiveness in large-scale software engineering environments. Early data from 1000 engineers also indicated a significant increase in productivity and job satisfaction.
Related papers
- SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents [49.73885480071402]
We introduce SWE-PolyBench, a new benchmark for repository-level, execution-based evaluation of coding agents.
SWE-PolyBench contains 2110 instances from 21 repositories and includes tasks in Java (165), JavaScript (1017), TypeScript (729) and Python (199), covering bug fixes, feature additions, and code.
Our experiments show that current agents exhibit uneven performances across languages and struggle with complex problems while showing higher performance on simpler tasks.
arXiv Detail & Related papers (2025-04-11T17:08:02Z) - Experience with GitHub Copilot for Developer Productivity at Zoominfo [1.631115063641726]
We evaluate GitHub Copilot's deployment and impact on developer productivity at Zoominfo.
We show an average acceptance rate of 33% for suggestions and 20% for lines of code, with high developer satisfaction scores of 72%.
Our findings contribute to the growing body of knowledge about AI-assisted software development in enterprise settings.
arXiv Detail & Related papers (2025-01-23T00:17:48Z) - OpenHands: An Open Platform for AI Software Developers as Generalist Agents [109.8507367518992]
We introduce OpenHands, a platform for the development of AI agents that interact with the world in similar ways to a human developer.
We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, and incorporation of evaluation benchmarks.
arXiv Detail & Related papers (2024-07-23T17:50:43Z) - Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? [73.81908518992161]
We introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering.
Spider2-V features real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications.
These tasks evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems.
arXiv Detail & Related papers (2024-07-15T17:54:37Z) - Transforming Software Development: Evaluating the Efficiency and Challenges of GitHub Copilot in Real-World Projects [0.0]
GitHub Copilot is an AI-powered coding assistant.
This study evaluates the efficiency gains, areas for improvement, and emerging challenges of using GitHub Copilot.
arXiv Detail & Related papers (2024-06-25T19:51:21Z) - Impact of the Availability of ChatGPT on Software Development: A Synthetic Difference in Differences Estimation using GitHub Data [49.1574468325115]
ChatGPT is an AI tool that enhances software production efficiency.
We estimate ChatGPT's effects on the number of git pushes, repositories, and unique developers per 100,000 people.
These results suggest that AI tools like ChatGPT can substantially boost developer productivity, though further analysis is needed to address potential downsides such as low quality code and privacy concerns.
arXiv Detail & Related papers (2024-06-16T19:11:15Z) - Exploring the Effect of Multiple Natural Languages on Code Suggestion
Using GitHub Copilot [46.822148186169144]
GitHub Copilot is an AI-enabled tool that automates program synthesis.
Recent studies have extensively examined Copilot's capabilities in various programming tasks.
However, little is known about the effect of different natural languages on code suggestion.
arXiv Detail & Related papers (2024-02-02T14:30:02Z) - DevEval: Evaluating Code Generation in Practical Software Projects [52.16841274646796]
We propose a new benchmark named DevEval, aligned with Developers' experiences in practical projects.
DevEval is collected through a rigorous pipeline, containing 2,690 samples from 119 practical projects.
We assess five popular LLMs on DevEval and reveal their actual abilities in code generation.
arXiv Detail & Related papers (2024-01-12T06:51:30Z) - Exploring the Problems, their Causes and Solutions of AI Pair Programming: A Study on GitHub and Stack Overflow [6.724815667295355]
GitHub Copilot, the AI programmer pair, utilize machine learning models trained on a large corpus of code snippets to generate code suggestions.
Despite its popularity in software development, there is limited empirical evidence on the actual experiences of practitioners who work with Copilot.
We collected data from 473 GitHub issues, 706 GitHub discussions, and 142 Stack Overflow posts.
arXiv Detail & Related papers (2023-11-02T06:24:38Z) - Measuring the Runtime Performance of Code Produced with GitHub Copilot [1.6021036144262577]
We evaluate the runtime performance of code produced when developers use GitHub Copilot versus when they do not.
Our results suggest that using Copilot may produce code with a significantly slower runtime performance.
arXiv Detail & Related papers (2023-05-10T20:14:52Z) - The GitHub Development Workflow Automation Ecosystems [47.818229204130596]
Large-scale software development has become a highly collaborative endeavour.
This chapter explores the ecosystems of development bots and GitHub Actions.
It provides an extensive survey of the state-of-the-art in this domain.
arXiv Detail & Related papers (2023-05-08T15:24:23Z) - Study of software developers' experience using the Github Copilot Tool
in the software development process [0.0]
Github Copilot was announced on 29 June 2021.
It uses trained model to generate code based on human understandable language.
This research investigates developers' approach to this tool.
arXiv Detail & Related papers (2023-01-12T13:12:54Z) - An Empirical Cybersecurity Evaluation of GitHub Copilot's Code
Contributions [8.285068188878578]
GitHub Copilot is a language model trained over open-source GitHub code.
Code often contains bugs - and so, it is certain that the language model will have learned from exploitable, buggy code.
This raises concerns on the security of Copilot's code contributions.
arXiv Detail & Related papers (2021-08-20T17:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.