Related papers: It Takes Two to Tango: Combining Visual and Textual Information for Detecting Duplicate Video-Based Bug Reports

It Takes Two to Tango: Combining Visual and Textual Information for Detecting Duplicate Video-Based Bug Reports

URL: http://arxiv.org/abs/2101.09194v2
Date: Fri, 5 Feb 2021 16:55:22 GMT
Title: It Takes Two to Tango: Combining Visual and Textual Information for Detecting Duplicate Video-Based Bug Reports
Authors: Nathan Cooper, Carlos Bernal-C\'ardenas, Oscar Chaparro, Kevin Moran, Denys Poshyvanyk
Abstract summary: This paper presents Tango, a duplicate detection technique that operates purely on video-based bug reports. We evaluate multiple configurations of Tango in a comprehensive empirical evaluation on 4,860 duplicate detection tasks. On average, Tango can reduce developer effort by over 60%, illustrating its practicality.
Score: 19.289285682720177
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When a bug manifests in a user-facing application, it is likely to be exposed through the graphical user interface (GUI). Given the importance of visual information to the process of identifying and understanding such bugs, users are increasingly making use of screenshots and screen-recordings as a means to report issues to developers. However, when such information is reported en masse, such as during crowd-sourced testing, managing these artifacts can be a time-consuming process. As the reporting of screen-recordings in particular becomes more popular, developers are likely to face challenges related to manually identifying videos that depict duplicate bugs. Due to their graphical nature, screen-recordings present challenges for automated analysis that preclude the use of current duplicate bug report detection techniques. To overcome these challenges and aid developers in this task, this paper presents Tango, a duplicate detection technique that operates purely on video-based bug reports by leveraging both visual and textual information. Tango combines tailored computer vision techniques, optical character recognition, and text retrieval. We evaluated multiple configurations of Tango in a comprehensive empirical evaluation on 4,860 duplicate detection tasks that involved a total of 180 screen-recordings from six Android apps. Additionally, we conducted a user study investigating the effort required for developers to manually detect duplicate video-based bug reports and compared this to the effort required to use Tango. The results reveal that Tango's optimal configuration is highly effective at detecting duplicate video-based bug reports, accurately ranking target duplicate videos in the top-2 returned results in 83% of the tasks. Additionally, our user study shows that, on average, Tango can reduce developer effort by over 60%, illustrating its practicality.

Related papers

Understanding the Impact of Domain Term Explanation on Duplicate Bug Report Detection [2.9312156642007294]
Duplicate bug reports make up 42% of all reports in bug tracking systems (e.g., Bugzilla) Traditional techniques often focus on detecting textually similar duplicates. About 78% of bug reports in open-source projects are very short (e.g., less than 100 words) often containing domain-specific terms or jargon.
arXiv Detail & Related papers (2025-03-24T16:09:37Z)
Semantic GUI Scene Learning and Video Alignment for Detecting Duplicate Video-based Bug Reports [16.45808969240553]
Video-based bug reports are increasingly being used to document bugs for programs centered around a graphical user interface (GUI) We introduce a new approach, called JANUS, that adapts the scene-learning capabilities of vision transformers to capture subtle visual and textual patterns that manifest on app UI screens. Janus also makes use of a video alignment technique capable of adaptive weighting of video frames to account for typical bug manifestation patterns.
arXiv Detail & Related papers (2024-07-11T15:48:36Z)
VDebugger: Harnessing Execution Feedback for Debugging Visual Programs [103.61860743476933]
We introduce V Debugger, a critic-refiner framework trained to localize and debug visual programs by tracking execution step by step. V Debugger identifies and corrects program errors leveraging detailed execution feedback, improving interpretability and accuracy. Evaluations on six datasets demonstrate V Debugger's effectiveness, showing performance improvements of up to 3.2% in downstream task accuracy.
arXiv Detail & Related papers (2024-06-19T11:09:16Z)
VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs [64.60035916955837]
VANE-Bench is a benchmark designed to assess the proficiency of Video-LMMs in detecting anomalies and inconsistencies in videos. Our dataset comprises an array of videos synthetically generated using existing state-of-the-art text-to-video generation models. We evaluate nine existing Video-LMMs, both open and closed sources, on this benchmarking task and find that most of the models encounter difficulties in effectively identifying the subtle anomalies.
arXiv Detail & Related papers (2024-06-14T17:59:01Z)
Toward Rapid Bug Resolution for Android Apps [0.4759142872591625]
This paper describes the existing limitations of bug reports and identifies potential strategies for addressing them. Our vision encompasses a future where the alleviation of these limitations and successful execution of our proposed new research directions can benefit both reporters and developers.
arXiv Detail & Related papers (2023-12-23T18:29:06Z)
Uncovering Hidden Connections: Iterative Search and Reasoning for Video-grounded Dialog [83.63849872250651]
Video-grounded dialog requires profound understanding of both dialog history and video content for accurate response generation. We present an iterative search and reasoning framework, which consists of a textual encoder, a visual encoder, and a generator.
arXiv Detail & Related papers (2023-10-11T07:37:13Z)
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios [87.12753459582116]
A wider range of tasks now face an increasing risk of containing factual errors when handled by generative models. We propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models.
arXiv Detail & Related papers (2023-07-25T14:20:51Z)
Auto-labelling of Bug Report using Natural Language Processing [0.0]
Rule and Query-based solutions recommend a long list of potential similar bug reports with no clear ranking. In this paper, we have proposed a solution using a combination of NLP techniques. It uses a custom data transformer, a deep neural network, and a non-generalizing machine learning method to retrieve existing identical bug reports.
arXiv Detail & Related papers (2022-12-13T02:32:42Z)
ASCNet: Self-supervised Video Representation Learning with Appearance-Speed Consistency [62.38914747727636]
We study self-supervised video representation learning, which is a challenging task due to 1) a lack of labels for explicit supervision and 2) unstructured and noisy visual information. Existing methods mainly use contrastive loss with video clips as the instances and learn visual representation by discriminating instances from each other. In this paper, we observe that the consistency between positive samples is the key to learn robust video representations.
arXiv Detail & Related papers (2021-06-04T08:44:50Z)
Video Exploration via Video-Specific Autoencoders [60.256055890647595]
We present video-specific autoencoders that enables human-controllable video exploration. We observe that a simple autoencoder trained on multiple frames of a specific video enables one to perform a large variety of video processing and editing tasks.
arXiv Detail & Related papers (2021-03-31T17:56:13Z)
Translating Video Recordings of Mobile App Usages into Replayable Scenarios [24.992877070869177]
V2S is a lightweight, automated approach for translating video recordings of Android app usages into replayable scenarios. We performed an extensive evaluation of V2S involving 175 videos depicting 3,534 GUI-based actions collected from users exercising features and reproducing bugs from over 80 popular Android apps.
arXiv Detail & Related papers (2020-05-18T20:11:36Z)
Advaita: Bug Duplicity Detection System [1.9624064951902522]
Duplicate bugs rate (% of duplicate bugs) are in the range from single digit (1 to 9%) to double digits (40%) based on the product maturity, size of the code and number of engineers working on the project. Detecting duplicity deals with identifying whether any two bugs convey the same meaning. This approach considers multiple sets of features viz. basic text statistical features, semantic features and contextual features.
arXiv Detail & Related papers (2020-01-24T04:48:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.