It Takes Two to Tango: Combining Visual and Textual Information for
Detecting Duplicate Video-Based Bug Reports
- URL: http://arxiv.org/abs/2101.09194v2
- Date: Fri, 5 Feb 2021 16:55:22 GMT
- Title: It Takes Two to Tango: Combining Visual and Textual Information for
Detecting Duplicate Video-Based Bug Reports
- Authors: Nathan Cooper, Carlos Bernal-C\'ardenas, Oscar Chaparro, Kevin Moran,
Denys Poshyvanyk
- Abstract summary: This paper presents Tango, a duplicate detection technique that operates purely on video-based bug reports.
We evaluate multiple configurations of Tango in a comprehensive empirical evaluation on 4,860 duplicate detection tasks.
On average, Tango can reduce developer effort by over 60%, illustrating its practicality.
- Score: 19.289285682720177
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When a bug manifests in a user-facing application, it is likely to be exposed
through the graphical user interface (GUI). Given the importance of visual
information to the process of identifying and understanding such bugs, users
are increasingly making use of screenshots and screen-recordings as a means to
report issues to developers. However, when such information is reported en
masse, such as during crowd-sourced testing, managing these artifacts can be a
time-consuming process. As the reporting of screen-recordings in particular
becomes more popular, developers are likely to face challenges related to
manually identifying videos that depict duplicate bugs. Due to their graphical
nature, screen-recordings present challenges for automated analysis that
preclude the use of current duplicate bug report detection techniques. To
overcome these challenges and aid developers in this task, this paper presents
Tango, a duplicate detection technique that operates purely on video-based bug
reports by leveraging both visual and textual information. Tango combines
tailored computer vision techniques, optical character recognition, and text
retrieval. We evaluated multiple configurations of Tango in a comprehensive
empirical evaluation on 4,860 duplicate detection tasks that involved a total
of 180 screen-recordings from six Android apps. Additionally, we conducted a
user study investigating the effort required for developers to manually detect
duplicate video-based bug reports and compared this to the effort required to
use Tango. The results reveal that Tango's optimal configuration is highly
effective at detecting duplicate video-based bug reports, accurately ranking
target duplicate videos in the top-2 returned results in 83% of the tasks.
Additionally, our user study shows that, on average, Tango can reduce developer
effort by over 60%, illustrating its practicality.
Related papers
- Semantic GUI Scene Learning and Video Alignment for Detecting Duplicate Video-based Bug Reports [16.45808969240553]
Video-based bug reports are increasingly being used to document bugs for programs centered around a graphical user interface (GUI)
We introduce a new approach, called JANUS, that adapts the scene-learning capabilities of vision transformers to capture subtle visual and textual patterns that manifest on app UI screens.
Janus also makes use of a video alignment technique capable of adaptive weighting of video frames to account for typical bug manifestation patterns.
arXiv Detail & Related papers (2024-07-11T15:48:36Z) - VDebugger: Harnessing Execution Feedback for Debugging Visual Programs [103.61860743476933]
We introduce V Debugger, a critic-refiner framework trained to localize and debug visual programs by tracking execution step by step.
V Debugger identifies and corrects program errors leveraging detailed execution feedback, improving interpretability and accuracy.
Evaluations on six datasets demonstrate V Debugger's effectiveness, showing performance improvements of up to 3.2% in downstream task accuracy.
arXiv Detail & Related papers (2024-06-19T11:09:16Z) - VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs [64.60035916955837]
VANE-Bench is a benchmark designed to assess the proficiency of Video-LMMs in detecting anomalies and inconsistencies in videos.
Our dataset comprises an array of videos synthetically generated using existing state-of-the-art text-to-video generation models.
We evaluate nine existing Video-LMMs, both open and closed sources, on this benchmarking task and find that most of the models encounter difficulties in effectively identifying the subtle anomalies.
arXiv Detail & Related papers (2024-06-14T17:59:01Z) - Toward Rapid Bug Resolution for Android Apps [0.4759142872591625]
This paper describes the existing limitations of bug reports and identifies potential strategies for addressing them.
Our vision encompasses a future where the alleviation of these limitations and successful execution of our proposed new research directions can benefit both reporters and developers.
arXiv Detail & Related papers (2023-12-23T18:29:06Z) - Uncovering Hidden Connections: Iterative Search and Reasoning for Video-grounded Dialog [83.63849872250651]
Video-grounded dialog requires profound understanding of both dialog history and video content for accurate response generation.
We present an iterative search and reasoning framework, which consists of a textual encoder, a visual encoder, and a generator.
arXiv Detail & Related papers (2023-10-11T07:37:13Z) - FacTool: Factuality Detection in Generative AI -- A Tool Augmented
Framework for Multi-Task and Multi-Domain Scenarios [87.12753459582116]
A wider range of tasks now face an increasing risk of containing factual errors when handled by generative models.
We propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models.
arXiv Detail & Related papers (2023-07-25T14:20:51Z) - Auto-labelling of Bug Report using Natural Language Processing [0.0]
Rule and Query-based solutions recommend a long list of potential similar bug reports with no clear ranking.
In this paper, we have proposed a solution using a combination of NLP techniques.
It uses a custom data transformer, a deep neural network, and a non-generalizing machine learning method to retrieve existing identical bug reports.
arXiv Detail & Related papers (2022-12-13T02:32:42Z) - ASCNet: Self-supervised Video Representation Learning with
Appearance-Speed Consistency [62.38914747727636]
We study self-supervised video representation learning, which is a challenging task due to 1) a lack of labels for explicit supervision and 2) unstructured and noisy visual information.
Existing methods mainly use contrastive loss with video clips as the instances and learn visual representation by discriminating instances from each other.
In this paper, we observe that the consistency between positive samples is the key to learn robust video representations.
arXiv Detail & Related papers (2021-06-04T08:44:50Z) - Video Exploration via Video-Specific Autoencoders [60.256055890647595]
We present video-specific autoencoders that enables human-controllable video exploration.
We observe that a simple autoencoder trained on multiple frames of a specific video enables one to perform a large variety of video processing and editing tasks.
arXiv Detail & Related papers (2021-03-31T17:56:13Z) - Translating Video Recordings of Mobile App Usages into Replayable
Scenarios [24.992877070869177]
V2S is a lightweight, automated approach for translating video recordings of Android app usages into replayable scenarios.
We performed an extensive evaluation of V2S involving 175 videos depicting 3,534 GUI-based actions collected from users exercising features and reproducing bugs from over 80 popular Android apps.
arXiv Detail & Related papers (2020-05-18T20:11:36Z) - Advaita: Bug Duplicity Detection System [1.9624064951902522]
Duplicate bugs rate (% of duplicate bugs) are in the range from single digit (1 to 9%) to double digits (40%) based on the product maturity, size of the code and number of engineers working on the project.
Detecting duplicity deals with identifying whether any two bugs convey the same meaning.
This approach considers multiple sets of features viz. basic text statistical features, semantic features and contextual features.
arXiv Detail & Related papers (2020-01-24T04:48:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.