S3M: Siamese Stack (Trace) Similarity Measure
- URL: http://arxiv.org/abs/2103.10526v1
- Date: Thu, 18 Mar 2021 21:10:41 GMT
- Title: S3M: Siamese Stack (Trace) Similarity Measure
- Authors: Aleksandr Khvorov, Roman Vasiliev, George Chernishev, Irving Muller
Rodrigues, Dmitrij Koznov, Nikita Povarov
- Abstract summary: We present S3M -- the first approach to computing stack trace similarity based on deep learning.
It is based on a biLSTM encoder and a fully-connected classifier to compute similarity.
Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset.
- Score: 55.58269472099399
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic crash reporting systems have become a de-facto standard in software
development. These systems monitor target software, and if a crash occurs they
send details to a backend application. Later on, these reports are aggregated
and used in the development process to 1) understand whether it is a new or an
existing issue, 2) assign these bugs to appropriate developers, and 3) gain a
general overview of the application's bug landscape. The efficiency of report
aggregation and subsequent operations heavily depends on the quality of the
report similarity metric. However, a distinctive feature of this kind of report
is that no textual input from the user (i.e., bug description) is available: it
contains only stack trace information.
In this paper, we present S3M ("extreme") -- the first approach to computing
stack trace similarity based on deep learning. It is based on a siamese
architecture that uses a biLSTM encoder and a fully-connected classifier to
compute similarity. Our experiments demonstrate the superiority of our approach
over the state-of-the-art on both open-sourced data and a private JetBrains
dataset. Additionally, we review the impact of stack trace trimming on the
quality of the results.
Related papers
- SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding [56.079013202051094]
We present SegVG, a novel method transfers the box-level annotation as signals to provide an additional pixel-level supervision for Visual Grounding.
This approach allows us to iteratively exploit the annotation as signals for both box-level regression and pixel-level segmentation.
arXiv Detail & Related papers (2024-07-03T15:30:45Z) - ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection [52.228708947607636]
This paper proposes a comprehensive visual anomaly detection benchmark, textbftextitADer, which is a modular framework for new anomaly detection methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Fact Checking Beyond Training Set [64.88575826304024]
We show that the retriever-reader suffers from performance deterioration when it is trained on labeled data from one domain and used in another domain.
We propose an adversarial algorithm to make the retriever component robust against distribution shift.
We then construct eight fact checking scenarios from these datasets, and compare our model to a set of strong baseline models.
arXiv Detail & Related papers (2024-03-27T15:15:14Z) - EMBERSim: A Large-Scale Databank for Boosting Similarity Search in
Malware Analysis [48.5877840394508]
In recent years there has been a shift from quantifications-based malware detection towards machine learning.
We propose to address the deficiencies in the space of similarity research on binary files, starting from EMBER.
We enhance EMBER with similarity information as well as malware class tags, to enable further research in the similarity space.
arXiv Detail & Related papers (2023-10-03T06:58:45Z) - MaintainoMATE: A GitHub App for Intelligent Automation of Maintenance
Activities [3.2228025627337864]
Software development projects rely on issue tracking systems at the core of tracking maintenance tasks such as bug reports, and enhancement requests.
The handling of issue-reports is critical and requires thorough scanning of the text entered in an issue-report making it a labor-intensive task.
We present a unified framework called MaintainoMATE, which is capable of automatically categorizing the issue-reports in their respective category and further assigning the issue-reports to a developer with relevant expertise.
arXiv Detail & Related papers (2023-08-31T05:15:42Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Auto-labelling of Bug Report using Natural Language Processing [0.0]
Rule and Query-based solutions recommend a long list of potential similar bug reports with no clear ranking.
In this paper, we have proposed a solution using a combination of NLP techniques.
It uses a custom data transformer, a deep neural network, and a non-generalizing machine learning method to retrieve existing identical bug reports.
arXiv Detail & Related papers (2022-12-13T02:32:42Z) - Automatic Classification of Bug Reports Based on Multiple Text
Information and Reports' Intention [37.67372105858311]
This paper proposes a new automatic classification method for bug reports.
The innovation is that when categorizing bug reports, in addition to using the text information of the report, the intention of the report is also considered.
Our proposed method achieves better performance and its F-Measure achieves from 87.3% to 95.5%.
arXiv Detail & Related papers (2022-08-02T06:44:51Z) - Leveraging Structural Properties of Source Code Graphs for Just-In-Time
Bug Prediction [6.467090475885797]
A graph is one of the most commonly used representations for understanding relational data.
In this study, we propose a methodology to utilize the relational properties of source code in the form of a graph.
arXiv Detail & Related papers (2022-01-25T07:20:47Z) - Mining Knowledge Graphs From Incident Reports [3.3395585414528663]
Incident reports filed by customers are largely unstructured making diagnosis or mitigation non-trivial.
We present an approach to mine and score binary entity relations from co-occurring entity pairs.
We construct knowledge graphs automatically and show that the implicit knowledge in the graph can be used to rank relevant entities for distinct incidents.
arXiv Detail & Related papers (2021-01-15T04:15:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.