Mining Java Memory Errors using Subjective Interesting Subgroups with
Hierarchical Targets
- URL: http://arxiv.org/abs/2310.00781v1
- Date: Sun, 1 Oct 2023 20:24:59 GMT
- Title: Mining Java Memory Errors using Subjective Interesting Subgroups with
Hierarchical Targets
- Authors: Youcef Remil and Anes Bendimerad and Mathieu Chambard and Romain
Mathonat and Marc Plantevit and Mehdi Kaytoue
- Abstract summary: Subgroup Discovery (SD) is a data mining method that can automatically mine incident code and extract discriminant patterns to identify the root causes of issues.
We propose a novel SD approach that can handle complex target concepts with hierarchies.
We apply this framework to investigate out-of-memory errors and demonstrate its usefulness in incident diagnosis.
- Score: 1.188383832081829
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Software applications, especially Enterprise Resource Planning (ERP) systems,
are crucial to the day-to-day operations of many industries. Therefore, it is
essential to maintain these systems effectively using tools that can identify,
diagnose, and mitigate their incidents. One promising data-driven approach is
the Subgroup Discovery (SD) technique, a data mining method that can
automatically mine incident datasets and extract discriminant patterns to
identify the root causes of issues. However, current SD solutions have
limitations in handling complex target concepts with multiple attributes
organized hierarchically. To illustrate this scenario, we examine the case of
Java out-of-memory incidents among several possible applications. We have a
dataset that describes these incidents, including their context and the types
of Java objects occupying memory when it reaches saturation, with these types
arranged hierarchically. This scenario inspires us to propose a novel Subgroup
Discovery approach that can handle complex target concepts with hierarchies. To
achieve this, we design a pattern syntax and a quality measure that ensure the
identified subgroups are relevant, non-redundant, and resilient to noise. To
achieve the desired quality measure, we use the Subjective Interestingness
model that incorporates prior knowledge about the data and promotes patterns
that are both informative and surprising relative to that knowledge. We apply
this framework to investigate out-of-memory errors and demonstrate its
usefulness in incident diagnosis. To validate the effectiveness of our approach
and the quality of the identified patterns, we present an empirical study. The
source code and data used in the evaluation are publicly accessible, ensuring
transparency and reproducibility.
Related papers
- DATABench: Evaluating Dataset Auditing in Deep Learning from an Adversarial Perspective [59.66984417026933]
We introduce a novel taxonomy, classifying existing methods based on their reliance on internal features (IF) (inherent to the data) versus external features (EF) (artificially introduced for auditing)<n>We formulate two primary attack types: evasion attacks, designed to conceal the use of a dataset, and forgery attacks, intending to falsely implicate an unused dataset.<n>Building on the understanding of existing methods and attack objectives, we further propose systematic attack strategies: decoupling, removal, and detection for evasion; adversarial example-based methods for forgery.<n>Our benchmark, DATABench, comprises 17 evasion attacks, 5 forgery attacks, and 9
arXiv Detail & Related papers (2025-07-08T03:07:15Z) - Segment Concealed Objects with Incomplete Supervision [63.637733655439334]
Incompletely-Supervised Concealed Object (ISCOS) involves segmenting objects that seamlessly blend into their surrounding environments.<n>This task remains highly challenging due to the limited supervision provided by the incompletely annotated training data.<n>In this paper, we introduce the first unified method for ISCOS to address these challenges.
arXiv Detail & Related papers (2025-06-10T16:25:15Z) - Extending Dataset Pruning to Object Detection: A Variance-based Approach [0.0]
We present the first extension of classification pruning techniques to the object detection domain.<n>We propose tailored solutions, including a novel scoring method called Variance-based Prediction Score (VPS)<n>Our work bridges dataset pruning and object detection, paving the way for dataset pruning in complex vision tasks.
arXiv Detail & Related papers (2025-05-22T19:46:51Z) - OptiGait-LGBM: An Efficient Approach of Gait-based Person Re-identification in Non-Overlapping Regions [0.26388783516590225]
We propose an OptiGait-LGBM model capable of recognizing person re-identification using a skeletal model approach.<n>A benchmark dataset, RUET-GAIT, is introduced to represent uncontrolled gait sequences in complex outdoor environments.<n>Our aim is to address the aforementioned challenges with minimal computational cost compared to existing methods.
arXiv Detail & Related papers (2025-05-10T08:28:57Z) - Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference [78.08901120841833]
We propose a method to detect the knowledge boundary of Visual Large Language Models (VLLMs)
We show that our method successfully depicts a VLLM's knowledge boundary based on which we are able to reduce indiscriminate retrieval while maintaining or improving the performance.
arXiv Detail & Related papers (2025-02-25T09:32:08Z) - Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection [75.02249869573994]
In open-set scenarios, the unlabeled dataset contains both in-distribution (ID) classes and out-of-distribution (OOD) classes.
Applying semi-supervised detectors in such settings can lead to misclassifying OOD class as ID classes.
We propose a simple yet effective method, termed Collaborative Feature-Logits Detector (CFL-Detector)
arXiv Detail & Related papers (2024-11-20T02:57:35Z) - Resilience to the Flowing Unknown: an Open Set Recognition Framework for Data Streams [6.7236795813629]
This work investigates the application of an Open Set Recognition framework that combines classification and clustering to address the textitover-occupied space problem in streaming scenarios.
arXiv Detail & Related papers (2024-10-31T11:06:54Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - Data AUDIT: Identifying Attribute Utility- and Detectability-Induced
Bias in Task Models [8.420252576694583]
We present a first technique for the rigorous, quantitative screening of medical image datasets.
Our method decomposes the risks associated with dataset attributes in terms of their detectability and utility.
Using our method, we show our screening method reliably identifies nearly imperceptible bias-inducing artifacts.
arXiv Detail & Related papers (2023-04-06T16:50:15Z) - INoD: Injected Noise Discriminator for Self-Supervised Representation
Learning in Agricultural Fields [6.891600948991265]
We propose an Injected Noise Discriminator (INoD) which exploits principles of feature replacement and dataset discrimination for self-supervised representation learning.
INoD interleaves feature maps from two disjoint datasets during their convolutional encoding and predicts the dataset affiliation of the resultant feature map as a pretext task.
Our approach enables the network to learn unequivocal representations of objects seen in one dataset while observing them in conjunction with similar features from the disjoint dataset.
arXiv Detail & Related papers (2023-03-31T14:46:31Z) - A Framework for Verifiable and Auditable Federated Anomaly Detection [3.639790324866155]
Federated Leaning is an emerging approach to manage cooperation between a group of agents for the solution of Machine Learning tasks.
We present a novel algorithmic architecture that tackle this problem in the particular case of Anomaly Detection.
arXiv Detail & Related papers (2022-03-15T11:34:02Z) - Learning to Detect Instance-level Salient Objects Using Complementary
Image Labels [55.049347205603304]
We present the first weakly-supervised approach to the salient instance detection problem.
We propose a novel weakly-supervised network with three branches: a Saliency Detection Branch leveraging class consistency information to locate candidate objects; a Boundary Detection Branch exploiting class discrepancy information to delineate object boundaries; and a Centroid Detection Branch using subitizing information to detect salient instance centroids.
arXiv Detail & Related papers (2021-11-19T10:15:22Z) - Class Introspection: A Novel Technique for Detecting Unlabeled
Subclasses by Leveraging Classifier Explainability Methods [0.0]
latent structure is a crucial step in performing analysis of a dataset.
By leveraging instance explanation methods, an existing classifier can be extended to detect latent classes.
This paper also contains a pipeline for analyzing classifiers automatically, and a web application for interactively exploring the results from this technique.
arXiv Detail & Related papers (2021-07-04T14:58:29Z) - Predicting Themes within Complex Unstructured Texts: A Case Study on
Safeguarding Reports [66.39150945184683]
We focus on the problem of automatically identifying the main themes in a safeguarding report using supervised classification approaches.
Our results show the potential of deep learning models to simulate subject-expert behaviour even for complex tasks with limited labelled data.
arXiv Detail & Related papers (2020-10-27T19:48:23Z) - A Few-Shot Sequential Approach for Object Counting [63.82757025821265]
We introduce a class attention mechanism that sequentially attends to objects in the image and extracts their relevant features.
The proposed technique is trained on point-level annotations and uses a novel loss function that disentangles class-dependent and class-agnostic aspects of the model.
We present our results on a variety of object-counting/detection datasets, including FSOD and MS COCO.
arXiv Detail & Related papers (2020-07-03T18:23:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.