Automating DBSCAN via Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2208.04537v1
- Date: Tue, 9 Aug 2022 04:40:11 GMT
- Title: Automating DBSCAN via Deep Reinforcement Learning
- Authors: Ruitong Zhang, Hao Peng, Yingtong Dou, Jia Wu, Qingyun Sun, Jingyi
Zhang, Philip S. Yu
- Abstract summary: We propose a novel Deep Reinforcement Learning guided automatic DBSCAN parameters search framework, namely DRL-DBSCAN.
The framework models the process of adjusting the parameter search direction by perceiving the clustering environment as a Markov decision process.
The framework consistently improves DBSCAN clustering accuracy by up to 26% and 25% respectively.
- Score: 73.82740568765279
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: DBSCAN is widely used in many scientific and engineering fields because of
its simplicity and practicality. However, due to its high sensitivity
parameters, the accuracy of the clustering result depends heavily on practical
experience. In this paper, we first propose a novel Deep Reinforcement Learning
guided automatic DBSCAN parameters search framework, namely DRL-DBSCAN. The
framework models the process of adjusting the parameter search direction by
perceiving the clustering environment as a Markov decision process, which aims
to find the best clustering parameters without manual assistance. DRL-DBSCAN
learns the optimal clustering parameter search policy for different feature
distributions via interacting with the clusters, using a weakly-supervised
reward training policy network. In addition, we also present a recursive search
mechanism driven by the scale of the data to efficiently and controllably
process large parameter spaces. Extensive experiments are conducted on five
artificial and real-world datasets based on the proposed four working modes.
The results of offline and online tasks show that the DRL-DBSCAN not only
consistently improves DBSCAN clustering accuracy by up to 26% and 25%
respectively, but also can stably find the dominant parameters with high
computational efficiency. The code is available at
https://github.com/RingBDStack/DRL-DBSCAN.
Related papers
- ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning [42.33815055388433]
ARLBench is a benchmark for hyperparameter optimization (HPO) in reinforcement learning (RL)
It allows comparisons of diverse HPO approaches while being highly efficient in evaluation.
ARLBench is an efficient, flexible, and future-oriented foundation for research on AutoRL.
arXiv Detail & Related papers (2024-09-27T15:22:28Z) - D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments.
Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z) - Low-Rank Representations Meets Deep Unfolding: A Generalized and
Interpretable Network for Hyperspectral Anomaly Detection [41.50904949744355]
Current hyperspectral anomaly detection (HAD) benchmark datasets suffer from low resolution, simple background, and small size of the detection data.
These factors also limit the performance of the well-known low-rank representation (LRR) models in terms of robustness.
We build a new set of HAD benchmark datasets for improving the robustness of the HAD algorithm in complex scenarios, AIR-HAD for short.
arXiv Detail & Related papers (2024-02-23T14:15:58Z) - Efficient Architecture Search via Bi-level Data Pruning [70.29970746807882]
This work pioneers an exploration into the critical role of dataset characteristics for DARTS bi-level optimization.
We introduce a new progressive data pruning strategy that utilizes supernet prediction dynamics as the metric.
Comprehensive evaluations on the NAS-Bench-201 search space, DARTS search space, and MobileNet-like search space validate that BDP reduces search costs by over 50%.
arXiv Detail & Related papers (2023-12-21T02:48:44Z) - Learning Regions of Interest for Bayesian Optimization with Adaptive
Level-Set Estimation [84.0621253654014]
We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest.
We show theoretically that BALLET can efficiently shrink the search space, and can exhibit a tighter regret bound than standard BO.
arXiv Detail & Related papers (2023-07-25T09:45:47Z) - Data-Efficient Pipeline for Offline Reinforcement Learning with Limited
Data [28.846826115837825]
offline reinforcement learning can be used to improve future performance by leveraging historical data.
We introduce a task- and method-agnostic pipeline for automatically training, comparing, selecting, and deploying the best policy.
We show it can have substantial impacts when the dataset is small.
arXiv Detail & Related papers (2022-10-16T21:24:53Z) - No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL [28.31529154045046]
We propose a new approach to tune hyperparameters from offline logs of data.
We first learn a model of the environment from the offline data, which we call a calibration model, and then simulate learning in the calibration model.
We empirically investigate the method in a variety of settings to identify when it is effective and when it fails.
arXiv Detail & Related papers (2022-05-18T04:26:23Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Hyperparameter Tuning for Deep Reinforcement Learning Applications [0.3553493344868413]
We propose a distributed variable-length genetic algorithm framework to tune hyperparameters for various RL applications.
Our results show that with more generations, optimal solutions that require fewer training episodes and are computationally cheap while being more robust for deployment.
arXiv Detail & Related papers (2022-01-26T20:43:13Z) - AutoHAS: Efficient Hyperparameter and Architecture Search [104.29883101871083]
AutoHAS learns to alternately update the shared network weights and a reinforcement learning controller.
A temporary weight is introduced to store the updated weight from the selected HPs.
In experiments, we show AutoHAS is efficient and generalizable to different search spaces, baselines and datasets.
arXiv Detail & Related papers (2020-06-05T19:57:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.