Related papers: Pointer Networks with Q-Learning for Combinatorial Optimization

Pointer Networks with Q-Learning for Combinatorial Optimization

URL: http://arxiv.org/abs/2311.02629v3
Date: Mon, 17 Jun 2024 10:27:54 GMT
Title: Pointer Networks with Q-Learning for Combinatorial Optimization
Authors: Alessandro Barro,
Abstract summary: We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets) Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.
Score: 55.2480439325792
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets) to enhance the optimality of attention-based sequence generation, focusing on long-term outcomes. This integration proves particularly effective in solving combinatorial optimization (CO) tasks, especially the Travelling Salesman Problem (TSP), which is the focus of our study. We address this challenge by defining a Markov Decision Process (MDP) compatible with PQN, which involves iterative graph embedding, encoding and decoding by an LSTM-based recurrent neural network. This process generates a context vector and computes raw attention scores, which are dynamically adjusted by Q-values calculated for all available state-action pairs before applying softmax. The resulting attention vector is utilized as an action distribution, with actions selected hinged to exploration-exploitation dynamic adaptibility of PQN. Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.

Related papers

Selective Feature Re-Encoded Quantum Convolutional Neural Network with Joint Optimization for Image Classification [3.8876018618878585]
Quantum convolutional neural networks (QCNNs) have demonstrated promising results in classifying both quantum and classical data.<n>This study proposes a novel strategy to enhance feature processing and a QCNN architecture for improved classification accuracy.
arXiv Detail & Related papers (2025-07-02T18:51:56Z)
Quantum-Classical Hybrid Quantized Neural Network [7.759760132559044]
We present a novel Quadratic Binary Optimization (QBO) model for quantized neural network training, enabling the use of arbitrary activation and loss functions.<n>We employ the Quantum Gradient Conditional Descent (QCGD) algorithm, which leverages quantum computing to directly solve the QCBO problem.<n> Experimental results using a coherent Ising machine (CIM) demonstrate a 94.95% accuracy on the Fashion MNIST classification task, with only 1.1-bit precision.
arXiv Detail & Related papers (2025-06-23T02:12:36Z)
CVaR-Based Variational Quantum Optimization for User Association in Handoff-Aware Vehicular Networks [23.140655547353994]
We present a novel Conditional Value at Risk (CVaR)-based Variational Quantum Eigensolver (VQE) framework to address generalized assignment problems (GAP) in vehicular networks (VNets) Our approach leverages a hybrid quantum-classical structure, integrating a tailored cost function that balances both objective and constraint-specific penalties to improve solution quality and stability. We apply this framework to a user-association problem in VNets, where our method achieves 23.5% improvement compared to the deep neural network (DNN) approach.
arXiv Detail & Related papers (2025-01-14T20:21:06Z)
An Adaptive Collocation Point Strategy For Physics Informed Neural Networks via the QR Discrete Empirical Interpolation Method [1.2289361708127877]
We propose an adaptive collocation point selection strategy utilizing the QR Discrete Empirical Interpolation Method (QR-DEIM) Our results on benchmark PDEs, including the wave, Allen-Cahn, and Burgers' equations, demonstrate that our QR-DEIM-based approach improves PINN accuracy compared to existing methods.
arXiv Detail & Related papers (2025-01-13T21:24:15Z)
Enhancing Variational Quantum Circuit Training: An Improved Neural Network Approach for Barren Plateau Mitigation [0.0]
variational quantum algorithms (VQAs) are among the most promising algorithms in near-term quantum computing. They iteratively update circuit parameters to optimize a cost function. The training of variational quantum circuits (VQCs) is susceptible to a phenomenon known as barren plateaus (BPs)
arXiv Detail & Related papers (2024-11-14T06:43:37Z)
Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance. Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z)
Parallel Proportional Fusion of Spiking Quantum Neural Network for Optimizing Image Classification [10.069224006497162]
We introduce a novel architecture termed Parallel Proportional Fusion of Quantum and Spiking Neural Networks (PPF-QSNN) The proposed PPF-QSNN outperforms both the existing spiking neural network and the serial quantum neural network across metrics such as accuracy, loss, and robustness. This study lays the groundwork for the advancement and application of quantum advantage in artificial intelligent computations.
arXiv Detail & Related papers (2024-04-01T10:35:35Z)
Quantum Annealing and Graph Neural Networks for Solving TSP with QUBO [0.0]
The paper explores the application of Quadratic Unconstrained Binary Optimization (QUBO) models in solving the Travelling Salesman Problem (TSP) through Quantum Annealing algorithms and Graph Neural Networks. It introduces a Graph Neural Network solution for TSP (QGNN-TSP), which learns the underlying structure of the problem and produces competitive solutions via gradient descent over a QUBO-based loss function.
arXiv Detail & Related papers (2024-02-21T05:55:00Z)
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level. We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z)
Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization [21.30645601474163]
Original Q-learning suffers from performance and complexity challenges across very large networks. New model-free ensemble reinforcement learning algorithm which adapts the classical Q-learning is proposed to handle these challenges. Numerical results show that the proposed algorithm can achieve up to 55% less average policy error with up to 50% less runtime complexity.
arXiv Detail & Related papers (2024-02-08T08:08:23Z)
EPTQ: Enhanced Post-Training Quantization via Hessian-guided Network-wise Optimization [3.3998740964877463]
Quantization is a key method for deploying deep neural networks on edge devices with limited memory and computation resources. We propose a new method for enhanced Post-Training Quantization (EPTQ) that employs a network-wise quantization optimization process.
arXiv Detail & Related papers (2023-09-20T10:50:28Z)
Differentially Private Deep Q-Learning for Pattern Privacy Preservation in MEC Offloading [76.0572817182483]
attackers may eavesdrop on the offloading decisions to infer the edge server's (ES's) queue information and users' usage patterns. We propose an offloading strategy which jointly minimizes the latency, ES's energy consumption, and task dropping rate, while preserving pattern privacy (PP) We develop a Differential Privacy Deep Q-learning based Offloading (DP-DQO) algorithm to solve this problem while addressing the PP issue by injecting noise into the generated offloading decisions.
arXiv Detail & Related papers (2023-02-09T12:50:18Z)
A Deep-Unfolded Reference-Based RPCA Network For Video Foreground-Background Separation [86.35434065681925]
This paper proposes a new deep-unfolding-based network design for the problem of Robust Principal Component Analysis (RPCA) Unlike existing designs, our approach focuses on modeling the temporal correlation between the sparse representations of consecutive video frames. Experimentation using the moving MNIST dataset shows that the proposed network outperforms a recently proposed state-of-the-art RPCA network in the task of video foreground-background separation.
arXiv Detail & Related papers (2020-10-02T11:40:09Z)
Optimistic Exploration even with a Pessimistic Initialisation [57.41327865257504]
Optimistic initialisation is an effective strategy for efficient exploration in reinforcement learning (RL) In particular, in scenarios with only positive rewards, Q-values are initialised at their lowest possible values. We propose a simple count-based augmentation to pessimistically initialised Q-values that separates the source of optimism from the neural network.
arXiv Detail & Related papers (2020-02-26T17:15:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.