Related papers: Universal Learning Waveform Selection Strategies for Adaptive Target Tracking

Universal Learning Waveform Selection Strategies for Adaptive Target Tracking

URL: http://arxiv.org/abs/2202.05294v1
Date: Thu, 10 Feb 2022 19:21:03 GMT
Title: Universal Learning Waveform Selection Strategies for Adaptive Target Tracking
Authors: Charles E. Thornton, R. Michael Buehrer, Harpreet S. Dhillon, Anthony F. Martone
Abstract summary: This work develops a sequential waveform selection scheme which achieves Bellman optimality in any radar scene. An algorithm based on a multi-alphabet version of the Context-Tree Weighting (CTW) method can be used to optimally solve a broad class of waveform-agile tracking problems.
Score: 42.4297040396286
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online selection of optimal waveforms for target tracking with active sensors has long been a problem of interest. Many conventional solutions utilize an estimation-theoretic interpretation, in which a waveform-specific Cram\'{e}r-Rao lower bound on measurement error is used to select the optimal waveform for each tracking step. However, this approach is only valid in the high SNR regime, and requires a rather restrictive set of assumptions regarding the target motion and measurement models. Further, due to computational concerns, many traditional approaches are limited to near-term, or myopic, optimization, even though radar scenes exhibit strong temporal correlation. More recently, reinforcement learning has been proposed for waveform selection, in which the problem is framed as a Markov decision process (MDP), allowing for long-term planning. However, a major limitation of reinforcement learning is that the memory length of the underlying Markov process is often unknown for realistic target and channel dynamics, and a more general framework is desirable. This work develops a universal sequential waveform selection scheme which asymptotically achieves Bellman optimality in any radar scene which can be modeled as a $U^{\text{th}}$ order Markov process for a finite, but unknown, integer $U$. Our approach is based on well-established tools from the field of universal source coding, where a stationary source is parsed into variable length phrases in order to build a context-tree, which is used as a probabalistic model for the scene's behavior. We show that an algorithm based on a multi-alphabet version of the Context-Tree Weighting (CTW) method can be used to optimally solve a broad class of waveform-agile tracking problems while making minimal assumptions about the environment's behavior.

Related papers

Theory-guided Pseudo-spectral Full Waveform Inversion via Deep Neural Networks [0.0]
Full-Waveform Inversion seeks to achieve a high-resolution model of the subsurface. Deep Learning techniques have emerged as excellent optimization frameworks. This work addresses the lacuna that exists in incorporating the pseudo-spectral approach within Deep Learning.
arXiv Detail & Related papers (2025-02-24T20:18:55Z)
Non-iterative Optimization of Trajectory and Radio Resource for Aerial Network [7.824710236769593]
We address a joint trajectory planning, user association, resource allocation, and power control problem in the aerial IoT network. Our framework can incorporate various trajectory planning algorithms such as the genetic, tree search, and reinforcement learning.
arXiv Detail & Related papers (2024-05-02T14:21:29Z)
FlowPG: Action-constrained Policy Gradient with Normalizing Flows [14.98383953401637]
Action-constrained reinforcement learning (ACRL) is a popular approach for solving safety-critical resource-alential related decision making problems. A major challenge in ACRL is to ensure agent taking a valid action satisfying constraints in each step.
arXiv Detail & Related papers (2024-02-07T11:11:46Z)
One-Dimensional Deep Image Prior for Curve Fitting of S-Parameters from Electromagnetic Solvers [57.441926088870325]
Deep Image Prior (DIP) is a technique that optimized the weights of a randomly-d convolutional neural network to fit a signal from noisy or under-determined measurements. Relative to publicly available implementations of Vector Fitting (VF), our method shows superior performance on nearly all test examples.
arXiv Detail & Related papers (2023-06-06T20:28:37Z)
Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm. We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z)
STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization [74.1615979057429]
We investigate non-batch optimization problems where the objective is an expectation over smooth loss functions. Our work builds on the STORM algorithm, in conjunction with a novel approach to adaptively set the learning rate and momentum parameters.
arXiv Detail & Related papers (2021-11-01T15:43:36Z)
An Online Prediction Approach Based on Incremental Support Vector Machine for Dynamic Multiobjective Optimization [19.336520152294213]
We propose a novel prediction algorithm based on incremental support vector machine (ISVM) We treat the solving of dynamic multiobjective optimization problems (DMOPs) as an online learning process. The proposed algorithm can effectively tackle dynamic multiobjective optimization problems.
arXiv Detail & Related papers (2021-02-24T08:51:23Z)
Online Model Selection for Reinforcement Learning with Function Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret. We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z)
Adaptive Sampling for Best Policy Identification in Markov Decision Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model. The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)
A Reinforcement Learning based approach for Multi-target Detection in Massive MIMO radar [12.982044791524494]
This paper considers the problem of multi-target detection for massive multiple input multiple output (MMIMO) cognitive radar (CR) We propose a reinforcement learning (RL) based algorithm for cognitive multi-target detection in the presence of unknown disturbance statistics. Numerical simulations are performed to assess the performance of the proposed RL-based algorithm in both stationary and dynamic environments.
arXiv Detail & Related papers (2020-05-10T16:29:06Z)
A data-driven choice of misfit function for FWI using reinforcement learning [0.0]
We use a deep-Q network (DQN) to learn an optimal policy to determine the proper timing to switch between different misfit functions. Specifically, we train the state-action value function (Q) to predict when to use the conventional L2-norm misfit function or the more advanced optimal-transport matching-filter (OTMF) misfit.
arXiv Detail & Related papers (2020-02-08T12:31:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.