Related papers: Active search and coverage using point-cloud reinforcement learning

Active search and coverage using point-cloud reinforcement learning

URL: http://arxiv.org/abs/2312.11410v1
Date: Mon, 18 Dec 2023 18:16:30 GMT
Title: Active search and coverage using point-cloud reinforcement learning
Authors: Matthias Rosynski and Alexandru Pop and Lucian Busoniu
Abstract summary: This paper presents an end-to-end deep reinforcement learning solution for target search and coverage. We show that deep hierarchical feature learning works for RL and that by using farthest point sampling (FPS) we can reduce the amount of points. We also show that multi-head attention for point-clouds helps to learn the agent faster but converges to the same outcome.
Score: 50.741409008225766
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider a problem in which the trajectory of a mobile 3D sensor must be optimized so that certain objects are both found in the overall scene and covered by the point cloud, as fast as possible. This problem is called target search and coverage, and the paper provides an end-to-end deep reinforcement learning (RL) solution to solve it. The deep neural network combines four components: deep hierarchical feature learning occurs in the first stage, followed by multi-head transformers in the second, max-pooling and merging with bypassed information to preserve spatial relationships in the third, and a distributional dueling network in the last stage. To evaluate the method, a simulator is developed where cylinders must be found by a Kinect sensor. A network architecture study shows that deep hierarchical feature learning works for RL and that by using farthest point sampling (FPS) we can reduce the amount of points and achieve not only a reduction of the network size but also better results. We also show that multi-head attention for point-clouds helps to learn the agent faster but converges to the same outcome. Finally, we compare RL using the best network with a greedy baseline that maximizes immediate rewards and requires for that purpose an oracle that predicts the next observation. We decided RL achieves significantly better and more robust results than the greedy strategy.

Related papers

Optimizing Sensor Network Design for Multiple Coverage [0.9668407688201359]
We introduce a new objective function for the greedy (next-best-view) algorithm to design efficient and robust sensor networks. We also introduce a Deep Learning model to accelerate the algorithm for near real-time computations.
arXiv Detail & Related papers (2024-05-15T05:13:20Z)
Multi-Correlation Siamese Transformer Network with Dense Connection for 3D Single Object Tracking [14.47355191520578]
Point cloud-based 3D object tracking is an important task in autonomous driving. It remains challenging to learn the correlation between the template and search branches effectively with the sparse LIDAR point cloud data. We present a multi-correlation Siamese Transformer network that has multiple stages and carries out feature correlation at the end of each stage.
arXiv Detail & Related papers (2023-12-18T09:33:49Z)
Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs. We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z)
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks. This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z)
OFA$^2$: A Multi-Objective Perspective for the Once-for-All Neural Architecture Search [79.36688444492405]
Once-for-All (OFA) is a Neural Architecture Search (NAS) framework designed to address the problem of searching efficient architectures for devices with different resources constraints. We aim to give one step further in the search for efficiency by explicitly conceiving the search stage as a multi-objective optimization problem.
arXiv Detail & Related papers (2023-03-23T21:30:29Z)
Spatial Layout Consistency for 3D Semantic Segmentation [0.7614628596146599]
We introduce a novel deep convolutional neural network (DCNN) technique for achieving voxel-based semantic segmentation of the ALTM's point clouds. The suggested deep learning method, Semantic Utility Network (SUNet) is a multi-dimensional and multi-resolution network. Our experiments demonstrated that SUNet's spatial layout consistency and a multi-resolution feature aggregation could significantly improve performance.
arXiv Detail & Related papers (2023-03-02T03:24:21Z)
Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum. Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels. They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z)
Learning Semantic Segmentation of Large-Scale Point Clouds with Random Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds. The key to our approach is to use random point sampling instead of more complex point selection approaches. Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z)
FatNet: A Feature-attentive Network for 3D Point Cloud Processing [1.502579291513768]
We introduce a novel feature-attentive neural network layer, a FAT layer, that combines both global point-based features and local edge-based features in order to generate better embeddings. Our architecture achieves state-of-the-art results on the task of point cloud classification, as demonstrated on the ModelNet40 dataset.
arXiv Detail & Related papers (2021-04-07T23:13:56Z)
Decoupled and Memory-Reinforced Networks: Towards Effective Feature Learning for One-Step Person Search [65.51181219410763]
One-step methods have been developed to handle pedestrian detection and identification sub-tasks using a single network. There are two major challenges in the current one-step approaches. We propose a decoupled and memory-reinforced network (DMRNet) to overcome these problems.
arXiv Detail & Related papers (2021-02-22T06:19:45Z)
Scan-based Semantic Segmentation of LiDAR Point Clouds: An Experimental Study [2.6205925938720833]
State of the art methods use deep neural networks to predict semantic classes for each point in a LiDAR scan. A powerful and efficient way to process LiDAR measurements is to use two-dimensional, image-like projections. We demonstrate various techniques to boost the performance and to improve runtime as well as memory constraints.
arXiv Detail & Related papers (2020-04-06T11:08:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.