DeepSSN: a deep convolutional neural network to assess spatial scene
similarity
- URL: http://arxiv.org/abs/2202.04755v1
- Date: Mon, 7 Feb 2022 23:53:20 GMT
- Title: DeepSSN: a deep convolutional neural network to assess spatial scene
similarity
- Authors: Danhuai Guo, Shiyin Ge, Shu Zhang, Song Gao, Ran Tao, Yangang Wang
- Abstract summary: We propose a deep convolutional neural network, namely Deep Spatial Scene Network (DeepSSN), to better assess the spatial scene similarity.
We develop a prototype spatial scene search system using the proposed DeepSSN, in which the users input spatial query via sketch maps.
The proposed model is validated using multi-source conflated map data including 131,300 labeled scene samples after data augmentation.
- Score: 11.608756441376544
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Spatial-query-by-sketch is an intuitive tool to explore human spatial
knowledge about geographic environments and to support communication with scene
database queries. However, traditional sketch-based spatial search methods
perform insufficiently due to their inability to find hidden multi-scale map
features from mental sketches. In this research, we propose a deep
convolutional neural network, namely Deep Spatial Scene Network (DeepSSN), to
better assess the spatial scene similarity. In DeepSSN, a triplet loss function
is designed as a comprehensive distance metric to support the similarity
assessment. A positive and negative example mining strategy using qualitative
constraint networks in spatial reasoning is designed to ensure a consistently
increasing distinction of triplets during the training process. Moreover, we
develop a prototype spatial scene search system using the proposed DeepSSN, in
which the users input spatial query via sketch maps and the system can
automatically augment the sketch training data. The proposed model is validated
using multi-source conflated map data including 131,300 labeled scene samples
after data augmentation. The empirical results demonstrate that the DeepSSN
outperforms baseline methods including k-nearest-neighbors, multilayer
perceptron, AlexNet, DenseNet, and ResNet using mean reciprocal rank and
precision metrics. This research advances geographic information retrieval
studies by introducing a novel deep learning method tailored to spatial scene
queries.
Related papers
- Camera-based 3D Semantic Scene Completion with Sparse Guidance Network [18.415854443539786]
We propose a camera-based semantic scene completion framework called SGN.
SGN propagates semantics from semantic-aware seed voxels to the whole scene based on spatial geometry cues.
Our experimental results demonstrate the superiority of our SGN over existing state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T04:17:27Z) - GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion.
In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning.
Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z) - Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from
Depth Maps [66.24554680709417]
Knowing the exact 3D location of workers and robots in a collaborative environment enables several real applications.
We propose a non-invasive framework based on depth devices and deep neural networks to estimate the 3D pose of robots from an external camera.
arXiv Detail & Related papers (2022-07-06T08:52:12Z) - S3Net: 3D LiDAR Sparse Semantic Segmentation Network [1.330528227599978]
S3Net is a novel convolutional neural network for LiDAR point cloud semantic segmentation.
It adopts an encoder-decoder backbone that consists of Sparse Intra-channel Attention Module (SIntraAM) and Sparse Inter-channel Attention Module (SInterAM)
arXiv Detail & Related papers (2021-03-15T22:15:24Z) - PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View
Depth Estimation with Neural Positional Encoding and Distilled Matting Loss [49.66736599668501]
We propose a self-supervised single-view pixel-level accurate depth estimation network, called PLADE-Net.
Our method shows unprecedented accuracy levels, exceeding 95% in terms of the $delta1$ metric on the KITTI dataset.
arXiv Detail & Related papers (2021-03-12T15:54:46Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - Multi-Subspace Neural Network for Image Recognition [33.61205842747625]
In image classification task, feature extraction is always a big issue. Intra-class variability increases the difficulty in designing the extractors.
Recently, deep learning has drawn lots of attention on automatically learning features from data.
In this study, we proposed multi-subspace neural network (MSNN) which integrates key components of the convolutional neural network (CNN), receptive field, with subspace concept.
arXiv Detail & Related papers (2020-06-17T02:55:34Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z) - Hyperspectral Classification Based on 3D Asymmetric Inception Network
with Data Fusion Transfer Learning [36.05574127972413]
We first deliver a 3D asymmetric inception network, AINet, to overcome the overfitting problem.
With the emphasis on spectral signatures over spatial contexts of HSI data, AINet can convey and classify the features effectively.
arXiv Detail & Related papers (2020-02-11T06:37:34Z) - Spatial-Spectral Residual Network for Hyperspectral Image
Super-Resolution [82.1739023587565]
We propose a novel spectral-spatial residual network for hyperspectral image super-resolution (SSRNet)
Our method can effectively explore spatial-spectral information by using 3D convolution instead of 2D convolution, which enables the network to better extract potential information.
In each unit, we employ spatial and temporal separable 3D convolution to extract spatial and spectral information, which not only reduces unaffordable memory usage and high computational cost, but also makes the network easier to train.
arXiv Detail & Related papers (2020-01-14T03:34:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.