Related papers: Satellite Image Classification with Deep Learning

Satellite Image Classification with Deep Learning

URL: http://arxiv.org/abs/2010.06497v1
Date: Tue, 13 Oct 2020 15:56:58 GMT
Title: Satellite Image Classification with Deep Learning
Authors: Mark Pritt and Gary Chern
Abstract summary: We describe a deep learning system for classifying objects and facilities from the IARPA Functional Map of the World (fMoW) dataset into 63 different classes. The system consists of an ensemble of convolutional neural networks and additional neural networks that integrate satellite metadata with image features. At the time of writing the system is in 2nd place in the fMoW TopCoder competition.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Satellite imagery is important for many applications including disaster response, law enforcement, and environmental monitoring. These applications require the manual identification of objects and facilities in the imagery. Because the geographic expanses to be covered are great and the analysts available to conduct the searches are few, automation is required. Yet traditional object detection and classification algorithms are too inaccurate and unreliable to solve the problem. Deep learning is a family of machine learning algorithms that have shown promise for the automation of such tasks. It has achieved success in image understanding by means of convolutional neural networks. In this paper we apply them to the problem of object and facility recognition in high-resolution, multi-spectral satellite imagery. We describe a deep learning system for classifying objects and facilities from the IARPA Functional Map of the World (fMoW) dataset into 63 different classes. The system consists of an ensemble of convolutional neural networks and additional neural networks that integrate satellite metadata with image features. It is implemented in Python using the Keras and TensorFlow deep learning libraries and runs on a Linux server with an NVIDIA Titan X graphics card. At the time of writing the system is in 2nd place in the fMoW TopCoder competition. Its total accuracy is 83%, the F1 score is 0.797, and it classifies 15 of the classes with accuracies of 95% or better.

Related papers

Perception Encoder: The best visual embeddings are not at the output of the network [70.86738083862099]
We introduce Perception (PE), a vision encoder for image and video understanding trained via simple vision-language learning. We find that contrastive vision-language training alone can produce strong, general embeddings for all of these downstream tasks. Together, our PE family of models achieves best-in-class results on a wide variety of tasks.
arXiv Detail & Related papers (2025-04-17T17:59:57Z)
Semantic segmentation on multi-resolution optical and microwave data using deep learning [0.0]
convolutional neural network based modified U-Net model and VGG-UNet model to automatically identify objects from satellite imagery. Cartosat 2S (1m spatial resolution) datasets were used. Deep learning models were implemented to detect building shapes and ships from the test datasets with an accuracy of more than 95%.
arXiv Detail & Related papers (2024-11-12T06:33:09Z)
Why do CNNs excel at feature extraction? A mathematical explanation [53.807657273043446]
We introduce a novel model for image classification, based on feature extraction, that can be used to generate images resembling real-world datasets. In our proof, we construct piecewise linear functions that detect the presence of features, and show that they can be realized by a convolutional network.
arXiv Detail & Related papers (2023-07-03T10:41:34Z)
Detection-segmentation convolutional neural network for autonomous vehicle perception [0.0]
Object detection and segmentation are two core modules of an autonomous vehicle perception system. Currently, the most commonly used algorithms are based on deep neural networks, which guarantee high efficiency but require high-performance computing platforms. A reduction in the complexity of the network can be achieved by using an appropriate architecture, representation, and computing platform.
arXiv Detail & Related papers (2023-06-30T08:54:52Z)
Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner. Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping. Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z)
Overhead-MNIST: Machine Learning Baselines for Image Classification [0.0]
Twenty-three machine learning algorithms were trained then scored to establish baseline comparison metrics. The Overhead-MNIST dataset is a collection of satellite images similar in style to the ubiquitous MNIST hand-written digits. We present results for the overall best performing algorithm as a baseline for edge deployability and future performance improvement.
arXiv Detail & Related papers (2021-07-01T13:30:39Z)
A Framework for Fast Scalable BNN Inference using Googlenet and Transfer Learning [0.0]
This thesis aims to achieve high accuracy in object detection with good real-time performance. The binarized neural network has shown high performance in various vision tasks such as image classification, object detection, and semantic segmentation. Results show that the accuracy of objects detected by the transfer learning method is more when compared to the existing methods.
arXiv Detail & Related papers (2021-01-04T06:16:52Z)
MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS) We employ a one-shot architecture search approach in order to obtain a reduced search cost. We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z)
SCAN: Learning to Classify Images without Labels [73.69513783788622]
We advocate a two-step approach where feature learning and clustering are decoupled. A self-supervised task from representation learning is employed to obtain semantically meaningful features. We obtain promising results on ImageNet, and outperform several semi-supervised learning methods in the low-data regime.
arXiv Detail & Related papers (2020-05-25T18:12:33Z)
Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture. We show consistent improvements in accuracy and learning convergence over the baseline. Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z)
Self-Supervised Viewpoint Learning From Image Collections [116.56304441362994]
We propose a novel learning framework which incorporates an analysis-by-synthesis paradigm to reconstruct images in a viewpoint aware manner. We show that our approach performs competitively to fully-supervised approaches for several object categories like human faces, cars, buses, and trains.
arXiv Detail & Related papers (2020-04-03T22:01:41Z)
Ensembles of Deep Neural Networks for Action Recognition in Still Images [3.7900158137749336]
We propose a transfer learning technique to tackle the lack of massive labeled action recognition datasets. We also use eight different pre-trained CNNs in our framework and investigate their performance on Stanford 40 dataset. The best setting of our method is able to achieve 93.17$%$ accuracy on the Stanford 40 dataset.
arXiv Detail & Related papers (2020-03-22T13:44:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.