Hierarchical Neural Architecture Search for Deep Stereo Matching
- URL: http://arxiv.org/abs/2010.13501v1
- Date: Mon, 26 Oct 2020 11:57:37 GMT
- Title: Hierarchical Neural Architecture Search for Deep Stereo Matching
- Authors: Xuelian Cheng, Yiran Zhong, Mehrtash Harandi, Yuchao Dai, Xiaojun
Chang, Tom Drummond, Hongdong Li, Zongyuan Ge
- Abstract summary: We propose the first end-to-end hierarchical NAS framework for deep stereo matching.
Our framework incorporates task-specific human knowledge into the neural architecture search framework.
It is ranked at the top 1 accuracy on KITTI stereo 2012, 2015 and Middlebury benchmarks, as well as the top 1 on SceneFlow dataset.
- Score: 131.94481111956853
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To reduce the human efforts in neural network design, Neural Architecture
Search (NAS) has been applied with remarkable success to various high-level
vision tasks such as classification and semantic segmentation. The underlying
idea for the NAS algorithm is straightforward, namely, to enable the network
the ability to choose among a set of operations (e.g., convolution with
different filter sizes), one is able to find an optimal architecture that is
better adapted to the problem at hand. However, so far the success of NAS has
not been enjoyed by low-level geometric vision tasks such as stereo matching.
This is partly due to the fact that state-of-the-art deep stereo matching
networks, designed by humans, are already sheer in size. Directly applying the
NAS to such massive structures is computationally prohibitive based on the
currently available mainstream computing resources. In this paper, we propose
the first end-to-end hierarchical NAS framework for deep stereo matching by
incorporating task-specific human knowledge into the neural architecture search
framework. Specifically, following the gold standard pipeline for deep stereo
matching (i.e., feature extraction -- feature volume construction and dense
matching), we optimize the architectures of the entire pipeline jointly.
Extensive experiments show that our searched network outperforms all
state-of-the-art deep stereo matching architectures and is ranked at the top 1
accuracy on KITTI stereo 2012, 2015 and Middlebury benchmarks, as well as the
top 1 on SceneFlow dataset with a substantial improvement on the size of the
network and the speed of inference. The code is available at
https://github.com/XuelianCheng/LEAStereo.
Related papers
- DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z) - GeNAS: Neural Architecture Search with Better Generalization [14.92869716323226]
Recent neural architecture search (NAS) approaches rely on validation loss or accuracy to find the superior network for the target data.
In this paper, we investigate a new neural architecture search measure for excavating architectures with better generalization.
arXiv Detail & Related papers (2023-05-15T12:44:54Z) - NASiam: Efficient Representation Learning using Neural Architecture
Search for Siamese Networks [76.8112416450677]
Siamese networks are one of the most trending methods to achieve self-supervised visual representation learning (SSL)
NASiam is a novel approach that uses for the first time differentiable NAS to improve the multilayer perceptron projector and predictor (encoder/predictor pair)
NASiam reaches competitive performance in both small-scale (i.e., CIFAR-10/CIFAR-100) and large-scale (i.e., ImageNet) image classification datasets while costing only a few GPU hours.
arXiv Detail & Related papers (2023-01-31T19:48:37Z) - Generic Neural Architecture Search via Regression [27.78105839644199]
We propose a novel and generic neural architecture search (NAS) framework, termed Generic NAS (GenNAS)
GenNAS does not use task-specific labels but instead adopts textitregression on a set of manually designed synthetic signal bases for architecture evaluation.
We then propose an automatic task search to optimize the combination of synthetic signals using limited downstream-task-specific labels.
arXiv Detail & Related papers (2021-08-04T08:21:12Z) - Weak NAS Predictors Are All You Need [91.11570424233709]
Recent predictor-based NAS approaches attempt to solve the problem with two key steps: sampling some architecture-performance pairs and fitting a proxy accuracy predictor.
We shift the paradigm from finding a complicated predictor that covers the whole architecture space to a set of weaker predictors that progressively move towards the high-performance sub-space.
Our method costs fewer samples to find the top-performance architectures on NAS-Bench-101 and NAS-Bench-201, and it achieves the state-of-the-art ImageNet performance on the NASNet search space.
arXiv Detail & Related papers (2021-02-21T01:58:43Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z) - Stage-Wise Neural Architecture Search [65.03109178056937]
Modern convolutional networks such as ResNet and NASNet have achieved state-of-the-art results in many computer vision applications.
These networks consist of stages, which are sets of layers that operate on representations in the same resolution.
It has been demonstrated that increasing the number of layers in each stage improves the prediction ability of the network.
However, the resulting architecture becomes computationally expensive in terms of floating point operations, memory requirements and inference time.
arXiv Detail & Related papers (2020-04-23T14:16:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.