Related papers: Hierarchical Neural Architecture Search for Deep Stereo Matching

Hierarchical Neural Architecture Search for Deep Stereo Matching

URL: http://arxiv.org/abs/2010.13501v1
Date: Mon, 26 Oct 2020 11:57:37 GMT
Title: Hierarchical Neural Architecture Search for Deep Stereo Matching
Authors: Xuelian Cheng, Yiran Zhong, Mehrtash Harandi, Yuchao Dai, Xiaojun Chang, Tom Drummond, Hongdong Li, Zongyuan Ge
Abstract summary: We propose the first end-to-end hierarchical NAS framework for deep stereo matching. Our framework incorporates task-specific human knowledge into the neural architecture search framework. It is ranked at the top 1 accuracy on KITTI stereo 2012, 2015 and Middlebury benchmarks, as well as the top 1 on SceneFlow dataset.
Score: 131.94481111956853
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To reduce the human efforts in neural network design, Neural Architecture Search (NAS) has been applied with remarkable success to various high-level vision tasks such as classification and semantic segmentation. The underlying idea for the NAS algorithm is straightforward, namely, to enable the network the ability to choose among a set of operations (e.g., convolution with different filter sizes), one is able to find an optimal architecture that is better adapted to the problem at hand. However, so far the success of NAS has not been enjoyed by low-level geometric vision tasks such as stereo matching. This is partly due to the fact that state-of-the-art deep stereo matching networks, designed by humans, are already sheer in size. Directly applying the NAS to such massive structures is computationally prohibitive based on the currently available mainstream computing resources. In this paper, we propose the first end-to-end hierarchical NAS framework for deep stereo matching by incorporating task-specific human knowledge into the neural architecture search framework. Specifically, following the gold standard pipeline for deep stereo matching (i.e., feature extraction -- feature volume construction and dense matching), we optimize the architectures of the entire pipeline jointly. Extensive experiments show that our searched network outperforms all state-of-the-art deep stereo matching architectures and is ranked at the top 1 accuracy on KITTI stereo 2012, 2015 and Middlebury benchmarks, as well as the top 1 on SceneFlow dataset with a substantial improvement on the size of the network and the speed of inference. The code is available at https://github.com/XuelianCheng/LEAStereo.

Related papers

Training-free Neural Architecture Search through Variance of Knowledge of Deep Network Weights [0.0]
We propose a training-free proxy for image classification accuracy based on Fisher Information. Our proxy achieves state-of-the-art results on three public datasets and in two search spaces.
arXiv Detail & Related papers (2025-02-07T14:48:28Z)
DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques. Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms. Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z)
GeNAS: Neural Architecture Search with Better Generalization [14.92869716323226]
Recent neural architecture search (NAS) approaches rely on validation loss or accuracy to find the superior network for the target data. In this paper, we investigate a new neural architecture search measure for excavating architectures with better generalization.
arXiv Detail & Related papers (2023-05-15T12:44:54Z)
NASiam: Efficient Representation Learning using Neural Architecture Search for Siamese Networks [76.8112416450677]
Siamese networks are one of the most trending methods to achieve self-supervised visual representation learning (SSL) NASiam is a novel approach that uses for the first time differentiable NAS to improve the multilayer perceptron projector and predictor (encoder/predictor pair) NASiam reaches competitive performance in both small-scale (i.e., CIFAR-10/CIFAR-100) and large-scale (i.e., ImageNet) image classification datasets while costing only a few GPU hours.
arXiv Detail & Related papers (2023-01-31T19:48:37Z)
Generic Neural Architecture Search via Regression [27.78105839644199]
We propose a novel and generic neural architecture search (NAS) framework, termed Generic NAS (GenNAS) GenNAS does not use task-specific labels but instead adopts textitregression on a set of manually designed synthetic signal bases for architecture evaluation. We then propose an automatic task search to optimize the combination of synthetic signals using limited downstream-task-specific labels.
arXiv Detail & Related papers (2021-08-04T08:21:12Z)
Weak NAS Predictors Are All You Need [91.11570424233709]
Recent predictor-based NAS approaches attempt to solve the problem with two key steps: sampling some architecture-performance pairs and fitting a proxy accuracy predictor. We shift the paradigm from finding a complicated predictor that covers the whole architecture space to a set of weaker predictors that progressively move towards the high-performance sub-space. Our method costs fewer samples to find the top-performance architectures on NAS-Bench-101 and NAS-Bench-201, and it achieves the state-of-the-art ImageNet performance on the NASNet search space.
arXiv Detail & Related papers (2021-02-21T01:58:43Z)
DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures. We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z)
Stage-Wise Neural Architecture Search [65.03109178056937]
Modern convolutional networks such as ResNet and NASNet have achieved state-of-the-art results in many computer vision applications. These networks consist of stages, which are sets of layers that operate on representations in the same resolution. It has been demonstrated that increasing the number of layers in each stage improves the prediction ability of the network. However, the resulting architecture becomes computationally expensive in terms of floating point operations, memory requirements and inference time.
arXiv Detail & Related papers (2020-04-23T14:16:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.