Automatic Detection of Rail Components via A Deep Convolutional
Transformer Network
- URL: http://arxiv.org/abs/2108.02423v1
- Date: Thu, 5 Aug 2021 07:38:04 GMT
- Title: Automatic Detection of Rail Components via A Deep Convolutional
Transformer Network
- Authors: Tiange Wang, Zijun Zhang, Fangfang Yang, and Kwok-Leung Tsui
- Abstract summary: We propose a deep convolutional transformer network based method to detect multi-class rail components including the rail, clip, and bolt.
Our proposed method simplifies the detection pipeline by eliminating the need of prior settings, such as anchor box, aspect ratio, default coordinates, and post-processing.
Results of a comprehensive computational study show that our proposed method outperforms a set of existing state-of-art approaches with large margins.
- Score: 7.557470133155959
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic detection of rail track and its fasteners via using continuously
collected railway images is important to maintenance as it can significantly
improve maintenance efficiency and better ensure system safety. Dominant
computer vision-based detection models typically rely on convolutional neural
networks that utilize local image features and cumbersome prior settings to
generate candidate boxes. In this paper, we propose a deep convolutional
transformer network based method to detect multi-class rail components
including the rail, clip, and bolt. We effectively synergize advantages of the
convolutional structure on extracting latent features from raw images as well
as advantages of transformers on selectively determining valuable latent
features to achieve an efficient and accurate performance on rail component
detections. Our proposed method simplifies the detection pipeline by
eliminating the need of prior settings, such as anchor box, aspect ratio,
default coordinates, and post-processing, such as the threshold for non-maximum
suppression; as well as allows users to trade off the quality and complexity of
the detector with limited training data. Results of a comprehensive
computational study show that our proposed method outperforms a set of existing
state-of-art approaches with large margins
Related papers
- Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Efficient Visual Fault Detection for Freight Train Braking System via
Heterogeneous Self Distillation in the Wild [8.062167870951706]
This paper proposes a heterogeneous self-distillation framework to ensure detection accuracy and speed.
We employ a novel loss function that makes the network easily concentrate on values near the label to improve learning efficiency.
Our framework can achieve over 37 frames per second and maintain the highest accuracy in comparison with traditional distillation approaches.
arXiv Detail & Related papers (2023-07-03T01:27:39Z) - RegFormer: An Efficient Projection-Aware Transformer Network for
Large-Scale Point Cloud Registration [73.69415797389195]
We propose an end-to-end transformer network (RegFormer) for large-scale point cloud alignment.
Specifically, a projection-aware hierarchical transformer is proposed to capture long-range dependencies and filter outliers.
Our transformer has linear complexity, which guarantees high efficiency even for large-scale scenes.
arXiv Detail & Related papers (2023-03-22T08:47:37Z) - Improving Transformer-based Image Matching by Cascaded Capturing
Spatially Informative Keypoints [44.90917854990362]
We propose a transformer-based cascade matching model -- Cascade feature Matching TRansformer (CasMTR)
We use a simple yet effective Non-Maximum Suppression (NMS) post-process to filter keypoints through the confidence map.
CasMTR achieves state-of-the-art performance in indoor and outdoor pose estimation as well as visual localization.
arXiv Detail & Related papers (2023-03-06T04:32:34Z) - Integral Migrating Pre-trained Transformer Encoder-decoders for Visual
Object Detection [78.2325219839805]
imTED improves the state-of-the-art of few-shot object detection by up to 7.6% AP.
Experiments on MS COCO dataset demonstrate that imTED consistently outperforms its counterparts by 2.8%.
arXiv Detail & Related papers (2022-05-19T15:11:20Z) - Vision Transformer with Convolutions Architecture Search [72.70461709267497]
We propose an architecture search method-Vision Transformer with Convolutions Architecture Search (VTCAS)
The high-performance backbone network searched by VTCAS introduces the desirable features of convolutional neural networks into the Transformer architecture.
It enhances the robustness of the neural network for object recognition, especially in the low illumination indoor scene.
arXiv Detail & Related papers (2022-03-20T02:59:51Z) - PnP-DETR: Towards Efficient Visual Analysis with Transformers [146.55679348493587]
Recently, DETR pioneered the solution vision tasks with transformers, it directly translates the image feature map into the object result.
Recent transformer-based image recognition model andTT show consistent efficiency gain.
arXiv Detail & Related papers (2021-09-15T01:10:30Z) - Hierarchical Convolutional Neural Network with Feature Preservation and
Autotuned Thresholding for Crack Detection [5.735035463793008]
Drone imagery is increasingly used in automated inspection for infrastructure surface defects.
This paper proposes a deep learning approach using hierarchical convolutional neural networks with feature preservation.
The proposed technique is then applied to identify surface cracks on the surface of roads, bridges or pavements.
arXiv Detail & Related papers (2021-04-21T13:07:58Z) - A Unified Light Framework for Real-time Fault Detection of Freight Train
Images [16.721758280029302]
Real-time fault detection for freight trains plays a vital role in guaranteeing the security and optimal operation of railway transportation.
Despite the promising results for deep learning based approaches, the performance of these fault detectors on freight train images are far from satisfactory in both accuracy and efficiency.
This paper proposes a unified light framework to improve detection accuracy while supporting a real-time operation with a low resource requirement.
arXiv Detail & Related papers (2021-01-31T05:10:20Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - Detection Method Based on Automatic Visual Shape Clustering for
Pin-Missing Defect in Transmission Lines [1.602803566465659]
Bolts are the most numerous fasteners in transmission lines and are prone to losing their split pins.
How to realize the automatic pin-missing defect detection for bolts in transmission lines so as to achieve timely and efficient trouble shooting is a difficult problem.
In this paper, an automatic detection model called Automatic Visual Shape Clustering Network (AVSCNet) for pin-missing defect is constructed.
arXiv Detail & Related papers (2020-01-17T10:57:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.