Resource Efficient Perception for Vision Systems
- URL: http://arxiv.org/abs/2405.07166v1
- Date: Sun, 12 May 2024 05:33:00 GMT
- Title: Resource Efficient Perception for Vision Systems
- Authors: A V Subramanyam, Niyati Singal, Vinay K Verma,
- Abstract summary: Our study introduces a framework aimed at mitigating these challenges by leveraging memory efficient patch based processing for high resolution images.
It incorporates a global context representation alongside local patch information, enabling a comprehensive understanding of the image content.
We demonstrate the effectiveness of our method through superior performance on 7 different benchmarks across classification, object detection, and segmentation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the rapid advancement in the field of image recognition, the processing of high-resolution imagery remains a computational challenge. However, this processing is pivotal for extracting detailed object insights in areas ranging from autonomous vehicle navigation to medical imaging analyses. Our study introduces a framework aimed at mitigating these challenges by leveraging memory efficient patch based processing for high resolution images. It incorporates a global context representation alongside local patch information, enabling a comprehensive understanding of the image content. In contrast to traditional training methods which are limited by memory constraints, our method enables training of ultra high resolution images. We demonstrate the effectiveness of our method through superior performance on 7 different benchmarks across classification, object detection, and segmentation. Notably, the proposed method achieves strong performance even on resource-constrained devices like Jetson Nano. Our code is available at https://github.com/Visual-Conception-Group/Localized-Perception-Constrained-Vision-Systems.
Related papers
- SaccadeDet: A Novel Dual-Stage Architecture for Rapid and Accurate Detection in Gigapixel Images [50.742420049839474]
'SaccadeDet' is an innovative architecture for gigapixel-level object detection, inspired by the human eye saccadic movement.
Our approach, evaluated on the PANDA dataset, achieves an 8x speed increase over the state-of-the-art methods.
It also demonstrates significant potential in gigapixel-level pathology analysis through its application to Whole Slide Imaging.
arXiv Detail & Related papers (2024-07-25T11:22:54Z) - UnSegGNet: Unsupervised Image Segmentation using Graph Neural Networks [9.268228808049951]
This research contributes to the broader field of unsupervised medical imaging and computer vision.
It presents an innovative methodology for image segmentation that aligns with real-world challenges.
The proposed method holds promise for diverse applications, including medical imaging, remote sensing, and object recognition.
arXiv Detail & Related papers (2024-05-09T19:02:00Z) - On the Effect of Image Resolution on Semantic Segmentation [27.115235051091663]
We show that a model capable of directly producing high-resolution segmentations can match the performance of more complex systems.
Our approach leverages a bottom-up information propagation technique across various scales.
We have rigorously tested our method using leading-edge semantic segmentation datasets.
arXiv Detail & Related papers (2024-02-08T04:21:30Z) - Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis [60.260724486834164]
This paper introduces innovative solutions to enhance spatial controllability in diffusion models reliant on text queries.
We present two key innovations: Vision Guidance and the Layered Rendering Diffusion framework.
We apply our method to three practical applications: bounding box-to-image, semantic mask-to-image and image editing.
arXiv Detail & Related papers (2023-11-30T10:36:19Z) - Super-Resolving Face Image by Facial Parsing Information [52.1267613768555]
Face super-resolution is a technology that transforms a low-resolution face image into the corresponding high-resolution one.
We build a novel parsing map guided face super-resolution network which extracts the face prior from low-resolution face image.
High-resolution features contain more precise spatial information while low-resolution features provide strong contextual information.
arXiv Detail & Related papers (2023-04-06T08:19:03Z) - Cross-resolution Face Recognition via Identity-Preserving Network and
Knowledge Distillation [12.090322373964124]
Cross-resolution face recognition is a challenging problem for modern deep face recognition systems.
This paper proposes a new approach that enforces the network to focus on the discriminative information stored in the low-frequency components of a low-resolution image.
arXiv Detail & Related papers (2023-03-15T14:52:46Z) - A Robust Morphological Approach for Semantic Segmentation of Very High
Resolution Images [2.2230089845369085]
We develop a robust pipeline that seamlessly extends any existing semantic segmentation algorithm to high resolution images.
Our method does not require the ground truth annotations of the high resolution images.
We show that the semantic segmentation results obtained by our method beat the existing state-of-the-art algorithms on high-resolution images.
arXiv Detail & Related papers (2022-08-02T05:25:35Z) - Toward an ImageNet Library of Functions for Global Optimization
Benchmarking [0.0]
This study proposes to transform the identification problem into an image recognition problem, with a potential to detect conception-free, machine-driven landscape features.
We address it as a supervised multi-class image recognition problem and apply basic artificial neural network models to solve it.
This evident successful learning is another step toward automated feature extraction and local structure deduction of BBO problems.
arXiv Detail & Related papers (2022-06-27T21:05:00Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Pixel Distillation: A New Knowledge Distillation Scheme for Low-Resolution Image Recognition [124.80263629921498]
We propose Pixel Distillation that extends knowledge distillation into the input level while simultaneously breaking architecture constraints.
Such a scheme can achieve flexible cost control for deployment, as it allows the system to adjust both network architecture and image quality according to the overall requirement of resources.
arXiv Detail & Related papers (2021-12-17T14:31:40Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.