SAMEdge: An Edge-cloud Video Analytics Architecture for the Segment Anything Model
- URL: http://arxiv.org/abs/2409.14784v1
- Date: Mon, 23 Sep 2024 07:59:09 GMT
- Title: SAMEdge: An Edge-cloud Video Analytics Architecture for the Segment Anything Model
- Authors: Rui Lu, Siping Shi, Yanting Liu, Dan Wang,
- Abstract summary: We propose SAMEdge, a novel edge-cloud computing architecture designed to support SAM computations for edge users.
SAMEdge integrates new modules on the edge and the cloud to maximize analytics accuracy under visual prompts and image prompts input with latency constraints.
- Score: 7.9748022315005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As artificial intelligence continues to evolve, it is increasingly capable of handling a wide range of video analytics tasks with merely one large model. One of the key foundation technologies is the Segment Anything Model (SAM), which allows the video analytics tasks to be determined on the fly according to the input prompts from the user. However, achieving real-time response in video analytics applications is crucial for user experiences due to the limited communication and computation resources on the edge, especially with SAM, where users may continuously interact by adding or adjusting prompts. In this paper, we propose SAMEdge, a novel edge-cloud computing architecture designed to support SAM computations for edge users. SAMEdge integrates new modules on the edge and the cloud to maximize analytics accuracy under visual prompts and image prompts input with latency constraints. It addresses resource challenges associated with prompt encoding and image encoding by offering a visual prompt transformation algorithm for visual prompts and efficient workload partitioning for image encoding. SAMEdge is implemented by extending the open-source SAM project from Meta AI. We demonstrate the practical application of SAMEdge through a case study on a Visual Tour Guide application. Our evaluation indicates that SAMEdge significantly enhances the accuracy of the video analytics application under distinct network bandwidths across various prompts.
Related papers
- Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation [4.6570959687411975]
The Segment Anything Model (SAM) demonstrates exceptional generalization capabilities.
SAM's lack of pretraining on massive remote sensing images and its interactive structure limit its automatic mask prediction capabilities.
A Multi- cognitive SAM-Based Instance Model (MC-SAM SEG) is introduced to employ SAM on remote sensing domain.
The proposed method named MC-SAM SEG extracts high-quality features by fine-tuning the SAM-Mona encoder along with a feature aggregator.
arXiv Detail & Related papers (2024-08-16T07:23:22Z) - Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection [58.241593208031816]
Segment Anything Model (SAM) has been proposed as a visual fundamental model, which gives strong segmentation and generalization capabilities.
We propose a Multi-scale and Detail-enhanced SAM (MDSAM) for Salient Object Detection (SOD)
Experimental results demonstrate the superior performance of our model on multiple SOD datasets.
arXiv Detail & Related papers (2024-08-08T09:09:37Z) - Segment Anything for Videos: A Systematic Survey [52.28931543292431]
The recent wave of foundation models has witnessed tremendous success in computer vision (CV) and beyond.
The segment anything model (SAM) has sparked a passion for exploring task-agnostic visual foundation models.
This work conducts a systematic review on SAM for videos in the era of foundation models.
arXiv Detail & Related papers (2024-07-31T02:24:53Z) - AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning [61.666973416903005]
Segment Anything Model (SAM) has demonstrated its impressive generalization capabilities in open-world scenarios with the guidance of prompts.
We propose a novel framework, termed AlignSAM, designed for automatic prompting for aligning SAM to an open context.
arXiv Detail & Related papers (2024-06-01T16:21:39Z) - FocSAM: Delving Deeply into Focused Objects in Segmenting Anything [58.042354516491024]
The Segment Anything Model (SAM) marks a notable milestone in segmentation models.
We propose FocSAM with a pipeline redesigned on two pivotal aspects.
First, we propose Dynamic Window Multi-head Self-Attention (Dwin-MSA) to dynamically refocus SAM's image embeddings on the target object.
Second, we propose Pixel-wise Dynamic ReLU (P-DyReLU) to enable sufficient integration of interactive information from a few initial clicks.
arXiv Detail & Related papers (2024-05-29T02:34:13Z) - SAM-PD: How Far Can SAM Take Us in Tracking and Segmenting Anything in
Videos by Prompt Denoising [37.216493829454706]
We explore the potential of applying the Segment Anything Model to track and segment objects in videos.
Specifically, we iteratively propagate the bounding box of each object's mask in the preceding frame as the prompt for the next frame.
To enhance SAM's denoising capability against position and size variations, we propose a multi-prompt strategy.
arXiv Detail & Related papers (2024-03-07T03:52:59Z) - RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation [53.4319652364256]
This paper presents the RefSAM model, which explores the potential of SAM for referring video object segmentation.
Our proposed approach adapts the original SAM model to enhance cross-modality learning by employing a lightweight Cross-RValModal.
We employ a parameter-efficient tuning strategy to align and fuse the language and vision features effectively.
arXiv Detail & Related papers (2023-07-03T13:21:58Z) - The Segment Anything Model (SAM) for Remote Sensing Applications: From
Zero to One Shot [6.500451285898152]
This study aims to advance the application of the Segment Anything Model (SAM) in remote sensing image analysis.
SAM is known for its exceptional generalization capabilities and zero-shot learning.
Despite the limitations encountered with lower spatial resolution images, SAM exhibits promising adaptability to remote sensing data analysis.
arXiv Detail & Related papers (2023-06-29T01:49:33Z) - A Comprehensive Survey on Segment Anything Model for Vision and Beyond [7.920790211915402]
It is urgent to design a general class of models, which we term foundation models, trained on broad data.
The recently proposed segment anything model (SAM) has made significant progress in breaking the boundaries of segmentation.
This paper introduces the background and terminology for foundation models including SAM, as well as state-of-the-art methods contemporaneous with SAM.
arXiv Detail & Related papers (2023-05-14T16:23:22Z) - A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering [49.732628643634975]
The Segment Anything Model (SAM), developed by Meta AI Research, offers a robust framework for image and video segmentation.
This survey provides a comprehensive exploration of the SAM family, including SAM and SAM 2, highlighting their advancements in granularity and contextual understanding.
arXiv Detail & Related papers (2023-05-12T07:21:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.