Adaptive Memory Management for Video Object Segmentation
- URL: http://arxiv.org/abs/2204.06626v1
- Date: Wed, 13 Apr 2022 19:59:07 GMT
- Title: Adaptive Memory Management for Video Object Segmentation
- Authors: Ali Pourganjalikhan and Charalambos Poullis
- Abstract summary: A matching-based network stores every-k frames in an external memory bank for future inference.
The size of the memory bank gradually increases with the length of the video, which slows down inference speed and makes it impractical to handle arbitrary length videos.
This paper proposes an adaptive memory bank strategy for matching-based networks for semi-supervised video object segmentation (VOS) that can handle videos of arbitrary length by discarding obsolete features.
- Score: 6.282068591820945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Matching-based networks have achieved state-of-the-art performance for video
object segmentation (VOS) tasks by storing every-k frames in an external memory
bank for future inference. Storing the intermediate frames' predictions
provides the network with richer cues for segmenting an object in the current
frame. However, the size of the memory bank gradually increases with the length
of the video, which slows down inference speed and makes it impractical to
handle arbitrary length videos.
This paper proposes an adaptive memory bank strategy for matching-based
networks for semi-supervised video object segmentation (VOS) that can handle
videos of arbitrary length by discarding obsolete features. Features are
indexed based on their importance in the segmentation of the objects in
previous frames. Based on the index, we discard unimportant features to
accommodate new features. We present our experiments on DAVIS 2016, DAVIS 2017,
and Youtube-VOS that demonstrate that our method outperforms state-of-the-art
that employ first-and-latest strategy with fixed-sized memory banks and
achieves comparable performance to the every-k strategy with increasing-sized
memory banks. Furthermore, experiments show that our method increases inference
speed by up to 80% over the every-k and 35% over first-and-latest strategies.
Related papers
- Efficient Video Object Segmentation via Modulated Cross-Attention Memory [123.12273176475863]
We propose a transformer-based approach, named MAVOS, to model temporal smoothness without requiring frequent memory expansion.
Our MAVOS achieves a J&F score of 63.3% while operating at 37 frames per second (FPS) on a single V100 GPU.
arXiv Detail & Related papers (2024-03-26T17:59:58Z) - Memory-Efficient Continual Learning Object Segmentation for Long Video [7.9190306016374485]
We propose two novel techniques to reduce the memory requirement of Online VOS methods while improving modeling accuracy and generalization on long videos.
Motivated by the success of continual learning techniques in preserving previously-learned knowledge, here we propose Gated-Regularizer Continual Learning (GRCL) and a Reconstruction-based Memory Selection Continual Learning (RMSCL)
Experimental results show that the proposed methods are able to improve the performance of Online VOS models by more than 8%, with improved robustness on long-video datasets.
arXiv Detail & Related papers (2023-09-26T21:22:03Z) - READMem: Robust Embedding Association for a Diverse Memory in
Unconstrained Video Object Segmentation [24.813416082160224]
We present READMem, a modular framework for sVOS methods to handle unconstrained videos.
We propose a robust association of the embeddings stored in the memory with query embeddings during the update process.
Our approach achieves competitive results on the Long-time Video dataset (LV1) while not hindering performance on short sequences.
arXiv Detail & Related papers (2023-05-22T08:31:16Z) - Look Before You Match: Instance Understanding Matters in Video Object
Segmentation [114.57723592870097]
In this paper, we argue that instance matters in video object segmentation (VOS)
We present a two-branch network for VOS, where the query-based instance segmentation (IS) branch delves into the instance details of the current frame and the VOS branch performs spatial-temporal matching with the memory bank.
We employ well-learned object queries from IS branch to inject instance-specific information into the query key, with which the instance-auged matching is further performed.
arXiv Detail & Related papers (2022-12-13T18:59:59Z) - Per-Clip Video Object Segmentation [110.08925274049409]
Recently, memory-based approaches show promising results on semisupervised video object segmentation.
We treat video object segmentation as clip-wise mask-wise propagation.
We propose a new method tailored for the per-clip inference.
arXiv Detail & Related papers (2022-08-03T09:02:29Z) - Region Aware Video Object Segmentation with Deep Motion Modeling [56.95836951559529]
Region Aware Video Object (RAVOS) is a method that predicts regions of interest for efficient object segmentation and memory storage.
For efficient segmentation, object features are extracted according to the ROIs, and an object decoder is designed for object-level segmentation.
For efficient memory storage, we propose motion path memory to filter out redundant context by memorizing the features within the motion path of objects between two frames.
arXiv Detail & Related papers (2022-07-21T01:44:40Z) - Learning Quality-aware Dynamic Memory for Video Object Segmentation [32.06309833058726]
We propose a Quality-aware Dynamic Memory Network (QDMN) to evaluate the segmentation quality of each frame.
Our QDMN achieves new state-of-the-art performance on both DAVIS and YouTube-VOS benchmarks.
arXiv Detail & Related papers (2022-07-16T12:18:04Z) - Recurrent Dynamic Embedding for Video Object Segmentation [54.52527157232795]
We propose a Recurrent Dynamic Embedding (RDE) to build a memory bank of constant size.
We propose an unbiased guidance loss during the training stage, which makes SAM more robust in long videos.
We also design a novel self-correction strategy so that the network can repair the embeddings of masks with different qualities in the memory bank.
arXiv Detail & Related papers (2022-05-08T02:24:43Z) - Video Object Segmentation with Episodic Graph Memory Networks [198.74780033475724]
A graph memory network is developed to address the novel idea of "learning to update the segmentation model"
We exploit an episodic memory network, organized as a fully connected graph, to store frames as nodes and capture cross-frame correlations by edges.
The proposed graph memory network yields a neat yet principled framework, which can generalize well both one-shot and zero-shot video object segmentation tasks.
arXiv Detail & Related papers (2020-07-14T13:19:19Z) - Dual Temporal Memory Network for Efficient Video Object Segmentation [42.05305410986511]
One of the fundamental challenges in Video Object (VOS) is how to make the most use of the temporal information to boost the performance.
We present an end-to-end network which stores short- and long-term video sequence information preceding the current frame as the temporal memories.
Our network consists of two temporal sub-networks including a short-term memory sub-network and a long-term memory sub-network.
arXiv Detail & Related papers (2020-03-13T06:07:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.