Segment Any Events via Weighted Adaptation of Pivotal Tokens
- URL: http://arxiv.org/abs/2312.16222v1
- Date: Sun, 24 Dec 2023 12:47:08 GMT
- Title: Segment Any Events via Weighted Adaptation of Pivotal Tokens
- Authors: Zhiwen Chen, Zhiyu Zhu, Yifan Zhang, Junhui Hou, Guangming Shi, and
Jinjian Wu
- Abstract summary: This paper focuses on the nuanced challenge of tailoring the Segment Anything Models (SAMs) for integration with event data.
We introduce a multi-scale feature distillation methodology to optimize the alignment of token embeddings originating from event data with their RGB image counterparts.
- Score: 85.39087004253163
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we delve into the nuanced challenge of tailoring the Segment
Anything Models (SAMs) for integration with event data, with the overarching
objective of attaining robust and universal object segmentation within the
event-centric domain. One pivotal issue at the heart of this endeavor is the
precise alignment and calibration of embeddings derived from event-centric data
such that they harmoniously coincide with those originating from RGB imagery.
Capitalizing on the vast repositories of datasets with paired events and RGB
images, our proposition is to harness and extrapolate the profound knowledge
encapsulated within the pre-trained SAM framework. As a cornerstone to
achieving this, we introduce a multi-scale feature distillation methodology.
This methodology rigorously optimizes the alignment of token embeddings
originating from event data with their RGB image counterparts, thereby
preserving and enhancing the robustness of the overall architecture.
Considering the distinct significance that token embeddings from intermediate
layers hold for higher-level embeddings, our strategy is centered on accurately
calibrating the pivotal token embeddings. This targeted calibration is aimed at
effectively managing the discrepancies in high-level embeddings originating
from both the event and image domains. Extensive experiments on different
datasets demonstrate the effectiveness of the proposed distillation method.
Code in http://github.com/happychenpipi/EventSAM.
Related papers
- EZSR: Event-based Zero-Shot Recognition [21.10165234725309]
This paper studies zero-shot object recognition using event camera data.
We develop an event encoder without relying on additional reconstruction networks.
We achieve 47.84% zero-shot accuracy on the N-ImageNet dataset.
arXiv Detail & Related papers (2024-07-31T14:06:06Z) - Depth-Guided Semi-Supervised Instance Segmentation [62.80063539262021]
Semi-Supervised Instance (SSIS) aims to leverage an amount of unlabeled data during training.
Previous frameworks primarily utilized the RGB information of unlabeled images to generate pseudo-labels.
We introduce a Depth-Guided (DG) framework to overcome this limitation.
arXiv Detail & Related papers (2024-06-25T09:36:50Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation [21.950751953721817]
We propose a novel Bi-directional Fractal Cross Fusion Network (BiFCNet) for semantic segmentation.
We use RGB images with rich color features as input to our network in which the Fractal Cross Fusion module fuses RGB and depth data.
To reduce the cost of real data collection, we propose a data augmentation method based on an adversarial strategy.
arXiv Detail & Related papers (2023-05-05T03:21:55Z) - Discriminative Co-Saliency and Background Mining Transformer for
Co-Salient Object Detection [111.04994415248736]
We propose a Discriminative co-saliency and background Mining Transformer framework (DMT)
We use two types of pre-defined tokens to mine co-saliency and background information via our proposed contrast-induced pixel-to-token correlation and co-saliency token-to-token correlation modules.
Experimental results on three benchmark datasets demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2023-04-30T15:56:47Z) - Multi-domain Collaborative Feature Representation for Robust Visual
Object Tracking [32.760681454334765]
This paper focuses on effectively representing and utilizing complementary features from the frame domain and event domain.
For learning the unique features of the two domains, we utilize a Unique Extractor for Event (UEE) based on Spiking Neural Networks.
Experiments on standard RGB benchmark and real event tracking dataset demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2021-08-10T09:01:42Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Cluster-level Feature Alignment for Person Re-identification [16.01713931617725]
This paper probes another feature alignment modality, namely cluster-level feature alignment across whole dataset.
We propose anchor loss and investigate many variants of cluster-level feature alignment, which consists of iterative aggregation and alignment from overview of dataset.
arXiv Detail & Related papers (2020-08-15T23:47:47Z) - Gradient-Induced Co-Saliency Detection [81.54194063218216]
Co-saliency detection (Co-SOD) aims to segment the common salient foreground in a group of relevant images.
In this paper, inspired by human behavior, we propose a gradient-induced co-saliency detection method.
arXiv Detail & Related papers (2020-04-28T08:40:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.