Related papers: A Similarity Alignment Model for Video Copy Segment Matching

A Similarity Alignment Model for Video Copy Segment Matching

URL: http://arxiv.org/abs/2305.15679v1
Date: Thu, 25 May 2023 03:08:51 GMT
Title: A Similarity Alignment Model for Video Copy Segment Matching
Authors: Zhenhua Liu, Feipeng Ma, Tianyi Wang, Fengyun Rao
Abstract summary: Meta AI hold Video Similarity Challenge on CVPR 2023 to push the technology forward. We propose a Similarity Alignment Model for video copy segment matching. Our SAM exhibits superior performance compared to other competitors.
Score: 13.517933749704866
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the development of multimedia technology, Video Copy Detection has been a crucial problem for social media platforms. Meta AI hold Video Similarity Challenge on CVPR 2023 to push the technology forward. In this report, we share our winner solutions on Matching Track. We propose a Similarity Alignment Model(SAM) for video copy segment matching. Our SAM exhibits superior performance compared to other competitors, with a 0.108 / 0.144 absolute improvement over the second-place competitor in Phase 1 / Phase 2. Code is available at https://github.com/FeipengMa6/VSC22-Submission/tree/main/VSC22-Matching-Track-1st.

Related papers

VC-Bench: Pioneering the Video Connecting Benchmark with a Dataset and Evaluation Metrics [83.61875204972465]
We introduce Video Connecting, a task that aims to generate smooth intermediate video content between given start and end clips.<n>To bridge this gap, we proposed VC-Bench, a novel benchmark specifically designed for video connecting.<n> VC-Bench focuses on three core aspects: Video Quality Score VQS, Start-End Consistency Score SECS, and Transition Smoothness Score TSS.
arXiv Detail & Related papers (2026-01-27T06:15:12Z)
NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results [179.05961380270648]
Review of the NTIRE 2025 Challenge on Short-form Video Quality Assessment and Enhancement. Challenge comprises two tracks: (i) Efficient Video Quality Assessment (KVQ), and (ii) Diffusion-based Image Super-Resolution (KwaiSR)
arXiv Detail & Related papers (2025-04-17T17:45:34Z)
DC-SAM: In-Context Segment Anything in Images and Videos via Dual Consistency [91.30252180093333]
We propose the Dual Consistency SAM (DCSAM) method based on prompttuning to adapt SAM and SAM2 for in-context segmentation. Our key insights are to enhance the features of the SAM's prompt encoder in segmentation by providing high-quality visual prompts. Although the proposed DC-SAM is primarily designed for images, it can be seamlessly extended to the video domain with the support SAM2.
arXiv Detail & Related papers (2025-04-16T13:41:59Z)
AIM 2024 Challenge on Video Saliency Prediction: Methods and Results [105.09572982350532]
This paper reviews the Challenge on Video Saliency Prediction at AIM 2024. The goal of the participants was to develop a method for predicting accurate saliency maps for the provided set of video sequences.
arXiv Detail & Related papers (2024-09-23T08:59:22Z)
AIM 2024 Challenge on Compressed Video Quality Assessment: Methods and Results [120.95863275142727]
This paper presents the results of the Compressed Video Quality Assessment challenge, held in conjunction with the Advances in Image Manipulation (AIM) workshop at ECCV 2024. The challenge aimed to evaluate the performance of VQA methods on a diverse dataset of 459 videos encoded with 14 codecs of various compression standards.
arXiv Detail & Related papers (2024-08-21T20:32:45Z)
1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation [81.50620771207329]
We investigate the effectiveness of static-dominant data and frame sampling on referring video object segmentation (RVOS) Our solution achieves a J&F score of 0.5447 in the competition phase and ranks 1st in the MeViS track of the PVUW Challenge.
arXiv Detail & Related papers (2024-06-11T08:05:26Z)
NTIRE 2024 Quality Assessment of AI-Generated Content Challenge [141.37864527005226]
The challenge is divided into the image track and the video track. The winning methods in both tracks have demonstrated superior prediction performance on AIGC.
arXiv Detail & Related papers (2024-04-25T15:36:18Z)
A Dual-level Detection Method for Video Copy Detection [13.517933749704866]
Meta AI hold Video Similarity Challenge on CVPR 2023 to push the technology forward. We propose a dual-level detection method with Video Editing Detection (VED) and Frame Scenes Detection (FSD) to tackle the core challenges on Video Copy Detection.
arXiv Detail & Related papers (2023-05-21T06:19:08Z)
3rd Place Solution to Meta AI Video Similarity Challenge [1.1470070927586016]
This paper presents our 3rd place solution in the Meta AI Video Similarity Challenge (VSC2022) Our approach builds upon existing image copy detection techniques and incorporates several strategies to exploit on the properties of video data.
arXiv Detail & Related papers (2023-04-24T10:00:09Z)
Feature-compatible Progressive Learning for Video Copy Detection [30.358206867280426]
Video Copy Detection (VCD) has been developed to identify instances of unauthorized or duplicated video content. This paper presents our second place solutions to the Meta AI Video Similarity Challenge (VSC22), CVPR 2023.
arXiv Detail & Related papers (2023-04-20T13:39:47Z)
M&M Mix: A Multimodal Multiview Transformer Ensemble [77.16389667210427]
This report describes the approach behind our winning solution to the 2022 Epic-Kitchens Action Recognition Challenge. Our approach builds upon our recent work, Multiview Transformer for Video Recognition (MTV), and adapts it to multimodal inputs. Our approach achieved 52.8% Top-1 accuracy on the test set in action classes, which is 4.1% higher than last year's winning entry.
arXiv Detail & Related papers (2022-06-20T15:31:13Z)
Cross-modal Manifold Cutmix for Self-supervised Video Representation Learning [50.544635516455116]
This paper focuses on designing video augmentation for self-supervised learning. We first analyze the best strategy to mix videos to create a new augmented video sample. We propose Cross-Modal Manifold Cutmix (CMMC) that inserts a video tesseract into another video tesseract in the feature space across two different modalities.
arXiv Detail & Related papers (2021-12-07T18:58:33Z)
Top1 Solution of QQ Browser 2021 Ai Algorithm Competition Track 1 : Multimodal Video Similarity [0.6445605125467573]
We describe the solution to the QQ Browser 2021 Ai Algorithm Competition (AIAC) Track 1. In the pretrain phase, we train the model with three tasks, (1) Video Tag Classification (VTC), (2) Mask Language Modeling (MLM) and (3) Mask Frame Modeling (MFM) In the finetune phase, we train the model with video similarity based on rank normalized human labels. Our full pipeline, after ensembling several models, scores 0.852 on the leaderboard, which we achieved the 1st place in the competition.
arXiv Detail & Related papers (2021-10-30T15:38:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.