D^2LV: A Data-Driven and Local-Verification Approach for Image Copy
Detection
- URL: http://arxiv.org/abs/2111.07090v1
- Date: Sat, 13 Nov 2021 10:56:58 GMT
- Title: D^2LV: A Data-Driven and Local-Verification Approach for Image Copy
Detection
- Authors: Wenhao Wang, Yifan Sun, Weipu Zhang, Yi Yang
- Abstract summary: A data-driven and local-verification approach is proposed to compete for Image Similarity Challenge: Matching Track at NeurIPS'21.
In D2LV, unsupervised pre-training substitutes the commonly-used supervised one.
The proposed approach ranks first out of 1,103 participants on the Facebook AI Image Similarity Challenge: Matching Track.
- Score: 36.473577708618976
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image copy detection is of great importance in real-life social media. In
this paper, a data-driven and local-verification (D^2LV) approach is proposed
to compete for Image Similarity Challenge: Matching Track at NeurIPS'21. In
D^2LV, unsupervised pre-training substitutes the commonly-used supervised one.
When training, we design a set of basic and six advanced transformations, and a
simple but effective baseline learns robust representation. During testing, a
global-local and local-global matching strategy is proposed. The strategy
performs local-verification between reference and query images. Experiments
demonstrate that the proposed method is effective. The proposed approach ranks
first out of 1,103 participants on the Facebook AI Image Similarity Challenge:
Matching Track. The code and trained models are available at
https://github.com/WangWenhao0716/ISC-Track1-Submission.
Related papers
- ForgeryTTT: Zero-Shot Image Manipulation Localization with Test-Time Training [42.58645429356456]
Social media is increasingly plagued by realistic fake images, making it hard to trust content.
Previous algorithms to detect these fakes often fail in new, real-world scenarios because they are trained on specific datasets.
We introduce ForgeryTTT, the first method leveraging test-time training to identify manipulated regions in images.
arXiv Detail & Related papers (2024-10-05T04:41:55Z) - Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition [72.35438297011176]
We propose a novel method to realize seamless adaptation of pre-trained models for visual place recognition (VPR)
Specifically, to obtain both global and local features that focus on salient landmarks for discriminating places, we design a hybrid adaptation method.
Experimental results show that our method outperforms the state-of-the-art methods with less training data and training time.
arXiv Detail & Related papers (2024-02-22T12:55:01Z) - MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks.
We propose a single-stage and standalone method, MOCA, which unifies both desired properties.
We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z) - Few-Shot Learning with Visual Distribution Calibration and Cross-Modal
Distribution Alignment [47.53887941065894]
Pre-trained vision-language models have inspired much research on few-shot learning.
With only a few training images, the visual feature distributions are easily distracted by class-irrelevant information in images.
We propose a Selective Attack module that generates spatial attention maps of images to guide the attacks on class-irrelevant image areas.
arXiv Detail & Related papers (2023-05-19T05:45:17Z) - 1st Place Solution for ECCV 2022 OOD-CV Challenge Image Classification
Track [64.49153847504141]
OOD-CV challenge is an out-of-distribution generalization task.
In this challenge, our core solution can be summarized as that Noisy Label Learning Is A Strong Test-Time Domain Adaptation method.
After integrating Test-Time Augmentation and Model Ensemble strategies, our solution ranks the first place on the Image Classification Leaderboard of the OOD-CV Challenge.
arXiv Detail & Related papers (2023-01-12T03:44:30Z) - 3rd Place: A Global and Local Dual Retrieval Solution to Facebook AI
Image Similarity Challenge [2.4340897078287815]
This paper presents our 3rd place solution to the matching track of Image Similarity Challenge (ISC) 2021 organized by Facebook AI.
We propose a multi-branch retrieval method of combining global descriptors and local descriptors to cover all attack cases.
We show some ablation experiments of our method, which reveals the complementary advantages of global and local features.
arXiv Detail & Related papers (2021-12-04T16:25:24Z) - Bag of Tricks and A Strong baseline for Image Copy Detection [36.473577708618976]
A bag of tricks and a strong baseline are proposed for image copy detection.
We design a descriptor stretching strategy to stabilize the scores of different queries.
The proposed baseline ranks third out of 526 participants on the Facebook AI Image Similarity Challenge: Descriptor Track.
arXiv Detail & Related papers (2021-11-13T13:58:43Z) - DetCo: Unsupervised Contrastive Learning for Object Detection [64.22416613061888]
Unsupervised contrastive learning achieves great success in learning image representations with CNN.
We present a novel contrastive learning approach, named DetCo, which fully explores the contrasts between global image and local image patches.
DetCo consistently outperforms supervised method by 1.6/1.2/1.0 AP on Mask RCNN-C4/FPN/RetinaNet with 1x schedule.
arXiv Detail & Related papers (2021-02-09T12:47:20Z) - Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation
Learning [108.999497144296]
Recently advanced unsupervised learning approaches use the siamese-like framework to compare two "views" from the same image for learning representations.
This work aims to involve the distance concept on label space in the unsupervised learning and let the model be aware of the soft degree of similarity between positive or negative pairs.
Despite its conceptual simplicity, we show empirically that with the solution -- Unsupervised image mixtures (Un-Mix), we can learn subtler, more robust and generalized representations from the transformed input and corresponding new label space.
arXiv Detail & Related papers (2020-03-11T17:59:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.