Contrastive Learning with Large Memory Bank and Negative Embedding
Subtraction for Accurate Copy Detection
- URL: http://arxiv.org/abs/2112.04323v1
- Date: Wed, 8 Dec 2021 15:08:10 GMT
- Title: Contrastive Learning with Large Memory Bank and Negative Embedding
Subtraction for Accurate Copy Detection
- Authors: Shuhei Yokoo
- Abstract summary: Copy detection is a task to determine whether an image is a modified copy of any image in a database.
We trained convolutional neural networks (CNNs) with contrastive learning.
Using our methods, we achieved 1st place in the Facebook AI Image Similarity Challenge: Descriptor Track.
- Score: 1.90365714903665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Copy detection, which is a task to determine whether an image is a modified
copy of any image in a database, is an unsolved problem. Thus, we addressed
copy detection by training convolutional neural networks (CNNs) with
contrastive learning. Training with a large memory-bank and hard data
augmentation enables the CNNs to obtain more discriminative representation. Our
proposed negative embedding subtraction further boosts the copy detection
accuracy. Using our methods, we achieved 1st place in the Facebook AI Image
Similarity Challenge: Descriptor Track. Our code is publicly available here:
\url{https://github.com/lyakaap/ISC21-Descriptor-Track-1st}
Related papers
- Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections [0.0]
This paper presents a comparative study of near-duplicate image detection techniques in a real-world use case scenario.
We propose a transductive learning approach that leverages state-of-the-art deep learning architectures such as convolutional neural networks (CNNs) and Vision Transformers (ViTs)
The results show that the proposed approach outperforms the baseline methods in the task of near-duplicate image detection in the UKBench and an in-house private dataset.
arXiv Detail & Related papers (2024-10-25T09:56:15Z) - Generating Aligned Pseudo-Supervision from Non-Aligned Data for Image
Restoration in Under-Display Camera [84.41316720913785]
We revisit the classic stereo setup for training data collection -- capturing two images of the same scene with one UDC and one standard camera.
The key idea is to "copy" details from a high-quality reference image and "paste" them on the UDC image.
A novel Transformer-based framework generates well-aligned yet high-quality target data for the corresponding UDC input.
arXiv Detail & Related papers (2023-04-12T17:56:42Z) - Active Image Indexing [26.33727468288776]
This paper improves the robustness of image copy detection with active indexing.
We reduce the quantization loss of a given image representation by making imperceptible changes to the image before its release.
Experiments show that the retrieval and copy detection of activated images is significantly improved.
arXiv Detail & Related papers (2022-10-05T17:55:15Z) - A Self-Supervised Descriptor for Image Copy Detection [13.624995441674642]
We introduce SSCD, a model that builds on a self-supervised contrastive training objective.
We adapt this method to the copy detection task by changing the architecture and training objective.
Our approach relies on an entropy regularization term, promoting consistent separation between descriptor vectors.
arXiv Detail & Related papers (2022-02-21T14:25:32Z) - Bag of Tricks and A Strong baseline for Image Copy Detection [36.473577708618976]
A bag of tricks and a strong baseline are proposed for image copy detection.
We design a descriptor stretching strategy to stabilize the scores of different queries.
The proposed baseline ranks third out of 526 participants on the Facebook AI Image Similarity Challenge: Descriptor Track.
arXiv Detail & Related papers (2021-11-13T13:58:43Z) - Compact Binary Fingerprint for Image Copy Re-Ranking [0.0]
Image copy detection is challenging and appealing topic in computer vision and signal processing.
Local keypoint descriptors such as SIFT are used to represent the images, and based on those descriptors matching, images are matched and retrieved.
Features are quantized so that searching/matching may be made feasible for large databases at the cost of accuracy loss.
arXiv Detail & Related papers (2021-09-16T08:44:56Z) - Unsupervised Pretraining for Object Detection by Patch Reidentification [72.75287435882798]
Unsupervised representation learning achieves promising performances in pre-training representations for object detectors.
This work proposes a simple yet effective representation learning method for object detection, named patch re-identification (Re-ID)
Our method significantly outperforms its counterparts on COCO in all settings, such as different training iterations and data percentages.
arXiv Detail & Related papers (2021-03-08T15:13:59Z) - Data Augmentation for Object Detection via Differentiable Neural
Rendering [71.00447761415388]
It is challenging to train a robust object detector when annotated data is scarce.
Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data.
We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
arXiv Detail & Related papers (2021-03-04T06:31:06Z) - DetCo: Unsupervised Contrastive Learning for Object Detection [64.22416613061888]
Unsupervised contrastive learning achieves great success in learning image representations with CNN.
We present a novel contrastive learning approach, named DetCo, which fully explores the contrasts between global image and local image patches.
DetCo consistently outperforms supervised method by 1.6/1.2/1.0 AP on Mask RCNN-C4/FPN/RetinaNet with 1x schedule.
arXiv Detail & Related papers (2021-02-09T12:47:20Z) - D-Unet: A Dual-encoder U-Net for Image Splicing Forgery Detection and
Localization [108.8592577019391]
Image splicing forgery detection is a global binary classification task that distinguishes the tampered and non-tampered regions by image fingerprints.
We propose a novel network called dual-encoder U-Net (D-Unet) for image splicing forgery detection, which employs an unfixed encoder and a fixed encoder.
In an experimental comparison study of D-Unet and state-of-the-art methods, D-Unet outperformed the other methods in image-level and pixel-level detection.
arXiv Detail & Related papers (2020-12-03T10:54:02Z) - An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human
Pose Estimation [80.02124918255059]
Semi-supervised learning aims to boost the accuracy of a model by exploring unlabeled images.
We learn two networks to mutually teach each other.
The more reliable predictions on easy images in each network are used to teach the other network to learn about the corresponding hard images.
arXiv Detail & Related papers (2020-11-25T03:29:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.