Retail-786k: a Large-Scale Dataset for Visual Entity Matching
- URL: http://arxiv.org/abs/2309.17164v2
- Date: Mon, 11 Mar 2024 15:11:11 GMT
- Title: Retail-786k: a Large-Scale Dataset for Visual Entity Matching
- Authors: Bianca Lamm (1 and 2), Janis Keuper (1) ((1) IMLA, Offenburg
University, (2) Markant Services International GmbH)
- Abstract summary: This paper introduces the first publicly available large-scale dataset for "visual entity matching"
We provide a total of 786k manually annotated, high resolution product images containing 18k different individual retail products which are grouped into 3k entities.
The proposed "visual entity matching" constitutes a novel learning problem which can not sufficiently be solved using standard image based classification and retrieval algorithms.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Entity Matching (EM) defines the task of learning to group objects by
transferring semantic concepts from example groups (=entities) to unseen data.
Despite the general availability of image data in the context of many
EM-problems, most currently available EM-algorithms solely rely on (textual)
meta data. In this paper, we introduce the first publicly available large-scale
dataset for "visual entity matching", based on a production level use case in
the retail domain. Using scanned advertisement leaflets, collected over several
years from different European retailers, we provide a total of ~786k manually
annotated, high resolution product images containing ~18k different individual
retail products which are grouped into ~3k entities. The annotation of these
product entities is based on a price comparison task, where each entity forms
an equivalence class of comparable products. Following on a first baseline
evaluation, we show that the proposed "visual entity matching" constitutes a
novel learning problem which can not sufficiently be solved using standard
image based classification and retrieval algorithms. Instead, novel approaches
which allow to transfer example based visual equivalent classes to new data are
needed to address the proposed problem. The aim of this paper is to provide a
benchmark for such algorithms.
Information about the dataset, evaluation code and download instructions are
provided under https://www.retail-786k.org/.
Related papers
- Unsupervised Collaborative Metric Learning with Mixed-Scale Groups for General Object Retrieval [28.810040080126324]
This paper presents a novel unsupervised deep metric learning approach, termed unsupervised collaborative metric learning with mixed-scale groups (MS-UGCML)
We show that our approach can learn embeddings for objects of varying scales, with an object level and image level mAPs improvement of up to 6.69% and 10.03%, respectively.
arXiv Detail & Related papers (2024-03-16T04:01:50Z) - Text-Based Product Matching -- Semi-Supervised Clustering Approach [9.748519919202986]
This paper aims to present a new philosophy to product matching utilizing a semi-supervised clustering approach.
We study the properties of this method by experimenting with the IDEC algorithm on the real-world dataset.
arXiv Detail & Related papers (2024-02-01T18:52:26Z) - Thinking Like an Annotator: Generation of Dataset Labeling Instructions [59.603239753484345]
We introduce a new task, Labeling Instruction Generation, to address missing publicly available labeling instructions.
We take a reasonably annotated dataset and: 1) generate a set of examples that are visually representative of each category in the dataset; 2) provide a text label that corresponds to each of the examples.
This framework acts as a proxy to human annotators that can help to both generate a final labeling instruction set and evaluate its quality.
arXiv Detail & Related papers (2023-06-24T18:32:48Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Automatic Generation of Product-Image Sequence in E-commerce [46.06263129000091]
Multi-modality Unified Imagesequence (MUIsC) is able to simultaneously detect all categories through learning rule violations.
By Dec 2021, our AGPIS framework has generated high-standard images for about 1.5 million products and achieves 13.6% in reject rate.
arXiv Detail & Related papers (2022-06-26T23:38:42Z) - Open-World Entity Segmentation [70.41548013910402]
We introduce a new image segmentation task, termed Entity (ES) with the aim to segment all visual entities in an image without considering semantic category labels.
All semantically-meaningful segments are equally treated as categoryless entities and there is no thing-stuff distinction.
ES enables the following: (1) merging multiple datasets to form a large training set without the need to resolve label conflicts; (2) any model trained on one dataset can generalize exceptionally well to other datasets with unseen domains.
arXiv Detail & Related papers (2021-07-29T17:59:05Z) - Simple multi-dataset detection [83.9604523643406]
We present a simple method for training a unified detector on multiple large-scale datasets.
We show how to automatically integrate dataset-specific outputs into a common semantic taxonomy.
Our approach does not require manual taxonomy reconciliation.
arXiv Detail & Related papers (2021-02-25T18:55:58Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z) - Rethinking Object Detection in Retail Stores [55.359582952686175]
We propose a new task, simultaneously object localization and counting, abbreviated as Locount.
Locount requires algorithms to localize groups of objects of interest with the number of instances.
We collect a large-scale object localization and counting dataset with rich annotations in retail stores.
arXiv Detail & Related papers (2020-03-18T14:01:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.