A Strong Baseline for Fashion Retrieval with Person Re-Identification
Models
- URL: http://arxiv.org/abs/2003.04094v1
- Date: Mon, 9 Mar 2020 12:50:15 GMT
- Title: A Strong Baseline for Fashion Retrieval with Person Re-Identification
Models
- Authors: Mikolaj Wieczorek (1), Andrzej Michalowski (1), Anna Wroblewska (1 and
2), Jacek Dabrowski (1) ((1) Synerise, (2) Warsaw University of Technology)
- Abstract summary: Fashion retrieval is the challenging task of finding an exact match for fashion items contained within an image.
We introduce a simple baseline model for fashion retrieval, significantly outperforming previous state-of-the-art results.
We conduct in-depth experiments on Street2Shop and DeepFashion datasets and validate our results.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fashion retrieval is the challenging task of finding an exact match for
fashion items contained within an image. Difficulties arise from the
fine-grained nature of clothing items, very large intra-class and inter-class
variance. Additionally, query and source images for the task usually come from
different domains - street photos and catalogue photos respectively. Due to
these differences, a significant gap in quality, lighting, contrast, background
clutter and item presentation exists between domains. As a result, fashion
retrieval is an active field of research both in academia and the industry.
Inspired by recent advancements in Person Re-Identification research, we
adapt leading ReID models to be used in fashion retrieval tasks. We introduce a
simple baseline model for fashion retrieval, significantly outperforming
previous state-of-the-art results despite a much simpler architecture. We
conduct in-depth experiments on Street2Shop and DeepFashion datasets and
validate our results. Finally, we propose a cross-domain (cross-dataset)
evaluation method to test the robustness of fashion retrieval models.
Related papers
- FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation [7.483981721542115]
FashionFail is a new dataset with e-commerce images for object detection and segmentation.
Our analysis reveals the shortcomings of leading models, such as Attribute-Mask R-CNN and Fashionformer.
We propose a baseline approach using naive data augmentation to mitigate common failure cases and improve model robustness.
arXiv Detail & Related papers (2024-04-12T16:28:30Z) - The Impact of Background Removal on Performance of Neural Networks for Fashion Image Classification and Segmentation [3.408972648415128]
We try removing the background from fashion images to boost data quality and increase model performance.
Background removal can effectively work for fashion data in simple and shallow networks that are not susceptible to overfitting.
It can improve model accuracy by up to 5% in the classification on the FashionStyle14 dataset when training models from scratch.
arXiv Detail & Related papers (2023-08-18T18:18:47Z) - LRVS-Fashion: Extending Visual Search with Referring Instructions [13.590668564555195]
We present Referred Visual Search (RVS), a task allowing users to define more precisely the desired similarity.
We release a new large public dataset, LRVS-Fashion, consisting of 272k fashion products with 842k images extracted from fashion catalogs.
arXiv Detail & Related papers (2023-06-05T14:45:38Z) - FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified
Retrieval and Captioning [66.38951790650887]
Multimodal tasks in the fashion domain have significant potential for e-commerce.
We propose a novel fashion-specific pre-training framework based on weakly-supervised triplets constructed from fashion image-text pairs.
We show the triplet-based tasks are an effective addition to standard multimodal pre-training tasks.
arXiv Detail & Related papers (2022-10-26T21:01:19Z) - Progressive Learning for Image Retrieval with Hybrid-Modality Queries [48.79599320198615]
Image retrieval with hybrid-modality queries, also known as composing text and image for image retrieval (CTI-IR)
We decompose the CTI-IR task into a three-stage learning problem to progressively learn the complex knowledge for image retrieval with hybrid-modality queries.
Our proposed model significantly outperforms state-of-the-art methods in the mean of Recall@K by 24.9% and 9.5% on the Fashion-IQ and Shoes benchmark datasets respectively.
arXiv Detail & Related papers (2022-04-24T08:10:06Z) - Fashionformer: A simple, Effective and Unified Baseline for Human
Fashion Segmentation and Recognition [80.74495836502919]
In this work, we focus on joint human fashion segmentation and attribute recognition.
We introduce the object query for segmentation and the attribute query for attribute prediction.
For attribute stream, we design a novel Multi-Layer Rendering module to explore more fine-grained features.
arXiv Detail & Related papers (2022-04-10T11:11:10Z) - Where Does the Performance Improvement Come From? - A Reproducibility
Concern about Image-Text Retrieval [85.03655458677295]
Image-text retrieval has gradually become a major research direction in the field of information retrieval.
We first examine the related concerns and why the focus is on image-text retrieval tasks.
We analyze various aspects of the reproduction of pretrained and nonpretrained retrieval models.
arXiv Detail & Related papers (2022-03-08T05:01:43Z) - Conversational Fashion Image Retrieval via Multiturn Natural Language
Feedback [36.623221002330226]
We study the task of conversational fashion image retrieval via multiturn natural language feedback.
We propose a novel framework that can effectively handle conversational fashion image retrieval with multiturn natural language feedback texts.
arXiv Detail & Related papers (2021-06-08T06:34:25Z) - FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal
Retrieval [31.822218310945036]
FashionBERT learns high level representations of texts and images.
FashionBERT achieves significant improvements in performances than the baseline and state-of-the-art approaches.
arXiv Detail & Related papers (2020-05-20T00:41:00Z) - Fashionpedia: Ontology, Segmentation, and an Attribute Localization
Dataset [62.77342894987297]
We propose a novel Attribute-Mask RCNN model to jointly perform instance segmentation and localized attribute recognition.
We also demonstrate instance segmentation models pre-trained on Fashionpedia achieve better transfer learning performance on other fashion datasets than ImageNet pre-training.
arXiv Detail & Related papers (2020-04-26T02:38:26Z) - Learning Diverse Fashion Collocation by Neural Graph Filtering [78.9188246136867]
We propose a novel fashion collocation framework, Neural Graph Filtering, that models a flexible set of fashion items via a graph neural network.
By applying symmetric operations on the edge vectors, this framework allows varying numbers of inputs/outputs and is invariant to their ordering.
We evaluate the proposed approach on three popular benchmarks, the Polyvore dataset, the Polyvore-D dataset, and our reorganized Amazon Fashion dataset.
arXiv Detail & Related papers (2020-03-11T16:17:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.