Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An
Approach
- URL: http://arxiv.org/abs/2108.02399v1
- Date: Thu, 5 Aug 2021 06:28:32 GMT
- Title: Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An
Approach
- Authors: Zeren Sun, Yazhou Yao, Xiu-Shen Wei, Yongshun Zhang, Fumin Shen,
Jianxin Wu, Jian Zhang, Heng-Tao Shen
- Abstract summary: We construct two new benchmark webly supervised fine-grained datasets, WebFG-496 and WebiNat-5089, respectively.
For WebiNat-5089, it contains 5089 sub-categories and more than 1.1 million web training images, which is the largest webly supervised fine-grained dataset ever.
As a minor contribution, we also propose a novel webly supervised method (termed Peer-learning'') for benchmarking these datasets.
- Score: 115.91099791629104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning from the web can ease the extreme dependence of deep learning on
large-scale manually labeled datasets. Especially for fine-grained recognition,
which targets at distinguishing subordinate categories, it will significantly
reduce the labeling costs by leveraging free web data. Despite its significant
practical and research value, the webly supervised fine-grained recognition
problem is not extensively studied in the computer vision community, largely
due to the lack of high-quality datasets. To fill this gap, in this paper we
construct two new benchmark webly supervised fine-grained datasets, termed
WebFG-496 and WebiNat-5089, respectively. In concretely, WebFG-496 consists of
three sub-datasets containing a total of 53,339 web training images with 200
species of birds (Web-bird), 100 types of aircrafts (Web-aircraft), and 196
models of cars (Web-car). For WebiNat-5089, it contains 5089 sub-categories and
more than 1.1 million web training images, which is the largest webly
supervised fine-grained dataset ever. As a minor contribution, we also propose
a novel webly supervised method (termed ``{Peer-learning}'') for benchmarking
these datasets.~Comprehensive experimental results and analyses on two new
benchmark datasets demonstrate that the proposed method achieves superior
performance over the competing baseline models and states-of-the-art. Our
benchmark datasets and the source codes of Peer-learning have been made
available at
{\url{https://github.com/NUST-Machine-Intelligence-Laboratory/weblyFG-dataset}}.
Related papers
- The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale [30.955171096569618]
FineWeb is a 15-trillion token dataset derived from 96 Common Crawl snapshots.
FineWeb-Edu is a 1.3-trillion token collection of educational text filtered from FineWeb.
arXiv Detail & Related papers (2024-06-25T13:50:56Z) - From Categories to Classifier: Name-Only Continual Learning by Exploring
the Web [125.75085825742092]
Continual learning often relies on the availability of extensive annotated datasets, an assumption that is unrealistically time-consuming and costly in practice.
We explore a novel paradigm termed name-only continual learning where time and cost constraints prohibit manual annotation.
Our proposed solution leverages the expansive and ever-evolving internet to query and download uncurated webly-supervised data for image classification.
arXiv Detail & Related papers (2023-11-19T10:43:43Z) - ELFIS: Expert Learning for Fine-grained Image Recognition Using Subsets [6.632855264705276]
We propose ELFIS, an expert learning framework for Fine-Grained Visual Recognition.
A set of neural networks-based experts are trained focusing on the meta-categories and are integrated into a multi-task framework.
Experiments show improvements in the SoTA FGVR benchmarks of up to +1.3% of accuracy using both CNNs and transformer-based networks.
arXiv Detail & Related papers (2023-03-16T12:45:19Z) - GROWN+UP: A Graph Representation Of a Webpage Network Utilizing
Pre-training [0.2538209532048866]
We introduce an agnostic deep graph neural network feature extractor that can ingest webpage structures, pre-train self-supervised on massive unlabeled data, and fine-tune to arbitrary tasks on webpages effectually.
We show that our pre-trained model achieves state-of-the-art results using multiple datasets on two very different benchmarks: webpage boilerplate removal and genre classification.
arXiv Detail & Related papers (2022-08-03T13:37:27Z) - DataPerf: Benchmarks for Data-Centric AI Development [81.03754002516862]
DataPerf is a community-led benchmark suite for evaluating ML datasets and data-centric algorithms.
We provide an open, online platform with multiple rounds of challenges to support this iterative development.
The benchmarks, online evaluation platform, and baseline implementations are open source.
arXiv Detail & Related papers (2022-07-20T17:47:54Z) - Data-Free Adversarial Knowledge Distillation for Graph Neural Networks [62.71646916191515]
We propose the first end-to-end framework for data-free adversarial knowledge distillation on graph structured data (DFAD-GNN)
To be specific, our DFAD-GNN employs a generative adversarial network, which mainly consists of three components: a pre-trained teacher model and a student model are regarded as two discriminators, and a generator is utilized for deriving training graphs to distill knowledge from the teacher model into the student model.
Our DFAD-GNN significantly surpasses state-of-the-art data-free baselines in the graph classification task.
arXiv Detail & Related papers (2022-05-08T08:19:40Z) - The Klarna Product Page Dataset: Web Element Nomination with Graph
Neural Networks and Large Language Models [51.39011092347136]
We introduce the Klarna Product Page dataset, a collection of webpages that surpasses existing datasets in richness and variety.
We empirically benchmark a range of Graph Neural Networks (GNNs) on the web element nomination task.
Second, we introduce a training refinement procedure that involves identifying a small number of relevant elements from each page.
Third, we introduce the Challenge Nomination Training Procedure, a novel training approach that further boosts nomination accuracy.
arXiv Detail & Related papers (2021-11-03T12:13:52Z) - On The State of Data In Computer Vision: Human Annotations Remain
Indispensable for Developing Deep Learning Models [0.0]
High-quality labeled datasets play a crucial role in fueling the development of machine learning (ML)
Since the emergence of the ImageNet dataset and the AlexNet model in 2012, the size of new open-source labeled vision datasets has remained roughly constant.
Only a minority of publications in the computer vision community tackle supervised learning on datasets that are orders of magnitude larger than Imagenet.
arXiv Detail & Related papers (2021-07-31T00:08:21Z) - Facial Age Estimation using Convolutional Neural Networks [0.0]
This paper is a part of a student project in Machine Learning at the Norwegian University of Science and Technology.
A deep convolutional neural network with five convolutional layers and three fully-connected layers is presented to estimate the ages of individuals based on images.
arXiv Detail & Related papers (2021-05-14T10:09:47Z) - NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization [101.13851473792334]
We construct a large-scale congested crowd counting and localization dataset, NWPU-Crowd, consisting of 5,109 images, in a total of 2,133,375 annotated heads with points and boxes.
Compared with other real-world datasets, it contains various illumination scenes and has the largest density range (020,033)
We describe the data characteristics, evaluate the performance of some mainstream state-of-the-art (SOTA) methods, and analyze the new problems that arise on the new data.
arXiv Detail & Related papers (2020-01-10T09:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.