Scene Image Representation by Foreground, Background and Hybrid Features
- URL: http://arxiv.org/abs/2006.03199v1
- Date: Fri, 5 Jun 2020 01:55:24 GMT
- Title: Scene Image Representation by Foreground, Background and Hybrid Features
- Authors: Chiranjibi Sitaula and Yong Xiang and Sunil Aryal and Xuequan Lu
- Abstract summary: We propose to use hybrid features in addition to foreground and background features to represent scene images.
Our method produces the state-of-the-art classification performance.
- Score: 17.754713956659188
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous methods for representing scene images based on deep learning
primarily consider either the foreground or background information as the
discriminating clues for the classification task. However, scene images also
require additional information (hybrid) to cope with the inter-class similarity
and intra-class variation problems. In this paper, we propose to use hybrid
features in addition to foreground and background features to represent scene
images. We suppose that these three types of information could jointly help to
represent scene image more accurately. To this end, we adopt three VGG-16
architectures pre-trained on ImageNet, Places, and Hybrid (both ImageNet and
Places) datasets for the corresponding extraction of foreground, background and
hybrid information. All these three types of deep features are further
aggregated to achieve our final features for the representation of scene
images. Extensive experiments on two large benchmark scene datasets (MIT-67 and
SUN-397) show that our method produces the state-of-the-art classification
performance.
Related papers
- Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models.
In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques.
We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z) - Locate, Assign, Refine: Taming Customized Image Inpainting with Text-Subject Guidance [17.251982243534144]
LAR-Gen is a novel approach for image inpainting that enables seamless inpainting of masked scene images.
Our approach adopts a coarse-to-fine manner to ensure subject identity preservation and local semantic coherence.
Experiments and varied application scenarios demonstrate the superiority of LAR-Gen in terms of both identity preservation and text semantic consistency.
arXiv Detail & Related papers (2024-03-28T16:07:55Z) - High-Quality Entity Segmentation [110.55724145851725]
CropFormer is designed to tackle the intractability of instance-level segmentation on high-resolution images.
It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.
With CropFormer, we achieve a significant AP gain of $1.9$ on the challenging entity segmentation task.
arXiv Detail & Related papers (2022-11-10T18:58:22Z) - Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to
Parcel Logistics [58.720142291102135]
We present a fully automated pipeline to generate a synthetic dataset for instance segmentation in four steps.
We first scrape images for the objects of interest from popular image search engines.
We compare three different methods for image selection: Object-agnostic pre-processing, manual image selection and CNN-based image selection.
arXiv Detail & Related papers (2022-10-18T12:49:04Z) - Recent Advances in Scene Image Representation and Classification [1.8369974607582584]
We review the existing scene image representation methods that are being used widely for image classification.
We compare their performance both qualitatively (e.g., quality of outputs, pros/cons, etc.) and quantitatively (e.g., accuracy)
Overall, this survey provides in-depth insights and applications of recent scene image representation methods for traditional Computer Vision (CV)-based methods, Deep Learning (DL)-based methods, and Search Engine (SE)-based methods.
arXiv Detail & Related papers (2022-06-15T07:12:23Z) - Knowledge Mining with Scene Text for Fine-Grained Recognition [53.74297368412834]
We propose an end-to-end trainable network that mines implicit contextual knowledge behind scene text image.
We employ KnowBert to retrieve relevant knowledge for semantic representation and combine it with image features for fine-grained classification.
Our method outperforms the state-of-the-art by 3.72% mAP and 5.39% mAP, respectively.
arXiv Detail & Related papers (2022-03-27T05:54:00Z) - Rectifying the Shortcut Learning of Background: Shared Object
Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks.
We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z) - Devil's in the Details: Aligning Visual Clues for Conditional Embedding
in Person Re-Identification [94.77172127405846]
We propose two key recognition patterns to better utilize the detail information of pedestrian images.
CACE-Net achieves state-of-the-art performance on three public datasets.
arXiv Detail & Related papers (2020-09-11T06:28:56Z) - AiRound and CV-BrCT: Novel Multi-View Datasets for Scene Classification [2.931113769364182]
We present two new publicly available datasets named thedatasetand CV-BrCT.
The first one contains triplets of images from the same geographic coordinate with different perspectives of view extracted from various places around the world.
The second dataset contains pairs of aerial and street-level images extracted from southeast Brazil.
arXiv Detail & Related papers (2020-08-03T18:55:46Z) - Content and Context Features for Scene Image Representation [16.252523139552174]
We propose new techniques to compute content features and context features, and then fuse them together.
For content features, we design multi-scale deep features based on background and foreground information in images.
For context features, we use annotations of similar images available in the web to design a filter words (codebook)
arXiv Detail & Related papers (2020-06-05T03:19:13Z) - HDF: Hybrid Deep Features for Scene Image Representation [16.252523139552174]
We propose a novel type of features -- hybrid deep features, for scene images.
We exploit both object-based and scene-based features at two levels.
We show that our introduced features can produce state-of-the-art classification accuracies.
arXiv Detail & Related papers (2020-03-22T01:05:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.