An Automatic Image Content Retrieval Method for better Mobile Device
Display User Experiences
- URL: http://arxiv.org/abs/2108.12068v1
- Date: Thu, 26 Aug 2021 23:44:34 GMT
- Title: An Automatic Image Content Retrieval Method for better Mobile Device
Display User Experiences
- Authors: Alessandro Bruno
- Abstract summary: A new mobile application for image content retrieval and classification for mobile device display is proposed.
The application was run on thousands of pictures and showed encouraging results towards a better user visual experience with mobile displays.
- Score: 91.3755431537592
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: A growing number of commercially available mobile phones come with integrated
high-resolution digital cameras. That enables a new class of dedicated
applications to image analysis such as mobile visual search, image cropping,
object detection, content-based image retrieval, image classification. In this
paper, a new mobile application for image content retrieval and classification
for mobile device display is proposed to enrich the visual experience of users.
The mobile application can extract a certain number of images based on the
content of an image with visual saliency methods aiming at detecting the most
critical regions in a given image from a perceptual viewpoint. First, the most
critical areas from a perceptual perspective are extracted using the local
maxima of a 2D saliency function. Next, a salient region is cropped using the
bounding box centred on the local maxima of the thresholded Saliency Map of the
image. Then, each image crop feds into an Image Classification system based on
SVM and SIFT descriptors to detect the class of object present in the image.
ImageNet repository was used as the reference for semantic category
classification. Android platform was used to implement the mobile application
on a client-server architecture. A mobile client sends the photo taken by the
camera to the server, which processes the image and returns the results (image
contents such as image crops and related target classes) to the mobile client.
The application was run on thousands of pictures and showed encouraging results
towards a better user visual experience with mobile displays.
Related papers
- Revisit Anything: Visual Place Recognition via Image Segment Retrieval [8.544326445217369]
Existing visual place recognition pipelines encode the "whole" image and search for matches.
We address this by encoding and searching for "image segments" instead of the whole images.
We show that retrieving these partial representations leads to significantly higher recognition recall than the typical whole image based retrieval.
arXiv Detail & Related papers (2024-09-26T16:49:58Z) - Improving Image Recognition by Retrieving from Web-Scale Image-Text Data [68.63453336523318]
We introduce an attention-based memory module, which learns the importance of each retrieved example from the memory.
Compared to existing approaches, our method removes the influence of the irrelevant retrieved examples, and retains those that are beneficial to the input query.
We show that it achieves state-of-the-art accuracies in ImageNet-LT, Places-LT and Webvision datasets.
arXiv Detail & Related papers (2023-04-11T12:12:05Z) - High-Quality Entity Segmentation [110.55724145851725]
CropFormer is designed to tackle the intractability of instance-level segmentation on high-resolution images.
It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.
With CropFormer, we achieve a significant AP gain of $1.9$ on the challenging entity segmentation task.
arXiv Detail & Related papers (2022-11-10T18:58:22Z) - Fast and Efficient Scene Categorization for Autonomous Driving using
VAEs [2.694218293356451]
Scene categorization is a useful precursor task that provides prior knowledge for advanced computer vision tasks.
We propose to generate a global descriptor that captures coarse features from the image and use a classification head to map the descriptors to 3 scene categories: Rural, Urban and Suburban.
The proposed global descriptor is very compact with an embedding length of 128, significantly faster to compute, and is robust to seasonal and illuminational changes.
arXiv Detail & Related papers (2022-10-26T18:50:15Z) - Facing the Void: Overcoming Missing Data in Multi-View Imagery [0.783788180051711]
We propose a novel technique for multi-view image classification robust to this problem.
The proposed method, based on state-of-the-art deep learning-based approaches and metric learning, can be easily adapted and exploited in other applications and domains.
Results show that the proposed algorithm provides improvements in multi-view image classification accuracy when compared to state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T13:21:27Z) - Use Image Clustering to Facilitate Technology Assisted Review [0.5249805590164902]
Technology Assisted Review (TAR) in electronic discovery is witnessing a rising need to incorporate multimedia content in the scope.
We have developed innovative image analytics applications for TAR in the past years, such as image classification, image clustering, and object detection.
arXiv Detail & Related papers (2021-12-16T04:02:51Z) - Multimodal Icon Annotation For Mobile Applications [11.342641993269693]
We propose a novel deep learning based multi-modal approach that combines the benefits of both pixel and view hierarchy features.
In order to demonstrate the utility provided, we create a high quality UI dataset by manually annotating the most commonly used 29 icons in Rico.
arXiv Detail & Related papers (2021-07-09T13:57:37Z) - A Simple and Effective Use of Object-Centric Images for Long-Tailed
Object Detection [56.82077636126353]
We take advantage of object-centric images to improve object detection in scene-centric images.
We present a simple yet surprisingly effective framework to do so.
Our approach can improve the object detection (and instance segmentation) accuracy of rare objects by 50% (and 33%) relatively.
arXiv Detail & Related papers (2021-02-17T17:27:21Z) - Retrieval Guided Unsupervised Multi-domain Image-to-Image Translation [59.73535607392732]
Image to image translation aims to learn a mapping that transforms an image from one visual domain to another.
We propose the use of an image retrieval system to assist the image-to-image translation task.
arXiv Detail & Related papers (2020-08-11T20:11:53Z) - Geometrically Mappable Image Features [85.81073893916414]
Vision-based localization of an agent in a map is an important problem in robotics and computer vision.
We propose a method that learns image features targeted for image-retrieval-based localization.
arXiv Detail & Related papers (2020-03-21T15:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.