Font Impression Estimation in the Wild
- URL: http://arxiv.org/abs/2402.15236v1
- Date: Fri, 23 Feb 2024 10:00:25 GMT
- Title: Font Impression Estimation in the Wild
- Authors: Kazuki Kitajima, Daichi Haraguchi, Seiichi Uchida
- Abstract summary: We use a font dataset with annotation about font impressions and a convolutional neural network (CNN) framework for this task.
We propose an exemplar-based impression estimation approach, which relies on a strategy of ensembling impressions of exemplar fonts that are similar to the input image.
We conduct a correlation analysis between book genres and font impressions on real book cover images.
- Score: 7.542892664684078
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the challenging task of estimating font impressions from
real font images. We use a font dataset with annotation about font impressions
and a convolutional neural network (CNN) framework for this task. However,
impressions attached to individual fonts are often missing and noisy because of
the subjective characteristic of font impression annotation. To realize stable
impression estimation even with such a dataset, we propose an exemplar-based
impression estimation approach, which relies on a strategy of ensembling
impressions of exemplar fonts that are similar to the input image. In addition,
we train CNN with synthetic font images that mimic scanned word images so that
CNN estimates impressions of font images in the wild. We evaluate the basic
performance of the proposed estimation method quantitatively and qualitatively.
Then, we conduct a correlation analysis between book genres and font
impressions on real book cover images; it is important to note that this
analysis is only possible with our impression estimation method. The analysis
reveals various trends in the correlation between them - this fact supports a
hypothesis that book cover designers carefully choose a font for a book cover
considering the impression given by the font.
Related papers
- GRIF-DM: Generation of Rich Impression Fonts using Diffusion Models [18.15911470339845]
We introduce a diffusion-based method, termed ourmethod, to generate fonts that vividly embody specific impressions.
Our experimental results, conducted on the MyFonts dataset, affirm that this method is capable of producing realistic, vibrant, and high-fidelity fonts.
arXiv Detail & Related papers (2024-08-14T02:26:46Z) - Towards Retrieval-Augmented Architectures for Image Captioning [81.11529834508424]
This work presents a novel approach towards developing image captioning models that utilize an external kNN memory to improve the generation process.
Specifically, we propose two model variants that incorporate a knowledge retriever component that is based on visual similarities.
We experimentally validate our approach on COCO and nocaps datasets and demonstrate that incorporating an explicit external memory can significantly enhance the quality of captions.
arXiv Detail & Related papers (2024-05-21T18:02:07Z) - Impression-CLIP: Contrastive Shape-Impression Embedding for Fonts [7.542892664684078]
We propose Impression-CLIP, which is a novel machine-learning model based on CLIP (Contrastive Language-Image Pre-training)
In our experiment, we perform cross-modal retrieval between fonts and impressions through co-embedding.
The results indicate that Impression-CLIP achieves better retrieval accuracy than the state-of-the-art method.
arXiv Detail & Related papers (2024-02-26T07:07:18Z) - CF-Font: Content Fusion for Few-shot Font Generation [63.79915037830131]
We propose a content fusion module (CFM) to project the content feature into a linear space defined by the content features of basis fonts.
Our method also allows to optimize the style representation vector of reference images.
We have evaluated our method on a dataset of 300 fonts with 6.5k characters each.
arXiv Detail & Related papers (2023-03-24T14:18:40Z) - Generating More Pertinent Captions by Leveraging Semantics and Style on
Multi-Source Datasets [56.018551958004814]
This paper addresses the task of generating fluent descriptions by training on a non-uniform combination of data sources.
Large-scale datasets with noisy image-text pairs provide a sub-optimal source of supervision.
We propose to leverage and separate semantics and descriptive style through the incorporation of a style token and keywords extracted through a retrieval component.
arXiv Detail & Related papers (2021-11-24T19:00:05Z) - Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction.
Our approach enables us to massively scale up the number of character types we can effectively model.
We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z) - A Multi-Implicit Neural Representation for Fonts [79.6123184198301]
font-specific discontinuities like edges and corners are difficult to represent using neural networks.
We introduce textitmulti-implicits to represent fonts as a permutation-in set of learned implict functions, without losing features.
arXiv Detail & Related papers (2021-06-12T21:40:11Z) - Font Style that Fits an Image -- Font Generation Based on Image Context [7.646713951724013]
We propose a method of generating a book title image based on its context within a book cover.
We propose an end-to-end neural network that inputs the book cover, a target location mask, and a desired book title and outputs stylized text suitable for the cover.
We demonstrate that the proposed method can effectively produce desirable and appropriate book cover text through quantitative and qualitative results.
arXiv Detail & Related papers (2021-05-19T01:53:04Z) - Which Parts determine the Impression of the Font? [0.0]
Various fonts give different impressions, such as legible, rough, and comic-text.
By focusing on local shapes instead of the whole letter shape, we can realize letter-shape independent and more general analysis.
arXiv Detail & Related papers (2021-03-26T02:13:24Z) - Shared Latent Space of Font Shapes and Impressions [9.205278113241473]
We realize a shared latent space where a font shape image and its impression words are embedded in a cross-modal manner.
This latent space is useful to understand the style-impression correlation and generate font images by specifying several impression words.
arXiv Detail & Related papers (2021-03-23T06:54:45Z) - Intrinsic Image Captioning Evaluation [53.51379676690971]
We propose a learning based metrics for image captioning, which we call Intrinsic Image Captioning Evaluation(I2CE)
Experiment results show that our proposed method can keep robust performance and give more flexible scores to candidate captions when encountered with semantic similar expression or less aligned semantics.
arXiv Detail & Related papers (2020-12-14T08:36:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.