FONTNET: On-Device Font Understanding and Prediction Pipeline
- URL: http://arxiv.org/abs/2103.16150v1
- Date: Tue, 30 Mar 2021 08:11:24 GMT
- Title: FONTNET: On-Device Font Understanding and Prediction Pipeline
- Authors: Rakshith S, Rishabh Khurana, Vibhav Agarwal, Jayesh Rajkumar Vachhani,
Guggilla Bhanodai
- Abstract summary: We propose two engines: Font Detection Engine and Font Prediction Engine.
We develop a novel CNN architecture for identifying font style of text in images.
Second, we designed a novel algorithm for predicting similar fonts for a given query font.
Third, we have optimized and deployed the entire engine On-Device which ensures privacy and improves latency in real time applications such as instant messaging.
- Score: 1.5749416770494706
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Fonts are one of the most basic and core design concepts. Numerous use cases
can benefit from an in depth understanding of Fonts such as Text Customization
which can change text in an image while maintaining the Font attributes like
style, color, size. Currently, Text recognition solutions can group recognized
text based on line breaks or paragraph breaks, if the Font attributes are known
multiple text blocks can be combined based on context in a meaningful manner.
In this paper, we propose two engines: Font Detection Engine, which identifies
the font style, color and size attributes of text in an image and a Font
Prediction Engine, which predicts similar fonts for a query font. Major
contributions of this paper are three-fold: First, we developed a novel CNN
architecture for identifying font style of text in images. Second, we designed
a novel algorithm for predicting similar fonts for a given query font. Third,
we have optimized and deployed the entire engine On-Device which ensures
privacy and improves latency in real time applications such as instant
messaging. We achieve a worst case On-Device inference time of 30ms and a model
size of 4.5MB for both the engines.
Related papers
- TextDiffuser-2: Unleashing the Power of Language Models for Text
Rendering [118.30923824681642]
TextDiffuser-2 aims to unleash the power of language models for text rendering.
We utilize the language model within the diffusion model to encode the position and texts at the line level.
We conduct extensive experiments and incorporate user studies involving human participants as well as GPT-4V.
arXiv Detail & Related papers (2023-11-28T04:02:40Z) - VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and
Quantization [52.870638830417]
We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement.
Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
arXiv Detail & Related papers (2023-08-27T06:32:20Z) - Combining OCR Models for Reading Early Modern Printed Books [2.839401411131008]
We study the usage of fine-grained font recognition on OCR for books printed from the 15th to the 18th century.
We show that OCR performance is strongly impacted by font style and that selecting fine-tuned models with font group recognition has a very positive impact on the results.
arXiv Detail & Related papers (2023-05-11T20:43:50Z) - CF-Font: Content Fusion for Few-shot Font Generation [63.79915037830131]
We propose a content fusion module (CFM) to project the content feature into a linear space defined by the content features of basis fonts.
Our method also allows to optimize the style representation vector of reference images.
We have evaluated our method on a dataset of 300 fonts with 6.5k characters each.
arXiv Detail & Related papers (2023-03-24T14:18:40Z) - Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font.
The proposed model aims to generate the entire font library by giving only one sample as the reference.
The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z) - Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction.
Our approach enables us to massively scale up the number of character types we can effectively model.
We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z) - Font Completion and Manipulation by Cycling Between Multi-Modality
Representations [113.26243126754704]
We innovate to explore the generation of font glyphs as 2D graphic objects with the graph as an intermediate representation.
We formulate a cross-modality cycled image-to-image structure with a graph between an image encoder and an image.
Our model generates improved results than both image-to-image baseline and previous state-of-the-art methods for glyph completion.
arXiv Detail & Related papers (2021-08-30T02:43:29Z) - AdaptiFont: Increasing Individuals' Reading Speed with a Generative Font
Model and Bayesian Optimization [3.480626767752489]
AdaptiFont is a human-in-the-loop system aimed at interactively increasing readability of text displayed on a monitor.
We generate new true-type-fonts through active learning, render texts with the new font, and measure individual users' reading speed.
The results of a user study show that this adaptive font generation system finds regions in the font space corresponding to high reading speeds, that these fonts significantly increase participants' reading speed, and that the found fonts are significantly different across individual readers.
arXiv Detail & Related papers (2021-04-21T19:56:28Z) - Impressions2Font: Generating Fonts by Specifying Impressions [10.345810093530261]
This paper proposes Impressions2Font (Imp2Font) that generates font images with specific impressions.
Imp2Font accepts an arbitrary number of impression words as the condition to generate the font images.
arXiv Detail & Related papers (2021-03-18T06:10:26Z) - Few-shot Compositional Font Generation with Dual Memory [16.967987801167514]
We propose a novel font generation framework, named Dual Memory-augmented Font Generation Network (DM-Font)
We employ memory components and global-context awareness in the generator to take advantage of the compositionality.
In the experiments on Korean-handwriting fonts and Thai-printing fonts, we observe that our method generates a significantly better quality of samples with faithful stylization.
arXiv Detail & Related papers (2020-05-21T08:13:40Z) - Attribute2Font: Creating Fonts You Want From Attributes [32.82714291856353]
Attribute2Font is trained to perform font style transfer between any two fonts conditioned on their attribute values.
A novel unit named Attribute Attention Module is designed to make those generated glyph images better embody the prominent font attributes.
arXiv Detail & Related papers (2020-05-16T04:06:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.