A Probabilistic Generative Model for Typographical Analysis of Early
Modern Printing
- URL: http://arxiv.org/abs/2005.01646v1
- Date: Mon, 4 May 2020 17:01:11 GMT
- Title: A Probabilistic Generative Model for Typographical Analysis of Early
Modern Printing
- Authors: Kartik Goyal, Chris Dyer, Christopher Warren, Max G'Sell, Taylor
Berg-Kirkpatrick
- Abstract summary: We propose a deep and interpretable probabilistic generative model to analyze glyph shapes in printed Early Modern documents.
Our approach introduces a neural editor model that first generates well-understood printing perturbations from template parameters via interpertable latent variables.
We show that our approach outperforms rigid interpretable clustering baselines (Ocular) and overly-flexible deep generative models (VAE) alike on the task of completely unsupervised discovery of typefaces in mixed-font documents.
- Score: 44.62884731273421
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a deep and interpretable probabilistic generative model to analyze
glyph shapes in printed Early Modern documents. We focus on clustering
extracted glyph images into underlying templates in the presence of multiple
confounding sources of variance. Our approach introduces a neural editor model
that first generates well-understood printing phenomena like spatial
perturbations from template parameters via interpertable latent variables, and
then modifies the result by generating a non-interpretable latent vector
responsible for inking variations, jitter, noise from the archiving process,
and other unforeseen phenomena associated with Early Modern printing.
Critically, by introducing an inference network whose input is restricted to
the visual residual between the observation and the interpretably-modified
template, we are able to control and isolate what the vector-valued latent
variable captures. We show that our approach outperforms rigid interpretable
clustering baselines (Ocular) and overly-flexible deep generative models (VAE)
alike on the task of completely unsupervised discovery of typefaces in
mixed-font documents.
Related papers
- Sub-graph Based Diffusion Model for Link Prediction [43.15741675617231]
Denoising Diffusion Probabilistic Models (DDPMs) represent a contemporary class of generative models with exceptional qualities.
We build a novel generative model for link prediction using a dedicated design to decompose the likelihood estimation process via the Bayesian formula.
Our proposed method presents numerous advantages: (1) transferability across datasets without retraining, (2) promising generalization on limited training data, and (3) robustness against graph adversarial attacks.
arXiv Detail & Related papers (2024-09-13T02:23:55Z) - Prototype Generation: Robust Feature Visualisation for Data Independent
Interpretability [1.223779595809275]
Prototype Generation is a stricter and more robust form of feature visualisation for model-agnostic, data-independent interpretability of image classification models.
We demonstrate its ability to generate inputs that result in natural activation paths, countering previous claims that feature visualisation algorithms are untrustworthy due to the unnatural internal activations.
arXiv Detail & Related papers (2023-09-29T11:16:06Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data.
Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z) - Molecular Property Prediction by Semantic-invariant Contrastive Learning [26.19431931932982]
We develop a Fragment-based Semantic-Invariant Contrastive Learning model based on this view generation method for molecular property prediction.
With the least number of pre-training samples, FraSICL can achieve state-of-the-art performance, compared with major existing counterpart models.
arXiv Detail & Related papers (2023-03-13T07:32:37Z) - DiffusER: Discrete Diffusion via Edit-based Reconstruction [88.62707047517914]
DiffusER is an edit-based generative model for text based on denoising diffusion models.
It can rival autoregressive models on several tasks spanning machine translation, summarization, and style transfer.
It can also perform other varieties of generation that standard autoregressive models are not well-suited for.
arXiv Detail & Related papers (2022-10-30T16:55:23Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - A Probabilistic Formulation of Unsupervised Text Style Transfer [128.80213211598752]
We present a deep generative model for unsupervised text style transfer that unifies previously proposed non-generative techniques.
By hypothesizing a parallel latent sequence that generates each observed sequence, our model learns to transform sequences from one domain to another in a completely unsupervised fashion.
arXiv Detail & Related papers (2020-02-10T16:20:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.