WSRNet: Joint Spotting and Recognition of Handwritten Words
- URL: http://arxiv.org/abs/2008.07109v1
- Date: Mon, 17 Aug 2020 06:22:05 GMT
- Title: WSRNet: Joint Spotting and Recognition of Handwritten Words
- Authors: George Retsinas, Giorgos Sfikas, Petros Maragos
- Abstract summary: The proposed network is comprised of a non-recurrent CTC branch and a Seq2Seq branch that is further augmented with an Autoencoding module.
We show how to further process these representations with binarization and a retraining scheme to provide compact and highly efficient descriptors.
- Score: 38.212002652391
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we present a unified model that can handle both Keyword
Spotting and Word Recognition with the same network architecture. The proposed
network is comprised of a non-recurrent CTC branch and a Seq2Seq branch that is
further augmented with an Autoencoding module. The related joint loss leads to
a boost in recognition performance, while the Seq2Seq branch is used to create
efficient word representations. We show how to further process these
representations with binarization and a retraining scheme to provide compact
and highly efficient descriptors, suitable for keyword spotting. Numerical
results validate the usefulness of the proposed architecture, as our method
outperforms the previous state-of-the-art in keyword spotting, and provides
results in the ballpark of the leading methods for word recognition.
Related papers
- A Generative Approach for Wikipedia-Scale Visual Entity Recognition [56.55633052479446]
We address the task of mapping a given query image to one of the 6 million existing entities in Wikipedia.
We introduce a novel Generative Entity Recognition framework, which learns to auto-regressively decode a semantic and discriminative code'' identifying the target entity.
arXiv Detail & Related papers (2024-03-04T13:47:30Z) - SimCKP: Simple Contrastive Learning of Keyphrase Representations [36.88517357720033]
We propose SimCKP, a simple contrastive learning framework that consists of two stages: 1) An extractor-generator that extracts keyphrases by learning context-aware phrase-level representations in a contrastive manner while also generating keyphrases that do not appear in the document; and 2) A reranker that adapts scores for each generated phrase by likewise aligning their representations with the corresponding document.
arXiv Detail & Related papers (2023-10-12T11:11:54Z) - Self-Sufficient Framework for Continuous Sign Language Recognition [75.60327502570242]
The goal of this work is to develop self-sufficient framework for Continuous Sign Language Recognition.
These include the need for complex multi-scale features such as hands, face, and mouth for understanding, and absence of frame-level annotations.
We propose Divide and Focus Convolution (DFConv) which extracts both manual and non-manual features without the need for additional networks or annotations.
DPLR propagates non-spiky frame-level pseudo-labels by combining the ground truth gloss sequence labels with the predicted sequence.
arXiv Detail & Related papers (2023-03-21T11:42:57Z) - Using virtual edges to extract keywords from texts modeled as complex
networks [0.1611401281366893]
We modeled texts co-occurrence networks, where nodes are words and edges are established by contextual or semantical similarity.
We found that, in fact, the use of virtual edges can improve the discriminability of co-occurrence networks.
arXiv Detail & Related papers (2022-05-04T16:43:03Z) - Instant One-Shot Word-Learning for Context-Specific Neural
Sequence-to-Sequence Speech Recognition [62.997667081978825]
We present an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly.
In this paper we demonstrate that through this mechanism our system is able to recognize more than 85% of newly added words that it previously failed to recognize.
arXiv Detail & Related papers (2021-07-05T21:08:34Z) - Unsupervised Key-phrase Extraction and Clustering for Classification
Scheme in Scientific Publications [0.0]
We investigate possible ways of automating parts of the Systematic Mapping (SM) and Systematic Review (SR) process.
Key-phrases are extracted from scientific documents using unsupervised methods, which are then used to construct the corresponding Classification Scheme.
We also explore how clustering can be used to group related key-phrases.
arXiv Detail & Related papers (2021-01-25T10:17:33Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Pairwise Learning for Name Disambiguation in Large-Scale Heterogeneous
Academic Networks [81.00481125272098]
We introduce Multi-view Attention-based Pairwise Recurrent Neural Network (MA-PairRNN) to solve the name disambiguation problem.
MA-PairRNN combines heterogeneous graph embedding learning and pairwise similarity learning into a framework.
Results on two real-world datasets demonstrate that our framework has a significant and consistent improvement of performance on the name disambiguation task.
arXiv Detail & Related papers (2020-08-30T06:08:20Z) - Keyphrase Extraction with Span-based Feature Representations [13.790461555410747]
Keyphrases are capable of providing semantic metadata characterizing documents.
Three approaches to address keyphrase extraction: (i) traditional two-step ranking method, (ii) sequence labeling and (iii) generation using neural networks.
In this paper, we propose a novelty Span Keyphrase Extraction model that extracts span-based feature representation of keyphrase directly from all the content tokens.
arXiv Detail & Related papers (2020-02-13T09:48:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.