Fugu-MT 論文翻訳(概要): Scalable handwritten text recognition system for lexicographic sources of under-resourced languages and alphabets

論文の概要: Scalable handwritten text recognition system for lexicographic sources of under-resourced languages and alphabets

arxiv url: http://arxiv.org/abs/2303.16256v1
Date: Tue, 28 Mar 2023 19:06:27 GMT
ステータス: 翻訳完了
システム内更新日: 2023-03-30 17:01:36.851070
Title: Scalable handwritten text recognition system for lexicographic sources of under-resourced languages and alphabets
Title（参考訳）: 言語とアルファベットの辞書ソースのためのスケーラブルな手書き文字認識システム
Authors: Jan Idziak, Artjoms \v{S}e\c{l}a, Micha{\l} Wo\'zniak, Albert Le\'sniak, Joanna Byszuk, Maciej Eder
Abstract要約: 17世紀と18世紀のポーランド語辞典という大きな歴史辞典では、インデックスカードは280万枚である。我々は,(1)最適化された検出モデル,(2)手書きコンテンツを解読する認識モデル,(3)制約付きWord Beam Searchを用いた後処理ステップを含む手書き文字認識ソリューションを適用した。我々のモデルは単語レベルで0.881の精度を達成し、ベースRCNNモデルよりも優れていた。
参考スコア（独自算出の注目度）: 1.304892050913381
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The paper discusses an approach to decipher large collections of handwritten index cards of historical dictionaries. Our study provides a working solution that reads the cards, and links their lemmas to a searchable list of dictionary entries, for a large historical dictionary entitled the Dictionary of the 17th- and 18th-century Polish, which comprizes 2.8 million index cards. We apply a tailored handwritten text recognition (HTR) solution that involves (1) an optimized detection model; (2) a recognition model to decipher the handwritten content, designed as a spatial transformer network (STN) followed by convolutional neural network (RCNN) with a connectionist temporal classification layer (CTC), trained using a synthetic set of 500,000 generated Polish words of different length; (3) a post-processing step using constrained Word Beam Search (WBC): the predictions were matched against a list of dictionary entries known in advance. Our model achieved the accuracy of 0.881 on the word level, which outperforms the base RCNN model. Within this study we produced a set of 20,000 manually annotated index cards that can be used for future benchmarks and transfer learning HTR applications.
Abstract（参考訳）: 本稿では,歴史辞書の手書きインデックスカードの膨大なコレクションを解読する手法について述べる。本研究は,280万のインデックスカードを構成する17世紀のポーランド語辞典(dictionary of the 17thand 18th century polish)に対して,カードを読み,その補題を検索可能な辞書項目のリストにリンクする作業ソリューションを提供する。 We apply a tailored handwritten text recognition (HTR) solution that involves (1) an optimized detection model; (2) a recognition model to decipher the handwritten content, designed as a spatial transformer network (STN) followed by convolutional neural network (RCNN) with a connectionist temporal classification layer (CTC), trained using a synthetic set of 500,000 generated Polish words of different length; (3) a post-processing step using constrained Word Beam Search (WBC): the predictions were matched against a list of dictionary entries known in advance. 本モデルは単語レベルで0.881の精度を達成し,rcnnモデルよりも優れていた。本研究では,将来のベンチマークや変換学習用HTRアプリケーションに使用可能な2万個の手動注釈付きインデックスカードを作成した。

関連論文リスト

Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models [53.17363502535395]
信頼できる言語モデルは、正しい答えと検証可能な答えの両方を提供するべきです。現在のシステムは、外部レトリバーを推論時にクエリすることで、引用を挿入する。本稿では,合成QAペアを継続的に事前訓練するActive Indexingを提案する。
論文参考訳（メタデータ） (2025-06-21T04:48:05Z)
Classification of Non-native Handwritten Characters Using Convolutional Neural Network [0.0]
非ネイティブユーザによる英語文字の分類は、カスタマイズされたCNNモデルを提案することによって行われる。我々はこのCNNを、手書きの独立した英語文字データセットと呼ばれる新しいデータセットでトレーニングする。 5つの畳み込み層と1つの隠蔽層を持つモデルでは、文字認識精度において最先端モデルよりも優れる。
論文参考訳（メタデータ） (2024-06-06T21:08:07Z)
Generative Spoken Language Model based on continuous word-sized audio tokens [52.081868603603844]
本稿では,単語サイズ連続評価音声埋め込みに基づく生成音声言語モデルを提案する。結果として得られるモデルは、単語サイズの連続埋め込みに基づく最初の生成言語モデルである。
論文参考訳（メタデータ） (2023-10-08T16:46:14Z)
Integrating Bidirectional Long Short-Term Memory with Subword Embedding for Authorship Attribution [2.3429306644730854]
マニフォールド語に基づくスタイリスティックマーカーは、著者帰属の本質的な問題に対処するために、ディープラーニング手法でうまく使われてきた。提案手法は,CCAT50,IMDb62,Blog50,Twitter50の公営企業における最先端手法に対して実験的に評価された。
論文参考訳（メタデータ） (2023-06-26T11:35:47Z)
SpellMapper: A non-autoregressive neural spellchecker for ASR customization with candidate retrieval based on n-gram mappings [76.87664008338317]
文脈スペル補正モデルは、音声認識を改善するために浅い融合に代わるものである。ミススペルn-gramマッピングに基づく候補探索のための新しいアルゴリズムを提案する。 Spoken Wikipediaの実験では、ベースラインのASRシステムに比べて21.4%のワードエラー率の改善が見られた。
論文参考訳（メタデータ） (2023-06-04T10:00:12Z)
CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models [77.45934004406283]
複合語を構成語に分割する作業である「分解」を体系的に研究する。 We introduced a dataset of 255k compound and non-compound words across 56 various languages obtained from Wiktionary。分割のための専用モデルを訓練するための新しい手法を導入する。
論文参考訳（メタデータ） (2023-05-23T16:32:27Z)
Information Retrieval in long documents: Word clustering approach for improving Semantics [0.0]
本稿では,長い文書の場合のセマンティック情報検索のためのディープニューラルネットワークの代替案を提案する。クラスタリング技術を活用したこの新しいアプローチは、長文と短文を対象とする情報検索システムにおける単語の意味を考慮に入れている。
論文参考訳（メタデータ） (2023-02-20T18:32:57Z)
Deep LSTM Spoken Term Detection using Wav2Vec 2.0 Recognizer [0.0]
本稿では,DNN-HMMハイブリッドASRの従来の発音語彙に含まれる知識を,グラフベースのWav2Vecの文脈に転送するブートストラップ手法について述べる。提案手法は、DNN-HMMハイブリッドASRと音素認識器の組み合わせにより、英語とチェコ語の両方のMALACHデータに対する大きなマージンで、これまで公表されていたシステムより優れている。
論文参考訳（メタデータ） (2022-10-21T11:26:59Z)
Query Expansion Using Contextual Clue Sampling with Language Models [69.51976926838232]
本稿では,実効的なフィルタリング戦略と検索した文書の融合の組み合わせを,各文脈の生成確率に基づいて提案する。我々の語彙マッチングに基づくアプローチは、よく確立された高密度検索モデルDPRと比較して、同様のトップ5/トップ20検索精度と上位100検索精度を実現する。エンド・ツー・エンドのQAでは、読者モデルも我々の手法の恩恵を受けており、いくつかの競争基準に対してエクサクト・マッチのスコアが最も高い。
論文参考訳（メタデータ） (2022-10-13T15:18:04Z)
Lex2Sent: A bagging approach to unsupervised sentiment analysis [0.628122931748758]
本稿では,テキストの分類方法として,Lex2Sentを提案する。テキストを分類するために、文書埋め込みと適切な辞書の埋め込みの距離を決定するために埋め込みモデルを訓練する。本稿では,このモデルがレキシカよりも優れており,バイナリ感情分析のタスクにおいて,高パフォーマンスな数発の微調整手法の基盤となることを示す。
論文参考訳（メタデータ） (2022-09-26T20:49:18Z)
Autoregressive Search Engines: Generating Substrings as Document Identifiers [53.0729058170278]
自動回帰言語モデルは、回答を生成するデファクト標準として現れています。これまでの研究は、探索空間を階層構造に分割する方法を探究してきた。本研究では,検索空間の任意の構造を強制しない代替として,経路内のすべてのngramを識別子として使用することを提案する。
論文参考訳（メタデータ） (2022-04-22T10:45:01Z)
Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
世界中の多くの言語における既存の言語データの多くは、非デジタル化された書籍や文書に閉じ込められている。従来の研究は、あまり良くない言語を認識するためのニューラル・ポスト・コレクション法の有用性を実証してきた。そこで本研究では,生画像を利用した半教師付き学習手法を提案する。
論文参考訳（メタデータ） (2021-11-04T04:39:02Z)
Classification of Chinese Handwritten Numbers with Labeled Projective Dictionary Pair Learning [1.8594711725515674]
我々は,識別可能性,空間性,分類誤差の3つの要因を取り入れたクラス固有辞書を設計する。我々は、辞書原子を生成するために、新しい特徴空間、すなわち、向き付け勾配(HOG)のヒストグラムを採用する。その結果,最先端のディープラーニング技術と比較して,分類性能が向上した(sim98%)。
論文参考訳（メタデータ） (2020-03-26T01:43:59Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。