Related papers: Structured Analysis and Comparison of Alphabets in Historical Handwritten Ciphers

Structured Analysis and Comparison of Alphabets in Historical Handwritten Ciphers

URL: http://arxiv.org/abs/2410.21913v1
Date: Tue, 29 Oct 2024 10:12:16 GMT
Title: Structured Analysis and Comparison of Alphabets in Historical Handwritten Ciphers
Authors: Martín Méndez, Pau Torras, Adrià Molina, Jialuo Chen, Oriol Ramos-Terrades, Alicia Fornés,
Abstract summary: We propose the CSI metric, a novel way of comparing pairs of ciphered documents. We assess their effectiveness in an unsupervised clustering scenario utilising visual features, including SIFT, pre-trained learnt embeddings, and OCR descriptors.
Score: 3.423211639513232
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Historical ciphered manuscripts are documents that were typically used in sensitive communications within military and diplomatic contexts or among members of secret societies. These secret messages were concealed by inventing a method of writing employing symbols from diverse sources such as digits, alchemy signs and Latin or Greek characters. When studying a new, unseen cipher, the automatic search and grouping of ciphers with a similar alphabet can aid the scholar in its transcription and cryptanalysis because it indicates a probability that the underlying cipher is similar. In this study, we address this need by proposing the CSI metric, a novel way of comparing pairs of ciphered documents. We assess their effectiveness in an unsupervised clustering scenario utilising visual features, including SIFT, pre-trained learnt embeddings, and OCR descriptors.

Related papers

CipherGuard: Compiler-aided Mitigation against Ciphertext Side-channel Attacks [30.992038220253797]
CipherGuard is a compiler-aided mitigation methodology to counteract ciphertext side channels with high efficiency and security. We demonstrate that CipherGuard can strengthen the security of various cryptographic implementations more efficiently than existing state-of-the-art defense mechanism, i.e., CipherFix.
arXiv Detail & Related papers (2025-02-19T03:22:36Z)
Cryptanalysis via Machine Learning Based Information Theoretic Metrics [58.96805474751668]
We propose two novel applications of machine learning (ML) algorithms to perform cryptanalysis on any cryptosystem. These algorithms can be readily applied in an audit setting to evaluate the robustness of a cryptosystem. We show that our classification model correctly identifies the encryption schemes that are not IND-CPA secure, such as DES, RSA, and AES ECB, with high accuracy.
arXiv Detail & Related papers (2025-01-25T04:53:36Z)
Secure Semantic Communication With Homomorphic Encryption [52.5344514499035]
This paper explores the feasibility of applying homomorphic encryption to SemCom. We propose a task-oriented SemCom scheme secured through homomorphic encryption.
arXiv Detail & Related papers (2025-01-17T13:26:14Z)
The Evolution of Cryptography through Number Theory [55.2480439325792]
cryptography began around 100 years ago, its roots trace back to ancient civilizations like Mesopotamia and Egypt. This paper explores the link between early information hiding techniques and modern cryptographic algorithms like RSA.
arXiv Detail & Related papers (2024-11-11T16:27:57Z)
A Machine Learning-Based Framework for Assessing Cryptographic Indistinguishability of Lightweight Block Ciphers [1.5953412143328967]
Indistinguishability is a fundamental principle of cryptographic security, crucial for securing data transmitted between Internet of Things (IoT) devices. This research investigates the ability of machine learning (ML) in assessing indistinguishability property in encryption systems. We introduce MIND-Crypt, a novel ML-based framework designed to assess the cryptographic indistinguishability of lightweight block ciphers.
arXiv Detail & Related papers (2024-05-30T04:40:13Z)
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [54.27040631527217]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries. We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language. We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z)
HierCode: A Lightweight Hierarchical Codebook for Zero-shot Chinese Text Recognition [47.86479271322264]
We propose HierCode, a novel and lightweight codebook that exploits the innate hierarchical nature of Chinese characters. HierCode employs a multi-hot encoding strategy, leveraging hierarchical binary tree encoding and prototype learning to create distinctive, informative representations for each character. This approach not only facilitates zero-shot recognition of OOV characters by utilizing shared radicals and structures but also excels in line-level recognition tasks by computing similarity with visual features.
arXiv Detail & Related papers (2024-03-20T17:20:48Z)
Can a Tabula Recta provide security in the XXI century? [0.0]
I discuss how some human-computable algorithms can indeed afford sufficient security in this situation. Three kinds of algorithms are discussed: those that concentrate entropy from shared text sources, stream ciphers based on arithmetic of non-binary spaces, and hash-like algorithms that may be used to generate a password from a challenge text.
arXiv Detail & Related papers (2023-12-05T16:36:27Z)
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher [85.18213923151717]
Experimental results show certain ciphers succeed almost 100% of the time to bypass the safety alignment of GPT-4 in several safety domains. We propose a novel SelfCipher that uses only role play and several demonstrations in natural language to evoke this capability.
arXiv Detail & Related papers (2023-08-12T04:05:57Z)
CipherSniffer: Classifying Cipher Types [0.0]
We frame the decryption task as a classification problem. We first create a dataset of transpositions, substitutions, text reversals, word reversals, sentence shifts, and unencrypted text.
arXiv Detail & Related papers (2023-06-13T20:18:24Z)
ConTextual Mask Auto-Encoder for Dense Passage Retrieval [49.49460769701308]
CoT-MAE is a simple yet effective generative pre-training method for dense passage retrieval. It learns to compress the sentence semantics into a dense vector through self-supervised and context-supervised masked auto-encoding. We conduct experiments on large-scale passage retrieval benchmarks and show considerable improvements over strong baselines.
arXiv Detail & Related papers (2022-08-16T11:17:22Z)
Enhancing Networking Cipher Algorithms with Natural Language [0.0]
Natural language processing is considered as the weakest link in a networking encryption model. This paper summarizes how languages can be integrated into symmetric encryption as a way to assist in the encryption of vulnerable streams.
arXiv Detail & Related papers (2022-06-22T09:05:52Z)
Open Set Classification of Untranscribed Handwritten Documents [56.0167902098419]
Huge amounts of digital page images of important manuscripts are preserved in archives worldwide. The class or typology'' of a document is perhaps the most important tag to be included in the metadata. The technical problem is one of automatic classification of documents, each consisting of a set of untranscribed handwritten text images.
arXiv Detail & Related papers (2022-06-20T20:43:50Z)
Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval. We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases. We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z)
Can Sequence-to-Sequence Models Crack Substitution Ciphers? [15.898270650875158]
State-of-the-art decipherment methods use beam search and a neural language model to score candidate hypotheses for a given cipher. We show that our proposed method can decipher text without explicit language identification and can still be robust to noise.
arXiv Detail & Related papers (2020-12-30T17:16:33Z)
A Few-shot Learning Approach for Historical Ciphered Manuscript Recognition [3.0682439731292592]
We propose a novel method for handwritten ciphers recognition based on few-shot object detection. By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets.
arXiv Detail & Related papers (2020-09-26T11:49:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.