Use of Metric Learning for the Recognition of Handwritten Digits, and its Application to Increase the Outreach of Voice-based Communication Platforms
- URL: http://arxiv.org/abs/2504.18948v1
- Date: Sat, 26 Apr 2025 15:14:47 GMT
- Title: Use of Metric Learning for the Recognition of Handwritten Digits, and its Application to Increase the Outreach of Voice-based Communication Platforms
- Authors: Devesh Pant, Dibyendu Talukder, Deepak Kumar, Rachit Pandey, Aaditeshwar Seth, Chetan Arora,
- Abstract summary: Paper-based data collection has been argued to be more appropriate in several contexts.<n>We provide a large dataset of handwritten digits, and deep learning based models and methods built using this data.<n>We demonstrate the deployment of these tools in the context of a maternal and child health awareness project in north India.
- Score: 5.316686791692299
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Initiation, monitoring, and evaluation of development programmes can involve field-based data collection about project activities. This data collection through digital devices may not always be feasible though, for reasons such as unaffordability of smartphones and tablets by field-based cadre, or shortfalls in their training and capacity building. Paper-based data collection has been argued to be more appropriate in several contexts, with automated digitization of the paper forms through OCR (Optical Character Recognition) and OMR (Optical Mark Recognition) techniques. We contribute with providing a large dataset of handwritten digits, and deep learning based models and methods built using this data, that are effective in real-world environments. We demonstrate the deployment of these tools in the context of a maternal and child health and nutrition awareness project, which uses IVR (Interactive Voice Response) systems to provide awareness information to rural women SHG (Self Help Group) members in north India. Paper forms were used to collect phone numbers of the SHG members at scale, which were digitized using the OCR tools developed by us, and used to push almost 4 million phone calls. The data, model, and code have been released in the open-source domain.
Related papers
- Handwritten Digit Recognition: An Ensemble-Based Approach for Superior Performance [9.174021241188143]
This paper presents an ensemble-based approach that combines Convolutional Neural Networks (CNNs) with traditional machine learning techniques to improve recognition accuracy and robustness.<n>We evaluate our method on the MNIST dataset, comprising 70,000 handwritten digit images.<n>Our hybrid model, which uses CNNs for feature extraction and Support Vector Machines (SVMs) for classification, achieves an accuracy of 99.30%.
arXiv Detail & Related papers (2025-03-08T07:09:49Z) - TelegramScrap: A comprehensive tool for scraping Telegram data [0.0]
TelegramScrap is a tool for extracting and analyzing data from Telegram channels and groups.<n>This white paper outlines the tool's development, capabilities, and applications in academic and scientific research.
arXiv Detail & Related papers (2024-12-21T21:46:56Z) - Deep Neural Network-Based Sign Language Recognition: A Comprehensive Approach Using Transfer Learning with Explainability [0.0]
We suggest a novel solution that uses a deep neural network to fully automate sign language recognition.
This methodology integrates sophisticated preprocessing methodologies to optimise the overall performance.
Our model's ability to provide informational clarity was assessed using the SHAP (SHapley Additive exPlanations) method.
arXiv Detail & Related papers (2024-09-11T17:17:44Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - MyDigitalFootprint: an extensive context dataset for pervasive computing
applications at the edge [7.310043452300736]
MyDigitalFootprint is a large-scale dataset comprising smartphone sensor data, physical proximity information, and Online Social Networks interactions.
It spans two months of measurements from 31 volunteer users in their natural environment, allowing for unrestricted behavior.
To demonstrate the dataset's effectiveness, we present three context-aware applications utilizing various machine learning tasks.
arXiv Detail & Related papers (2023-06-28T07:59:47Z) - Robotic Skill Acquisition via Instruction Augmentation with
Vision-Language Models [70.82705830137708]
We introduce Data-driven Instruction Augmentation for Language-conditioned control (DIAL)
We utilize semi-language labels leveraging the semantic understanding of CLIP to propagate knowledge onto large datasets of unlabelled demonstration data.
DIAL enables imitation learning policies to acquire new capabilities and generalize to 60 novel instructions unseen in the original dataset.
arXiv Detail & Related papers (2022-11-21T18:56:00Z) - Semantic Segmentation of Vegetation in Remote Sensing Imagery Using Deep
Learning [77.34726150561087]
We propose an approach for creating a multi-modal and large-temporal dataset comprised of publicly available Remote Sensing data.
We use Convolutional Neural Networks (CNN) models that are capable of separating different classes of vegetation.
arXiv Detail & Related papers (2022-09-28T18:51:59Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - Handwritten Digit Recognition using Machine and Deep Learning Algorithms [0.0]
We have performed handwritten digit recognition with the help of MNIST datasets using Support Vector Machines (SVM), Multi-Layer Perceptron (MLP) and Convolution Neural Network (CNN) models.
Our main objective is to compare the accuracy of the models stated above along with their execution time to get the best possible model for digit recognition.
arXiv Detail & Related papers (2021-06-23T18:23:01Z) - Reinforced Iterative Knowledge Distillation for Cross-Lingual Named
Entity Recognition [54.92161571089808]
Cross-lingual NER transfers knowledge from rich-resource language to languages with low resources.
Existing cross-lingual NER methods do not make good use of rich unlabeled data in target languages.
We develop a novel approach based on the ideas of semi-supervised learning and reinforcement learning.
arXiv Detail & Related papers (2021-06-01T05:46:22Z) - FedOCR: Communication-Efficient Federated Learning for Scene Text
Recognition [76.26472513160425]
We study how to make use of decentralized datasets for training a robust scene text recognizer.
To make FedOCR fairly suitable to be deployed on end devices, we make two improvements including using lightweight models and hashing techniques.
arXiv Detail & Related papers (2020-07-22T14:30:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.