Base and Exponent Prediction in Mathematical Expressions using Multi-Output CNN
- URL: http://arxiv.org/abs/2407.14967v1
- Date: Sat, 20 Jul 2024 19:23:40 GMT
- Title: Base and Exponent Prediction in Mathematical Expressions using Multi-Output CNN
- Authors: Md Laraib Salam, Akash S Balsaraf, Gaurav Gupta,
- Abstract summary: This research presents a simplified yet effective approach to predicting both the base and exponent from images of mathematical expressions using a multi-output Convolutional Neural Network (CNN)
The model is trained on 10,900 synthetically generated images containing exponent expressions, incorporating random noise, font size variations, and blur intensity to simulate real-world conditions.
The experimental results indicate that the model achieves high accuracy in predicting the base and exponent values, proving the efficacy of this approach in handling noisy and varied input images.
- Score: 2.4366097951781795
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of neural networks and deep learning techniques in image processing has significantly advanced the field, enabling highly accurate recognition results. However, achieving high recognition rates often necessitates complex network models, which can be challenging to train and require substantial computational resources. This research presents a simplified yet effective approach to predicting both the base and exponent from images of mathematical expressions using a multi-output Convolutional Neural Network (CNN). The model is trained on 10,900 synthetically generated images containing exponent expressions, incorporating random noise, font size variations, and blur intensity to simulate real-world conditions. The proposed CNN model demonstrates robust performance with efficient training time. The experimental results indicate that the model achieves high accuracy in predicting the base and exponent values, proving the efficacy of this approach in handling noisy and varied input images.
Related papers
- Fast constrained sampling in pre-trained diffusion models [77.21486516041391]
Diffusion models have dominated the field of large, generative image models.
We propose an algorithm for fast-constrained sampling in large pre-trained diffusion models.
arXiv Detail & Related papers (2024-10-24T14:52:38Z) - Research on Optimization of Natural Language Processing Model Based on Multimodal Deep Learning [0.036651088217486416]
This project intends to study the image representation based on attention mechanism and multimodal data.
By adding multiple pattern layers to the model, the semantic and hidden layers of image content are integrated.
The word vector is quantified by the Word2Vec method and then evaluated by a word embedding convolutional neural network.
arXiv Detail & Related papers (2024-06-13T06:03:59Z) - Adversarial Masking Contrastive Learning for vein recognition [10.886119051977785]
Vein recognition has received increasing attention due to its high security and privacy.
Deep neural networks such as Convolutional neural networks (CNN) and Transformers have been introduced for vein recognition.
Despite the recent advances, existing solutions for finger-vein feature extraction are still not optimal due to scarce training image samples.
arXiv Detail & Related papers (2024-01-16T03:09:45Z) - Unsupervised Domain Transfer with Conditional Invertible Neural Networks [83.90291882730925]
We propose a domain transfer approach based on conditional invertible neural networks (cINNs)
Our method inherently guarantees cycle consistency through its invertible architecture, and network training can efficiently be conducted with maximum likelihood.
Our method enables the generation of realistic spectral data and outperforms the state of the art on two downstream classification tasks.
arXiv Detail & Related papers (2023-03-17T18:00:27Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z) - Neural Knitworks: Patched Neural Implicit Representation Networks [1.0470286407954037]
We propose Knitwork, an architecture for neural implicit representation learning of natural images that achieves image synthesis.
To the best of our knowledge, this is the first implementation of a coordinate-based patch tailored for synthesis tasks such as image inpainting, super-resolution, and denoising.
The results show that modeling natural images using patches, rather than pixels, produces results of higher fidelity.
arXiv Detail & Related papers (2021-09-29T13:10:46Z) - Research on facial expression recognition based on Multimodal data
fusion and neural network [2.5431493111705943]
The algorithm is based on the multimodal data, and it takes the facial image, the histogram of oriented gradient of the image and the facial landmarks as the input.
Experimental results show that, benefiting by the complementarity of multimodal data, the algorithm has a great improvement in accuracy, robustness and detection speed.
arXiv Detail & Related papers (2021-09-26T23:45:40Z) - An application of a pseudo-parabolic modeling to texture image
recognition [0.0]
We present a novel methodology for texture image recognition using a partial differential equation modeling.
We employ the pseudo-parabolic Buckley-Leverett equation to provide a dynamics to the digital image representation and collect local descriptors from those images evolving in time.
arXiv Detail & Related papers (2021-02-09T18:08:42Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.