Review of coreference resolution in English and Persian
- URL: http://arxiv.org/abs/2211.04428v1
- Date: Tue, 8 Nov 2022 18:14:09 GMT
- Title: Review of coreference resolution in English and Persian
- Authors: Hassan Haji Mohammadi, Alireza Talebpour, Ahmad Mahmoudi Aznaveh,
Samaneh Yazdani
- Abstract summary: Coreference resolution (CR) is one of the most challenging areas of natural language processing.
This article reviews the existing corpora and evaluation metrics in this field.
Then, an overview of the coreference algorithms, from rule-based methods to the latest deep learning techniques are provided.
- Score: 5.419608513284392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Coreference resolution (CR) is one of the most challenging areas of natural
language processing. This task seeks to identify all textual references to the
same real-world entity. Research in this field is divided into coreference
resolution and anaphora resolution. Due to its application in textual
comprehension and its utility in other tasks such as information extraction
systems, document summarization, and machine translation, this field has
attracted considerable interest. Consequently, it has a significant effect on
the quality of these systems. This article reviews the existing corpora and
evaluation metrics in this field. Then, an overview of the coreference
algorithms, from rule-based methods to the latest deep learning techniques, is
provided. Finally, coreference resolution and pronoun resolution systems in
Persian are investigated.
Related papers
- Towards Unified Multi-granularity Text Detection with Interactive Attention [56.79437272168507]
"Detect Any Text" is an advanced paradigm that unifies scene text detection, layout analysis, and document page detection into a cohesive, end-to-end model.
A pivotal innovation in DAT is the across-granularity interactive attention module, which significantly enhances the representation learning of text instances.
Tests demonstrate that DAT achieves state-of-the-art performances across a variety of text-related benchmarks.
arXiv Detail & Related papers (2024-05-30T07:25:23Z) - Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models.
We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z) - Advancements and Challenges in Arabic Optical Character Recognition: A
Comprehensive Survey [0.6629765271909505]
This paper seeks to offer an exhaustive review of contemporary applications, methodologies, and challenges associated with Arabic Optical Character Recognition (OCR)
A thorough analysis is conducted on prevailing techniques utilized throughout the OCR process, with a dedicated effort to discern the most efficacious approaches that demonstrate enhanced outcomes.
In addition to presenting cutting-edge techniques and methods, this paper critically identifies research gaps within the realm of Arabic OCR.
arXiv Detail & Related papers (2023-12-19T03:01:31Z) - Language Model Decoding as Direct Metrics Optimization [87.68281625776282]
Current decoding methods struggle to generate texts that align with human texts across different aspects.
In this work, we frame decoding from a language model as an optimization problem with the goal of strictly matching the expected performance with human texts.
We prove that this induced distribution is guaranteed to improve the perplexity on human texts, which suggests a better approximation to the underlying distribution of human texts.
arXiv Detail & Related papers (2023-10-02T09:35:27Z) - X-PARADE: Cross-Lingual Textual Entailment and Information Divergence across Paragraphs [55.80189506270598]
X-PARADE is the first cross-lingual dataset of paragraph-level information divergences.
Annotators label a paragraph in a target language at the span level and evaluate it with respect to a corresponding paragraph in a source language.
Aligned paragraphs are sourced from Wikipedia pages in different languages.
arXiv Detail & Related papers (2023-09-16T04:34:55Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Uzbek text summarization based on TF-IDF [0.0]
This article presents an experiment on summarization task for Uzbek language.
The methodology was based on text abstracting based on TF-IDF algorithm.
We summarize the given text by applying the n-gram method to important parts of the whole text.
arXiv Detail & Related papers (2023-03-01T12:39:46Z) - FDMT: A Benchmark Dataset for Fine-grained Domain Adaptation in Machine
Translation [53.87731008029645]
We present a real-world fine-grained domain adaptation task in machine translation (FDMT)
The FDMT dataset consists of four sub-domains of information technology: autonomous vehicles, AI education, real-time networks and smart phone.
We make quantitative experiments and deep analyses in this new setting, which benchmarks the fine-grained domain adaptation task.
arXiv Detail & Related papers (2020-12-31T17:15:09Z) - Method of the coherence evaluation of Ukrainian text [0.0]
Methods for text coherence measurements for Ukrainian language are analyzed.
Training and examination procedures are made on the corpus of Ukrainian texts.
Test procedure is implemented by performing of two typical tasks for the text coherence assessment.
arXiv Detail & Related papers (2020-10-31T16:48:55Z) - Natural language technology and query expansion: issues,
state-of-the-art and perspectives [0.0]
Linguistic characteristics that cause ambiguity and misinterpretation of queries as well as additional factors affect the users ability to accurately represent their information needs.
We lay down the anatomy of a generic linguistic based query expansion framework and propose its module-based decomposition.
For each of the modules we review the state-of-the-art solutions in the literature and categorized under the light of the techniques used.
arXiv Detail & Related papers (2020-04-23T11:39:07Z) - Matching Text with Deep Mutual Information Estimation [0.0]
We present a neural approach for general-purpose text matching with deep mutual information estimation incorporated.
Our approach, Text matching with Deep Info Max (TIM), is integrated with a procedure of unsupervised learning of representations.
We evaluate our text matching approach on several tasks including natural language inference, paraphrase identification, and answer selection.
arXiv Detail & Related papers (2020-03-09T15:25:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.