Towards an evolutionary-based approach for natural language processing
- URL: http://arxiv.org/abs/2004.13832v1
- Date: Thu, 23 Apr 2020 18:44:12 GMT
- Title: Towards an evolutionary-based approach for natural language processing
- Authors: Luca Manzoni, Domagoj Jakobovic, Luca Mariot, Stjepan Picek, Mauro
Castelli
- Abstract summary: We propose a first proof-of-concept that combines GP with the well established NLP tool word2vec for the next word prediction task.
The main idea is that, once words have been moved into a vector space, traditional GP operators can successfully work on vectors, thus producing meaningful words as the output.
- Score: 14.760703384346984
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tasks related to Natural Language Processing (NLP) have recently been the
focus of a large research endeavor by the machine learning community. The
increased interest in this area is mainly due to the success of deep learning
methods. Genetic Programming (GP), however, was not under the spotlight with
respect to NLP tasks. Here, we propose a first proof-of-concept that combines
GP with the well established NLP tool word2vec for the next word prediction
task. The main idea is that, once words have been moved into a vector space,
traditional GP operators can successfully work on vectors, thus producing
meaningful words as the output. To assess the suitability of this approach, we
perform an experimental evaluation on a set of existing newspaper headlines.
Individuals resulting from this (pre-)training phase can be employed as the
initial population in other NLP tasks, like sentence generation, which will be
the focus of future investigations, possibly employing adversarial
co-evolutionary approaches.
Related papers
- A survey of neural-network-based methods utilising comparable data for finding translation equivalents [0.0]
We present the most common approaches from NLP that endeavour to automatically induce one of the essential dictionary components.
We analyse them from a lexicographic perspective since their viewpoints are crucial for improving the described methods.
This survey encourages a connection between the NLP and lexicography fields as the NLP field can benefit from lexicographic insights.
arXiv Detail & Related papers (2024-10-19T16:10:41Z) - The Nature of NLP: Analyzing Contributions in NLP Papers [77.31665252336157]
We quantitatively investigate what constitutes NLP research by examining research papers.
Our findings reveal a rising involvement of machine learning in NLP since the early nineties.
In post-2020, there has been a resurgence of focus on language and people.
arXiv Detail & Related papers (2024-09-29T01:29:28Z) - Pre-Trained Language Models for Keyphrase Prediction: A Review [2.7869482272876622]
Keyphrase Prediction (KP) is essential for identifying keyphrases in a document that can summarize its content.
Recent Natural Language Processing advances have developed more efficient KP models using deep learning techniques.
This paper extensively examines the topic of pre-trained language models for keyphrase prediction (PLM-KP)
arXiv Detail & Related papers (2024-09-02T09:15:44Z) - Natural Language Processing for Dialects of a Language: A Survey [56.93337350526933]
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets.
This survey delves into an important attribute of these datasets: the dialect of a language.
Motivated by the performance degradation of NLP models for dialectic datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches.
arXiv Detail & Related papers (2024-01-11T03:04:38Z) - Unsupervised Chunking with Hierarchical RNN [62.15060807493364]
This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner.
We present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions.
Experiments on the CoNLL-2000 dataset reveal a notable improvement over existing unsupervised methods, enhancing phrase F1 score by up to 6 percentage points.
arXiv Detail & Related papers (2023-09-10T02:55:12Z) - A Survey of Knowledge Enhanced Pre-trained Language Models [78.56931125512295]
We present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs)
For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG) and rule knowledge.
The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods.
arXiv Detail & Related papers (2022-11-11T04:29:02Z) - Meta Learning for Natural Language Processing: A Survey [88.58260839196019]
Deep learning has been the mainstream technique in natural language processing (NLP) area.
Deep learning requires many labeled data and is less generalizable across domains.
Meta-learning is an arising field in machine learning studying approaches to learn better algorithms.
arXiv Detail & Related papers (2022-05-03T13:58:38Z) - Meta-Embeddings for Natural Language Inference and Semantic Similarity
tasks [0.0]
Word Representations form the core component for almost all advanced Natural Language Processing (NLP) applications.
In this paper, we propose to use Meta Embedding derived from few State-of-the-Art (SOTA) models to efficiently tackle mainstream NLP tasks.
arXiv Detail & Related papers (2020-12-01T16:58:01Z) - Heads-up! Unsupervised Constituency Parsing via Self-Attention Heads [27.578115452635625]
We propose a novel fully unsupervised parsing approach that extracts constituency trees from PLM attention heads.
We rank transformer attention heads based on their inherent properties, and create an ensemble of high-ranking heads to produce the final tree.
Our experiments can also be used as a tool to analyze the grammars PLMs learn implicitly.
arXiv Detail & Related papers (2020-10-19T13:51:40Z) - Task-specific Objectives of Pre-trained Language Models for Dialogue
Adaptation [79.0866650271659]
Common process of utilizing PrLMs is first pre-training on large-scale general corpora with task-independent LM training objectives, then fine-tuning on task datasets with task-specific training objectives.
We introduce task-specific pre-training on in-domain task-related corpora with task-specific objectives.
This procedure is placed between the original two stages to enhance the model understanding capacity of specific tasks.
arXiv Detail & Related papers (2020-09-10T16:46:46Z) - Probing the Natural Language Inference Task with Automated Reasoning
Tools [6.445605125467574]
The Natural Language Inference (NLI) task is an important task in modern NLP, as it asks a broad question to which many other tasks may be reducible.
We use other techniques to examine the logical structure of the NLI task.
We show how well a machine-oriented controlled natural language can be used to parse NLI sentences, and how well automated theorem provers can reason over the resulting formulae.
arXiv Detail & Related papers (2020-05-06T03:18:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.