A Comprehensive Survey of Sentence Representations: From the BERT Epoch
to the ChatGPT Era and Beyond
- URL: http://arxiv.org/abs/2305.12641v3
- Date: Fri, 2 Feb 2024 07:57:05 GMT
- Title: A Comprehensive Survey of Sentence Representations: From the BERT Epoch
to the ChatGPT Era and Beyond
- Authors: Abhinav Ramesh Kashyap, Thanh-Tung Nguyen, Viktor Schlegel, Stefan
Winkler, See-Kiong Ng, Soujanya Poria
- Abstract summary: Sentence representations are a critical component in NLP applications such as retrieval, question answering, and text classification.
They capture the meaning of a sentence, enabling machines to understand and reason over human language.
There is no literature review on sentence representations till now.
- Score: 45.455178613559006
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sentence representations are a critical component in NLP applications such as
retrieval, question answering, and text classification. They capture the
meaning of a sentence, enabling machines to understand and reason over human
language. In recent years, significant progress has been made in developing
methods for learning sentence representations, including unsupervised,
supervised, and transfer learning approaches. However there is no literature
review on sentence representations till now. In this paper, we provide an
overview of the different methods for sentence representation learning,
focusing mostly on deep learning models. We provide a systematic organization
of the literature, highlighting the key contributions and challenges in this
area. Overall, our review highlights the importance of this area in natural
language processing, the progress made in sentence representation learning, and
the challenges that remain. We conclude with directions for future research,
suggesting potential avenues for improving the quality and efficiency of
sentence representations.
Related papers
- CADS: A Systematic Literature Review on the Challenges of Abstractive Dialogue Summarization [7.234196390284036]
This article summarizes the research on Transformer-based abstractive summarization for English dialogues.
We cover the main challenges present in dialog summarization (i.e., language, structure, comprehension, speaker, salience, and factuality)
We find that while some challenges, like language, have seen considerable progress, others, such as comprehension, factuality, and salience, remain difficult and hold significant research opportunities.
arXiv Detail & Related papers (2024-06-11T17:30:22Z) - From Lengthy to Lucid: A Systematic Literature Review on NLP Techniques
for Taming Long Sentences [3.4961473050660303]
Long sentences have been a persistent issue in written communication for many years.
This survey systematically reviews two main strategies for addressing the issue of long sentences.
We categorize and group the most representative methods into a comprehensive taxonomy.
arXiv Detail & Related papers (2023-12-08T16:51:29Z) - Learning Disentangled Speech Representations [0.45060992929802207]
Disentangled representation learning from speech remains limited despite its importance in many application domains.
Key challenge is the lack of speech datasets with known generative factors to evaluate methods.
This paper proposes SynSpeech: a novel synthetic speech dataset with ground truth factors enabling research on disentangling speech representations.
arXiv Detail & Related papers (2023-11-04T04:54:17Z) - A Neural-Symbolic Approach Towards Identifying Grammatically Correct
Sentences [0.0]
It is commonly accepted that it is crucial to have access to well-written text from valid sources to tackle challenges like text summarization, question-answering, machine translation, or even pronoun resolution.
We present a simplified way to validate English sentences through a novel neural-symbolic approach.
arXiv Detail & Related papers (2023-07-16T13:21:44Z) - RankCSE: Unsupervised Sentence Representations Learning via Learning to
Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning.
It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework.
An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - Sentence Representation Learning with Generative Objective rather than
Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z) - From Show to Tell: A Survey on Image Captioning [48.98681267347662]
Connecting Vision and Language plays an essential role in Generative Intelligence.
Research in image captioning has not reached a conclusive answer yet.
This work aims at providing a comprehensive overview and categorization of image captioning approaches.
arXiv Detail & Related papers (2021-07-14T18:00:54Z) - Narrative Incoherence Detection [76.43894977558811]
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding.
Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
arXiv Detail & Related papers (2020-12-21T07:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.