Sentence Compression as Deletion with Contextual Embeddings
- URL: http://arxiv.org/abs/2006.03210v1
- Date: Fri, 5 Jun 2020 02:40:46 GMT
- Title: Sentence Compression as Deletion with Contextual Embeddings
- Authors: Minh-Tien Nguyen and Bui Cong Minh and Dung Tien Le and Le Thai Linh
- Abstract summary: We exploit contextual embeddings that enable our model capturing the context of inputs.
Experimental results on a benchmark Google dataset show that by utilizing contextual embeddings, our model achieves a new state-of-the-art F-score.
- Score: 3.3263205689999444
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sentence compression is the task of creating a shorter version of an input
sentence while keeping important information. In this paper, we extend the task
of compression by deletion with the use of contextual embeddings. Different
from prior work usually using non-contextual embeddings (Glove or Word2Vec), we
exploit contextual embeddings that enable our model capturing the context of
inputs. More precisely, we utilize contextual embeddings stacked by
bidirectional Long-short Term Memory and Conditional Random Fields for dealing
with sequence labeling. Experimental results on a benchmark Google dataset show
that by utilizing contextual embeddings, our model achieves a new
state-of-the-art F-score compared to strong methods reported on the leader
board.
Related papers
- Fine-grained Controllable Text Generation through In-context Learning with Feedback [57.396980277089135]
We present a method for rewriting an input sentence to match specific values of nontrivial linguistic features, such as dependency depth.
In contrast to earlier work, our method uses in-context learning rather than finetuning, making it applicable in use cases where data is sparse.
arXiv Detail & Related papers (2024-06-17T08:55:48Z) - Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings.
RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z) - Unsupervised Matching of Data and Text [6.2520079463149205]
We introduce a framework that supports matching textual content and structured data in an unsupervised setting.
Our method builds a fine-grained graph over the content of the corpora and derives word embeddings to represent the objects to match in a low dimensional space.
Experiments on real use cases and public datasets show that our framework produces embeddings that outperform word embeddings and fine-tuned language models.
arXiv Detail & Related papers (2021-12-16T10:40:48Z) - Clustering and Network Analysis for the Embedding Spaces of Sentences
and Sub-Sentences [69.3939291118954]
This paper reports research on a set of comprehensive clustering and network analyses targeting sentence and sub-sentence embedding spaces.
Results show that one method generates the most clusterable embeddings.
In general, the embeddings of span sub-sentences have better clustering properties than the original sentences.
arXiv Detail & Related papers (2021-10-02T00:47:35Z) - Text Ranking and Classification using Data Compression [1.332560004325655]
We propose a language-agnostic approach to text categorization.
We use the Zstandard compressor and strengthen these ideas in several ways, calling the resulting technique Zest.
We show that Zest complements and can compete with language-specific multidimensional content embeddings in production, but cannot outperform other counting methods on public datasets.
arXiv Detail & Related papers (2021-09-23T18:13:17Z) - A Condense-then-Select Strategy for Text Summarization [53.10242552203694]
We propose a novel condense-then-select framework for text summarization.
Our framework helps to avoid the loss of salient information, while preserving the high efficiency of sentence-level compression.
arXiv Detail & Related papers (2021-06-19T10:33:10Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.