Related papers: Sentence Compression as Deletion with Contextual Embeddings

Sentence Compression as Deletion with Contextual Embeddings

URL: http://arxiv.org/abs/2006.03210v1
Date: Fri, 5 Jun 2020 02:40:46 GMT
Title: Sentence Compression as Deletion with Contextual Embeddings
Authors: Minh-Tien Nguyen and Bui Cong Minh and Dung Tien Le and Le Thai Linh
Abstract summary: We exploit contextual embeddings that enable our model capturing the context of inputs. Experimental results on a benchmark Google dataset show that by utilizing contextual embeddings, our model achieves a new state-of-the-art F-score.
Score: 3.3263205689999444
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sentence compression is the task of creating a shorter version of an input sentence while keeping important information. In this paper, we extend the task of compression by deletion with the use of contextual embeddings. Different from prior work usually using non-contextual embeddings (Glove or Word2Vec), we exploit contextual embeddings that enable our model capturing the context of inputs. More precisely, we utilize contextual embeddings stacked by bidirectional Long-short Term Memory and Conditional Random Fields for dealing with sequence labeling. Experimental results on a benchmark Google dataset show that by utilizing contextual embeddings, our model achieves a new state-of-the-art F-score compared to strong methods reported on the leader board.

Related papers

Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings. First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss. Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z)
Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models [5.330795983408874]
We introduce a novel method called late chunking, which leverages long context embedding models to first embed all tokens of the long text. The resulting chunk embeddings capture the full contextual information, leading to superior results across various retrieval tasks.
arXiv Detail & Related papers (2024-09-07T03:54:46Z)
Fine-grained Controllable Text Generation through In-context Learning with Feedback [57.396980277089135]
We present a method for rewriting an input sentence to match specific values of nontrivial linguistic features, such as dependency depth. In contrast to earlier work, our method uses in-context learning rather than finetuning, making it applicable in use cases where data is sparse.
arXiv Detail & Related papers (2024-06-17T08:55:48Z)
Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings. RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z)
Text Ranking and Classification using Data Compression [1.332560004325655]
We propose a language-agnostic approach to text categorization. We use the Zstandard compressor and strengthen these ideas in several ways, calling the resulting technique Zest. We show that Zest complements and can compete with language-specific multidimensional content embeddings in production, but cannot outperform other counting methods on public datasets.
arXiv Detail & Related papers (2021-09-23T18:13:17Z)
A Condense-then-Select Strategy for Text Summarization [53.10242552203694]
We propose a novel condense-then-select framework for text summarization. Our framework helps to avoid the loss of salient information, while preserving the high efficiency of sentence-level compression.
arXiv Detail & Related papers (2021-06-19T10:33:10Z)
A Comparative Study on Structural and Semantic Properties of Sentence Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction. We show that different embedding spaces have different degrees of strength for the structural and semantic properties. These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z)
Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer. In detail, the input is a set of structured records and a reference text for describing another recordset. The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.