Hyperbolic sentence representations for solving Textual Entailment
- URL: http://arxiv.org/abs/2406.15472v1
- Date: Sat, 15 Jun 2024 15:39:43 GMT
- Title: Hyperbolic sentence representations for solving Textual Entailment
- Authors: Igor Petrovski,
- Abstract summary: We use the Poincare ball to embed sentences with the goal of proving how hyperbolic spaces can be used for solving Textual Entailment.
We evaluate against baselines of various backgrounds, including LSTMs, Order Embeddings and Euclidean Averaging.
We consistently outperform the baselines on the SICK dataset and are second only to Order Embeddings on the SNLI dataset.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hyperbolic spaces have proven to be suitable for modeling data of hierarchical nature. As such we use the Poincare ball to embed sentences with the goal of proving how hyperbolic spaces can be used for solving Textual Entailment. To this end, apart from the standard datasets used for evaluating textual entailment, we developed two additional datasets. We evaluate against baselines of various backgrounds, including LSTMs, Order Embeddings and Euclidean Averaging, which comes as a natural counterpart to representing sentences into the Euclidean space. We consistently outperform the baselines on the SICK dataset and are second only to Order Embeddings on the SNLI dataset, for the binary classification version of the entailment task.
Related papers
- Triples-to-isiXhosa (T2X): Addressing the Challenges of Low-Resource
Agglutinative Data-to-Text Generation [9.80836683456026]
We tackle data-to-text for isiXhosa, which is low-resource and agglutinative.
We introduce Triples-to-isiXhosa (T2X), a new dataset based on a subset of WebNLG.
We develop an evaluation framework for T2X that measures how accurately generated text describes the data.
arXiv Detail & Related papers (2024-03-12T11:53:27Z) - UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity [50.91030850662369]
Existing text-based person retrieval datasets often have relatively coarse-grained text annotations.
This hinders the model to comprehend the fine-grained semantics of query texts in real scenarios.
We contribute a new benchmark named textbfUFineBench for text-based person retrieval with ultra-fine granularity.
arXiv Detail & Related papers (2023-12-06T11:50:14Z) - Federated Classification in Hyperbolic Spaces via Secure Aggregation of
Convex Hulls [35.327709607897944]
We develop distributed versions of convex SVM classifiers for Poincar'e discs.
We compute the complexity of the convex hulls in hyperbolic spaces to assess the extent of data leakage.
We test our method on a collection of diverse data sets, including hierarchical single-cell RNA-seq data from different patients distributed across different repositories.
arXiv Detail & Related papers (2023-08-14T02:25:48Z) - On the Use of Context for Predicting Citation Worthiness of Sentences in
Scholarly Articles [10.28696219236292]
We formulate this problem as a sequence labeling task solved using a hierarchical BiLSTM model.
We contribute a new benchmark dataset containing over two million sentences and their corresponding labels.
Our results quantify the benefits of using context and contextual embeddings for citation worthiness.
arXiv Detail & Related papers (2021-04-18T21:47:30Z) - Aligning Hyperbolic Representations: an Optimal Transport-based approach [0.0]
This work proposes a novel approach based on OT of embeddings on the Poincar'e model of hyperbolic spaces.
As a result of this formalism, we derive extensions to some existing Euclidean methods of OT-based domain adaptation to their hyperbolic counterparts.
arXiv Detail & Related papers (2020-12-02T11:22:19Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Rethinking Positional Encoding in Language Pre-training [111.2320727291926]
We show that in absolute positional encoding, the addition operation applied on positional embeddings and word embeddings brings mixed correlations.
We propose a new positional encoding method called textbfTransformer with textbfUntied textPositional textbfEncoding (T)
arXiv Detail & Related papers (2020-06-28T13:11:02Z) - ToTTo: A Controlled Table-To-Text Generation Dataset [61.83159452483026]
ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples.
We introduce a dataset construction process where annotators directly revise existing candidate sentences from Wikipedia.
While usually fluent, existing methods often hallucinate phrases that are not supported by the table.
arXiv Detail & Related papers (2020-04-29T17:53:45Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.