StrAE: Autoencoding for Pre-Trained Embeddings using Explicit Structure
- URL: http://arxiv.org/abs/2305.05588v2
- Date: Wed, 25 Oct 2023 15:30:59 GMT
- Title: StrAE: Autoencoding for Pre-Trained Embeddings using Explicit Structure
- Authors: Mattia Opper, Victor Prokhorov, N. Siddharth
- Abstract summary: StrAE is a Structured Autoencoder framework that through strict adherence to explicit structure, enables effective learning of multi-level representations.
We show that our results are directly attributable to the informativeness of the structure provided as input, and show that this is not the case for existing tree models.
We then extend StrAE to allow the model to define its own compositions using a simple localised-merge algorithm.
- Score: 5.2869308707704255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents StrAE: a Structured Autoencoder framework that through
strict adherence to explicit structure, and use of a novel contrastive
objective over tree-structured representations, enables effective learning of
multi-level representations. Through comparison over different forms of
structure, we verify that our results are directly attributable to the
informativeness of the structure provided as input, and show that this is not
the case for existing tree models. We then further extend StrAE to allow the
model to define its own compositions using a simple localised-merge algorithm.
This variant, called Self-StrAE, outperforms baselines that don't involve
explicit hierarchical compositions, and is comparable to models given
informative structure (e.g. constituency parses). Our experiments are conducted
in a data-constrained (circa 10M tokens) setting to help tease apart the
contribution of the inductive bias to effective learning. However, we find that
this framework can be robust to scale, and when extended to a much larger
dataset (circa 100M tokens), our 430 parameter model performs comparably to a
6-layer RoBERTa many orders of magnitude larger in size. Our findings support
the utility of incorporating explicit composition as an inductive bias for
effective representation learning.
Related papers
- Learning to Model Graph Structural Information on MLPs via Graph Structure Self-Contrasting [50.181824673039436]
We propose a Graph Structure Self-Contrasting (GSSC) framework that learns graph structural information without message passing.
The proposed framework is based purely on Multi-Layer Perceptrons (MLPs), where the structural information is only implicitly incorporated as prior knowledge.
It first applies structural sparsification to remove potentially uninformative or noisy edges in the neighborhood, and then performs structural self-contrasting in the sparsified neighborhood to learn robust node representations.
arXiv Detail & Related papers (2024-09-09T12:56:02Z) - Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction.
We reformulate the task to be entity-centric, enabling the use of diverse metrics.
We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z) - Promptly Predicting Structures: The Return of Inference [31.442123334313035]
We present a framework for constructing zero- and few-shot linguistic structure predictors.
Our results show that enforcing consistency constructs not only structurally valid outputs, but also improves performance.
arXiv Detail & Related papers (2024-01-12T20:08:39Z) - StructGPT: A General Framework for Large Language Model to Reason over
Structured Data [117.13986738340027]
We develop an emphIterative Reading-then-Reasoning(IRR) approach for solving question answering tasks based on structured data.
Our approach can significantly boost the performance of ChatGPT and achieve comparable performance against the full-data supervised-tuning baselines.
arXiv Detail & Related papers (2023-05-16T17:45:23Z) - Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal
Structured Representations [70.41385310930846]
We present an end-to-end framework Structure-CLIP to enhance multi-modal structured representations.
We use scene graphs to guide the construction of semantic negative examples, which results in an increased emphasis on learning structured representations.
A Knowledge-Enhance (KEE) is proposed to leverage SGK as input to further enhance structured representations.
arXiv Detail & Related papers (2023-05-06T03:57:05Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Adaptive Attribute and Structure Subspace Clustering Network [49.040136530379094]
We propose a novel self-expressiveness-based subspace clustering network.
We first consider an auto-encoder to represent input data samples.
Then, we construct a mixed signed and symmetric structure matrix to capture the local geometric structure underlying data.
We perform self-expressiveness on the constructed attribute structure and matrices to learn their affinity graphs.
arXiv Detail & Related papers (2021-09-28T14:00:57Z) - Structure by Architecture: Structured Representations without
Regularization [31.75200752252397]
We study the problem of self-supervised structured representation learning using autoencoders for downstream tasks such as generative modeling.
We design a novel autoencoder architecture capable of learning a structured representation without the need for aggressive regularization.
We demonstrate how these models learn a representation that improves results in a variety of downstream tasks including generation, disentanglement, and extrapolation.
arXiv Detail & Related papers (2020-06-14T04:37:08Z) - DRTS Parsing with Structure-Aware Encoding and Decoding [28.711318411470497]
State-of-the-art performance can be achieved by a neural sequence-to-sequence model.
We propose a structural-aware model at both the encoder and decoder phase to integrate the structural information.
arXiv Detail & Related papers (2020-05-14T12:09:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.