An Expression Tree Decoding Strategy for Mathematical Equation
Generation
- URL: http://arxiv.org/abs/2310.09619v3
- Date: Wed, 18 Oct 2023 09:28:54 GMT
- Title: An Expression Tree Decoding Strategy for Mathematical Equation
Generation
- Authors: Wenqi Zhang, Yongliang Shen, Qingpeng Nong, Zeqi Tan, Yanna Ma,
Weiming Lu
- Abstract summary: Existing approaches can be broadly categorized into token-level and expression-level generation.
Expression-level methods generate each expression one by one.
Each expression represents a solving step, and there naturally exist parallel or dependent relations between these steps.
We integrate tree structure into the expression-level generation and advocate an expression tree decoding strategy.
- Score: 24.131972875875952
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating mathematical equations from natural language requires an accurate
understanding of the relations among math expressions. Existing approaches can
be broadly categorized into token-level and expression-level generation. The
former treats equations as a mathematical language, sequentially generating
math tokens. Expression-level methods generate each expression one by one.
However, each expression represents a solving step, and there naturally exist
parallel or dependent relations between these steps, which are ignored by
current sequential methods. Therefore, we integrate tree structure into the
expression-level generation and advocate an expression tree decoding strategy.
To generate a tree with expression as its node, we employ a layer-wise parallel
decoding strategy: we decode multiple independent expressions (leaf nodes) in
parallel at each layer and repeat parallel decoding layer by layer to
sequentially generate these parent node expressions that depend on others.
Besides, a bipartite matching algorithm is adopted to align multiple
predictions with annotations for each layer. Experiments show our method
outperforms other baselines, especially for these equations with complex
structures.
Related papers
- An Autoregressive Text-to-Graph Framework for Joint Entity and Relation
Extraction [4.194768796374315]
We propose a novel method for joint entity and relation extraction from unstructured text by framing it as a conditional sequence generation problem.
It generates a linearized graph where nodes represent text spans and edges represent relation triplets.
Our method employs a transformer encoder-decoder architecture with pointing mechanism on a dynamic vocabulary of spans and relation types.
arXiv Detail & Related papers (2024-01-02T18:32:14Z) - Explicit Syntactic Guidance for Neural Text Generation [45.60838824233036]
Generative Grammar suggests that humans generate natural language texts by learning language grammar.
We propose a syntax-guided generation schema, which generates the sequence guided by a constituency parse tree in a top-down direction.
Experiments on paraphrase generation and machine translation show that the proposed method outperforms autoregressive baselines.
arXiv Detail & Related papers (2023-06-20T12:16:31Z) - Compositional Generalization without Trees using Multiset Tagging and
Latent Permutations [121.37328648951993]
We phrase semantic parsing as a two-step process: we first tag each input token with a multiset of output tokens.
Then we arrange the tokens into an output sequence using a new way of parameterizing and predicting permutations.
Our model outperforms pretrained seq2seq models and prior work on realistic semantic parsing tasks.
arXiv Detail & Related papers (2023-05-26T14:09:35Z) - Linear-Time Modeling of Linguistic Structure: An Order-Theoretic
Perspective [97.57162770792182]
Tasks that model the relation between pairs of tokens in a string are a vital part of understanding natural language.
We show that these exhaustive comparisons can be avoided, and, moreover, the complexity can be reduced to linear by casting the relation between tokens as a partial order over the string.
Our method predicts real numbers for each token in a string in parallel and sorts the tokens accordingly, resulting in total orders of the tokens in the string.
arXiv Detail & Related papers (2023-05-24T11:47:35Z) - Outline, Then Details: Syntactically Guided Coarse-To-Fine Code
Generation [61.50286000143233]
ChainCoder is a program synthesis language model that generates Python code progressively.
A tailored transformer architecture is leveraged to jointly encode the natural language descriptions and syntactically aligned I/O data samples.
arXiv Detail & Related papers (2023-04-28T01:47:09Z) - Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference.
Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z) - R2D2: Recursive Transformer based on Differentiable Tree for
Interpretable Hierarchical Language Modeling [36.61173494449218]
This paper proposes a model based on differentiable CKY style binary trees to emulate the composition process.
We extend the bidirectional language model pre-training objective to this architecture, attempting to predict each word given its left and right abstraction nodes.
To scale up our approach, we also introduce an efficient pruned tree induction algorithm to enable encoding in just a linear number of composition steps.
arXiv Detail & Related papers (2021-07-02T11:00:46Z) - Span-based Semantic Parsing for Compositional Generalization [53.24255235340056]
SpanBasedSP predicts a span tree over an input utterance, explicitly encoding how partial programs compose over spans in the input.
On GeoQuery, SCAN and CLOSURE, SpanBasedSP performs similarly to strong seq2seq baselines on random splits, but dramatically improves performance compared to baselines on splits that require compositional generalization.
arXiv Detail & Related papers (2020-09-13T16:42:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.