Related papers: Adaptive Convolution for Semantic Role Labeling

Adaptive Convolution for Semantic Role Labeling

URL: http://arxiv.org/abs/2012.13939v1
Date: Sun, 27 Dec 2020 13:26:11 GMT
Title: Adaptive Convolution for Semantic Role Labeling
Authors: Kashif Munir, Hai Zhao, Zuchao Li
Abstract summary: Semantic role labeling (SRL) aims at elaborating the meaning of a sentence by forming a predicate-argument structure. Recent researches depicted that the effective use of syntax can improve SRL performance. This work effectively encodes syntax using adaptive convolution which endows strong flexibility to existing convolutional networks. Experiments on CoNLL-2009 dataset confirm that the proposed model substantially outperforms most previous SRL systems for both English and Chinese languages.
Score: 48.69930912510414
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Semantic role labeling (SRL) aims at elaborating the meaning of a sentence by forming a predicate-argument structure. Recent researches depicted that the effective use of syntax can improve SRL performance. However, syntax is a complicated linguistic clue and is hard to be effectively applied in a downstream task like SRL. This work effectively encodes syntax using adaptive convolution which endows strong flexibility to existing convolutional networks. The existing CNNs may help in encoding a complicated structure like syntax for SRL, but it still has shortcomings. Contrary to traditional convolutional networks that use same filters for different inputs, adaptive convolution uses adaptively generated filters conditioned on syntactically informed inputs. We achieve this with the integration of a filter generation network which generates the input specific filters. This helps the model to focus on important syntactic features present inside the input, thus enlarging the gap between syntax-aware and syntax-agnostic SRL systems. We further study a hashing technique to compress the size of the filter generation network for SRL in terms of trainable parameters. Experiments on CoNLL-2009 dataset confirm that the proposed model substantially outperforms most previous SRL systems for both English and Chinese languages

Related papers

DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation [68.19756761027351]
Diffusion large language models (dLLMs) are compelling alternatives to autoregressive (AR) models.<n>We investigate their denoising processes and reinforcement learning methods.<n>Our work provides deeper insight into the machinery of dLLM generation and offers an effective, diffusion-native RL training framework.
arXiv Detail & Related papers (2025-06-25T17:35:47Z)
R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models [83.77114091471822]
Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML) A challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming. This is particularly pronounced for word embedding parameters in large language models (LLMs), which are crucial for language understanding. A physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks.
arXiv Detail & Related papers (2024-07-16T12:21:29Z)
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States [52.48566999668521]
We propose to use vector-quantized long short-term memory units in the prediction network of RNN transducers. By training the discrete representation jointly with the ASR network, hypotheses can be actively merged for lattice generation. Our experiments on the Switchboard corpus show that the proposed VQ RNN transducers improve ASR performance over transducers with regular prediction networks.
arXiv Detail & Related papers (2022-08-03T02:45:52Z)
Transition-based Semantic Role Labeling with Pointer Networks [0.40611352512781856]
We propose the first transition-based SRL approach that is capable of completely processing an input sentence in a single left-to-right pass. Thanks to our implementation based on Pointer Networks, full SRL can be accurately and efficiently done in $O(n2)$, achieving the best performance to date on the majority of languages from the CoNLL-2009 shared task.
arXiv Detail & Related papers (2022-05-20T08:38:44Z)
Contrastive String Representation Learning using Synthetic Data [0.0]
The goal of String representation Learning (SRL) is to learn dense and low-dimensional vectors for encoding character sequences. We propose a new method for to train a SRL model by only using synthetic data. We demonstrate the effectiveness of our approach by evaluating the learned representation on the task of string similarity matching.
arXiv Detail & Related papers (2021-10-08T16:06:54Z)
Confusion-based rank similarity filters for computationally-efficient machine learning on high dimensional data [0.0]
We introduce a novel type of computationally efficient artificial neural network (ANN) called the rank similarity filter (RSF) RSFs can be used to transform and classify nonlinearly separable datasets with many data points and dimensions. Open-source code for RST, RSC and RSPC was written in Python using the popular scikit-learn framework to make it easily accessible.
arXiv Detail & Related papers (2021-09-28T10:53:38Z)
Physics-Based Deep Learning for Fiber-Optic Communication Systems [10.630021520220653]
We propose a new machine-learning approach for fiber-optic communication systems governed by the nonlinear Schr"odinger equation (NLSE) Our main observation is that the popular split-step method (SSM) for numerically solving the NLSE has essentially the same functional form as a deep multi-layer neural network. We exploit this connection by parameterizing the SSM and viewing the linear steps as general linear functions, similar to the weight matrices in a neural network.
arXiv Detail & Related papers (2020-10-27T12:55:23Z)
Off-Policy Self-Critical Training for Transformer in Visual Paragraph Generation [20.755764654229047]
Transformer is currently state-of-the-art seq-to-seq model in language generation. We propose an off-policy RL learning algorithm where a behaviour policy represented by GRUs performs the sampling. The proposed algorithm achieves state-of-the-art performance on the visual paragraph generation and improved results on image captioning.
arXiv Detail & Related papers (2020-06-21T05:10:17Z)
Dependency Aware Filter Pruning [74.69495455411987]
Pruning a proportion of unimportant filters is an efficient way to mitigate the inference cost. Previous work prunes filters according to their weight norms or the corresponding batch-norm scaling factors. We propose a novel mechanism to dynamically control the sparsity-inducing regularization so as to achieve the desired sparsity.
arXiv Detail & Related papers (2020-05-06T07:41:22Z)
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. VAEs tend to ignore latent variables with a strong auto-regressive decoder. We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
Depth-Adaptive Graph Recurrent Network for Text Classification [71.20237659479703]
Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network. We propose a depth-adaptive mechanism for the S-LSTM, which allows the model to learn how many computational steps to conduct for different words as required.
arXiv Detail & Related papers (2020-02-29T03:09:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.