Capturing Structural Locality in Non-parametric Language Models
- URL: http://arxiv.org/abs/2110.02870v1
- Date: Wed, 6 Oct 2021 15:53:38 GMT
- Title: Capturing Structural Locality in Non-parametric Language Models
- Authors: Frank F. Xu, Junxian He, Graham Neubig, Vincent J. Hellendoorn
- Abstract summary: We propose a simple yet effective approach for adding locality information into non-parametric language models.
Experiments on two different domains, Java source code and Wikipedia text, demonstrate that locality features improve model efficacy.
- Score: 85.94669097485992
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Structural locality is a ubiquitous feature of real-world datasets, wherein
data points are organized into local hierarchies. Some examples include topical
clusters in text or project hierarchies in source code repositories. In this
paper, we explore utilizing this structural locality within non-parametric
language models, which generate sequences that reference retrieved examples
from an external source. We propose a simple yet effective approach for adding
locality information into such models by adding learned parameters that improve
the likelihood of retrieving examples from local neighborhoods. Experiments on
two different domains, Java source code and Wikipedia text, demonstrate that
locality features improve model efficacy over models without access to these
features, with interesting differences. We also perform an analysis of how and
where locality features contribute to improved performance and why the
traditionally used contextual similarity metrics alone are not enough to grasp
the locality structure.
Related papers
- Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - Content-Based Landmark Retrieval Combining Global and Local Features
using Siamese Neural Networks [3.785123406103385]
We present a method for landmark retrieval that utilizes global and local features.
A Siamese network is used for global feature extraction and metric learning, which gives an initial ranking of the landmark search.
We utilize the extracted feature maps from the Siamese architecture as local descriptors, the search results are then further refined using a cosine similarity between local descriptors.
arXiv Detail & Related papers (2022-08-03T18:11:36Z) - SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation [94.11915008006483]
We propose SemAffiNet for point cloud semantic segmentation.
We conduct extensive experiments on the ScanNetV2 and NYUv2 datasets.
arXiv Detail & Related papers (2022-05-26T17:00:23Z) - Unsupervised Keyphrase Extraction by Jointly Modeling Local and Global
Context [25.3472693740778]
Embedding based methods are widely used for unsupervised keyphrase extraction (UKE) tasks.
In this paper, we propose a novel method for UKE, where local and global contexts are jointly modeled.
arXiv Detail & Related papers (2021-09-15T13:41:10Z) - Clustered Federated Learning via Generalized Total Variation
Minimization [83.26141667853057]
We study optimization methods to train local (or personalized) models for local datasets with a decentralized network structure.
Our main conceptual contribution is to formulate federated learning as total variation minimization (GTV)
Our main algorithmic contribution is a fully decentralized federated learning algorithm.
arXiv Detail & Related papers (2021-05-26T18:07:19Z) - Adaptive Semiparametric Language Models [17.53604394786977]
We present a language model that combines a large parametric neural network (i.e., a transformer) with a non-parametric episodic memory component.
Experiments on word-based and character-based language modeling datasets demonstrate the efficacy of our proposed method.
arXiv Detail & Related papers (2021-02-04T11:47:03Z) - Local Context Attention for Salient Object Segmentation [5.542044768017415]
We propose a novel Local Context Attention Network (LCANet) to generate locally reinforcement feature maps in a uniform representational architecture.
The proposed network introduces an Attentional Correlation Filter (ACF) module to generate explicit local attention by calculating the correlation feature map between coarse prediction and global context.
Comprehensive experiments are conducted on several salient object segmentation datasets, demonstrating the superior performance of the proposed LCANet against the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-24T09:20:06Z) - On the use of local structural properties for improving the efficiency
of hierarchical community detection methods [77.34726150561087]
We study how local structural network properties can be used as proxies to improve the efficiency of hierarchical community detection.
We also check the performance impact of network prunings as an ancillary tactic to make hierarchical community detection more efficient.
arXiv Detail & Related papers (2020-09-15T00:16:12Z) - Learning Local Features with Context Aggregation for Visual Localization [24.167882373322957]
Keypoint detection and description is fundamental yet important in many vision applications.
Most existing methods use detect-then-describe or detect-and-describe strategy to learn local features without considering their context information.
In this paper, we focus on the fusion of low-level textual information and high-level semantic context information to improve the discrimitiveness of local features.
arXiv Detail & Related papers (2020-05-26T17:19:06Z) - Pairwise Similarity Knowledge Transfer for Weakly Supervised Object
Localization [53.99850033746663]
We study the problem of learning localization model on target classes with weakly supervised image labels.
In this work, we argue that learning only an objectness function is a weak form of knowledge transfer.
Experiments on the COCO and ILSVRC 2013 detection datasets show that the performance of the localization model improves significantly with the inclusion of pairwise similarity function.
arXiv Detail & Related papers (2020-03-18T17:53:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.