Which Syntactic Capabilities Are Statistically Learned by Masked
Language Models for Code?
- URL: http://arxiv.org/abs/2401.01512v2
- Date: Wed, 21 Feb 2024 16:22:22 GMT
- Title: Which Syntactic Capabilities Are Statistically Learned by Masked
Language Models for Code?
- Authors: Alejandro Velasco, David N. Palacio, Daniel Rodriguez-Cardenas and
Denys Poshyvanyk
- Abstract summary: We highlight relying on accuracy-based measurements may lead to an overestimation of models' capabilities.
To address these issues, we introduce a technique called SyntaxEval in Syntactic Capabilities.
- Score: 51.29970742152668
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper discusses the limitations of evaluating Masked Language Models
(MLMs) in code completion tasks. We highlight that relying on accuracy-based
measurements may lead to an overestimation of models' capabilities by
neglecting the syntax rules of programming languages. To address these issues,
we introduce a technique called SyntaxEval in which Syntactic Capabilities are
used to enhance the evaluation of MLMs. SyntaxEval automates the process of
masking elements in the model input based on their Abstract Syntax Trees
(ASTs). We conducted a case study on two popular MLMs using data from GitHub
repositories. Our results showed negative causal effects between the node types
and MLMs' accuracy. We conclude that MLMs under study fail to predict some
syntactic capabilities.
Related papers
- Using Grammar Masking to Ensure Syntactic Validity in LLM-based Modeling Tasks [0.996023506058745]
Grammar masking is used to guide large language models toward producing syntactically correct models for a given context-free grammar.
We show that grammar masking can dramatically improve the modeling capabilities of several language models.
arXiv Detail & Related papers (2024-07-08T17:19:59Z) - Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks [12.629516072317331]
Syntax-Aware Fill-In-the-Middle (SAFIM) is a new benchmark for evaluating Large Language Models (LLMs) on the code Fill-in-the-Middle (FIM) task.
This benchmark focuses on syntax-aware completions of program structures such as code blocks and conditional expressions.
arXiv Detail & Related papers (2024-03-07T05:05:56Z) - CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark.
In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship.
We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z) - InfiMM-Eval: Complex Open-Ended Reasoning Evaluation For Multi-Modal
Large Language Models [50.03163753638256]
Multi-modal Large Language Models (MLLMs) are increasingly prominent in the field of artificial intelligence.
Our benchmark comprises three key reasoning categories: deductive, abductive, and analogical reasoning.
We evaluate a selection of representative MLLMs using this rigorously developed open-ended multi-step elaborate reasoning benchmark.
arXiv Detail & Related papers (2023-11-20T07:06:31Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - Masked and Permuted Implicit Context Learning for Scene Text Recognition [8.742571493814326]
Scene Recognition (STR) is difficult because of variations in text styles, shapes, and backgrounds.
We propose a masked and permuted implicit context learning network for STR, within a single decoder.
arXiv Detail & Related papers (2023-05-25T15:31:02Z) - Augmented Language Models: a Survey [55.965967655575454]
This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools.
We refer to them as Augmented Language Models (ALMs)
The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks.
arXiv Detail & Related papers (2023-02-15T18:25:52Z) - Inconsistencies in Masked Language Models [20.320583166619528]
Masked language models (MLMs) can provide distributions of tokens in the masked positions in a sequence.
distributions corresponding to different masking patterns can demonstrate considerable inconsistencies.
We propose an inference-time strategy for fors called Ensemble of Conditionals.
arXiv Detail & Related papers (2022-12-30T22:53:25Z) - Masked Language Modeling and the Distributional Hypothesis: Order Word
Matters Pre-training for Little [74.49773960145681]
A possible explanation for the impressive performance of masked language model (MLM)-training is that such models have learned to represent the syntactic structures prevalent in NLP pipelines.
In this paper, we propose a different explanation: pre-trains succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics.
Our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.
arXiv Detail & Related papers (2021-04-14T06:30:36Z) - MLMLM: Link Prediction with Mean Likelihood Masked Language Model [14.672283581769774]
Knowledge Bases (KBs) are easy query, verifiable, and interpretable.
Masked Models (MLMs), such as BERT, scale with computing power as well as raw text data.
We introduce Mean Likelihood Masked Language Model, an approach comparing mean likelihood of generating different entities to perform link prediction.
arXiv Detail & Related papers (2020-09-15T13:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.