Do Language Models Learn about Legal Entity Types during Pretraining?
- URL: http://arxiv.org/abs/2310.13092v1
- Date: Thu, 19 Oct 2023 18:47:21 GMT
- Title: Do Language Models Learn about Legal Entity Types during Pretraining?
- Authors: Claire Barale, Michael Rovatsos, Nehal Bhuta
- Abstract summary: We show that Llama2 performs well on certain entities and exhibits potential for substantial improvement with optimized prompt templates.
Llama2 appears to frequently overlook syntactic cues, a shortcoming less present in BERT-based architectures.
- Score: 4.604003661048267
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Language Models (LMs) have proven their ability to acquire diverse linguistic
knowledge during the pretraining phase, potentially serving as a valuable
source of incidental supervision for downstream tasks. However, there has been
limited research conducted on the retrieval of domain-specific knowledge, and
specifically legal knowledge. We propose to explore the task of Entity Typing,
serving as a proxy for evaluating legal knowledge as an essential aspect of
text comprehension, and a foundational task to numerous downstream legal NLP
applications. Through systematic evaluation and analysis and two types of
prompting (cloze sentences and QA-based templates) and to clarify the nature of
these acquired cues, we compare diverse types and lengths of entities both
general and domain-specific entities, semantics or syntax signals, and
different LM pretraining corpus (generic and legal-oriented) and architectures
(encoder BERT-based and decoder-only with Llama2). We show that (1) Llama2
performs well on certain entities and exhibits potential for substantial
improvement with optimized prompt templates, (2) law-oriented LMs show
inconsistent performance, possibly due to variations in their training corpus,
(3) LMs demonstrate the ability to type entities even in the case of
multi-token entities, (4) all models struggle with entities belonging to
sub-domains of the law (5) Llama2 appears to frequently overlook syntactic
cues, a shortcoming less present in BERT-based architectures.
Related papers
- Large Language Models for Judicial Entity Extraction: A Comparative Study [0.0]
This research investigates the application of Large Language Models in identifying domain-specific entities within case law documents.
The study evaluates the performance of state-of-the-art Large Language Model architectures, including Large Language Model Meta AI 3, Mistral, and Gemma.
arXiv Detail & Related papers (2024-07-08T09:49:03Z) - Can Large Language Models Identify Authorship? [16.35265384114857]
Large Language Models (LLMs) have demonstrated an exceptional capacity for reasoning and problem-solving.
This work seeks to address three research questions: (1) Can LLMs perform zero-shot, end-to-end authorship verification effectively?
(2) Are LLMs capable of accurately attributing authorship among multiple candidates authors (e.g., 10 and 20)?
arXiv Detail & Related papers (2024-03-13T03:22:02Z) - Knowledge Plugins: Enhancing Large Language Models for Domain-Specific
Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE.
This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z) - Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language
Pretraining? [34.609984453754656]
We aim to elucidate the impact of comprehensive linguistic knowledge, including semantic expression and syntactic structure, on multimodal alignment.
Specifically, we design and release the SNARE, the first large-scale multimodal alignment probing benchmark.
arXiv Detail & Related papers (2023-08-24T16:17:40Z) - One Law, Many Languages: Benchmarking Multilingual Legal Reasoning for Judicial Support [18.810320088441678]
This work introduces a novel NLP benchmark for the legal domain.
It challenges LLMs in five key dimensions: processing emphlong documents (up to 50K tokens), using emphdomain-specific knowledge (embodied in legal texts) and emphmultilingual understanding (covering five languages)
Our benchmark contains diverse datasets from the Swiss legal system, allowing for a comprehensive study of the underlying non-English, inherently multilingual legal system.
arXiv Detail & Related papers (2023-06-15T16:19:15Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Prompting Language Models for Linguistic Structure [73.11488464916668]
We present a structured prompting approach for linguistic structured prediction tasks.
We evaluate this approach on part-of-speech tagging, named entity recognition, and sentence chunking.
We find that while PLMs contain significant prior knowledge of task labels due to task leakage into the pretraining corpus, structured prompting can also retrieve linguistic structure with arbitrary labels.
arXiv Detail & Related papers (2022-11-15T01:13:39Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages
with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context.
It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts.
Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.