Related papers: Toward Improving Attentive Neural Networks in Legal Text Processing

Toward Improving Attentive Neural Networks in Legal Text Processing

URL: http://arxiv.org/abs/2203.08244v1
Date: Tue, 15 Mar 2022 20:45:22 GMT
Title: Toward Improving Attentive Neural Networks in Legal Text Processing
Authors: Ha-Thanh Nguyen
Abstract summary: In this dissertation, we present the main achievements in improving attentive neural networks in automatic legal document processing. Language models tend to grow larger and larger, though, without expert knowledge, these models can still fail in domain adaptation.
Score: 0.20305676256390934
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, thanks to breakthroughs in neural network techniques especially attentive deep learning models, natural language processing has made many impressive achievements. However, automated legal word processing is still a difficult branch of natural language processing. Legal sentences are often long and contain complicated legal terminologies. Hence, models that work well on general documents still face challenges in dealing with legal documents. We have verified the existence of this problem with our experiments in this work. In this dissertation, we selectively present the main achievements in improving attentive neural networks in automatic legal document processing. Language models tend to grow larger and larger, though, without expert knowledge, these models can still fail in domain adaptation, especially for specialized fields like law.

Related papers

TransformLLM: Adapting Large Language Models via LLM-Transformed Reading Comprehension Text [5.523385345486362]
We have developed language models specifically designed for legal applications. Our innovative approach significantly improves capabilities in legal tasks by using Large Language Models (LLMs) to convert raw training data into reading comprehension text.
arXiv Detail & Related papers (2024-10-28T19:32:18Z)
Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian [75.94354349994576]
This paper explores the feasibility of employing smaller, domain-specific encoder LMs alongside prompting techniques to enhance performance in specialized contexts. Our study concentrates on the Italian bureaucratic and legal language, experimenting with both general-purpose and further pre-trained encoder-only models. The results indicate that while further pre-trained models may show diminished robustness in general knowledge, they exhibit superior adaptability for domain-specific tasks, even in a zero-shot setting.
arXiv Detail & Related papers (2024-07-30T08:50:16Z)
Improving Legal Judgement Prediction in Romanian with Long Text Encoders [0.8933959485129375]
We investigate specialized and general models for predicting the final ruling of a legal case, known as Legal Judgment Prediction (LJP) In this work we focus on methods to extend to sequence length of Transformer-based models to better understand the long documents present in legal corpora.
arXiv Detail & Related papers (2024-02-29T13:52:33Z)
LegalRelectra: Mixed-domain Language Modeling for Long-range Legal Text Comprehension [6.442209435258797]
LegalRelectra is a legal-domain language model trained on mixed-domain legal and medical corpora. Our training architecture implements the Electra framework, but utilizes Reformer instead of BERT for its generator and discriminator.
arXiv Detail & Related papers (2022-12-16T00:15:14Z)
NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation [55.17069935305069]
We introduce NeuroCounterfactuals, designed as loose counterfactuals, allowing for larger edits which result in naturalistic generations containing linguistic diversity. Our novel generative approach bridges the benefits of constrained decoding, with those of language model adaptation for sentiment steering.
arXiv Detail & Related papers (2022-10-22T06:29:21Z)
Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding. We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z)
Customizing Contextualized Language Models forLegal Document Reviews [0.22940141855172028]
We show how different language models strained on general-domain corpora can be best customized for legal document reviewing tasks. We compare their efficiencies with respect to task performances and present practical considerations.
arXiv Detail & Related papers (2021-02-10T22:14:15Z)
Deep Learning for Text Style Transfer: A Survey [71.8870854396927]
Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text. We present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017. We discuss the task formulation, existing datasets and subtasks, evaluation, as well as the rich methodologies in the presence of parallel and non-parallel data.
arXiv Detail & Related papers (2020-11-01T04:04:43Z)
Continual Learning for Natural Language Generation in Task-oriented Dialog Systems [72.92029584113676]
Natural language generation (NLG) is an essential component of task-oriented dialog systems. We study NLG in a "continual learning" setting to expand its knowledge to new domains or functionalities incrementally. The major challenge towards this goal is catastrophic forgetting, meaning that a continually trained model tends to forget the knowledge it has learned before.
arXiv Detail & Related papers (2020-10-02T10:32:29Z)
Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [110.93934567725826]
We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process. Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved. We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
arXiv Detail & Related papers (2020-09-28T10:28:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.