PatentEval: Understanding Errors in Patent Generation
- URL: http://arxiv.org/abs/2406.06589v2
- Date: Tue, 25 Jun 2024 08:23:03 GMT
- Title: PatentEval: Understanding Errors in Patent Generation
- Authors: You Zuo, Kim Gerdes, Eric Villemonte de La Clergerie, BenoƮt Sagot,
- Abstract summary: We introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts.
We have also developed a benchmark, PatentEval, for systematically assessing language models in this context.
- Score: 9.981773213952994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims-to-abstract generation, and the generation of the next claim given previous ones. We have also developed a benchmark, PatentEval, for systematically assessing language models in this context. Our study includes a comparative analysis, annotated by humans, of various models. These range from those specifically adapted during training for tasks within the patent domain to the latest general-purpose large language models (LLMs). Furthermore, we explored and evaluated some metrics to approximate human judgments in patent text evaluation, analyzing the extent to which these metrics align with expert assessments. These approaches provide valuable insights into the capabilities and limitations of current language models in the specialized field of patent text generation.
Related papers
- Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian [75.94354349994576]
This paper explores the feasibility of employing smaller, domain-specific encoder LMs alongside prompting techniques to enhance performance in specialized contexts.
Our study concentrates on the Italian bureaucratic and legal language, experimenting with both general-purpose and further pre-trained encoder-only models.
The results indicate that while further pre-trained models may show diminished robustness in general knowledge, they exhibit superior adaptability for domain-specific tasks, even in a zero-shot setting.
arXiv Detail & Related papers (2024-07-30T08:50:16Z) - InstructPatentGPT: Training patent language models to follow instructions with human feedback [0.9790236766474201]
This research aims to increase the likelihood for a language model to generate patent claims that have a higher chance of being granted.
To showcase the controllability of the language model, the system learns from granted patents and pre-grant applications with different rewards.
arXiv Detail & Related papers (2024-05-25T11:48:50Z) - Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation [42.17092078439132]
We argue that it is not reliable and comprehensive to evaluate language models with a fixed question or limited paraphrases as the query.
We introduce a novel concept named knowledge boundary to encompass both prompt-agnostic and prompt-sensitive knowledge.
arXiv Detail & Related papers (2024-02-18T07:48:15Z) - Unveiling Black-boxes: Explainable Deep Learning Models for Patent
Classification [48.5140223214582]
State-of-the-art methods for multi-label patent classification rely on deep opaque neural networks (DNNs)
We propose a novel deep explainable patent classification framework by introducing layer-wise relevance propagation (LRP)
Considering the relevance score, we then generate explanations by visualizing relevant words for the predicted patent class.
arXiv Detail & Related papers (2023-10-31T14:11:37Z) - L2CEval: Evaluating Language-to-Code Generation Capabilities of Large
Language Models [102.00201523306986]
We present L2CEval, a systematic evaluation of the language-to-code generation capabilities of large language models (LLMs)
We analyze the factors that potentially affect their performance, such as model size, pretraining data, instruction tuning, and different prompting methods.
In addition to assessing model performance, we measure confidence calibration for the models and conduct human evaluations of the output programs.
arXiv Detail & Related papers (2023-09-29T17:57:00Z) - RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models [57.12888828853409]
RAVEN is a model that combines retrieval-augmented masked language modeling and prefix language modeling.
Fusion-in-Context Learning enables the model to leverage more in-context examples without requiring additional training.
Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning.
arXiv Detail & Related papers (2023-08-15T17:59:18Z) - Large-Scale Text Analysis Using Generative Language Models: A Case Study
in Discovering Public Value Expressions in AI Patents [2.246222223318928]
This paper employs a novel approach using a generative language model (GPT-4) to produce labels and rationales for large-scale text analysis.
We collect a database comprising 154,934 patent documents using an advanced Boolean query submitted to InnovationQ+.
We design a framework for identifying and labeling public value expressions in these AI patent sentences.
arXiv Detail & Related papers (2023-05-17T17:18:26Z) - A Survey on Sentence Embedding Models Performance for Patent Analysis [0.0]
We propose a standard library and dataset for assessing the accuracy of embeddings models based on PatentSBERTa approach.
Results show PatentSBERTa, Bert-for-patents, and TF-IDF Weighted Word Embeddings have the best accuracy for computing sentence embeddings at the subclass level.
arXiv Detail & Related papers (2022-04-28T12:04:42Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Linguistically Informed Masking for Representation Learning in the
Patent Domain [7.911344873839031]
We propose the empirically motivated Linguistically Informed Masking (LIM) method to focus domain-adaptative pre-training on the linguistic patterns of patents.
We quantify the relevant differences between patent, scientific and general-purpose language.
We demonstrate the impact of balancing the learning from different information sources during domain adaptation for the patent domain.
arXiv Detail & Related papers (2021-06-10T14:20:57Z) - A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques.
We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.