Related papers: Towards Better Evaluation for Generated Patent Claims

Towards Better Evaluation for Generated Patent Claims

URL: http://arxiv.org/abs/2505.11095v1
Date: Fri, 16 May 2025 10:27:16 GMT
Title: Towards Better Evaluation for Generated Patent Claims
Authors: Lekang Jiang, Pascal A Scherz, Stephan Goetz,
Abstract summary: We introduce Patent-CE, the first comprehensive benchmark for evaluating patent claims.<n>We also propose PatClaimEval, a novel multi-dimensional evaluation method specifically designed for patent claims.<n>This research provides the groundwork for more accurate evaluations of automated patent claim generation systems.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Patent claims define the scope of protection and establish the legal boundaries of an invention. Drafting these claims is a complex and time-consuming process that usually requires the expertise of skilled patent attorneys, which can form a large access barrier for many small enterprises. To solve these challenges, researchers have investigated the use of large language models (LLMs) for automating patent claim generation. However, existing studies highlight inconsistencies between automated evaluation metrics and human expert assessments. To bridge this gap, we introduce Patent-CE, the first comprehensive benchmark for evaluating patent claims. Patent-CE includes comparative claim evaluations annotated by patent experts, focusing on five key criteria: feature completeness, conceptual clarity, terminology consistency, logical linkage, and overall quality. Additionally, we propose PatClaimEval, a novel multi-dimensional evaluation method specifically designed for patent claims. Our experiments demonstrate that PatClaimEval achieves the highest correlation with human expert evaluations across all assessment criteria among all tested metrics. This research provides the groundwork for more accurate evaluations of automated patent claim generation systems.

Related papers

PatentMind: A Multi-Aspect Reasoning Graph for Patent Similarity Evaluation [32.272839191711114]
We introduce PatentMind, a novel framework for patent similarity assessment based on a Multi-Aspect Reasoning Graph (MARG)<n>PatentMind decomposes patents into three core dimensions: technical feature, application domain, and claim scope, to compute dimension-specific similarity scores.<n>To support evaluation, we construct PatentSimBench, a human-annotated benchmark comprising 500 patent pairs.
arXiv Detail & Related papers (2025-05-25T22:28:27Z)
PatentScore: Multi-dimensional Evaluation of LLM-Generated Patent Claims [32.272839191711114]
We introduce PatentScore, a multi-dimensional evaluation framework for assessing LLM-generated patent claims.<n>Unlike general-purpose NLG metrics, PatentScore reflects patent-specific constraints and document structures, enabling evaluation beyond surface similarity.<n>We report a Pearson correlation of $r = 0.819$ with expert annotations, outperforming existing NLG metrics.
arXiv Detail & Related papers (2025-05-25T22:20:11Z)
Scoring Verifiers: Evaluating Synthetic Verification for Code and Reasoning [59.25951947621526]
We propose an approach which can transform existing coding benchmarks into scoring and ranking datasets to evaluate the effectiveness of synthetic verifiers.<n>We release four new benchmarks (HE-R, HE-R+, MBPP-R, and MBPP-R+), and analyzed synthetic verification methods with standard, reasoning-based, and reward-based LLMs.<n>Our experiments show that reasoning can significantly improve test case generation and that scaling the number of test cases enhances the verification accuracy.
arXiv Detail & Related papers (2025-02-19T15:32:11Z)
AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability.<n>The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z)
Can AI Examine Novelty of Patents?: Novelty Evaluation Based on the Correspondence between Patent Claim and Prior Art [5.655276956391884]
This paper introduces a novel challenge by evaluating the ability of large language models (LLMs) to assess patent novelty.<n>We present the first dataset specifically designed for novelty evaluation, derived from real patent examination cases.<n>Our study reveals that while classification models struggle to effectively assess novelty, generative models make predictions with a reasonable level of accuracy.
arXiv Detail & Related papers (2025-02-10T10:09:29Z)
Intelligent System for Automated Molecular Patent Infringement Assessment [38.48937966447085]
PatentFinder is a novel multi-agent and tool-enhanced intelligence system that can accurately and comprehensively evaluate small molecules for patent infringement.<n>PatentFinder features five specialized agents that collaboratively analyze patent claims and molecular structures.<n>PatentFinder autonomously generates detailed and interpretable patent infringement reports, showcasing enhanced accuracy and improved interpretability.
arXiv Detail & Related papers (2024-12-10T12:14:38Z)
Patent-CR: A Dataset for Patent Claim Revision [0.0]
This paper presents Patent-CR, the first dataset created for the patent claim revision task in English.<n>It includes both initial patent applications rejected by patent examiners and the final granted versions.
arXiv Detail & Related papers (2024-12-03T16:43:42Z)
PatentEdits: Framing Patent Novelty as Textual Entailment [62.8514393375952]
We introduce the PatentEdits dataset, which contains 105K examples of successful revisions. We design algorithms to label edits sentence by sentence, then establish how well these edits can be predicted with large language models. We demonstrate that evaluating textual entailment between cited references and draft sentences is especially effective in predicting which inventive claims remained unchanged or are novel in relation to prior art.
arXiv Detail & Related papers (2024-11-20T17:23:40Z)
Towards Automated Patent Workflows: AI-Orchestrated Multi-Agent Framework for Intellectual Property Management and Analysis [0.0]
PatExpert is an autonomous multi-agent conversational framework designed to streamline and optimize patent-related tasks. The framework consists of a metaagent that coordinates task-specific expert agents for various patent-related tasks and a critique agent for error handling and feedback provision.
arXiv Detail & Related papers (2024-09-21T13:44:34Z)
Structural Representation Learning and Disentanglement for Evidential Chinese Patent Approval Prediction [19.287231890434718]
This paper presents the pioneering effort on this task using a retrieval-based classification approach. We propose a novel framework called DiSPat, which focuses on structural representation learning and disentanglement. Our framework surpasses state-of-the-art baselines on patent approval prediction, while also exhibiting enhanced evidentiality.
arXiv Detail & Related papers (2024-08-23T05:44:16Z)
Unveiling Black-boxes: Explainable Deep Learning Models for Patent Classification [48.5140223214582]
State-of-the-art methods for multi-label patent classification rely on deep opaque neural networks (DNNs) We propose a novel deep explainable patent classification framework by introducing layer-wise relevance propagation (LRP) Considering the relevance score, we then generate explanations by visualizing relevant words for the predicted patent class.
arXiv Detail & Related papers (2023-10-31T14:11:37Z)
Towards a Complete Metamorphic Testing Pipeline [56.75969180129005]
Metamorphic Testing (MT) addresses the test oracle problem by examining the relationships between input-output pairs in consecutive executions of the System Under Test (SUT) These relations, known as Metamorphic Relations (MRs), specify the expected output changes resulting from specific input changes. Our research aims to develop methods and tools that assist testers in generating MRs, defining constraints, and providing explainability for MR outcomes.
arXiv Detail & Related papers (2023-09-30T10:49:22Z)
Adaptive Taxonomy Learning and Historical Patterns Modelling for Patent Classification [26.85734804493925]
We propose an integrated framework that comprehensively considers the information on patents for patent classification. We first present an IPC codes correlations learning module to derive their semantic representations. Finally, we combine the contextual information of patent texts that contains the semantics of IPC codes, and assignees' sequential preferences to make predictions.
arXiv Detail & Related papers (2023-08-10T07:02:24Z)
A Survey on Sentence Embedding Models Performance for Patent Analysis [0.0]
We propose a standard library and dataset for assessing the accuracy of embeddings models based on PatentSBERTa approach. Results show PatentSBERTa, Bert-for-patents, and TF-IDF Weighted Word Embeddings have the best accuracy for computing sentence embeddings at the subclass level.
arXiv Detail & Related papers (2022-04-28T12:04:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.