Related papers: PEDANTIC: A Dataset for the Automatic Examination of Definiteness in Patent Claims

PEDANTIC: A Dataset for the Automatic Examination of Definiteness in Patent Claims

URL: http://arxiv.org/abs/2505.21342v3
Date: Wed, 18 Jun 2025 12:03:10 GMT
Title: PEDANTIC: A Dataset for the Automatic Examination of Definiteness in Patent Claims
Authors: Valentin Knappich, Annemarie Friedrich, Anna Hätty, Simon Razniewski,
Abstract summary: PEDANTIC is a dataset of 14k US patent claims annotated with reasons for indefiniteness.<n>A human validation study confirms the pipeline's accuracy in generating high-quality annotations.<n> PEDANTIC provides a valuable resource for patent AI researchers, enabling the development of advanced examination models.
Score: 13.242188189150987
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Patent claims define the scope of protection for an invention. If there are ambiguities in a claim, it is rejected by the patent office. In the US, this is referred to as indefiniteness (35 U.S.C {\S} 112(b)) and is among the most frequent reasons for patent application rejection. The development of automatic methods for patent definiteness examination has the potential to make patent drafting and examination more efficient, but no annotated dataset has been published to date. We introduce PEDANTIC (Patent Definiteness Examination Corpus), a novel dataset of 14k US patent claims from patent applications relating to Natural Language Processing (NLP), annotated with reasons for indefiniteness. We construct PEDANTIC using a fully automatic pipeline that retrieves office action documents from the USPTO and uses Large Language Models (LLMs) to extract the reasons for indefiniteness. A human validation study confirms the pipeline's accuracy in generating high-quality annotations. To gain insight beyond binary classification metrics, we implement an LLM-as-Judge evaluation that compares the free-form reasoning of every model-cited reason with every examiner-cited reason. We show that LLM agents based on Qwen 2.5 32B and 72B struggle to outperform logistic regression baselines on definiteness prediction, even though they often correctly identify the underlying reasons. PEDANTIC provides a valuable resource for patent AI researchers, enabling the development of advanced examination models. We will publicly release the dataset and code.

Related papers

Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers [59.168391398830515]
We evaluate 12 pre-trained LLMs and one specialized fact-verifier, using a collection of examples from 14 fact-checking benchmarks.<n>We highlight the importance of addressing annotation errors and ambiguity in datasets.<n> frontier LLMs with few-shot in-context examples, often overlooked in previous works, achieve top-tier performance.
arXiv Detail & Related papers (2025-06-16T10:32:10Z)
Towards Better Evaluation for Generated Patent Claims [0.0]
We introduce Patent-CE, the first comprehensive benchmark for evaluating patent claims.<n>We also propose PatClaimEval, a novel multi-dimensional evaluation method specifically designed for patent claims.<n>This research provides the groundwork for more accurate evaluations of automated patent claim generation systems.
arXiv Detail & Related papers (2025-05-16T10:27:16Z)
Can AI Examine Novelty of Patents?: Novelty Evaluation Based on the Correspondence between Patent Claim and Prior Art [5.655276956391884]
This paper introduces a novel challenge by evaluating the ability of large language models (LLMs) to assess patent novelty.<n>We present the first dataset specifically designed for novelty evaluation, derived from real patent examination cases.<n>Our study reveals that while classification models struggle to effectively assess novelty, generative models make predictions with a reasonable level of accuracy.
arXiv Detail & Related papers (2025-02-10T10:09:29Z)
Towards Copyright Protection for Knowledge Bases of Retrieval-augmented Language Models via Reasoning [58.57194301645823]
Large language models (LLMs) are increasingly integrated into real-world personalized applications.<n>The valuable and often proprietary nature of the knowledge bases used in RAG introduces the risk of unauthorized usage by adversaries.<n>Existing methods that can be generalized as watermarking techniques to protect these knowledge bases typically involve poisoning or backdoor attacks.<n>We propose name for harmless' copyright protection of knowledge bases.
arXiv Detail & Related papers (2025-02-10T09:15:56Z)
Patent-CR: A Dataset for Patent Claim Revision [0.0]
This paper presents Patent-CR, the first dataset created for the patent claim revision task in English.<n>It includes both initial patent applications rejected by patent examiners and the final granted versions.
arXiv Detail & Related papers (2024-12-03T16:43:42Z)
PatentEdits: Framing Patent Novelty as Textual Entailment [62.8514393375952]
We introduce the PatentEdits dataset, which contains 105K examples of successful revisions. We design algorithms to label edits sentence by sentence, then establish how well these edits can be predicted with large language models. We demonstrate that evaluating textual entailment between cited references and draft sentences is especially effective in predicting which inventive claims remained unchanged or are novel in relation to prior art.
arXiv Detail & Related papers (2024-11-20T17:23:40Z)
Contrastive Learning to Improve Retrieval for Real-world Fact Checking [84.57583869042791]
We present Contrastive Fact-Checking Reranker (CFR), an improved retriever for fact-checking complex claims. We leverage the AVeriTeC dataset, which annotates subquestions for claims with human written answers from evidence documents. We find a 6% improvement in veracity classification accuracy on the dataset.
arXiv Detail & Related papers (2024-10-07T00:09:50Z)
ClaimCompare: A Data Pipeline for Evaluation of Novelty Destroying Patent Pairs [2.60235825984014]
We introduce a novel data pipeline, ClaimCompare, designed to generate labeled patent claim datasets suitable for training IR and ML models. To the best of our knowledge, ClaimCompare is the first pipeline that can generate multiple novelty destroying patent datasets.
arXiv Detail & Related papers (2024-07-16T21:38:45Z)
Query Performance Prediction using Relevance Judgments Generated by Large Language Models [53.97064615557883]
We propose a new Query performance prediction (QPP) framework using automatically generated relevance judgments (QPP-GenRE)<n>QPP-GenRE decomposes QPP into independent subtasks of predicting relevance of each item in a ranked list to a given query.<n>We predict an item's relevance by using open-source large language models (LLMs) to ensure scientific relevance.
arXiv Detail & Related papers (2024-04-01T09:33:05Z)
PaECTER: Patent-level Representation Learning using Citation-informed Transformers [0.16785092703248325]
PaECTER is a publicly available, open-source document-level encoder specific for patents. We fine-tune BERT for Patents with examiner-added citation information to generate numerical representations for patent documents. PaECTER performs better in similarity tasks than current state-of-the-art models used in the patent domain.
arXiv Detail & Related papers (2024-02-29T18:09:03Z)
Unveiling Black-boxes: Explainable Deep Learning Models for Patent Classification [48.5140223214582]
State-of-the-art methods for multi-label patent classification rely on deep opaque neural networks (DNNs) We propose a novel deep explainable patent classification framework by introducing layer-wise relevance propagation (LRP) Considering the relevance score, we then generate explanations by visualizing relevant words for the predicted patent class.
arXiv Detail & Related papers (2023-10-31T14:11:37Z)
Can AI-Generated Text be Reliably Detected? [50.95804851595018]
Large Language Models (LLMs) perform impressively well in various applications.<n>The potential for misuse of these models in activities such as plagiarism, generating fake news, and spamming has raised concern about their responsible use.<n>We stress-test the robustness of these AI text detectors in the presence of an attacker.
arXiv Detail & Related papers (2023-03-17T17:53:19Z)
Evaluating Generative Patent Language Models [1.8275108630751844]
This manuscript aims to build generative language models in the patent domain. The perspective is to measure the ratio of keystrokes that can be saved by autocompletion. The largest model built in this manuscript is 6B, which is state-of-the-art in the patent domain.
arXiv Detail & Related papers (2022-06-23T08:58:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.