Research on Evaluation Methods for Patent Novelty Search Systems and Empirical Analysis
- URL: http://arxiv.org/abs/2508.17782v1
- Date: Mon, 25 Aug 2025 08:24:04 GMT
- Title: Research on Evaluation Methods for Patent Novelty Search Systems and Empirical Analysis
- Authors: Shu Zhang, LiSha Zhang, Kai Duan, XinKai Sun,
- Abstract summary: We propose a comprehensive evaluation methodology that builds high-quality, reproducible datasets from examiner citations and X-type citations extracted from technically consistent family patents.<n>Experiments show the method effectively exposes performance differences across scenarios and offers actionable evidence for system improvement.<n>The framework is scalable and practical, providing a useful reference for development and optimization of patent novelty search systems.
- Score: 1.5611734619214201
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Patent novelty search systems are critical to IP protection and innovation assessment; their retrieval accuracy directly impacts patent quality. We propose a comprehensive evaluation methodology that builds high-quality, reproducible datasets from examiner citations and X-type citations extracted from technically consistent family patents, and evaluates systems using invention descriptions as inputs. Using Top-k Detection Rate and Recall as core metrics, we further conduct multi-dimensional analyses by language, technical field (IPC), and filing jurisdiction. Experiments show the method effectively exposes performance differences across scenarios and offers actionable evidence for system improvement. The framework is scalable and practical, providing a useful reference for development and optimization of patent novelty search systems
Related papers
- Revisiting Logit Distributions for Reliable Out-of-Distribution Detection [73.9121001113687]
Out-of-distribution (OOD) detection is critical for ensuring the reliability of deep learning models in open-world applications.<n>LogitGap is a novel post-hoc OOD detection method that exploits the relationship between the maximum logit and the remaining logits.<n>We show that LogitGap consistently achieves state-of-the-art performance across diverse OOD detection scenarios and benchmarks.
arXiv Detail & Related papers (2025-10-23T02:16:45Z) - Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications [59.721265428780946]
Large Language Models (LLMs) in medicine have enabled impressive capabilities, yet a critical gap remains in their ability to perform systematic, transparent, and verifiable reasoning.<n>This paper provides the first systematic review of this emerging field.<n>We propose a taxonomy of reasoning enhancement techniques, categorized into training-time strategies and test-time mechanisms.
arXiv Detail & Related papers (2025-08-01T14:41:31Z) - PatentMind: A Multi-Aspect Reasoning Graph for Patent Similarity Evaluation [35.13558856456741]
Patent similarity evaluation plays a critical role in intellectual property analysis.<n>We introduce PatentMind, a novel framework for patent similarity assessment based on a Multi-Aspect Reasoning Graph (MARG)<n>Our framework provides a structured and semantically grounded foundation for real-world decision-making.
arXiv Detail & Related papers (2025-05-25T22:28:27Z) - Scoring Verifiers: Evaluating Synthetic Verification for Code and Reasoning [59.25951947621526]
We propose an approach which can transform existing coding benchmarks into scoring and ranking datasets to evaluate the effectiveness of synthetic verifiers.<n>We release four new benchmarks (HE-R, HE-R+, MBPP-R, and MBPP-R+), and analyzed synthetic verification methods with standard, reasoning-based, and reward-based LLMs.<n>Our experiments show that reasoning can significantly improve test case generation and that scaling the number of test cases enhances the verification accuracy.
arXiv Detail & Related papers (2025-02-19T15:32:11Z) - Exploring Information Retrieval Landscapes: An Investigation of a Novel Evaluation Techniques and Comparative Document Splitting Methods [0.0]
In this study, the structured nature of textbooks, the conciseness of articles, and the narrative complexity of novels are shown to require distinct retrieval strategies.
A novel evaluation technique is introduced, utilizing an open-source model to generate a comprehensive dataset of question-and-answer pairs.
The evaluation employs weighted scoring metrics, including SequenceMatcher, BLEU, METEOR, and BERT Score, to assess the system's accuracy and relevance.
arXiv Detail & Related papers (2024-09-13T02:08:47Z) - A Literature Review of Literature Reviews in Pattern Analysis and Machine Intelligence [51.26815896167173]
We present a comprehensive tertiary analysis of PAMI reviews along three complementary dimensions.<n>Our analyses reveal distinctive organizational patterns as well as persistent gaps in current review practices.<n>Finally, our evaluation of state-of-the-art AI-generated reviews indicates encouraging advances in coherence and organization.
arXiv Detail & Related papers (2024-02-20T11:28:50Z) - Evaluating Generative Ad Hoc Information Retrieval [58.800799175084286]
generative retrieval systems often directly return a grounded generated text as a response to a query.
Quantifying the utility of the textual responses is essential for appropriately evaluating such generative ad hoc retrieval.
arXiv Detail & Related papers (2023-11-08T14:05:00Z) - A Survey on Sentence Embedding Models Performance for Patent Analysis [0.0]
We propose a standard library and dataset for assessing the accuracy of embeddings models based on PatentSBERTa approach.
Results show PatentSBERTa, Bert-for-patents, and TF-IDF Weighted Word Embeddings have the best accuracy for computing sentence embeddings at the subclass level.
arXiv Detail & Related papers (2022-04-28T12:04:42Z) - Deep learning-based citation recommendation system for patents [5.376388266200792]
We present a novel dataset called PatentNet that includes textual information and metadata for approximately 110,000 patents from the Google Big Query service.
Compared with existing recommendation methods, the proposed benchmark method achieved a mean reciprocal rank of 0.2377 on the test set.
arXiv Detail & Related papers (2020-10-21T12:18:21Z) - PONE: A Novel Automatic Evaluation Metric for Open-Domain Generative
Dialogue Systems [48.99561874529323]
There are three kinds of automatic methods to evaluate the open-domain generative dialogue systems.
Due to the lack of systematic comparison, it is not clear which kind of metrics are more effective.
We propose a novel and feasible learning-based metric that can significantly improve the correlation with human judgments.
arXiv Detail & Related papers (2020-04-06T04:36:33Z) - Wrapper Feature Selection Algorithm for the Optimization of an Indicator
System of Patent Value Assessment [1.52292571922932]
The limitations of previous research on patent value assessment were analyzed.
A wrapper-mode feature selection algorithm that is based on classifier prediction accuracy was developed.
arXiv Detail & Related papers (2020-01-21T06:04:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.