Related papers: BERT based patent novelty search by training claims to their own description

BERT based patent novelty search by training claims to their own description

URL: http://arxiv.org/abs/2103.01126v3
Date: Thu, 4 Mar 2021 13:39:23 GMT
Title: BERT based patent novelty search by training claims to their own description
Authors: Michael Freunek and Andr\'e Bodmer
Abstract summary: We introduce a new scoring scheme, relevance scoring or novelty scoring, to process the output of BERT in a meaningful way. We tested the method on patent applications by training BERT on the first claims of patents and corresponding descriptions. BERT's output has been processed according to the relevance score and the results compared with the cited X documents in the search reports.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we present a method to concatenate patent claims to their own description. By applying this method, BERT trains suitable descriptions for claims. Such a trained BERT (claim-to-description- BERT) could be able to identify novelty relevant descriptions for patents. In addition, we introduce a new scoring scheme, relevance scoring or novelty scoring, to process the output of BERT in a meaningful way. We tested the method on patent applications by training BERT on the first claims of patents and corresponding descriptions. BERT's output has been processed according to the relevance score and the results compared with the cited X documents in the search reports. The test showed that BERT has scored some of the cited X documents as highly relevant.

Related papers

PatentEdits: Framing Patent Novelty as Textual Entailment [62.8514393375952]
We introduce the PatentEdits dataset, which contains 105K examples of successful revisions. We design algorithms to label edits sentence by sentence, then establish how well these edits can be predicted with large language models. We demonstrate that evaluating textual entailment between cited references and draft sentences is especially effective in predicting which inventive claims remained unchanged or are novel in relation to prior art.
arXiv Detail & Related papers (2024-11-20T17:23:40Z)
Structural Representation Learning and Disentanglement for Evidential Chinese Patent Approval Prediction [19.287231890434718]
This paper presents the pioneering effort on this task using a retrieval-based classification approach. We propose a novel framework called DiSPat, which focuses on structural representation learning and disentanglement. Our framework surpasses state-of-the-art baselines on patent approval prediction, while also exhibiting enhanced evidentiality.
arXiv Detail & Related papers (2024-08-23T05:44:16Z)
ClaimCompare: A Data Pipeline for Evaluation of Novelty Destroying Patent Pairs [2.60235825984014]
We introduce a novel data pipeline, ClaimCompare, designed to generate labeled patent claim datasets suitable for training IR and ML models. To the best of our knowledge, ClaimCompare is the first pipeline that can generate multiple novelty destroying patent datasets.
arXiv Detail & Related papers (2024-07-16T21:38:45Z)
PaECTER: Patent-level Representation Learning using Citation-informed Transformers [0.16785092703248325]
PaECTER is a publicly available, open-source document-level encoder specific for patents. We fine-tune BERT for Patents with examiner-added citation information to generate numerical representations for patent documents. PaECTER performs better in similarity tasks than current state-of-the-art models used in the patent domain.
arXiv Detail & Related papers (2024-02-29T18:09:03Z)
Multi label classification of Artificial Intelligence related patents using Modified D2SBERT and Sentence Attention mechanism [0.0]
We present a method for classifying artificial intelligence-related patents published by the USPTO using natural language processing technique and deep learning methodology. Our experiment result is highest performance compared to other deep learning methods.
arXiv Detail & Related papers (2023-03-03T12:27:24Z)
A Systematic Review and Replicability Study of BERT4Rec for Sequential Recommendation [91.02268704681124]
BERT4Rec is an effective model for sequential recommendation based on the Transformer architecture. We show that BERT4Rec results are not consistent within these publications. We propose our own implementation of BERT4Rec based on the Hugging Face Transformers library.
arXiv Detail & Related papers (2022-07-15T14:09:04Z)
BiBERT: Accurate Fully Binarized BERT [69.35727280997617]
BiBERT is an accurate fully binarized BERT to eliminate the performance bottlenecks. Our method yields impressive 56.3 times and 31.2 times saving on FLOPs and model size.
arXiv Detail & Related papers (2022-03-12T09:46:13Z)
BERT based freedom to operate patent analysis [0.0]
We present a method to apply BERT to freedom to operate patent analysis and patent searches. BERT is fine-tuned by training patent descriptions to the independent claims.
arXiv Detail & Related papers (2021-04-12T18:30:46Z)
TernaryBERT: Distillation-aware Ultra-low Bit BERT [53.06741585060951]
We propose TernaryBERT, which ternarizes the weights in a fine-tuned BERT model. Experiments on the GLUE benchmark and SQuAD show that our proposed TernaryBERT outperforms the other BERT quantization methods.
arXiv Detail & Related papers (2020-09-27T10:17:28Z)
DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding [90.85913515409275]
Recent studies on open-domain question answering have achieved prominent performance improvement using pre-trained language models such as BERT. We propose DC-BERT, a contextual encoding framework that has dual BERT models: an online BERT which encodes the question only once, and an offline BERT which pre-encodes all the documents and caches their encodings. On SQuAD Open and Natural Questions Open datasets, DC-BERT achieves 10x speedup on document retrieval, while retaining most (about 98%) of the QA performance.
arXiv Detail & Related papers (2020-02-28T08:18:37Z)
Incorporating BERT into Neural Machine Translation [251.54280200353674]
We propose a new algorithm named BERT-fused model, in which we first use BERT to extract representations for an input sequence. We conduct experiments on supervised (including sentence-level and document-level translations), semi-supervised and unsupervised machine translation, and achieve state-of-the-art results on seven benchmark datasets.
arXiv Detail & Related papers (2020-02-17T08:13:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.