Fugu-MT 論文翻訳(概要): A Novel Patent Similarity Measurement Methodology: Semantic Distance and Technological Distance

論文の概要: A Novel Patent Similarity Measurement Methodology: Semantic Distance and Technological Distance

arxiv url: http://arxiv.org/abs/2303.16767v2
Date: Fri, 1 Dec 2023 04:29:49 GMT
ステータス: 翻訳完了
システム内更新日: 2023-12-04 18:54:56.462052
Title: A Novel Patent Similarity Measurement Methodology: Semantic Distance and Technological Distance
Title（参考訳）: 新しい特許類似度測定手法:意味的距離と技術的距離
Authors: Yongmin Yoo, Cheonkam Jeong, Sanguk Gim, Junwon Lee, Zachary Schimke, Deaho Seo
Abstract要約: 特許類似性分析は、特許侵害のリスクを評価する上で重要な役割を果たす。自然言語処理技術の最近の進歩は、このプロセスを自動化するための有望な道を提供する。本稿では,特許間の類似性を考慮し,特許の意味的類似性を考慮し,特許間の類似度を測定するハイブリッド手法を提案する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Patent similarity analysis plays a crucial role in evaluating the risk of patent infringement. Nonetheless, this analysis is predominantly conducted manually by legal experts, often resulting in a time-consuming process. Recent advances in natural language processing technology offer a promising avenue for automating this process. However, methods for measuring similarity between patents still rely on experts manually classifying patents. Due to the recent development of artificial intelligence technology, a lot of research is being conducted focusing on the semantic similarity of patents using natural language processing technology. However, it is difficult to accurately analyze patent data, which are legal documents representing complex technologies, using existing natural language processing technologies. To address these limitations, we propose a hybrid methodology that takes into account bibliographic similarity, measures the similarity between patents by considering the semantic similarity of patents, the technical similarity between patents, and the bibliographic information of patents. Using natural language processing techniques, we measure semantic similarity based on patent text and calculate technical similarity through the degree of coexistence of International patent classification (IPC) codes. The similarity of bibliographic information of a patent is calculated using the special characteristics of the patent: citation information, inventor information, and assignee information. We propose a model that assigns reasonable weights to each similarity method considered. With the help of experts, we performed manual similarity evaluations on 420 pairs and evaluated the performance of our model based on this data. We have empirically shown that our method outperforms recent natural language processing techniques.
Abstract（参考訳）: 特許類似性分析は、特許侵害のリスクを評価する上で重要な役割を果たす。それにもかかわらず、この分析は主に法律の専門家によって手作業で行われ、しばしば時間がかかります。自然言語処理技術の最近の進歩は、このプロセスの自動化に有望な手段を提供する。しかし、特許間の類似性を測定する方法はまだ、手動で特許を分類する専門家に依存している。近年の人工知能技術の発展により,自然言語処理技術を用いた特許の意味的類似性に着目した研究が盛んに行われている。しかし、既存の自然言語処理技術を用いて、複雑な技術を表す法的文書である特許データを正確に分析することは困難である。これらの制約に対処するために,本研究では,書誌的類似性を考慮したハイブリッド手法を提案し,特許の意味的類似性,特許間の技術的類似性,特許の書誌的情報を考慮して,特許間の類似性を測定する。自然言語処理技術を用いて,特許文書に基づく意味的類似度を測定し,国際特許分類(IPC)コードの共存度を通じて技術的類似度を算出する。特許の書誌情報の類似性を、引用情報、発明者情報及び割り当て情報という特許の特殊特性を用いて算出する。本稿では,各類似度法に適切な重み付けを割り当てるモデルを提案する。専門家の助けを借りて,420組について手作業による類似度評価を行い,このデータをもとにモデルの性能評価を行った。我々は,本手法が最近の自然言語処理技術より優れていることを実証的に示した。

関連論文リスト

PANORAMA: A Dataset and Benchmarks Capturing Decision Trails and Rationales in Patent Examination [44.74519851862391]
PANORAMAは米国特許試験記録8,143件のデータセットである。我々は,特許専門家の特許審査プロセスをエミュレートする逐次ベンチマークにパスを分解する。我々は、LLMを含むNLPを特許領域で前進させるには、現実世界の特許審査をより深く理解する必要があると論じる。
論文参考訳（メタデータ） (2025-10-25T03:24:13Z)
PatentMind: A Multi-Aspect Reasoning Graph for Patent Similarity Evaluation [32.272839191711114]
マルチアスペクト推論グラフ(MARG)に基づく特許類似性評価のための新しいフレームワークであるPatentMindを紹介する。 PatentMindは、特許を技術機能、アプリケーションドメイン、クレームスコープの3つのコアディメンションに分解して、ディメンション固有の類似度スコアを計算する。評価を支援するため,500組の特許ペアからなる人為的注釈付きベンチマークであるPatentSimBenchを構築した。
論文参考訳（メタデータ） (2025-05-25T22:28:27Z)
PatentEdits: Framing Patent Novelty as Textual Entailment [62.8514393375952]
このデータセットには105万例の修正が成功している。我々は、文章を文単位でラベル付けするアルゴリズムを設計し、これらの編集がいかに大きな言語モデルで予測できるかを確立する。引用引用文と起草文の文的含意を評価することは,どの発明的主張が変化しないか,あるいは先行技術に関して新規かを予測するのに特に有効であることを示す。
論文参考訳（メタデータ） (2024-11-20T17:23:40Z)
A comparative analysis of embedding models for patent similarity [0.0]
本稿では,テキストに基づく特許類似性の分野に2つの貢献をする。これは、異なる種類の特許固有の事前訓練された埋め込みモデルの性能を比較する。
論文参考訳（メタデータ） (2024-03-25T11:20:23Z)
Connecting the Dots: Inferring Patent Phrase Similarity with Retrieved Phrase Graphs [18.86788223751979]
本稿では,2つの特許句間の意味的類似度を測定する特許フレーズ類似性推論タスクについて検討する。本稿では,特許用語のグローバルな文脈情報を増幅するためのグラフ拡張手法を提案する。
論文参考訳（メタデータ） (2024-03-24T18:59:38Z)
Natural Language Processing in Patents: A Survey [0.0]
重要な技術的および法的情報をカプセル化した特許は、自然言語処理(NLP)アプリケーションのための豊富なドメインを提供する。 NLP技術が発展するにつれて、大規模言語モデル(LLM)は一般的なテキスト処理や生成タスクにおいて優れた能力を示してきた。本稿は,NLP研究者に,この複雑な領域を効率的にナビゲートするために必要な知識を付与することを目的とする。
論文参考訳（メタデータ） (2024-03-06T23:17:16Z)
Measuring Technological Convergence in Encryption Technologies with Proximity Indices: A Text Mining and Bibliometric Analysis using OpenAlex [46.3643544723237]
本研究は,サイバーセキュリティにおける新興技術間の技術的収束を明らかにする。提案手法は,テキストマイニングとバイオロメトリ分析を統合し,技術的近接指標の定式化と予測を行う。我々のケーススタディでは、ブロックチェーンと公開鍵暗号の間にかなりの収束が見られ、その近さが証明されている。
論文参考訳（メタデータ） (2024-03-03T20:03:03Z)
PaECTER: Patent-level Representation Learning using Citation-informed Transformers [0.16785092703248325]
PaECTERは、特許に特有のオープンソースドキュメントレベルのエンコーダである。我々は,特許文書の数値表現を生成するために,受験者による引用情報付き特許用BERTを微調整する。 PaECTERは、特許ドメインで使用されている現在の最先端モデルよりも類似性タスクが優れている。
論文参考訳（メタデータ） (2024-02-29T18:09:03Z)
Unveiling Black-boxes: Explainable Deep Learning Models for Patent Classification [48.5140223214582]
深部不透明ニューラルネットワーク(DNN)を利用した多ラベル特許分類のための最先端手法レイヤワイド関連伝搬(Layer-wise Relevance propagation, LRP)を導入し, 特許の詳細な分類手法を提案する。関連性スコアを考慮し、予測された特許クラスに関連する単語を視覚化して説明を生成する。
論文参考訳（メタデータ） (2023-10-31T14:11:37Z)
Multi label classification of Artificial Intelligence related patents using Modified D2SBERT and Sentence Attention mechanism [0.0]
本稿では,自然言語処理技術とディープラーニング手法を用いて,USPTOが発行する人工知能関連特許を分類する手法を提案する。実験結果は,他の深層学習法と比較して高い性能を示した。
論文参考訳（メタデータ） (2023-03-03T12:27:24Z)
A Survey on Sentence Embedding Models Performance for Patent Analysis [0.0]
本稿では,PatentSBERTaアプローチに基づく埋め込みモデルの精度を評価するための標準ライブラリとデータセットを提案する。 patentSBERTa, Bert-for-patents, and TF-IDF Weighted Word Embeddings is the most accuracy for computing sentence embeddeds at the subclass level。
論文参考訳（メタデータ） (2022-04-28T12:04:42Z)
Counterfactual Explanations as Interventions in Latent Space [62.997667081978825]
反現実的な説明は、望ましい結果を達成するために変更が必要な機能のセットをエンドユーザに提供することを目的としています。現在のアプローチでは、提案された説明を達成するために必要な行動の実現可能性を考慮することはめったにない。本稿では,非現実的説明を生成する手法として,潜時空間における干渉としての対実的説明(CEILS)を提案する。
論文参考訳（メタデータ） (2021-06-14T20:48:48Z)
An interdisciplinary conceptual study of Artificial Intelligence (AI) for helping benefit-risk assessment practices: Towards a comprehensive qualification matrix of AI programs and devices (pre-print 2020) [55.41644538483948]
本稿では,インテリジェンスの概念に対処するさまざまな分野の既存の概念を包括的に分析する。目的は、AIシステムを評価するための共有概念や相違点を特定することである。
論文参考訳（メタデータ） (2021-05-07T12:01:31Z)
A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
既存の説明可能性技術を評価するための診断特性のリストを作成する。そこで本研究では, モデルの性能と有理性との整合性の関係を明らかにするために, 説明可能性手法によって割り当てられた有理性スコアと有理性入力領域の人間のアノテーションを比較した。
論文参考訳（メタデータ） (2020-09-25T12:01:53Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。