Deep learning-based citation recommendation system for patents
- URL: http://arxiv.org/abs/2010.10932v1
- Date: Wed, 21 Oct 2020 12:18:21 GMT
- Title: Deep learning-based citation recommendation system for patents
- Authors: Jaewoong Choi, Sion Jang, Jaeyoung Kim, Jiho Lee, Janghyeok Yoona,
Sungchul Choi
- Abstract summary: We present a novel dataset called PatentNet that includes textual information and metadata for approximately 110,000 patents from the Google Big Query service.
Compared with existing recommendation methods, the proposed benchmark method achieved a mean reciprocal rank of 0.2377 on the test set.
- Score: 5.376388266200792
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this study, we address the challenges in developing a deep learning-based
automatic patent citation recommendation system. Although deep learning-based
recommendation systems have exhibited outstanding performance in various
domains (such as movies, products, and paper citations), their validity in
patent citations has not been investigated, owing to the lack of a freely
available high-quality dataset and relevant benchmark model. To solve these
problems, we present a novel dataset called PatentNet that includes textual
information and metadata for approximately 110,000 patents from the Google Big
Query service. Further, we propose strong benchmark models considering the
similarity of textual information and metadata (such as cooperative patent
classification code). Compared with existing recommendation methods, the
proposed benchmark method achieved a mean reciprocal rank of 0.2377 on the test
set, whereas the existing state-of-the-art recommendation method achieved
0.2073.
Related papers
- Embedding in Recommender Systems: A Survey [67.67966158305603]
A crucial aspect is embedding techniques that covert the high-dimensional discrete features, such as user and item IDs, into low-dimensional continuous vectors.
Applying embedding techniques captures complex entity relationships and has spurred substantial research.
This survey covers embedding methods like collaborative filtering, self-supervised learning, and graph-based techniques.
arXiv Detail & Related papers (2023-10-28T06:31:06Z) - Impression-Aware Recommender Systems [57.38537491535016]
Novel data sources bring new opportunities to improve the quality of recommender systems.
Researchers may use impressions to refine user preferences and overcome the current limitations in recommender systems research.
We present a systematic literature review on recommender systems using impressions.
arXiv Detail & Related papers (2023-08-15T16:16:02Z) - Improving Recommendation Relevance by simulating User Interest [77.34726150561087]
We observe that recommendation "recency" can be straightforwardly and transparently maintained by iterative reduction of ranks of inactive items.
The basic idea behind this work is patented in a context of online recommendation systems.
arXiv Detail & Related papers (2023-02-03T03:35:28Z) - Estimating the Performance of Entity Resolution Algorithms: Lessons
Learned Through PatentsView.org [3.8494315501944736]
This paper introduces a novel evaluation methodology for entity resolution algorithms.
It is motivated by PatentsView.org, a U.S. Patents and Trademarks Office patent data exploration tool.
arXiv Detail & Related papers (2022-10-03T21:06:35Z) - Tag-Aware Document Representation for Research Paper Recommendation [68.8204255655161]
We propose a hybrid approach that leverages deep semantic representation of research papers based on social tags assigned by users.
The proposed model is effective in recommending research papers even when the rating data is very sparse.
arXiv Detail & Related papers (2022-09-08T09:13:07Z) - The Harvard USPTO Patent Dataset: A Large-Scale, Well-Structured, and
Multi-Purpose Corpus of Patent Applications [8.110699646062384]
We introduce the Harvard USPTO Patent dataset (HUPD)
With more than 4.5 million patent documents, HUPD is two to three times larger than comparable corpora.
By providing each application's metadata along with all of its text fields, the dataset enables researchers to perform new sets of NLP tasks.
arXiv Detail & Related papers (2022-07-08T17:57:15Z) - A Survey on Sentence Embedding Models Performance for Patent Analysis [0.0]
We propose a standard library and dataset for assessing the accuracy of embeddings models based on PatentSBERTa approach.
Results show PatentSBERTa, Bert-for-patents, and TF-IDF Weighted Word Embeddings have the best accuracy for computing sentence embeddings at the subclass level.
arXiv Detail & Related papers (2022-04-28T12:04:42Z) - Patent Sentiment Analysis to Highlight Patent Paragraphs [0.0]
Given a patent document, identifying distinct semantic annotations is an interesting research aspect.
In the process of manual patent analysis, to attain better readability, recognising the semantic information by marking paragraphs is in practice.
This work assist patent practitioners in highlighting semantic information automatically and aid to create a sustainable and efficient patent analysis using the aptitude of Machine Learning.
arXiv Detail & Related papers (2021-11-06T13:28:29Z) - TTRS: Tinkoff Transactions Recommender System benchmark [62.997667081978825]
We present the TTRS - Tinkoff Transactions Recommender System benchmark.
This financial transaction benchmark contains over 2 million interactions between almost 10,000 users and more than 1,000 merchant brands over 14 months.
We also present a comprehensive comparison of the current popular RecSys methods on the next-period recommendation task and conduct a detailed analysis of their performance against various metrics and recommendation goals.
arXiv Detail & Related papers (2021-10-11T20:04:07Z) - Learning Neural Textual Representations for Citation Recommendation [7.227232362460348]
We propose a novel approach to citation recommendation using a deep sequential representation of the documents (Sentence-BERT) cascaded with Siamese and triplet networks in a submodular scoring function.
To the best of our knowledge, this is the first approach to combine deep representations and submodular selection for a task of citation recommendation.
arXiv Detail & Related papers (2020-07-08T12:38:50Z) - PONE: A Novel Automatic Evaluation Metric for Open-Domain Generative
Dialogue Systems [48.99561874529323]
There are three kinds of automatic methods to evaluate the open-domain generative dialogue systems.
Due to the lack of systematic comparison, it is not clear which kind of metrics are more effective.
We propose a novel and feasible learning-based metric that can significantly improve the correlation with human judgments.
arXiv Detail & Related papers (2020-04-06T04:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.