Predictive Patentomics: Forecasting Innovation Success and Valuation
with ChatGPT
- URL: http://arxiv.org/abs/2307.01202v1
- Date: Thu, 22 Jun 2023 13:21:20 GMT
- Title: Predictive Patentomics: Forecasting Innovation Success and Valuation
with ChatGPT
- Authors: Stephen Yang
- Abstract summary: OpenAI's state-of-the-art textual embedding accesses complex information about the quality and impact of each invention.
The nuanced embedding drives a 24% incremental improvement in R-squared predicting patent value.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Analysis of innovation has been fundamentally limited by conventional
approaches to broad, structural variables. This paper pushes the boundaries,
taking an LLM approach to patent analysis with the groundbreaking ChatGPT
technology. OpenAI's state-of-the-art textual embedding accesses complex
information about the quality and impact of each invention to power deep
learning predictive models. The nuanced embedding drives a 24% incremental
improvement in R-squared predicting patent value and clearly isolates the worst
and best applications. These models enable a revision of the contemporary
Kogan, Papanikolaou, Seru, and Stoffman (2017) valuation of patents by a median
deviation of 1.5 times, accounting for potential institutional predictions.
Furthermore, the market fails to incorporate timely information about
applications; a long-short portfolio based on predicted acceptance rates
achieves significant abnormal returns of 3.3% annually. The models provide an
opportunity to revolutionize startup and small-firm corporate policy vis-a-vis
patenting.
Related papers
- PatentGPT: A Large Language Model for Patent Drafting Using Knowledge-based Fine-tuning Method [1.4496326701907591]
Existing large language models (LLMs) often fall short in this IP creation domain due to their lack of specialized knowledge and context-awareness.
We propose a groundbreaking framework for Knowledge Fine-Tuning (KFT) of LLMs, designed to endow AI with the ability to autonomously mine, understand, and apply domain-specific knowledge.
Our model, PatentGPT, has demonstrated outstanding performance, scoring up to approximately 400% higher in patent related benchmark tests compared to state-of-the-art models.
arXiv Detail & Related papers (2024-08-26T12:00:29Z) - Editable Fairness: Fine-Grained Bias Mitigation in Language Models [52.66450426729818]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.
FAST surpasses state-of-the-art baselines with superior debiasing performance.
This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z) - Early screening of potential breakthrough technologies with enhanced interpretability: A patent-specific hierarchical attention network model [4.779196219827507]
We propose an interpretable machine learning approach to predicting future citation counts from patent texts.
A case study of 35,376 pharmaceutical patents demonstrates the effectiveness of our approach.
It is expected that the proposed approach will enhance expert-machine collaboration in identifying breakthrough technologies.
arXiv Detail & Related papers (2024-07-24T02:17:10Z) - Design of reliable technology valuation model with calibrated machine learning of patent indicators [14.31250748501038]
We propose an analytical framework for reliable technology valuation using calibrated ML models.
We extract quantitative patent indicators that represent various technology characteristics as input data.
arXiv Detail & Related papers (2024-06-08T11:52:37Z) - On the Societal Impact of Open Foundation Models [93.67389739906561]
We focus on open foundation models, defined here as those with broadly available model weights.
We identify five distinctive properties of open foundation models that lead to both their benefits and risks.
arXiv Detail & Related papers (2024-02-27T16:49:53Z) - Unveiling Black-boxes: Explainable Deep Learning Models for Patent
Classification [48.5140223214582]
State-of-the-art methods for multi-label patent classification rely on deep opaque neural networks (DNNs)
We propose a novel deep explainable patent classification framework by introducing layer-wise relevance propagation (LRP)
Considering the relevance score, we then generate explanations by visualizing relevant words for the predicted patent class.
arXiv Detail & Related papers (2023-10-31T14:11:37Z) - Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models [51.3422222472898]
We document the capability of large language models (LLMs) like ChatGPT to predict stock price movements using news headlines.
We develop a theoretical model incorporating information capacity constraints, underreaction, limits-to-arbitrage, and LLMs.
arXiv Detail & Related papers (2023-04-15T19:22:37Z) - Uncertainty-Aware Instance Reweighting for Off-Policy Learning [63.31923483172859]
We propose a Uncertainty-aware Inverse Propensity Score estimator (UIPS) for improved off-policy learning.
Experiment results on synthetic and three real-world recommendation datasets demonstrate the advantageous sample efficiency of the proposed UIPS estimator.
arXiv Detail & Related papers (2023-03-11T11:42:26Z) - Off-policy evaluation for learning-to-rank via interpolating the
item-position model and the position-based model [83.83064559894989]
A critical need for industrial recommender systems is the ability to evaluate recommendation policies offline, before deploying them to production.
We develop a new estimator that mitigates the problems of the two most popular off-policy estimators for rankings.
In particular, the new estimator, called INTERPOL, addresses the bias of a potentially misspecified position-based model.
arXiv Detail & Related papers (2022-10-15T17:22:30Z) - The Harvard USPTO Patent Dataset: A Large-Scale, Well-Structured, and
Multi-Purpose Corpus of Patent Applications [8.110699646062384]
We introduce the Harvard USPTO Patent dataset (HUPD)
With more than 4.5 million patent documents, HUPD is two to three times larger than comparable corpora.
By providing each application's metadata along with all of its text fields, the dataset enables researchers to perform new sets of NLP tasks.
arXiv Detail & Related papers (2022-07-08T17:57:15Z) - A Survey on Sentence Embedding Models Performance for Patent Analysis [0.0]
We propose a standard library and dataset for assessing the accuracy of embeddings models based on PatentSBERTa approach.
Results show PatentSBERTa, Bert-for-patents, and TF-IDF Weighted Word Embeddings have the best accuracy for computing sentence embeddings at the subclass level.
arXiv Detail & Related papers (2022-04-28T12:04:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.