Classification of Visualization Types and Perspectives in Patents
- URL: http://arxiv.org/abs/2307.10471v1
- Date: Wed, 19 Jul 2023 21:45:07 GMT
- Title: Classification of Visualization Types and Perspectives in Patents
- Authors: Junaid Ahmed Ghauri, Eric M\"uller-Budack, Ralph Ewerth
- Abstract summary: We adopt state-of-the-art deep learning methods for the classification of visualization types and perspectives in patent images.
We derive a set of hierarchical classes from a dataset that provides weakly-labeled data for image perspectives.
- Score: 9.123089032348311
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Due to the swift growth of patent applications each year, information and
multimedia retrieval approaches that facilitate patent exploration and
retrieval are of utmost importance. Different types of visualizations (e.g.,
graphs, technical drawings) and perspectives (e.g., side view, perspective) are
used to visualize details of innovations in patents. The classification of
these images enables a more efficient search and allows for further analysis.
So far, datasets for image type classification miss some important
visualization types for patents. Furthermore, related work does not make use of
recent deep learning approaches including transformers. In this paper, we adopt
state-of-the-art deep learning methods for the classification of visualization
types and perspectives in patent images. We extend the CLEF-IP dataset for
image type classification in patents to ten classes and provide manual ground
truth annotations. In addition, we derive a set of hierarchical classes from a
dataset that provides weakly-labeled data for image perspectives. Experimental
results have demonstrated the feasibility of the proposed approaches. Source
code, models, and dataset will be made publicly available.
Related papers
- Large Language Model Informed Patent Image Retrieval [0.0]
We propose a language-informed, distribution-aware multimodal approach to patent image feature learning.
Our proposed method achieves state-of-the-art or comparable performance in image-based patent retrieval with mAP +53.3%, Recall@10 +41.8%, and MRR@10 +51.9%.
arXiv Detail & Related papers (2024-04-30T08:45:16Z) - MLIP: Enhancing Medical Visual Representation with Divergence Encoder
and Knowledge-guided Contrastive Learning [48.97640824497327]
We propose a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning.
Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with expert knowledge.
Notably, MLIP surpasses state-of-the-art methods even with limited annotated data, highlighting the potential of multimodal pre-training in advancing medical representation learning.
arXiv Detail & Related papers (2024-02-03T05:48:50Z) - Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models [64.24227572048075]
We propose a Knowledge-Aware Prompt Tuning (KAPT) framework for vision-language models.
Our approach takes inspiration from human intelligence in which external knowledge is usually incorporated into recognizing novel categories of objects.
arXiv Detail & Related papers (2023-08-22T04:24:45Z) - Event-based Dynamic Graph Representation Learning for Patent Application
Trend Prediction [45.0907126466271]
We propose an event-based graph learning framework for patent application trend prediction.
In particular, our method is founded on the memorable representations of both companies and patent classification codes.
arXiv Detail & Related papers (2023-08-04T05:43:32Z) - Semantic Prompt for Few-Shot Image Recognition [76.68959583129335]
We propose a novel Semantic Prompt (SP) approach for few-shot learning.
The proposed approach achieves promising results, improving the 1-shot learning accuracy by 3.67% on average.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for
Few-shot Image Classification [84.05253637260743]
We propose a new framework, named Semantic-guided Visual Adapting (SgVA), to extend vision-language pre-trained models.
SgVA produces discriminative task-specific visual features by comprehensively using a vision-specific contrastive loss, a cross-modal contrastive loss, and an implicit knowledge distillation.
State-of-the-art results on 13 datasets demonstrate that the adapted visual features can well complement the cross-modal features to improve few-shot image classification.
arXiv Detail & Related papers (2022-11-28T14:58:15Z) - Facing the Void: Overcoming Missing Data in Multi-View Imagery [0.783788180051711]
We propose a novel technique for multi-view image classification robust to this problem.
The proposed method, based on state-of-the-art deep learning-based approaches and metric learning, can be easily adapted and exploited in other applications and domains.
Results show that the proposed algorithm provides improvements in multi-view image classification accuracy when compared to state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T13:21:27Z) - SEGA: Semantic Guided Attention on Visual Prototype for Few-Shot
Learning [85.2093650907943]
We propose SEmantic Guided Attention (SEGA) to teach machines to recognize a new category.
SEGA uses semantic knowledge to guide the visual perception in a top-down manner about what visual features should be paid attention to.
We show that our semantic guided attention realizes anticipated function and outperforms state-of-the-art results.
arXiv Detail & Related papers (2021-11-08T08:03:44Z) - A Step Toward More Inclusive People Annotations for Fairness [38.546190750434945]
We present a new set of annotations on a subset of the Open Images dataset called the MIAP subset.
The attributes and labeling methodology for the MIAP subset were designed to enable research into model fairness.
arXiv Detail & Related papers (2021-05-05T20:44:56Z) - Saliency-driven Class Impressions for Feature Visualization of Deep
Neural Networks [55.11806035788036]
It is advantageous to visualize the features considered to be essential for classification.
Existing visualization methods develop high confidence images consisting of both background and foreground features.
In this work, we propose a saliency-driven approach to visualize discriminative features that are considered most important for a given task.
arXiv Detail & Related papers (2020-07-31T06:11:06Z) - A Convolutional Neural Network-based Patent Image Retrieval Method for
Design Ideation [5.195924252155368]
We propose a convolutional neural network (CNN)-based patent image retrieval method.
The core of this approach is a novel neural network architecture named Dual-VGG.
The accuracy of both training tasks and patent image embedding space are evaluated to show the performance of our model.
arXiv Detail & Related papers (2020-03-10T13:32:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.