Related papers: Patent Representation Learning via Self-supervision

Patent Representation Learning via Self-supervision

URL: http://arxiv.org/abs/2511.10657v1
Date: Mon, 03 Nov 2025 09:58:56 GMT
Title: Patent Representation Learning via Self-supervision
Authors: You Zuo, Kim Gerdes, Eric Villemonte de La Clergerie, Benoît Sagot,
Abstract summary: This paper presents a contrastive learning framework for learning patent embeddings by leveraging multiple views from within the same document.<n>We first identify a patent-specific failure mode of SimCSE style dropout augmentation: it produces overly uniform embeddings that lose semantic cohesion.<n>This design introduces natural semantic and structural diversity, mitigating over-dispersion and yielding embeddings that better preserve both global structure and local continuity.
Score: 14.128643457340617
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a simple yet effective contrastive learning framework for learning patent embeddings by leveraging multiple views from within the same document. We first identify a patent-specific failure mode of SimCSE style dropout augmentation: it produces overly uniform embeddings that lose semantic cohesion. To remedy this, we propose section-based augmentation, where different sections of a patent (e.g., abstract, claims, background) serve as complementary views. This design introduces natural semantic and structural diversity, mitigating over-dispersion and yielding embeddings that better preserve both global structure and local continuity. On large-scale benchmarks, our fully self-supervised method matches or surpasses citation-and IPC-supervised baselines in prior-art retrieval and classification, while avoiding reliance on brittle or incomplete annotations. Our analysis further shows that different sections specialize for different tasks-claims and summaries benefit retrieval, while background sections aid classification-highlighting the value of patents' inherent discourse structure for representation learning. These results highlight the value of exploiting intra-document views for scalable and generalizable patent understanding.

Related papers

HAMLET-FFD: Hierarchical Adaptive Multi-modal Learning Embeddings Transformation for Face Forgery Detection [6.060036926093259]
HAMLET-FFD is a cross-domain generalization framework for face forgery detection.<n>It integrates visual evidence with conceptual cues, emulating expert forensic analysis.<n>By design, HAMLET-FFD freezes all pretrained parameters, serving as an external plugin.
arXiv Detail & Related papers (2025-07-28T15:09:52Z)
Hierarchical Multi-Positive Contrastive Learning for Patent Image Retrieval [0.2970959580204573]
Patent images are technical drawings that convey information about a patent's innovation.<n>Current methods neglect patents' hierarchical relationships, such as those defined by the Locarno International Classification system.<n>We introduce a hierarchical multi-positive contrastive loss that leverages the LIC's taxonomy to induce such relations in the retrieval process.
arXiv Detail & Related papers (2025-06-16T13:53:02Z)
A Hybrid Architecture with Efficient Fine Tuning for Abstractive Patent Document Summarization [0.0]
This study proposes a system for efficiently creating abstractive summaries of patent records.<n>The procedure involves leveraging the LexRank graph-based algorithm to retrieve the important sentences from input parent texts.
arXiv Detail & Related papers (2025-03-13T13:30:54Z)
"Principal Components" Enable A New Language of Images [79.45806370905775]
We introduce a novel visual tokenization framework that embeds a provable PCA-like structure into the latent token space.<n>Our approach achieves state-of-the-art reconstruction performance and enables better interpretability to align with the human vision system.
arXiv Detail & Related papers (2025-03-11T17:59:41Z)
Unity in Diversity: Multi-expert Knowledge Confrontation and Collaboration for Generalizable Vehicle Re-identification [60.20318058777603]
Generalizable vehicle re-identification (ReID) seeks to develop models that can adapt to unknown target domains without the need for fine-tuning or retraining.<n>Previous works have mainly focused on extracting domain-invariant features by aligning data distributions between source domains.<n>We propose a two-stage Multi-expert Knowledge Confrontation and Collaboration (MiKeCoCo) method to solve this unique problem.
arXiv Detail & Related papers (2024-07-10T04:06:39Z)
Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation [59.37587762543934]
This paper studies the problem of weakly open-vocabulary semantic segmentation (WOVSS) Existing methods suffer from a granularity inconsistency regarding the usage of group tokens. We propose the prototypical guidance network (PGSeg) that incorporates multi-modal regularization.
arXiv Detail & Related papers (2023-10-29T13:18:00Z)
Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision. Existing literature addresses this challenge by employing local-based representation approaches. This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z)
Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning [53.68371566336254]
We argue that the key to better performance lies in meaningful latent modality structures instead of perfect modality alignment. Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization.
arXiv Detail & Related papers (2023-03-10T14:38:49Z)
TaCo: Textual Attribute Recognition via Contrastive Learning [9.042957048594825]
TaCo is a contrastive framework for textual attribute recognition tailored toward the most common document scenes. We design the learning paradigm from three perspectives: 1) generating attribute views, 2) extracting subtle but crucial details, and 3) exploiting valued view pairs for learning. Experiments show that TaCo surpasses the supervised counterparts and advances the state-of-the-art remarkably on multiple attribute recognition tasks.
arXiv Detail & Related papers (2022-08-22T09:45:34Z)
Deep Partial Multi-View Learning [94.39367390062831]
We propose a novel framework termed Cross Partial Multi-View Networks (CPM-Nets) We fifirst provide a formal defifinition of completeness and versatility for multi-view representation. We then theoretically prove the versatility of the learned latent representations.
arXiv Detail & Related papers (2020-11-12T02:29:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.