Related papers: Towards Writer Retrieval for Historical Datasets

Towards Writer Retrieval for Historical Datasets

URL: http://arxiv.org/abs/2305.05358v2
Date: Wed, 14 Jun 2023 07:04:39 GMT
Title: Towards Writer Retrieval for Historical Datasets
Authors: Marco Peer, Florian Kleber, Robert Sablatnig
Abstract summary: unsupervised approach for writer retrieval based on clustering SIFT descriptors detected at keypoint locations. residual network followed by our proposed NetRVLAD, an encoding layer with reduced complexity. We show that our approach achieves comparable performance on a modern dataset as well.
Score: 0.6445605125467572
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents an unsupervised approach for writer retrieval based on clustering SIFT descriptors detected at keypoint locations resulting in pseudo-cluster labels. With those cluster labels, a residual network followed by our proposed NetRVLAD, an encoding layer with reduced complexity compared to NetVLAD, is trained on 32x32 patches at keypoint locations. Additionally, we suggest a graph-based reranking algorithm called SGR to exploit similarities of the page embeddings to boost the retrieval performance. Our approach is evaluated on two historical datasets (Historical-WI and HisIR19). We include an evaluation of different backbones and NetRVLAD. It competes with related work on historical datasets without using explicit encodings. We set a new State-of-the-art on both datasets by applying our reranking scheme and show that our approach achieves comparable performance on a modern dataset as well.

Related papers

SubGCache: Accelerating Graph-based RAG with Subgraph-level KV Cache [20.26177496265456]
SubGCache aims to reduce inference latency by reusing computation across queries with similar structural prompts.<n>Experiments on two new datasets demonstrate that SubGCache consistently reduces inference latency with comparable and even improved generation quality.
arXiv Detail & Related papers (2025-05-16T07:39:41Z)
Evaluating Retrieval Quality in Retrieval-Augmented Generation [21.115495457454365]
Traditional end-to-end evaluation methods are computationally expensive. We propose eRAG, where each document in the retrieval list is individually utilized by the large language model within the RAG system. eRAG offers significant computational advantages, improving runtime and consuming up to 50 times less GPU memory than end-to-end evaluation.
arXiv Detail & Related papers (2024-04-21T21:22:28Z)
Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task. We propose a co-training-based framework that encourages clustering consistency. Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z)
Redundancy-Free Self-Supervised Relational Learning for Graph Clustering [13.176413653235311]
We propose a novel self-supervised deep graph clustering method named Redundancy-Free Graph Clustering (R$2$FGC) It extracts the attribute- and structure-level relational information from both global and local views based on an autoencoder and a graph autoencoder. Our experiments are performed on widely used benchmark datasets to validate the superiority of our R$2$FGC over state-of-the-art baselines.
arXiv Detail & Related papers (2023-09-09T06:18:50Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering. In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z)
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis [50.972595036856035]
We present a code that successfully replicates results from six popular and recent graph recommendation models. We compare these graph models with traditional collaborative filtering models that historically performed well in offline evaluations. By investigating the information flow from users' neighborhoods, we aim to identify which models are influenced by intrinsic features in the dataset structure.
arXiv Detail & Related papers (2023-08-01T09:31:44Z)
EGRC-Net: Embedding-induced Graph Refinement Clustering Network [66.44293190793294]
We propose a novel graph clustering network called Embedding-Induced Graph Refinement Clustering Network (EGRC-Net) EGRC-Net effectively utilizes the learned embedding to adaptively refine the initial graph and enhance the clustering performance. Our proposed methods consistently outperform several state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-19T09:08:43Z)
Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC) In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning. For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z)
Open-Set Recognition: A Good Closed-Set Classifier is All You Need [146.6814176602689]
We show that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes. We use this correlation to boost the performance of the cross-entropy OSR 'baseline' by improving its closed-set accuracy. We also construct new benchmarks which better respect the task of detecting semantic novelty.
arXiv Detail & Related papers (2021-10-12T17:58:59Z)
Variational Auto Encoder Gradient Clustering [0.0]
Clustering using deep neural network models have been extensively studied in recent years. This article investigates how probability function gradient ascent can be used to process data in order to achieve better clustering. We propose a simple yet effective method for investigating suitable number of clusters for data, based on the DBSCAN clustering algorithm.
arXiv Detail & Related papers (2021-05-11T08:00:36Z)
Writer Identification and Writer Retrieval Based on NetVLAD with Re-ranking [0.0]
Writer identification and writer retrieval is considered as a challenging problem in the document analysis and recognition field. A novel pipeline is proposed for the problem by employing a unified neural network architecture consisting of the ResNet-20 as a feature extractor. A novel re-ranking strategy is introduced for the task of identification and retrieval based on $k$-reciprocal nearest neighbors.
arXiv Detail & Related papers (2020-12-11T08:22:28Z)
GuCNet: A Guided Clustering-based Network for Improved Classification [15.747227188672088]
We present a novel, and yet a very simple classification technique by leveraging the ease of classifiability of any existing well separable dataset for guidance. Since the guide dataset which may or may not have any semantic relationship with the experimental dataset, the proposed network tries to embed class-wise features of the challenging dataset to those distinct clusters of the guide set.
arXiv Detail & Related papers (2020-10-11T10:22:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.