Towards End-to-End Semi-Supervised Table Detection with Deformable
Transformer
- URL: http://arxiv.org/abs/2305.02769v2
- Date: Sun, 7 May 2023 20:06:18 GMT
- Title: Towards End-to-End Semi-Supervised Table Detection with Deformable
Transformer
- Authors: Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker, Marcus Liwicki
and Muhammad Zeshan Afzal
- Abstract summary: Table detection is the task of classifying and localizing table objects within document images.
Many semi-supervised approaches are introduced to mitigate the need for a substantial amount of label data.
This paper presents a novel end-to-end semi-supervised table detection method that employs the deformable transformer for detecting table objects.
- Score: 11.648151981111436
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Table detection is the task of classifying and localizing table objects
within document images. With the recent development in deep learning methods,
we observe remarkable success in table detection. However, a significant amount
of labeled data is required to train these models effectively. Many
semi-supervised approaches are introduced to mitigate the need for a
substantial amount of label data. These approaches use CNN-based detectors that
rely on anchor proposals and post-processing stages such as NMS. To tackle
these limitations, this paper presents a novel end-to-end semi-supervised table
detection method that employs the deformable transformer for detecting table
objects. We evaluate our semi-supervised method on PubLayNet, DocBank, ICADR-19
and TableBank datasets, and it achieves superior performance compared to
previous methods. It outperforms the fully supervised method (Deformable
transformer) by +3.4 points on 10\% labels of TableBank-both dataset and the
previous CNN-based semi-supervised approach (Soft Teacher) by +1.8 points on
10\% labels of PubLayNet dataset. We hope this work opens new possibilities
towards semi-supervised and unsupervised table detection methods.
Related papers
- TrajSSL: Trajectory-Enhanced Semi-Supervised 3D Object Detection [59.498894868956306]
Pseudo-labeling approaches to semi-supervised learning adopt a teacher-student framework.
We leverage pre-trained motion-forecasting models to generate object trajectories on pseudo-labeled data.
Our approach improves pseudo-label quality in two distinct manners.
arXiv Detail & Related papers (2024-09-17T05:35:00Z) - End-to-End Semi-Supervised approach with Modulated Object Queries for Table Detection in Documents [12.042768320132694]
This study presents an innovative transformer-based semi-supervised table detector.
It improves the quality of pseudo-labels through a novel matching strategy.
It achieves new state-of-the-art results, with a mAP of 95.7% and 97.9% on TableBank (word) and PubLaynet with 30% label data.
arXiv Detail & Related papers (2024-05-08T11:24:57Z) - Towards End-to-End Semi-Supervised Table Detection with Semantic Aligned Matching Transformer [12.042768320132694]
Table detection within document images is a crucial task in document processing, involving the identification and localization of tables.
Recent strides in deep learning have substantially improved the accuracy of this task, but it still relies on large labeled datasets for effective training.
We introduce a semi-supervised approach employing SAM-DETR, a novel approach for precise alignment between object queries and target features.
arXiv Detail & Related papers (2024-04-30T20:25:57Z) - ClusterTabNet: Supervised clustering method for table detection and table structure recognition [0.0]
We present a novel deep-learning-based method to cluster words in documents which we apply to detect and recognize tables given the OCR output.
We interpret table structure bottom-up as a graph of relations between pairs of words and use a transformer encoder model to predict its adjacency matrix.
Compared to the current state-of-the-art detection methods such as DETR and Faster R-CNN, our method achieves similar or better accuracy, while requiring a significantly smaller model.
arXiv Detail & Related papers (2024-02-12T09:10:24Z) - Table Detection in the Wild: A Novel Diverse Table Detection Dataset and
Method [1.3814823347690746]
We introduce a diverse large-scale dataset for table detection with more than seven thousand samples.
We also present baseline results using a convolutional neural network-based method to detect table structure in documents.
arXiv Detail & Related papers (2022-08-31T14:20:30Z) - W2N:Switching From Weak Supervision to Noisy Supervision for Object
Detection [64.10643170523414]
We propose a novel WSOD framework with a new paradigm that switches from weak supervision to noisy supervision (W2N)
In the localization adaptation module, we propose a regularization loss to reduce the proportion of discriminative parts in original pseudo ground-truths.
Our W2N outperforms all existing pure WSOD methods and transfer learning methods.
arXiv Detail & Related papers (2022-07-25T12:13:48Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Scientific evidence extraction [0.0]
We propose a new dataset, Tables One Million (PubTables-1M), and a new class of metric, PubMed grid table similarity (GriTS)
PubTables-1M is nearly twice as large as the previous largest comparable dataset.
We show that object detection models trained on PubTables-1M produce excellent results out-of-the-box for all three tasks of detection, structure recognition, and functional analysis.
arXiv Detail & Related papers (2021-09-30T19:42:07Z) - WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z) - 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object
Detection [76.42897462051067]
3DIoUMatch is a novel semi-supervised method for 3D object detection applicable to both indoor and outdoor scenes.
We leverage a teacher-student mutual learning framework to propagate information from the labeled to the unlabeled train set in the form of pseudo-labels.
Our method consistently improves state-of-the-art methods on both ScanNet and SUN-RGBD benchmarks by significant margins under all label ratios.
arXiv Detail & Related papers (2020-12-08T11:06:26Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.