The Solution for The PST-KDD-2024 OAG-Challenge
- URL: http://arxiv.org/abs/2407.12827v1
- Date: Tue, 2 Jul 2024 14:15:05 GMT
- Title: The Solution for The PST-KDD-2024 OAG-Challenge
- Authors: Shupeng Zhong, Xinger Li, Shushan Jin, Yang Yang,
- Abstract summary: We introduce the second-place solution in the KDD-2024 OAG-Challenge paper source tracing track.
Our solution is mainly based on two methods, BERT and GCN, and combines the reasoning results of BERT and GCN in the final submission.
In the end, our solution achieved a remarkable score of 0.47691 in the competition.
- Score: 3.0116058513816224
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce the second-place solution in the KDD-2024 OAG-Challenge paper source tracing track. Our solution is mainly based on two methods, BERT and GCN, and combines the reasoning results of BERT and GCN in the final submission to achieve complementary performance. In the BERT solution, we focus on processing the fragments that appear in the references of the paper, and use a variety of operations to reduce the redundant interference in the fragments, so that the information received by BERT is more refined. In the GCN solution, we map information such as paper fragments, abstracts, and titles to a high-dimensional semantic space through an embedding model, and try to build edges between titles, abstracts, and fragments to integrate contextual relationships for judgment. In the end, our solution achieved a remarkable score of 0.47691 in the competition.
Related papers
- BERT-VBD: Vietnamese Multi-Document Summarization Framework [2.2526595080231857]
An emerging and promising strategy involves a synergistic fusion of extractive and abstractive summarization methods.
This paper presents a novel Vietnamese MDS framework leveraging a two-component pipeline architecture.
The proposed framework attains ROUGE-2 scores of 39.6% on the VN-MDS dataset and outperforming the state-of-the-art baselines.
arXiv Detail & Related papers (2024-09-18T16:56:06Z) - TocBERT: Medical Document Structure Extraction Using Bidirectional Transformers [1.2343981093497332]
TocBERT represents a supervised solution trained on the detection of titles and sub-titles from semantic representations.
The solution has been applied on a medical text segmentation use-case where the Bio-ClinicalBERT model is fine-tuned to segment discharge summaries of the MIMIC-III dataset.
It achieved an F1-score of 84.6% when evaluated on a linear text segmentation problem and 72.8% on a hierarchical text segmentation problem.
arXiv Detail & Related papers (2024-06-27T20:56:57Z) - Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach [7.012760526318993]
Weakly-Supervised Semantic (WSSS) offers a cost-efficient workaround to extensive labeling.
Existing WSSS methods have difficulties in learning the boundaries of objects leading to poor segmentation results.
We propose a novel and effective framework that addresses these issues by leveraging visual foundation models inside the bounding box.
arXiv Detail & Related papers (2024-05-10T16:42:25Z) - Theoretically Achieving Continuous Representation of Oriented Bounding Boxes [64.15627958879053]
This paper endeavors to completely solve the issue of discontinuity in Oriented Bounding Box representation.
We propose a novel representation method called Continuous OBB (COBB) which can be readily integrated into existing detectors.
For fairness and transparency of experiments, we have developed a modularized benchmark based on the open-source deep learning framework Jittor's detection toolbox JDet for OOD evaluation.
arXiv Detail & Related papers (2024-02-29T09:27:40Z) - Cross-Domain Few-Shot Segmentation via Iterative Support-Query
Correspondence Mining [81.09446228688559]
Cross-Domain Few-Shots (CD-FSS) poses the challenge of segmenting novel categories from a distinct domain using only limited exemplars.
We propose a novel cross-domain fine-tuning strategy that addresses the challenging CD-FSS tasks.
arXiv Detail & Related papers (2024-01-16T14:45:41Z) - Modeling Uncertainty and Using Post-fusion as Fallback Improves Retrieval Augmented Generation with LLMs [80.74263278847063]
The integration of retrieved passages and large language models (LLMs) has significantly contributed to improving open-domain question answering.
This paper investigates different methods of combining retrieved passages with LLMs to enhance answer generation.
arXiv Detail & Related papers (2023-08-24T05:26:54Z) - EFaR 2023: Efficient Face Recognition Competition [51.77649060180531]
The paper presents the summary of the Efficient Face Recognition Competition (EFaR) held at the 2023 International Joint Conference on Biometrics (IJCB 2023)
The competition received 17 submissions from 6 different teams.
The submitted solutions are ranked based on a weighted score of the achieved verification accuracies on a diverse set of benchmarks, as well as the deployability given by the number of floating-point operations and model size.
arXiv Detail & Related papers (2023-08-08T09:58:22Z) - iMETRE: Incorporating Markers of Entity Types for Relation Extraction [0.0]
Sentence-level relation extraction aims to identify the relationship between 2 entities given a contextual sentence.
In this paper, we approach the task of relationship extraction in the financial dataset REFinD.
arXiv Detail & Related papers (2023-06-30T20:54:41Z) - GDB: Gated convolutions-based Document Binarization [0.0]
We formulate text extraction as the learning of gating values and propose an end-to-end gated convolutions-based network (GDB) to solve the problem of imprecise stroke edge extraction.
Our proposed framework consists of two stages. Firstly, a coarse sub-network with an extra edge branch is trained to get more precise feature maps by feeding a priori mask and edge.
Secondly, a refinement sub-network is cascaded to refine the output of the first stage by gated convolutions based on the sharp edge.
arXiv Detail & Related papers (2023-02-04T02:56:40Z) - DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation [78.30720731968135]
Unsupervised domain adaptation in semantic segmentation has been raised to alleviate the reliance on expensive pixel-wise annotations.
We propose DecoupleNet that alleviates source domain overfitting and enables the final model to focus more on the segmentation task.
We also put forward Self-Discrimination (SD) and introduce an auxiliary classifier to learn more discriminative target domain features with pseudo labels.
arXiv Detail & Related papers (2022-07-20T15:47:34Z) - AlignSeg: Feature-Aligned Segmentation Networks [109.94809725745499]
We propose Feature-Aligned Networks (AlignSeg) to address misalignment issues during the feature aggregation process.
Our network achieves new state-of-the-art mIoU scores of 82.6% and 45.95%, respectively.
arXiv Detail & Related papers (2020-02-24T10:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.