Related papers: Finding Black Cat in a Coal Cellar -- Keyphrase Extraction & Keyphrase-Rubric Relationship Classification from Complex Assignments

Finding Black Cat in a Coal Cellar -- Keyphrase Extraction & Keyphrase-Rubric Relationship Classification from Complex Assignments

URL: http://arxiv.org/abs/2004.01549v3
Date: Fri, 24 Apr 2020 13:17:19 GMT
Title: Finding Black Cat in a Coal Cellar -- Keyphrase Extraction & Keyphrase-Rubric Relationship Classification from Complex Assignments
Authors: Manikandan Ravikiran
Abstract summary: This paper aims to quantify the effectiveness of supervised and unsupervised approaches for the task for keyphrase extraction. We find that (i) unsupervised MultiPartiteRank produces the best result for keyphrase extraction. We also present a comprehensive analysis and derive useful observations for those interested in these tasks for the future.
Score: 5.067828201066184
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diversity in content and open-ended questions are inherent in complex assignments across online graduate programs. The natural scale of these programs poses a variety of challenges across both peer and expert feedback including rogue reviews. While the identification of relevant content and associating it to predefined rubrics would simplify and improve the grading process, the research to date is still in a nascent stage. As such in this paper we aim to quantify the effectiveness of supervised and unsupervised approaches for the task for keyphrase extraction and generic/specific keyphrase-rubric relationship extraction. Through this study, we find that (i) unsupervised MultiPartiteRank produces the best result for keyphrase extraction (ii) supervised SVM classifier with BERT features that offer the best performance for both generic and specific keyphrase-rubric relationship classification. We finally present a comprehensive analysis and derive useful observations for those interested in these tasks for the future. The source code is released in \url{https://github.com/manikandan-ravikiran/cs6460-proj}.

Related papers

Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative Contexts [67.67746334493302]
Large Language Models (LLMs) have demonstrated remarkable capabilities across numerous tasks, yet they often rely on external context to handle complex tasks. We propose a tri-encoder sequential retriever that models this process as a Markov Decision Process (MDP) We show that our method consistently and significantly outperforms baselines, underscoring the importance of explicitly modeling inter-example dependencies.
arXiv Detail & Related papers (2025-04-15T17:35:56Z)
Training a Utility-based Retriever Through Shared Context Attribution for Retrieval-Augmented Language Models [51.608246558235166]
SCARLet is a framework for training utility-based retrievers in RALMs. It incorporates two key factors, multi-task generalization and inter-passage interaction. We evaluate our approach on ten datasets across various tasks, both in-domain and out-of-domain.
arXiv Detail & Related papers (2025-04-01T09:28:28Z)
Iterative NLP Query Refinement for Enhancing Domain-Specific Information Retrieval: A Case Study in Career Services [0.13980986259786224]
Retrieving semantically relevant documents in niche domains poses significant challenges for TF-IDF-based systems. This paper introduces an iterative and semi-automated query refinement methodology tailored to Humber College's career services webpages.
arXiv Detail & Related papers (2024-12-22T15:57:35Z)
Generative Retrieval Meets Multi-Graded Relevance [104.75244721442756]
We introduce a framework called GRaded Generative Retrieval (GR$2$) GR$2$ focuses on two key components: ensuring relevant and distinct identifiers, and implementing multi-graded constrained contrastive training. Experiments on datasets with both multi-graded and binary relevance demonstrate the effectiveness of GR$2$.
arXiv Detail & Related papers (2024-09-27T02:55:53Z)
KeyGen2Vec: Learning Document Embedding via Multi-label Keyword Generation in Question-Answering [20.03094433039241]
Current embedding models mainly rely on the availability of label supervision to increase the expressiveness of the resulting embeddings. Our study aims to loosen up the dependency on label supervision by learning document embeddings via Sequence-to-Sequence (Seq2Seq) text generator.
arXiv Detail & Related papers (2023-10-30T15:35:45Z)
Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation [59.37587762543934]
This paper studies the problem of weakly open-vocabulary semantic segmentation (WOVSS) Existing methods suffer from a granularity inconsistency regarding the usage of group tokens. We propose the prototypical guidance network (PGSeg) that incorporates multi-modal regularization.
arXiv Detail & Related papers (2023-10-29T13:18:00Z)
BibRank: Automatic Keyphrase Extraction Platform Using~Metadata [0.0]
This paper introduces a platform that integrates keyphrase datasets and facilitates the evaluation of keyphrase extraction algorithms. The platform includes BibRank, an automatic keyphrase extraction algorithm that leverages a rich dataset obtained by parsing word in Bib format.
arXiv Detail & Related papers (2023-10-13T14:44:34Z)
Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching [74.75284453828017]
Open-Vocabulary Keypoint Detection (OVKD) task is innovatively designed to use text prompts for identifying arbitrary keypoints across any species. We have developed a novel framework named Open-Vocabulary Keypoint Detection with Semantic-feature Matching (KDSM) This framework combines vision and language models, creating an interplay between language features and local keypoint visual features.
arXiv Detail & Related papers (2023-10-08T07:42:41Z)
MatchVIE: Exploiting Match Relevancy between Entities for Visual Information Extraction [48.55908127994688]
We propose a novel key-value matching model based on a graph neural network for VIE (MatchVIE) Through key-value matching based on relevancy evaluation, the proposed MatchVIE can bypass the recognitions to various semantics. We introduce a simple but effective operation, Num2Vec, to tackle the instability of encoded values.
arXiv Detail & Related papers (2021-06-24T12:06:29Z)
Phraseformer: Multimodal Key-phrase Extraction using Transformer and Graph Embedding [3.7110020502717616]
We develop a multimodal Key-phrase extraction approach, namely Phraseformer, using transformer and graph embedding techniques. In Phraseformer, each keyword candidate is presented by a vector which is the concatenation of the text and structure learning representations. We analyze the performance of Phraseformer on three datasets including Inspec, SemEval2010 and SemEval 2017 by F1-score.
arXiv Detail & Related papers (2021-06-09T09:32:17Z)
R$^2$-Net: Relation of Relation Learning Network for Sentence Semantic Matching [58.72111690643359]
We propose a Relation of Relation Learning Network (R2-Net) for sentence semantic matching. We first employ BERT to encode the input sentences from a global perspective. Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective. To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task.
arXiv Detail & Related papers (2020-12-16T13:11:30Z)
Keyphrase Extraction with Dynamic Graph Convolutional Networks and Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document. Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks. In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z)
Key Phrase Classification in Complex Assignments [5.067828201066184]
We show that the task of classification of key phrases is ambiguous at a human level producing Cohen's kappa of 0.77 on a new data set. Both pretrained language models and simple TFIDF SVM classifiers produce similar results with a former producing average of 0.6 F1 higher than the latter.
arXiv Detail & Related papers (2020-03-16T04:25:37Z)
Keyphrase Extraction with Span-based Feature Representations [13.790461555410747]
Keyphrases are capable of providing semantic metadata characterizing documents. Three approaches to address keyphrase extraction: (i) traditional two-step ranking method, (ii) sequence labeling and (iii) generation using neural networks. In this paper, we propose a novelty Span Keyphrase Extraction model that extracts span-based feature representation of keyphrase directly from all the content tokens.
arXiv Detail & Related papers (2020-02-13T09:48:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.