Joint Extraction and Classification of Danish Competences for Job Matching
- URL: http://arxiv.org/abs/2410.22103v1
- Date: Tue, 29 Oct 2024 15:00:40 GMT
- Title: Joint Extraction and Classification of Danish Competences for Job Matching
- Authors: Qiuchi Li, Christina Lioma,
- Abstract summary: This work presents the first model that jointly extracts and classifies competence from Danish job postings.
As a single BERT-like architecture for joint extraction and classification, our model is lightweight and efficient at inference.
- Score: 13.364545674944825
- License:
- Abstract: The matching of competences, such as skills, occupations or knowledges, is a key desiderata for candidates to be fit for jobs. Automatic extraction of competences from CVs and Jobs can greatly promote recruiters' productivity in locating relevant candidates for job vacancies. This work presents the first model that jointly extracts and classifies competence from Danish job postings. Different from existing works on skill extraction and skill classification, our model is trained on a large volume of annotated Danish corpora and is capable of extracting a wide range of Danish competences, including skills, occupations and knowledges of different categories. More importantly, as a single BERT-like architecture for joint extraction and classification, our model is lightweight and efficient at inference. On a real-scenario job matching dataset, our model beats the state-of-the-art models in the overall performance of Danish competence extraction and classification, and saves over 50% time at inference.
Related papers
- Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models [104.96990850774566]
We propose a Multi-lingual Ability Extraction and Transfer approach, named as MAET.
Our key idea is to decompose and extract language-agnostic ability-related weights from large language models.
Experiment results show MAET can effectively and efficiently extract and transfer the advanced abilities, and outperform training-based baseline methods.
arXiv Detail & Related papers (2024-10-10T11:23:18Z) - Comparative Analysis of Encoder-Based NER and Large Language Models for Skill Extraction from Russian Job Vacancies [0.0]
This study compares Named Entity Recognition methods based on encoders with Large Language Models (LLMs) for extracting skills from Russian job vacancies.
Results indicate that traditional NER models, especially DeepPavlov RuBERT NER tuned, outperform LLMs across various metrics including accuracy, precision, recall, and inference time.
This research contributes to the field of natural language processing (NLP) and its application in the labor market, particularly in non-English contexts.
arXiv Detail & Related papers (2024-07-29T09:08:40Z) - Rethinking Skill Extraction in the Job Market Domain using Large
Language Models [20.256353240384133]
Skill Extraction involves identifying skills and qualifications mentioned in documents such as job postings and resumes.
The reliance on manually annotated data limits the generalizability of such approaches.
In this paper, we explore the use of in-context learning to overcome these challenges.
arXiv Detail & Related papers (2024-02-06T09:23:26Z) - NNOSE: Nearest Neighbor Occupational Skill Extraction [55.22292957778972]
We tackle the complexity in occupational skill datasets.
We employ an external datastore for retrieving similar skills in a dataset-unifying manner.
We observe a performance gain in predicting infrequent patterns, with substantial gains of up to 30% span-F1 in cross-dataset settings.
arXiv Detail & Related papers (2024-01-30T15:18:29Z) - Hierarchical Classification of Transversal Skills in Job Ads Based on
Sentence Embeddings [0.0]
This paper aims to identify correlations between job ad requirements and skill sets using a deep learning model.
The approach involves data collection, preprocessing, and labeling using ESCO (European Skills, Competences, and Occupations) taxonomy.
arXiv Detail & Related papers (2024-01-10T11:07:32Z) - Cross-Lingual NER for Financial Transaction Data in Low-Resource
Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data.
We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information.
With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z) - Skill-Based Few-Shot Selection for In-Context Learning [123.26522773708683]
Skill-KNN is a skill-based few-shot selection method for in-context learning.
It does not require training or fine-tuning of any models, making it suitable for frequently expanding or changing example banks.
Experimental results across five cross-domain semantic parsing datasets and six backbone models show that Skill-KNN significantly outperforms existing methods.
arXiv Detail & Related papers (2023-05-23T16:28:29Z) - Design of Negative Sampling Strategies for Distantly Supervised Skill
Extraction [19.43668931500507]
We propose an end-to-end system for skill extraction, based on distant supervision through literal matching.
We observe that using the ESCO taxonomy to select negative examples from related skills yields the biggest improvements.
We release the benchmark dataset for research purposes to stimulate further research on the task.
arXiv Detail & Related papers (2022-09-13T13:37:06Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - DataOps for Societal Intelligence: a Data Pipeline for Labor Market
Skills Extraction and Matching [5.842787579447653]
We formulate and solve this problem using DataOps models.
We then focus on the critical task of skills extraction from resumes.
We showcase preliminary results with applied machine learning on real data.
arXiv Detail & Related papers (2021-04-05T15:37:25Z) - Combining Deep Generative Models and Multi-lingual Pretraining for
Semi-supervised Document Classification [49.47925519332164]
We combine semi-supervised deep generative models and multi-lingual pretraining to form a pipeline for document classification task.
Our framework is highly competitive and outperforms the state-of-the-art counterparts in low-resource settings across several languages.
arXiv Detail & Related papers (2021-01-26T11:26:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.