A practical method for occupational skills detection in Vietnamese job
listings
- URL: http://arxiv.org/abs/2210.14607v1
- Date: Wed, 26 Oct 2022 10:23:18 GMT
- Title: A practical method for occupational skills detection in Vietnamese job
listings
- Authors: Viet-Trung Tran, Hai-Nam Cao and Tuan-Dung Cao
- Abstract summary: Lack of accurate and timely labor market information leads to skill miss-matches.
Traditional approaches rely on existing taxonomy and/or large annotated data.
We propose a practical methodology for skill detection in Vietnamese job listings.
- Score: 0.16114012813668932
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vietnamese labor market has been under an imbalanced development. The number
of university graduates is growing, but so is the unemployment rate. This
situation is often caused by the lack of accurate and timely labor market
information, which leads to skill miss-matches between worker supply and the
actual market demands. To build a data monitoring and analytic platform for the
labor market, one of the main challenges is to be able to automatically detect
occupational skills from labor-related data, such as resumes and job listings.
Traditional approaches rely on existing taxonomy and/or large annotated data to
build Named Entity Recognition (NER) models. They are expensive and require
huge manual efforts. In this paper, we propose a practical methodology for
skill detection in Vietnamese job listings. Rather than viewing the task as a
NER task, we consider the task as a ranking problem. We propose a pipeline in
which phrases are first extracted and ranked in semantic similarity with the
phrases' contexts. Then we employ a final classification to detect skill
phrases. We collected three datasets and conducted extensive experiments. The
results demonstrated that our methodology achieved better performance than a
NER model in scarce datasets.
Related papers
- Comparative Analysis of Encoder-Based NER and Large Language Models for Skill Extraction from Russian Job Vacancies [0.0]
This study compares Named Entity Recognition methods based on encoders with Large Language Models (LLMs) for extracting skills from Russian job vacancies.
Results indicate that traditional NER models, especially DeepPavlov RuBERT NER tuned, outperform LLMs across various metrics including accuracy, precision, recall, and inference time.
This research contributes to the field of natural language processing (NLP) and its application in the labor market, particularly in non-English contexts.
arXiv Detail & Related papers (2024-07-29T09:08:40Z) - Computational Job Market Analysis with Natural Language Processing [5.117211717291377]
This thesis investigates Natural Language Processing (NLP) technology for extracting relevant information from job descriptions.
We frame the problem, obtaining annotated data, and introducing extraction methodologies.
Our contributions include job description datasets, a de-identification dataset, and a novel active learning algorithm for efficient model training.
arXiv Detail & Related papers (2024-04-29T14:52:38Z) - NNOSE: Nearest Neighbor Occupational Skill Extraction [55.22292957778972]
We tackle the complexity in occupational skill datasets.
We employ an external datastore for retrieving similar skills in a dataset-unifying manner.
We observe a performance gain in predicting infrequent patterns, with substantial gains of up to 30% span-F1 in cross-dataset settings.
arXiv Detail & Related papers (2024-01-30T15:18:29Z) - Making Pre-trained Language Models both Task-solvers and
Self-calibrators [52.98858650625623]
Pre-trained language models (PLMs) serve as backbones for various real-world systems.
Previous work shows that introducing an extra calibration task can mitigate this issue.
We propose a training algorithm LM-TOAST to tackle the challenges.
arXiv Detail & Related papers (2023-07-21T02:51:41Z) - Cross-Lingual NER for Financial Transaction Data in Low-Resource
Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data.
We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information.
With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z) - Detecting Fake Job Postings Using Bidirectional LSTM [0.0]
This study employs a Bidirectional Long Short-Term Memory (Bi-LSTM) model to identify fake job advertisements.
The proposed model demonstrates a superior performance, achieving a 0.91 ROC AUC score and a 98.71% accuracy rate.
The findings of this research contribute to the development of robust, automated tools that can help combat the proliferation of fake job postings.
arXiv Detail & Related papers (2023-04-03T20:05:27Z) - Design of Negative Sampling Strategies for Distantly Supervised Skill
Extraction [19.43668931500507]
We propose an end-to-end system for skill extraction, based on distant supervision through literal matching.
We observe that using the ESCO taxonomy to select negative examples from related skills yields the biggest improvements.
We release the benchmark dataset for research purposes to stimulate further research on the task.
arXiv Detail & Related papers (2022-09-13T13:37:06Z) - MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven
Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems.
We propose a novel method for computing the normalized maximum likelihood (NML) distribution.
We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z) - DataOps for Societal Intelligence: a Data Pipeline for Labor Market
Skills Extraction and Matching [5.842787579447653]
We formulate and solve this problem using DataOps models.
We then focus on the critical task of skills extraction from resumes.
We showcase preliminary results with applied machine learning on real data.
arXiv Detail & Related papers (2021-04-05T15:37:25Z) - Job2Vec: Job Title Benchmarking with Collective Multi-View
Representation Learning [51.34011135329063]
Job Title Benchmarking (JTB) aims at matching job titles with similar expertise levels across various companies.
Traditional JTB approaches mainly rely on manual market surveys, which is expensive and labor-intensive.
We reformulate the JTB as the task of link prediction over the Job-Graph that matched job titles should have links.
arXiv Detail & Related papers (2020-09-16T02:33:32Z) - Mining Implicit Relevance Feedback from User Behavior for Web Question
Answering [92.45607094299181]
We make the first study to explore the correlation between user behavior and passage relevance.
Our approach significantly improves the accuracy of passage ranking without extra human labeled data.
In practice, this work has proved effective to substantially reduce the human labeling cost for the QA service in a global commercial search engine.
arXiv Detail & Related papers (2020-06-13T07:02:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.