Related papers: A practical method for occupational skills detection in Vietnamese job listings

A practical method for occupational skills detection in Vietnamese job listings

URL: http://arxiv.org/abs/2210.14607v1
Date: Wed, 26 Oct 2022 10:23:18 GMT
Title: A practical method for occupational skills detection in Vietnamese job listings
Authors: Viet-Trung Tran, Hai-Nam Cao and Tuan-Dung Cao
Abstract summary: Lack of accurate and timely labor market information leads to skill miss-matches. Traditional approaches rely on existing taxonomy and/or large annotated data. We propose a practical methodology for skill detection in Vietnamese job listings.
Score: 0.16114012813668932
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Vietnamese labor market has been under an imbalanced development. The number of university graduates is growing, but so is the unemployment rate. This situation is often caused by the lack of accurate and timely labor market information, which leads to skill miss-matches between worker supply and the actual market demands. To build a data monitoring and analytic platform for the labor market, one of the main challenges is to be able to automatically detect occupational skills from labor-related data, such as resumes and job listings. Traditional approaches rely on existing taxonomy and/or large annotated data to build Named Entity Recognition (NER) models. They are expensive and require huge manual efforts. In this paper, we propose a practical methodology for skill detection in Vietnamese job listings. Rather than viewing the task as a NER task, we consider the task as a ranking problem. We propose a pipeline in which phrases are first extracted and ranked in semantic similarity with the phrases' contexts. Then we employ a final classification to detect skill phrases. We collected three datasets and conducted extensive experiments. The results demonstrated that our methodology achieved better performance than a NER model in scarce datasets.

Related papers

Can Online GenAI Discussion Serve as Bellwether for Labor Market Shifts? [62.386835769570006]
This paper examines whether online discussions about Large Language Models can function as early indicators of labor market shifts.<n>We employ four distinct analytical approaches to identify the domains and timeframes in which public discourse serves as a leading signal for employment changes.<n>Our findings reveal that discussion intensity predicts employment changes 1-7 months in advance across multiple indicators, including job postings, net hiring rates, tenure patterns, and unemployment duration.
arXiv Detail & Related papers (2025-11-20T04:18:25Z)
JobHop: A Large-Scale Dataset of Career Trajectories [48.881023210777585]
JobHop is a large-scale public dataset derived from anonymized resumes provided by VDAB, the public employment service in Flanders, Belgium.<n>We process unstructured resume data to extract structured career information, which is then mapped to standardized ESCO occupation codes.<n>This results in a rich dataset of over 2.3 million work experiences, extracted from and grouped into more than 391,000 user resumes.
arXiv Detail & Related papers (2025-05-12T15:22:29Z)
Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance. We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z)
Comparative Analysis of Encoder-Based NER and Large Language Models for Skill Extraction from Russian Job Vacancies [0.0]
This study compares Named Entity Recognition methods based on encoders with Large Language Models (LLMs) for extracting skills from Russian job vacancies. Results indicate that traditional NER models, especially DeepPavlov RuBERT NER tuned, outperform LLMs across various metrics including accuracy, precision, recall, and inference time. This research contributes to the field of natural language processing (NLP) and its application in the labor market, particularly in non-English contexts.
arXiv Detail & Related papers (2024-07-29T09:08:40Z)
Computational Job Market Analysis with Natural Language Processing [5.117211717291377]
This thesis investigates Natural Language Processing (NLP) technology for extracting relevant information from job descriptions. We frame the problem, obtaining annotated data, and introducing extraction methodologies. Our contributions include job description datasets, a de-identification dataset, and a novel active learning algorithm for efficient model training.
arXiv Detail & Related papers (2024-04-29T14:52:38Z)
NNOSE: Nearest Neighbor Occupational Skill Extraction [55.22292957778972]
We tackle the complexity in occupational skill datasets. We employ an external datastore for retrieving similar skills in a dataset-unifying manner. We observe a performance gain in predicting infrequent patterns, with substantial gains of up to 30% span-F1 in cross-dataset settings.
arXiv Detail & Related papers (2024-01-30T15:18:29Z)
Making Pre-trained Language Models both Task-solvers and Self-calibrators [52.98858650625623]
Pre-trained language models (PLMs) serve as backbones for various real-world systems. Previous work shows that introducing an extra calibration task can mitigate this issue. We propose a training algorithm LM-TOAST to tackle the challenges.
arXiv Detail & Related papers (2023-07-21T02:51:41Z)
Cross-Lingual NER for Financial Transaction Data in Low-Resource Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data. We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information. With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z)
Detecting Fake Job Postings Using Bidirectional LSTM [0.0]
This study employs a Bidirectional Long Short-Term Memory (Bi-LSTM) model to identify fake job advertisements. The proposed model demonstrates a superior performance, achieving a 0.91 ROC AUC score and a 98.71% accuracy rate. The findings of this research contribute to the development of robust, automated tools that can help combat the proliferation of fake job postings.
arXiv Detail & Related papers (2023-04-03T20:05:27Z)
Design of Negative Sampling Strategies for Distantly Supervised Skill Extraction [19.43668931500507]
We propose an end-to-end system for skill extraction, based on distant supervision through literal matching. We observe that using the ESCO taxonomy to select negative examples from related skills yields the biggest improvements. We release the benchmark dataset for research purposes to stimulate further research on the task.
arXiv Detail & Related papers (2022-09-13T13:37:06Z)
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems. We propose a novel method for computing the normalized maximum likelihood (NML) distribution. We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z)
DataOps for Societal Intelligence: a Data Pipeline for Labor Market Skills Extraction and Matching [5.842787579447653]
We formulate and solve this problem using DataOps models. We then focus on the critical task of skills extraction from resumes. We showcase preliminary results with applied machine learning on real data.
arXiv Detail & Related papers (2021-04-05T15:37:25Z)
Job2Vec: Job Title Benchmarking with Collective Multi-View Representation Learning [51.34011135329063]
Job Title Benchmarking (JTB) aims at matching job titles with similar expertise levels across various companies. Traditional JTB approaches mainly rely on manual market surveys, which is expensive and labor-intensive. We reformulate the JTB as the task of link prediction over the Job-Graph that matched job titles should have links.
arXiv Detail & Related papers (2020-09-16T02:33:32Z)
Mining Implicit Relevance Feedback from User Behavior for Web Question Answering [92.45607094299181]
We make the first study to explore the correlation between user behavior and passage relevance. Our approach significantly improves the accuracy of passage ranking without extra human labeled data. In practice, this work has proved effective to substantially reduce the human labeling cost for the QA service in a global commercial search engine.
arXiv Detail & Related papers (2020-06-13T07:02:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.