A practical method for occupational skills detection in Vietnamese job listings Viet-Trung Tran Hai-Nam Cao and Tuan-Dung Cao

2025-04-29 0 0 228.54KB 10 页 10玖币
侵权投诉
A practical method for occupational skills detection in
Vietnamese job listings
Viet-Trung Tran, Hai-Nam Cao, and Tuan-Dung Cao
Hanoi University of Science and Technology, Vietnam
{trungtv,namch,dungct}@soict.hust.edu.vn
Abstract. Vietnamese labor market has been under an imbalanced development.
The number of university graduates is growing, but so is the unemployment rate.
This situation is often caused by the lack of accurate and timely labor market
information, which leads to skill miss-matches between worker supply and the
actual market demands. To build a data monitoring and analytic platform for the
labor market, one of the main challenges is to be able to automatically detect
occupational skills from labor-related data, such as resumes and job listings.
Traditional approaches rely on existing taxonomy and/or large annotated data to
build Named Entity Recognition (NER) models. They are expensive and require
huge manual efforts. In this paper, we propose a practical methodology for skill
detection in Vietnamese job listings. Rather than viewing the task as a NER task,
we consider the task as a ranking problem. We propose a pipeline in which phrases
are first extracted and ranked in semantic similarity with the phrases’ contexts.
Then we employ a final classification to detect skill phrases. We collected three
datasets and conducted extensive experiments. The results demonstrated that our
methodology achieved better performance than a NER model in scarce datasets.
Keywords: skill extraction, named entity recognition, text embedding, text rank-
ing
1 Introduction
Labor market is the foundation and key driver for economic growth. To achieve
high efficiency, labor market needs information. Policymakers rely on labor supply
and demand relationships to chart economic and social policies. Educators need
to align curriculum development with employers’ demand, especially in fast-
changing sectors. Job seekers need information on skill requirements, company
profiles, work conditions, and growth trajectories. In an increasingly digitized
economy, labor market policymakers need to continuously maintain an up-to-
date vision, focusing on growing skills that are less likely to be replaced by
automation [6].
It is clear that regulators of Vietnam labor market lack updated and relevant
information to make market-driven decisions. Usually, the labor force surveys,
which conducted quarterly by General Statistics Office of Vietnam (GSO), face
six key challenges: freshness, accuracy, coverage, analysis, usability, and cost. In
consequence, Vietnamese labor market has been under imbalance development
for many years. Although the number of university graduates is growing, but so
is the unemployment rate. In the first quarter of 2018 [10], the number of people
arXiv:2210.14607v1 [cs.CL] 26 Oct 2022
2 V-T Tran et al.
with intermediate and college degrees that found jobs was 79.1% and 72.9%
respectively; meanwhile, only 55.6% of university graduates have jobs.
Besides, Vietnamese job portals have been considered as an important bridge
between recruitment managers and job seekers. Over the years, these portals
have accumulated a growing amount of digital labor-related market data such as
job listings and applicants’ resumes. However, the exploitation of these data is
limited as these portals only provide job categories and keyword-based search
functionality.
To enable advanced analysis, it is imperative to have a model that can automatically
detect skills from labor market-related data. The model can benefit advanced labor
market analysis and ultimately facilitate orienting workforce training and re-
skilling programs. Various approaches [18,13,11,17] consider this skill detection
task as a Named Entity Recognition (NER) task in natural language processing.
They have a common drawback: a large number of labeled sentences is needed to
train the NER models in a supervised setting. Other approaches detect skills from
a given document by performing a direct match between n-gram sequences and
terms in the target taxonomy [9,2,7]. These approaches, however, do not work for
Vietnamese language as there is no such a taxonomy yet.
In Vietnamese job listing websites, a job opening usually has a common semi-
structural format. Each job opening has the following sections:
– Title A short, one sentence highlighting for the job to attract job seekers.
The title often mentions job position, job level, and salary range.
– Description One paragraph or a list that describes the job characteristics:
What and how the work will be carried on.
– Compensation One paragraph or a list that shows salary range and benefits
paid to employees in exchange for the services they provide.
– Requirements One paragraph or a list that contains experiences, qualifi-
cations, and skills necessary for the candidates to be considered for a role.
About the company Brief introduction to the company and its environ-
ment.
Contact point An email address and a phone number to submit and ques-
tion the application.
The order of those sections may vary, however, most skill mentions will be within
the requirement section. In this paper, we present a practical approach for skill
detection in Vietnamese job listings. Rather than viewing the task as a NER task,
we model the task as a ranking problem. Our approach exploits the structural
property of a job description: any skill mention found in a requirement section
will have a high semantic similarity score with the section itself.
The rest of this paper is organized as follows: we start in Section 2 by outlining
the main steps of the proposed method. In Section 3, we describe in detail the
implementation of the tasks in the previous section: embedding, phrase mining,
term ranking, and term classification. In Section 4, we carry out a comprehensive
experimental study to validate the proposed method. We conclude with a summary
of results and future work in Section 5.
2 Methodology
Our method is depicted in Figure 1. In comparison to the traditional NER approach,
our methodology is more practical and less expensive in terms of manual efforts.
It is a pipeline composed of 4 layers:
摘要:

ApracticalmethodforoccupationalskillsdetectioninVietnamesejoblistingsViet-TrungTran,Hai-NamCao,andTuan-DungCaoHanoiUniversityofScienceandTechnology,Vietnam{trungtv,namch,dungct}@soict.hust.edu.vnAbstract.Vietnameselabormarkethasbeenunderanimbalanceddevelopment.Thenumberofuniversitygraduatesisgrowing...

展开>> 收起<<
A practical method for occupational skills detection in Vietnamese job listings Viet-Trung Tran Hai-Nam Cao and Tuan-Dung Cao.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:228.54KB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注