A practical method for occupational skills detection in Vietnamese job listings Viet-Trung Tran Hai-Nam Cao and Tuan-Dung Cao

2025-04-29 0 0 228.54KB 10 页 10玖币

侵权投诉

A practical method for occupational skills detection in

Vietnamese job listings

Viet-Trung Tran, Hai-Nam Cao, and Tuan-Dung Cao

Hanoi University of Science and Technology, Vietnam

{trungtv,namch,dungct}@soict.hust.edu.vn

Abstract. Vietnamese labor market has been under an imbalanced development.

The number of university graduates is growing, but so is the unemployment rate.

This situation is often caused by the lack of accurate and timely labor market

information, which leads to skill miss-matches between worker supply and the

actual market demands. To build a data monitoring and analytic platform for the

labor market, one of the main challenges is to be able to automatically detect

occupational skills from labor-related data, such as resumes and job listings.

Traditional approaches rely on existing taxonomy and/or large annotated data to

build Named Entity Recognition (NER) models. They are expensive and require

huge manual eﬀorts. In this paper, we propose a practical methodology for skill

detection in Vietnamese job listings. Rather than viewing the task as a NER task,

we consider the task as a ranking problem. We propose a pipeline in which phrases

are ﬁrst extracted and ranked in semantic similarity with the phrases’ contexts.

Then we employ a ﬁnal classiﬁcation to detect skill phrases. We collected three

datasets and conducted extensive experiments. The results demonstrated that our

methodology achieved better performance than a NER model in scarce datasets.

Keywords: skill extraction, named entity recognition, text embedding, text rank-

ing

1 Introduction

Labor market is the foundation and key driver for economic growth. To achieve

high eﬃciency, labor market needs information. Policymakers rely on labor supply

and demand relationships to chart economic and social policies. Educators need

to align curriculum development with employers’ demand, especially in fast-

changing sectors. Job seekers need information on skill requirements, company

proﬁles, work conditions, and growth trajectories. In an increasingly digitized

economy, labor market policymakers need to continuously maintain an up-to-

date vision, focusing on growing skills that are less likely to be replaced by

automation [6].

It is clear that regulators of Vietnam labor market lack updated and relevant

information to make market-driven decisions. Usually, the labor force surveys,

which conducted quarterly by General Statistics Oﬃce of Vietnam (GSO), face

six key challenges: freshness, accuracy, coverage, analysis, usability, and cost. In

consequence, Vietnamese labor market has been under imbalance development

for many years. Although the number of university graduates is growing, but so

is the unemployment rate. In the ﬁrst quarter of 2018 [10], the number of people

arXiv:2210.14607v1 [cs.CL] 26 Oct 2022

2 V-T Tran et al.

with intermediate and college degrees that found jobs was 79.1% and 72.9%

respectively; meanwhile, only 55.6% of university graduates have jobs.

Besides, Vietnamese job portals have been considered as an important bridge

between recruitment managers and job seekers. Over the years, these portals

have accumulated a growing amount of digital labor-related market data such as

job listings and applicants’ resumes. However, the exploitation of these data is

limited as these portals only provide job categories and keyword-based search

functionality.

To enable advanced analysis, it is imperative to have a model that can automatically

detect skills from labor market-related data. The model can beneﬁt advanced labor

market analysis and ultimately facilitate orienting workforce training and re-

skilling programs. Various approaches [18,13,11,17] consider this skill detection

task as a Named Entity Recognition (NER) task in natural language processing.

They have a common drawback: a large number of labeled sentences is needed to

train the NER models in a supervised setting. Other approaches detect skills from

a given document by performing a direct match between n-gram sequences and

terms in the target taxonomy [9,2,7]. These approaches, however, do not work for

Vietnamese language as there is no such a taxonomy yet.

In Vietnamese job listing websites, a job opening usually has a common semi-

structural format. Each job opening has the following sections:

– Title A short, one sentence highlighting for the job to attract job seekers.

The title often mentions job position, job level, and salary range.

– Description One paragraph or a list that describes the job characteristics:

What and how the work will be carried on.

– Compensation One paragraph or a list that shows salary range and beneﬁts

paid to employees in exchange for the services they provide.

– Requirements One paragraph or a list that contains experiences, qualiﬁ-

cations, and skills necessary for the candidates to be considered for a role.

– About the company Brief introduction to the company and its environ-

ment.

– Contact point An email address and a phone number to submit and ques-

tion the application.

The order of those sections may vary, however, most skill mentions will be within

the requirement section. In this paper, we present a practical approach for skill

detection in Vietnamese job listings. Rather than viewing the task as a NER task,

we model the task as a ranking problem. Our approach exploits the structural

property of a job description: any skill mention found in a requirement section

will have a high semantic similarity score with the section itself.

The rest of this paper is organized as follows: we start in Section 2 by outlining

the main steps of the proposed method. In Section 3, we describe in detail the

implementation of the tasks in the previous section: embedding, phrase mining,

term ranking, and term classiﬁcation. In Section 4, we carry out a comprehensive

experimental study to validate the proposed method. We conclude with a summary

of results and future work in Section 5.

2 Methodology

Our method is depicted in Figure 1. In comparison to the traditional NER approach,

our methodology is more practical and less expensive in terms of manual eﬀorts.

It is a pipeline composed of 4 layers:

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ApracticalmethodforoccupationalskillsdetectioninVietnamesejoblistingsViet-TrungTran,Hai-NamCao,andTuan-DungCaoHanoiUniversityofScienceandTechnology,Vietnam{trungtv,namch,dungct}@soict.hust.edu.vnAbstract.Vietnameselabormarkethasbeenunderanimbalanceddevelopment.Thenumberofuniversitygraduatesisgrowing...

展开>> 收起<<

A practical method for occupational skills detection in Vietnamese job listings Viet-Trung Tran Hai-Nam Cao and Tuan-Dung Cao.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A practical method for occupational skills detection in Vietnamese job listings Viet-Trung Tran Hai-Nam Cao and Tuan-Dung Cao

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: