OSS Mentor A framework for improving developers contributions via deep reinforcement learning Jiakuan Fan

2025-05-01 0 0 608.28KB 8 页 10玖币

侵权投诉

OSS Mentor: A framework for improving developers’ contributions via

deep reinforcement learning

Jiakuan Fan

School of Data Science and Engineering

East China Normal University

Shanghai, China

jkfan@stu.ecnu.edu.cn

Haoyue Wang

School of Data Science and Engineering

East China Normal University

Shanghai, China

51195100024@stu.ecnu.edu.cn

Wei Wang

School of Data Science and Engineering

East China Normal University

Shanghai, China

wwang@dase.ecnu.edu.cn

Ming Gao

School of Data Science and Engineering

East China Normal University

Shanghai, China

mgao@dase.ecnu.edu.cn

Shengyu Zhao

college of Electronical and Information Engineering

Tongji University

Shanghai, China

frank zsy@tongji.edu.cn

Abstract—In open source project governance, there has been

a lot of concern about how to measure developers’ contributions.

However, extremely sparse work has focused on enabling develop-

ers to improve their contributions, while it is signiﬁcant and valu-

able. In this paper, we introduce a deep reinforcement learning

framework named Open Source Software(OSS) Mentor, which

can be trained from empirical knowledge and then adaptively

help developers improve their contributions. Extensive experi-

ments demonstrate that OSS Mentor signiﬁcantly outperforms

excellent experimental results. Moreover, it is the ﬁrst time that

the presented framework explores deep reinforcement learning

techniques to manage open source software, which enables us

to design a more robust framework to improve developers’

contributions.

Index Terms—open source software, contribution measure-

ment, contribution enhancement, deep reinforcement learning

I. INTRODUCTION

In recent years, open source software represented by Apache

has achieved great success [17]. Developers with different

development experiences in different regions gather together

spontaneously because of their interests, prestige, and employ-

ment needs [18]. Developers of open source software generally

contribute to the project by sharing experience [12], debugging

code [1], and submitting functional patches [2]. Many studies

have always used project contribution as an important indicator

to evaluate the status of developers [7].

There are many use cases in which we need to compare

and recognize different developers’ contributions. While tra-

ditional value-based software engineering [3], [4], [8] focuses

on creating economic value as a way to prioritize resource

allocation and scheduling, other measurements of value may

be more relevant in some of the use cases. One example is

that instructors need a tool with which to evaluate individual

students’ code contributions to group projects (besides non-

code contributions). Such measurement of code contributions

has nothing to do with economic returns. As a second example,

an engineering manager may need a quantitative measurement

of team members’ performance. Additionally, for open-source

software projects, developers’ contributions heavily inﬂuence

collaboration, coordination, and leadership [11], [14].

Therefore, modeling all the data of the top contributors of

the project and analyzing the correlation between all actions

quantitatively so that they can provide insights for improving

the contributions of developers, which is the main research

signiﬁcance of this topic. Increasing developers’ contributions

help software engineering project managers to set up project

teams based on developers’ proﬁles, thereby improving the

productivity and code quality of the teams’ development. As

far as we know, there are no guidances on how to improve the

contributions of developers in open source projects, and more

on how to participate in open source projects. Because there

are many ways for developers to contribute, and it is difﬁcult

to get a general guide.

At present, when developers want to participate in open

source projects, the more common guidances are contribution

guidelines. The contribution guidelines are textual documen-

tation ﬁles, which embody a software project’s contribution

process and document the contribution expectations of project

maintainers. However, there has yet to be an exploration of

what contribution guidelines contain and whether projects

adhere to the workﬂows they prescribe. Currently, almost

no one is doing anything similar to improve developers’

contributions in open source software, while it is signiﬁcant

and valuable.

In this paper, we ﬁrst deﬁne the model for the contribution,

which reﬂects the mutual important relationships between

actions over time, with considering all the possible actions

(both coded and non-coded) from the perspective of the whole

project. Further, we propose the Open Source Software(OSS)

Mentor framework to help developers maximize their contri-

butions by translating the actual problem into a reinforcement

learning problem. In addition, we signiﬁcantly improve the

performance of the algorithm by enhancing the utilization

of parameters during training. The main contributions of our

paper:

arXiv:2210.13990v1 [cs.SE] 24 Oct 2022

1. We propose a data-based contribution evaluation model,

which can dynamically measure developers’ contributions

based on changes in data.

2. We address the challenge of how to improve developers’

contribution, which is an extremely rare and signiﬁcant work

at present.

3. It is the ﬁrst time that the presented framework explores

deep reinforcement learning techniques to manage open source

software, which enables us to design a more robust framework

to improve developers’ contributions.

4. We have performed extensive experiments, proving the

remarkable success of our proposed framework.

The main structure of the paper is as follows: In chapter

2 we focuses on our proposed framework OSS Mentor. In

Chapter 3 we validate our model. In Chapter 4 we present the

related work, and the discussion and summary sections are

presented in Chapters 5 and 6.

II. OSS MENTOR FRAMEWORK

In this section, ﬁrstly, we give a framework for contribution

assessment and show the overall architectural diagram of

the proposed model. Immediately afterward, we quantify the

essential elements of reinforcement learning in the context of

practical problems. After that, we illustrate the algorithmic

ﬂow of the model based on the previous foundation. Finally,

we describe the training process of the model.

A. Overview

Definition of contribution. Previously, the work

on measuring developers’ contribution was basically at the

visual level, and the method of quantiﬁcation was basically to

directly count the number of issues, issue comment, PR, PR

comment, etc., and to quantify the contribution by adding up

the empirical empowerment [16]. But this method is unreliable

because there is no analysis of the project’s data to obtain

results that conform to objective laws, and it does not reﬂect

the characteristics of the respective projects and changes over

time. However, in our work, we ﬁrst introduced the concept of

entropy to measure how recognizable the developer’s actions

are to the developer, and then analyzed project-wide data to

determine the weighting relationship between actions from an

objective perspective. The entropy is calculated as shown in

the formula:

H(X) = −

i=1

P(xi)logP (xi)(1)

P(xi)represents the probability of the event xi. In infor-

mation theory, entropy represents the degree of discrete infor-

mation, and the higher the degree of discrete, the greater the

entropy and the greater the amount of information represented.

So entropy is greatest with the discrete degree presenting

an average distribution. In our work, action events actually

executed by developers in open source projects are selected,

and the discrete degree of action is measured by calculating

the entropy value of the action event. If the higher the entropy

value of H(X) on the i-th action dimension, the greater the

amount of information, then the less discernible the i-th action

is to the developer, which means that everyone is more inclined

to perform it. Next, we use the entropy method to determine

the degree of importance of each action in an open source

project.

However, a prerequisite for information entropy is the

assumption of independence between actions. And in practice,

because of the inter-information problem between actions (e.g.

there is a strong correlation between issue and issue comment),

it does not directly satisfy the entropy-weighted computational

system. To solve the inter-information problem, we replace

information entropy with conditional entropy. The formula is

shown in the ﬁgure:

Hi(Y|X) = −X

x∈X,y∈Y

pi(x, y)log pi(x, y)

pi(x)(2)

We overcome the problem of the assumption that actions in

open source projects are not independent between each other

by introducing conditional probabilities, e.g., issue and issue

comment are not independent between two actions. With the

above method, we can calculate the weight vector Wifor each

action dimension of the project. Finally, we get the calculation

of the contribution:

C(i,k)=

t=1

W(i,k)∗At

(i,k)(3)

where At

(i,k)denotes how many times the action is executed

on the t-th step of the k-th episode of the i-th project. W(i,k)

is computed from the conditional entropy model, and it has

been normalized, which means PT

t=1 Wt= 1.

Our work on the determination of the weights is extremely

signiﬁcant. First of all, the determination of the weights is

based entirely on data, unlike previous work [16] which is

artiﬁcially determined through expert experience. Second, and

most importantly, the weights are dynamic. That is, changes

in project data over time and changes in project status, among

other factors, can cause the weights to be updated. Therefore,

we deﬁne weights that reﬂect not only the mutual importance

relationships between actions over time, but also the important

relationships that are speciﬁc to the actions between different

projects.

P roposed model. This is the ﬁrst time that deep

reinforcement learning has been explored in the ﬁeld of open

source project governance. The goal of the model is to max-

imize the cumulative contribution of the developer after mul-

tiple executions of the action. The detailed ﬂow of the model

is given in Figure 1. First, the environment section is a pre-

trained model that uses the contribution quantiﬁcation method

(Equation 3) to pre-train the weight vector Wi, and contains

the contributors’ action dataset Ei=e1

i, e2

i, e3

i,· · · , en

i.

Developer as an Agent selects action at

d,i in state st

iaccording

to its own policy. At this point, at

d,i matches the sequence of

actions at

e,i with the corresponding contributor that matches

the current state of the developer’s ability from a project Pi

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

OSSMentor:Aframeworkforimprovingdevelopers'contributionsviadeepreinforcementlearningJiakuanFanSchoolofDataScienceandEngineeringEastChinaNormalUniversityShanghai,Chinajkfan@stu.ecnu.edu.cnHaoyueWangSchoolofDataScienceandEngineeringEastChinaNormalUniversityShanghai,China51195100024@stu.ecnu.edu.cnWeiW...

展开>> 收起<<

OSS Mentor A framework for improving developers contributions via deep reinforcement learning Jiakuan Fan.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

OSS Mentor A framework for improving developers contributions via deep reinforcement learning Jiakuan Fan

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: