Trust in Motion Capturing Trust Ascendancy in Open-Source Projects using Hybrid AI Huascar Sanchez

2025-05-06 0 0 2.89MB 6 页 10玖币
侵权投诉
Trust in Motion: Capturing Trust Ascendancy in
Open-Source Projects using Hybrid AI
Huascar Sanchez
Computer Science Laboratory
SRI International
huascar.sanchez@sri.com
Briland Hitaj
Computer Science Laboratory
SRI International
briland.hitaj@sri.com
Abstract—Open-source is frequently described as a driver for
unprecedented communication and collaboration, and the process
works best when projects support teamwork. Yet, open-source
cooperation processes in no way protect project contributors from
considerations of trust, power, and influence. Indeed, achieving
the level of trust necessary to contribute to a project and thus
influence its direction is a constant process of change, and
developers take many different routes over many communication
channels to achieve it. We refer to this process of influence-
seeking and trust-building as trust ascendancy.
This paper describes a methodology for understanding the
notion of trust ascendancy and introduces the capabilities that are
needed to localize trust ascendancy operations happening over
open-source projects. Much of the prior work in understanding
trust in open-source software development has focused on a static
view of the problem using different forms of quantity measures.
However, trust ascendancy is not static, but rather adapts to
changes in the open-source ecosystem in response to new input.
This paper is the first attempt to articulate and study these signals
from a dynamic view of the problem. In that respect, we identify
related work that may help illuminate research challenges,
implementation tradeoffs, and complementary solutions. Our
preliminary results show the effectiveness of our method at cap-
turing the trust ascendancy developed by individuals involved in a
well-documented 2020 social engineering attack. Our future plans
highlight research challenges and encourage cross-disciplinary
collaboration to create more automated, accurate, and efficient
ways to model and then track trust ascendancy in open-source
projects.
Index Terms—trust ascendancy modeling, dynamic developer
activity embeddings, influence pathway trajectories
I. INTRODUCTION
Achieving the level of trust necessary to contribute to a
project is a ubiquitous construct of how open-source software
development works [30], [35] and one of the most prevalent
objectives [26] in the general developer population in social
coding platforms like GitHub and Stack Overflow. Achieving
this trust is a dynamic process of change [11] that is inherently
political [13], and developers take many different routes over
many communication channels to influence its formation [2],
[9], [18], [35], [37]. We refer to this process of influence-
seeking and trust-building as trust ascendancy. Much of the
This material is based upon work supported by the Defense Advanced
Research Projects Agency (DARPA) under Contract No. HR00112190086.
Any opinions, findings, and conclusions or recommendations expressed in this
material are those of the author(s) and do not necessarily reflect the views of
the United States Government or DARPA.
prior work in understanding trust and its ascendancy in open-
source projects has focused on a static view of the problem
using scale measurements (e.g., [3], [15], [34], [38]). However,
trust ascendancy is not static. Instead, it adapts to changes
in the ecosystem in response to developer role changes, new
functionality, new technologies, and so on. Automatically
tracking this socio-technically stimulated dynamism thus re-
quires dynamic developer behavior models. This paper is a
first attempt to articulate and study this issue.
We consider the problem of capturing the motion dynamics
of trust ascendancy inside open-source software (OSS) projects
using dynamic developer activity models. These motion dy-
namics are reflected in the way trust is periodically developed
inside projects in response to either socio-technical stimuli
(e.g., social influence, role changes, code contributions) or
to periodic changes in the context of individual activities,
such as reporting a bug, that are intended to help potential
contributors build a reputation and eventually become project
committers (see the case study of a “successful socialization”
in Ducheneaut [13]). Understanding the context in which
actions are performed as well as tracking when (e.g., time of
day) this context changes can give us a global picture of the
influence pathways formed inside a project. Here, an influence
pathway is a potential conduit for influence to flow inside a
project and a schedule. The context of an activity embodies the
semantic associations between the activity and other activities
that were performed around the same schedule.
Arguably, influence is the main driver for building trust
inside networked social environments [37] like OSS projects.
The structure of these networks is usually “black-boxed,” and
to exert any influence in them, potential contributors need to
progressively make this network structure more visible [13].
A goal of the current effort is to bridge the gap mentioned
earlier, namely, that research on understanding trust and its
ascendancy tends to be based on static accounts. Consistent
with this goal, we introduce a hybrid approach that combines
the strength of unsupervised machine learning with the flexi-
bility of self-supervised machine learning, and generalize it to
sequential data collected from real-world software projects.
This work complements existing work by providing better
mapping and understanding of the multiple influence pathways
taken by developers to progressively open this “black-box,
thereby enabling them to contribute to the project.
1
arXiv:2210.02656v2 [cs.SE] 10 Oct 2022
When we look at existing work in language evolution
detection using word embeddings [1], [12], [17], [20], [29],
[41], we observe the power of word embeddings at capturing
evolving semantic associations between words across time and
also at allowing for cross-pollination between the NLP and
other application domains (e.g., [4], [8], [10], [24], [33]). In
general, these models accurately model the distribution of an
object (e.g., words, security alerts) based on the surrounding
objects in terms of low-dimensional vector representations.
In building on this prior work, we are interested in learning
dynamic developer activity embeddings for evolving influ-
ence pathways recovery. These embeddings will capture, in
a fashion that resembles how dynamic word embeddings infer
embedding trajectories over time, the temporal dependence
between concrete developer actions performed within OSS
projects. This will give us a natural backtrace trajectory of the
influence pathways taken by potential contributors. Figure 1
shows a concrete example of how we can use the dynamic
developer activity embeddings to reveal distinctly novel in-
fluence trajectories that summarize a 2020 social engineering
attack against the Linux Kernel [7].
The revealed trajectories show the potentially influenced 1
maintainers and the attackers’ trust ascendancy operations
across both time and subsystems. The objective of each oper-
ation was to build trust and then to poison the Linux Kernel
with vulnerabilities (the “hypocrite commits” patches [39]).
The sections that follow provide details of the case study we
explored to highlight the challenge, as well as the methods
and data we used to recover these dynamic trajectories.
II. CASE STUDY: HIGHLIGHTING CHARACTERISTICS OF
THE CHALLENGE
The techniques described in this paper should apply to
thousands of significant OSS projects. We considered many
projects, but narrowed the list of case studies to the Linux
Kernel and its accepted patches case study below.
Case Study: The Linux Kernel and its accepted patches. The
Linux Kernel (LK) development process is well known, doc-
umented, and researched [14], [19]. However, some actions
of contributors or maintainers often diverge from this process.
One of these actions is accepting patches. Studying why a
technical change happens is as important as modeling either
the rejection or the oversight of design changes (e.g., [27]),
particularly if the acceptance is rooted in both social and
technical influence. To provide a control comparison, we will
model the influence pathways that have led to the integration
of controversial changes such as the hypocrite commits [39].
Could this event be indicative of an intrinsic susceptibility to
socio-technical influence in general OSS projects?
We rely on the high level of transparency offered by the
kernel and the wealth of signals available in its development
process to learn a dynamic model of developer behavior.
1Manipulated directly or indirectly in response to other maintainers’ actions.
(a) Opportunistic trust ascendancy.
(b) Awry trust ascendancy.
(c) Hit or miss trust ascendancy.
Fig. 1: A 2D t-SNE projection of three trust ascendancy tra-
jectories across a 14-week time window (08/202011/2020).
Red dots represent attackers’ behavior, green dots with a red
ring represent maintainers’ behavior, purple dots represent the
behavior of other aliases used by the attackers, labels are
action abbreviations, and blue lines represent the trajectory
of attackers’ trust ascendancy. Maintainers’ distance to these
trajectories is an indicator of trust development with attackers’
actions.
2
摘要:

TrustinMotion:CapturingTrustAscendancyinOpen-SourceProjectsusingHybridAIHuascarSanchezComputerScienceLaboratorySRIInternationalhuascar.sanchez@sri.comBrilandHitajComputerScienceLaboratorySRIInternationalbriland.hitaj@sri.comAbstract—Open-sourceisfrequentlydescribedasadriverforunprecedentedcommunicat...

展开>> 收起<<
Trust in Motion Capturing Trust Ascendancy in Open-Source Projects using Hybrid AI Huascar Sanchez.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:6 页 大小:2.89MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注