Do Software Security Practices Yield Fewer Vulnerabilities Nusrat Zahan Shohanuzzaman Shohan Dan Harris and Laurie Williams

2025-05-03 0 0 698.56KB 12 页 10玖币
侵权投诉
Do Software Security Practices Yield Fewer
Vulnerabilities?
Nusrat Zahan, Shohanuzzaman Shohan, Dan Harris and Laurie Williams
North Carolina State University
Raleigh, USA
Email: [nzahan, sshohan, doharris, lawilli3]@ncsu.edu
Abstract—Due to the ever-increasing number of security
breaches, practitioners are motivated to produce more secure
software. In the United States, the White House Office released
a memorandum on Executive Order (EO) 14028 that mandates
organizations provide self-attestation of the use of secure software
development practices. The OpenSSF Scorecard project allows
practitioners to measure the use of software security practices au-
tomatically. However, little research has been done to determine
whether the use of security practices improves package security,
particularly which security practices have the biggest impact on
security outcomes. The goal of this study is to assist practitioners
and researchers in making informed decisions on which security
practices to adopt through the development of models between
software security practice scores and security vulnerability counts.
To that end, we developed five supervised machine learning
models for npm and PyPI packages using the OpenSSF Score-
card security practices scores and aggregate security scores as
predictors and the number of externally-reported vulnerabilities
as a target variable. Our models found that four security practices
(Maintained, Code Review, Branch Protection, and Security Policy)
were the most important practices influencing vulnerability
count. However, we had low R2(ranging from 9% to 12%) when
we tested the models to predict vulnerability counts. Additionally,
we observed that the number of reported vulnerabilities increased
rather than reduced as the aggregate security score of the
packages increased. Both findings indicate that additional factors
may influence the package vulnerability count. Other factors,
such as the scarcity of vulnerability data, time to implicate
security practices vs. time to detect vulnerabilities, and the
need for more adequate scripts to detect security practices, may
impede the data-driven studies to indicate that a practice can aid
in the reduction of externally-reported vulnerabilities. We suggest
that vulnerability count and security score data be refined such
that these measures may be used to provide actionable guidance
on security practices.
I. INTRODUCTION
A 2022 report from Synopsys [1] assessed the reliance
of the software industry on open-source software (OSS) and
estimated that 97% of applications use OSS. The continuous
reliance on OSS comes with a risk of supply chain attack.
Sonatype has recorded an average 700% jump in supply chain
attacks [2], as measured by the number of newly-published
malicious packages in open-source repositories. Software de-
velopers largely did not anticipate how the software supply
chain would become a deliberate attack vector. Many stake-
holders have started to recognize this urgent concern; most
prominently, Executive Order 14028 [3], issued May 12, 2021,
calls for identifying practices that enhance software security
and mandating the use of software security practices.
As organizations seek to address escalating security risks
and comply with regulations, a myriad of activities are avail-
able to improve software security. The National Institute of
Standards and Technology (NIST) issued guidance [6], [7] on
software development practices that enhance the security of the
software supply chain. The Open Source Security Foundation
(OpenSSF), a cross-industry organization hosted at the Linux
Foundation, provides the vehicle for collaboration on tools,
services, training, infrastructure, and resources for securing
open-source projects. The OpenSSF Scorecard [8] project
computes automated scores of package’s security practices to
help developers make better decisions about security when
consuming open source projects.
While NIST’s guidelines and OpenSSF projects provide
guidelines and comprehensive lists of security practices, chal-
lenges arise in validating whether these practices improve se-
curity. Organizations may not have the resources to adopt a full
suite of these practices. They would like to understand the key
drivers of success, especially which of many possible software
security practices to undertake first. A data-driven study on the
relationship between the use of software security practices and
vulnerability count could aid in this understanding. The goal
of this study is to assist practitioners and researchers in
making informed decisions on which security practices to
adopt through the development of models between software
security practice scores and security vulnerability counts.
To that end, this study is focused on the relationship between
publicly-available data of security practice use and externally-
reported vulnerability count. This relationship may be used
to provide actionable recommendations to practitioners on
security practices. Our work addresses the following research
questions:
RQ1 (Security Practices): Which Scorecard security
practices are most important to understand the relation-
ship between security practices and vulnerability counts
in regression models?
RQ2 (Security Outcome): Do packages with higher
aggregate security scores have fewer vulnerabilities?
The OpenSSF Scorecard project [8] uses security practice
metrics and auto-generates a “security score” for each practice.
The tool also computes an aggregate “security score” which
is a weighted average of the individual score. In this study,
we leverage the security practice score data provided by
arXiv:2210.14884v2 [cs.CR] 15 Jun 2023
the OpenSSF Scorecard project and vulnerability information
from security advisory databases to evaluate the relationship
between security practices and vulnerability count. To answer
RQ1, we used the “feature importance” technique in four
regression models to understand the importance of each secu-
rity practice in interpreting the relationship between security
practices and vulnerability count. For RQ2, we built our fifth
regression model to evaluate the statistical association of a
package’s aggregate security scores with package vulnerabil-
ity count and to understand whether implementing security
practices assists in secure coding with fewer vulnerabilities.
This study can assist practitioners in making informed
choices regarding their security practices, particularly on
which practices have the most impact or whether specific
security practices have improved security outcomes. This study
makes the following contributions:
A proposed model to identify important security practices
to enhance software security.
A proposed model to understand how higher aggregate
security scores affect security outcomes.
Evaluation of the proposed model and security practices
using open source datasets.
The list of challenges impeding the model’s performance
This paper is organized as follows: Section II includes
the related concept used in our study, section III discussed
the step taken by the software industry to improve software
supply chain security. Section IV describes the data collection
process, and Section V describes the data analysis to measure
model performance. We close with a discussion and limitations
of our findings (Section VI).
II. BACKGROUND
In this section, we discuss the “OpenSSF Scorecard” tool
and the two most frequently used terms in our study: “Security
Outcome” and “Package”.
A. OpenSSF Scorecard
The OpenSSF Scorecard [8] is an automated tool that runs
on source code hosted by GitHub to monitor the security
health of the packages. The Scorecard automatically computes
a normalized integer score between 0 to 10 for each of the 18
security practices. For each GitHub repository, an individual
score for each practice and an aggregate score for all practices
are returned by the tool. Each security practice is assigned
one of four risk levels: Critical” risk-weight 10; “ High”
risk-weight 7.5; “ Medium” risk-weight 5; and Low” risk-
weight 2.5. The aggregate score is a weight-based average of
the individual scores, weighted by risk.
Apart from the scores between 0-10, the Scorecard tool
also assigns a score of -1 to the project. The notation -1
indicates that Scorecard could not get conclusive evidence of
implementing practices, or perhaps an internal error occurred
due to a runtime error in Scorecard. The inconclusive outcome
is graded as -1 instead of 0. Since a value of 0 will affect the
package’s aggregate score, Scorecard assigned a value of -1
to avoid the penalty of failing a check.
B. Security Outcome
The Security Outcomes represent indications of the fail-
ure or achievement of security associated with the software
throughout the software’s life cycle. Counting vulnerabilities
is the most common means of measuring security in soft-
ware [9]–[11]. In this study, we used package vulnerability
counts as a security outcome.
C. Package
APackage Registry such as npm and PyPI stores packages,
the metadata associated with them, and the configurations
needed to install it, as well as keeps track of the versions of
packages [12]. The registry offers a way to upload packages
and APIs for the client to call packages. Anyone can download
and reuse the packages from the registry. A Package is a
reusable piece of software that can be called from a global
registry (npm, PyPI) as a dependency to build another pack-
age [12], [13]. Each package may or may not depend on other
packages. However, publishing a package in a registry allows
consumers to import and reuse these packages.
III. RELATED WORK
This section highlights prior related studies, different se-
curity guidelines, frameworks, and tools on package security
health and security practices. While we only studied the
Scorecard tool’s security metrics, we further discussed other
security frameworks and tools to encourage future research to
evaluate which of these recommended security practices help
to improve security outcomes. Because many practices are
recommended for software security, but none of these guide-
lines (III-B) demonstrated which of these practices improved
security and why we should adopt them. Practitioners will
comply with these guidelines if data-driven research studies
on security practice measurement and feedback loops validate
the improvement in security outcomes.
A. Research Studies
Zahan et al. [20] conducted a measurement study on the
OpenSSF Scorecard security practices in the context of the
npm and PyPI ecosystems to identify the ecosystem’s security
practices trends and gaps. The study analyzed Scorecard
scores for 767,389 npm packages and 191,158 PyPI packages.
The authors observed gaps in both ecosystems in practicing
Code-Review, Maintained, License, Branch-Protection, and
Security-Policy practices in the GitHub repository. Then, the
rules specified by the Scorecard tool for Dependency-Update-
Tool, Fuzzing, CII-Best-Practices, Signed-Releases, and Pack-
aging security checks, were weakly adopted in npm and PyPI
due to Scorecard’s inherent reliance on other systems. The
study revealed that 13 Scorecard metrics could be mapped
back to the NIST SSDF framework’s [6] security practices.
Although this research showed a gap in both ecosystems to
adopt different security practices, the authors did not verify
whether adopting these practices would improve the npm and
PyPI ecosystems’ security outcomes.
2
On October 18, 2022, Sonatype released the 8th annual
State of the Software Supply Chain report [30]. As part
of project quality metrics identification, they built several
machine learning classification models on Scorecard security
practices to understand how well a model based on security
practices could correctly identify projects with known vul-
nerabilities. Their classification models were based on 12,786
Java projects, and they found Code-Review, Binary-Artifacts,
Pinned Dependencies, Branch Protection as important prac-
tices in their classification model.
Our study focuses on understanding whether good secu-
rity practices improve security outcomes, whereas Sonatype
research built models to identify vulnerable projects using
Scorecard data. However, study [20] showed that not all
Scorecard metrics apply to npm and PyPI ecosystems. From
the Sonatype annual report, we could not verify whether all the
metrics applied to Java projects or whether the study removed
noisy metrics from the model’s independent variables.
B. Guidelines and Frameworks:
The Software Component Verification Standard
(SCVS) [14] by OWASP, is a framework to develop a
common set of activities, controls, and security practices that
can help in identifying and reducing risk in a software supply
chain. There are 6control families that contain 87 controls
for different aspects of security verification or processes.
The SCVS has three verification levels, where higher levels
include additional controls.
The Building Security In Maturity Model (BSIMM)
[15] is the result of a multiyear study of real-world software
security initiatives (SSIs). Each year, various firms in differ-
ent industry verticals use the BSIMM to manage their SSI
improvements because the BSIMM report provides a clear
picture of actual practices used by organizations across the
security landscape [16]. The BSIMM is organized as a set of
122 activities, which consist of four domains distributed into
12 practices.
The CNCF Technical Advisory Group (TAG) [17] pub-
lished a series of recommended best practices, tooling rec-
ommendations, and design considerations that can reduce the
likelihood and overall impact of a successful supply chain
attack. They discuss the 5stage of software supply chain se-
curity—securing code, materials, building pipelines, artifacts,
and deployments. The framework proposed 54 practices with
associated risk factors to provide a holistic, end-to-end guide
for organizations and teams.
In response to Section 4 of the President’s Execu-
tive Order (EO) on “Improving the Nation’s Cybersecurity
(14028)” [3], the U.S. National Institute of Standards and
Technology (NIST) improvised the Secure Software Devel-
opment Framework (SSDF) [6]. The framework does not
introduce new practices or define new terminology. Instead,
the framework describes a set of high-level practices based
on established standards [15], [17]–[19] of secure software
development practices.
Supply Chain Levels for Software Artifacts (SLSA) [21]
is a security framework established by industry consensus, a
set of standards and controls to prevent tampering, enhance
the integrity, and also a set of security rules that may be
adopted incrementally. SLSA has four levels, with SLSA 4
representing the ideal end state. The lower levels represent in-
cremental milestones with corresponding incremental integrity
guarantees.
In August 2022, the Cybersecurity and Infrastructure Secu-
rity Agency (CISA) and the National Security Agency (NSA)
released a recommended practice guide on Securing the
Software Supply Chain for Developers [22]. The report
advocated specific frameworks like SLSA and SSDF. The
framework is a roadmap to building a healthy software devel-
opment environment that includes a Software Bill of Materials
(SBOM) for describing software components, active monitor-
ing for vulnerabilities and software supply chain attacks, and
a secure build environment and development team.
Microsoft released the Open Source Software (OSS)
Secure Supply Chain (SSC) Framework [23], which is
a security assurance and risk reduction process focused on
securing how developers consume open source software. The
framework provides security guidance and tools throughout
the developer’s inner-loop and outer-loop processes that have
played a critical role in defending and preventing supply chain
attacks through the consumption of open-source software
across Microsoft. In September 2022, the OpenSSF released
the npm Best Practices Guide [24] to help JavaScript and
TypeScript developers reduce the security risks associated with
using open-source dependencies.
C. Tools
Open Source Insights (OSI) or Deps.dev [25] is a Google-
developed and hosted tool to aid practitioners in grabbing
information about the source code location, package metadata,
licenses, releases, and vulnerabilities of open-source products.
OSI scans millions of open-source packages from different
ecosystems, constructs dependency graphs, and annotates the
metadata in a dashboard. Apart from the package’s metadata,
the OSI dashboard also shows statistics about a package’s
direct or transitive dependencies. On the OSI website, users
can view the vulnerability mapping of a package as well as
the vulnerability mapping with associated dependencies. In
addition to the package metadata, OSI has also incorporated
Scorecard security practices to help understand package secu-
rity practices.
In-toto: A framework that holistically enforces the integrity
of a software supply chain by gathering cryptographically ver-
ifiable information about the chain itself [26], [27]. in-toto
grants the end user the ability to verify the software’s supply
chain from the project’s inception to its deployment. To
achieve this, in-toto requires a project owner to declare
and sign a layout of how the supply chain’s steps must
be carried out and by whom. The concerned parties will
document their actions and produce a cryptographically signed
declaration for each step as they are completed. The link
3
摘要:

DoSoftwareSecurityPracticesYieldFewerVulnerabilities?NusratZahan,ShohanuzzamanShohan,DanHarrisandLaurieWilliamsNorthCarolinaStateUniversityRaleigh,USAEmail:[nzahan,sshohan,doharris,lawilli3]@ncsu.eduAbstract—Duetotheever-increasingnumberofsecuritybreaches,practitionersaremotivatedtoproducemoresecure...

展开>> 收起<<
Do Software Security Practices Yield Fewer Vulnerabilities Nusrat Zahan Shohanuzzaman Shohan Dan Harris and Laurie Williams.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:698.56KB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注