Oze Decentralized Graph-based Concurrency Control for Real-world Long Transactions on BoM Benchmark

2025-05-06 0 0 1.55MB 14 页 10玖币
侵权投诉
Oze: Decentralized Graph-based Concurrency Control for
Real-world Long Transactions on BoM Benchmark
Jun Nemoto
Scalar, Inc.
Tokyo, Japan
jun.nemoto@scalar-labs.com
Takashi Kambayashi
Nautilus Technologies, Inc.
Tokyo, Japan
kambayashi@nautilus-technologies.com
Takashi Hoshino
Cybozu Labs, Inc.
Tokyo, Japan
hoshino@labs.cybozu.co.jp
Hideyuki Kawashima
Keio University
Fujisawa, Japan
river@sfc.keio.ac.jp
ABSTRACT
In this paper, we propose Oze, a new concurrency control protocol
that handles heterogeneous workloads which include long-running
update transactions. Oze explores a large scheduling space using
a fully precise multi-version serialization graph to reduce false
positives. Oze manages the graph in a decentralized manner to
exploit many cores in modern servers. We also propose a new
OLTP benchmark, BoMB (Bill of Materials Benchmark), based on a
use case in an actual manufacturing company. BoMB consists of one
long-running update transaction and ve short transactions that
conict with each other. Experiments using BoMB show that Oze
keeps the abort rate of the long-running update transaction at zero
while reaching up to 1.7 Mtpm for short transactions with near-
linear scalability, whereas state-of-the-art protocols cannot commit
the long transaction or experience performance degradation in
short transaction throughput.
CCS CONCEPTS
Information systems Database transaction processing.
KEYWORDS
concurrency control, OLTP benchmark, long-running transaction
1 INTRODUCTION
Transaction processing has been used for applications and work-
loads in various industries. Concurrency control is the core of
transaction processing. Various concurrency control protocols have
been proposed to take advantage of the recent architectural evolu-
tion such as many-core and large memory capacity [
25
,
37
39
,
43
],
which have achieved high performance and scalability.
Existing concurrency control protocols do not assume a certain
type of heterogeneous workload in which a long-running update
transaction and multiple types of short transactions are mixed. Such
heterogeneous workloads exist, for example, in OLTP systems for
manufacturing industries. The system there runs a transaction that
builds up a tree structure based on an item master and an item
component master, referred to as a Bill of Materials, or BoM, and
calculates product costs and requirements [
11
]. This transaction
(referred to as the L1 transaction; see Section 2 for details) is a long-
running update transaction because it must read a large number of
Work done mostly while at Keio University.
0.990
0.995
1.000
0.000
0.200
0.400
0.600
0.800
20 30 40 50 60 70 80 90 100
Abortrate
#ofproducts
Oze(Proposal)
Silo
TicToc
MOCC
ERMIA
Cicada
Figure 1: Abort rate of long-running update transaction (L1)
in BoMB workload.
records and write the results. In addition to L1, the system runs a
short transaction (called the S1 transaction) that updates the raw
material cost referred to in the calculation of the product cost, and
a short transaction (called the S2 transaction) that uses the product
cost in other applications. Handling these concurrent transactions
that interfere with each other remains challenging.
Because existing concurrency control protocols are not designed
for such heterogeneous workloads, it is common for companies to
process long transactions at night when online short transactions
do not occur [
4
]. However, this workaround is sometimes infeasible.
The freshness and accuracy of the product costs, which are kept
by the long transactions, are essential since the product costs are
used as input for optimal production planning in manufacturing
resource planning (MRP) [
40
], especially budgeting and demand
planning. Generally, an accuracy of 98% in BoM composition and
95% in inventory is required to obtain accurate results in MRP be-
cause input errors accumulate from pile-up calculations [
11
,
33
].
Meanwhile, the cost of raw materials and the components of items,
which are the basis of product costing, can frequently change due
to supply chain disruptions caused by disasters and infectious dis-
eases
1
. Therefore, there is a need for on-demand product costing
not at night but during the day when online short transactions
occur [8, 17].
1
Supply chain disruptions and price uctuations have occurred due to the extensive
damage caused by the Great East Japan Earthquake and the resulting nuclear power
plant accident, as well as lockdowns to prevent the spread of COVID-19 [1, 27, 28].
arXiv:2210.04179v1 [cs.DB] 9 Oct 2022
Nemoto, et al.
What happens if modern concurrency control protocols try to
handle such heterogeneous workloads? The abort rate of the L1
transaction is shown in Figure 1 when the L1, S1, and S2 transactions
run concurrently using state-of-the-art protocols. The horizontal
axis is the number of products to be costed in one transaction,
which corresponds to the length of the L1 transaction. As shown
in the gure, none of the existing protocols can commit the long
transaction, or if they can, the success rate is less than one percent.
OCC protocols such as Silo [
37
] and TicToc [
43
] abort the L1 trans-
action because the S1 transaction updates the cost of raw materials
before L1 is completed. MOCC [
39
], which combines OCC and the
advantages of a lock-based scheme, rarely avoids aborts even with
pessimistic behavior. ERMIA [
38
] and Cicada [
25
] cannot commit
any L1 transactions because they are interfering with concurrent
S2 transactions. The details are described in Section 2.5.
Existing lock-based [
18
] and deterministic [
15
,
36
] approaches
can handle this workload under a certain condition where BoMs
do not change before and after the L1 transaction, as shown in
Figure 8. We call such a xed BoM a static BoM. Even though
these approaches can commit the L1, they suer from performance
degradation of the S1 that must wait for the L1. More importantly,
our target BoMs must often be updated by another transaction
which changes the composition of a product to dynamically re-
spond supply chain disruption. We call such a BoM a dynamic BoM.
Deterministic approaches cannot commit the L1 anymore when han-
dling a dynamic BoM. Even if they use reconnaissance queries [
36
]
to know BoM trees in advance, they cannot guarantee the trees
are not changed without additional application-level assistance,
e.g., stop transactions that modify BoMs. Such an application-level
workaround is not the direction of our goal.
In this paper, we rst propose Oze, a concurrency control proto-
col that can handle heterogeneous workloads which include long
and short transactions. Oze generates serializable schedules using
a multi-version serialization graph (MVSG) [
3
]. MVSGT [
20
] and
MVSGA [
19
] are conventional graph-based approaches that gener-
ate serializable schedules in large scheduling spaces: multi-version
conict serializability (MCSR) and multi-version view serializability
(MVSR). However, their protocols are rather theoretical and there
are no available implementations, to the best of our knowledge.
In addition, they assume centralized graph management, which
cannot benet from many cores and achieve high scalability. We
present a decentralized implementation of Oze that can take advan-
tage of many-core environments by using a logically single graph
on each record.
Second, we present a new benchmark, BoMB (Bill of Materials
Benchmark), which reproduces the heterogeneous workload de-
scribed above. TPC-C [
12
] and TPC-E [
13
] are widely used as de
facto standard benchmarks for OLTP systems; however, neither
includes long-running update transactions. In contrast, BoMB’s
target application is a cost management system for manufacturing,
which consists of six transactions: calculating product costs (L1),
updating raw material costs that are used by L1 (S1), posting jour-
nal vouchers based on the calculated product cost (S2), changing a
product (S3), changing a raw material of a product (S4), changing a
product quantity (S5). The transactions of BoMB are designed and
implemented on CCBench [
34
], which is a benchmarking platform
!"#$%&'(')*+*($
!"#$"%&'()*+,"+%-#.)/'+'0"1"+%
/2!
3&40"%5+0
,-
.-
67/
8"1'+4)!('++5+0
27
3-/
9:
/7
;'<("=;#'+='>%5-+= *+$&%=
?@A)7'(>&('%")$#-4&>%)>-=%)&=5+0)3-/)'+4)&$4'%")>-=%
6@A)B$4'%")#'C)1'%"#5'()>-=%)D)6EA)7F'+0")#'C)1'%"#5'()-G)$#-4&>%
6HA)*==&")I-&#+'(),-&>F"#))))))D)6JA)7F'+0")$#-4&>%)K&'+%5%.
6LA)7F'+0")$#-4&>%
MA)G'>%-#.)))*A)5%"1)))!A)$#-4&>%=
/7A)1'%"#5'(N>-=%)))27A)#"=&(%N>-=%)))9:A)I-&#+'(N,-&>F"#
2"'4
O#5%"P*+="#%P8"("%"
6>'+
!
M * ,/
,0
,1
,2
Figure 2: System and workload overview.
for concurrency control protocols for in-memory database systems,
to allow fair comparison and evaluation between various protocols.
Third, we evaluate Oze with modern concurrency control pro-
tocols using BoMB. Experimental results show that Oze keeps the
abort rate of the long-running update transactions at zero while
reaching up to 1.7 Mtpm for short transactions with near-linear
scalability, whereas state-of-the-art protocols cannot commit the
long transaction or experience performance degradation in the
throughput of short transactions.
The rest of this paper is organized as follows. First, in Section 2,
we dene the workload and its database in BoMB. Next, we describe
the design of the Oze protocol in Section 3 and the implementation
of Oze for multi-core systems in Section 4. In Section 5, we evaluate
several protocols using BoMB. In Section 6, we describe related
work. Finally, we conclude this paper in Section 7.
2 BOM BENCHMARK
This section describes an overview, database schema, and workload
of BoM Benchmark (BoMB). We also show how and why existing
protocols cannot eectively handle the BoMB workload.
2.1 Background and Overview
We use a food manufacturing company that produces bread na-
tionwide as our reference when designing the BoMB workload. In
the target food manufacturing industry, the supply chain can be
disrupted by various reasons such as climate change, and the prices
of raw materials often change as well. Therefore, it is necessary
to accurately gauge manufacturing costs and schedule an optimal
production plan and supply chain. In order to reect such needs,
the BoMB workload is congured assuming a system capable of on-
demand inventory control, cost control, and production planning.
Figure 2 shows an overview of the target system and workload. We
assume that the system consists of an MRP system that manages
products and resources needed for manufacturing and a perpetual
inventory management system that continuously manages the in-
ventory. The MRP consists of cost management, budgeting, demand
Oze: Decentralized Graph-based Concurrency Control for Real-world Long Transactions on BoM Benchmark
Table 1: BoMB parameters.
Parameters Default Description
factories 8Number of factories
product-types 72,000 Number of product types
material-types 198,000 Number of material types
raw-material-
types 75,000 Number of
raw material types
material-trees-
per-product 5Number of material trees
per product
material-tree-size 10 Number of materials
in a material tree
raw-materials-
per-leaf 3Number of raw materials
in a leaf material
target-products 100 Number of products
manufactured in a factory
target-materials 1Number of raw materials
for update
planning, and supply chain management (SCM) modules, and each
module accesses one database. The cost management generates
the most complicated workload among these modules due to the
long-running update transactions. Thus, for BoMB, we focus on
emulating the workload of the cost management module.
The BoMB workload has six transactions that are directly related
to product costing and its input/output: L1 and S1–S5 transactions.
L and S stand for "long" and "short," respectively. All of these trans-
actions generally occur in manufacturing industries [
29
,
41
] and
can be widely applied other than in bakeries.
2.2 Tables and Parameters
BoMB uses seven tables shown below. The underlined attribute is
the primary key. Note that
INT16
,
INT32
, and
INT64
are integers
of 16, 32, and 64 bits, respectively. Adjustable parameters for the
BoMB are shown in Table 1. The default values are set on the basis
of the actual values of the referenced bread manufacturer; these
would change depending on the industry.
factory(id INT32, name VARCHAR)
: The assuming company
operates multiple factories, and the
factory
table manages a list
of those factories. The number of factories is set by the parameter
factories.
item(id INT32, name VARCHAR, type INT16)
: The
item
table
manages the name of items with their type: product, material, or
raw material. The
item
table stores the total records of products
(
product-types
), materials (
material-types
), and raw materials
(raw-material-types).
product(factory_id INT32, item_id INT32, quantity
DOUBLE)
: The
product
table manages the manufactured products
and their quantity in each factory. When performing cost account-
ing, it is used to obtain the products currently in production at the
factory.
bom(parent_item_id INT32, child_item_id INT32, quan-
tity DOUBLE)
: The
bom
table manages a list of (intermediate and
raw) materials and the quantities of each needed to manufacture a
product. Specically, it stores the parent item ID, child item ID, and
!"#$%&"'
()'*+%,!&-.
/"0,1"#$%&"'
()'*+%,2.
/"0,1"#$%&"'
()'*+%,3.
/"0,1"#$%&"'
()'*+%,4.
!"#$%"&'!(")*$+'!$%"&'!(")
!"#$%&"'
(3$"5#,!&-.
!"#$%&"'
(6*+78.
9%*:+;#
(<"=:0&;8.
!"#$%&"'
/**#,1"#$%&"'
(>%$":.
/**#,1"#$%&"'
(?+=",<"'":. @@@
!,,&$%"&'!(")*$+'!$+!,-./&
%"&'!(")$&!''$*(0'
Figure 3: Example of BoM tree.
quantity in each record and hierarchically represents BoM trees.
Details of the structure of BoM trees and product costing using this
table are described in Section 2.3.
material-cost(factory_id INT32, item_id INT32,
stock_quantity DOUBLE, stock_amount DOUBLE)
: The
materi
al-cost
table manages the cost of raw materials, the stock quantity,
and the amount of the raw materials for each factory and item.
result-cost(factory_id INT32, item_id INT32, cost
DOUBLE)
: The
result-cost
table contains the latest cost calcu-
lation results for each product in each factory.
journal-voucher(voucher_id INT64, date DATE, debit
INT32, credit INT32, amount DOUBLE, description VARCHAR)
:
Because the cost calculation result is used for each module, e.g.,
budgeting, demand planning, and SCM, it is created as a journal
voucher as needed and stored in this table.
2.3 BoM Tree
The
bom
table is a list of intermediate and raw materials and the
quantities required to make a particular product and their quantities.
It can be logically expressed in a tree structure. An example of a
BoM tree is shown in Figure 3. The product consists of several major
materials (hereafter, "root materials" for convenience). For example,
in the production of sandwiches, the major materials correspond
to bread and the ingredients inside (e.g., tuna salad). Each root
material is made from multiple materials. In the example of bread,
the material is dough, and the raw materials are our, yeast, etc.
To support BoMs in other manufacturing industries [
29
,
41
] such
as in aircraft and robots industries, we introduce parameters for
BoM trees: the number of root materials, the number of materials
that make up each root material tree (
material_tree_size
), and
the number of raw materials in each leaf material (
raw-materials
-per-leaf).
When starting the benchmark, the BoM trees are initialized as
follows. (1) Select a set of materials with size
material_tree_size
.
(2) Select the root material from them. (3) Add the remaining mate-
rials as child nodes to random tree nodes. (4) Add
raw-materials-
per-leaf
raw materials to each leaf of the tree. Raw materials are
randomly selected from
raw-materials-types
. (5) After generat-
ing all trees until
materials-types
materials are exhausted, assign
material-trees-per-product
trees to each product. Though the
摘要:

Oze:DecentralizedGraph-basedConcurrencyControlforReal-worldLongTransactionsonBoMBenchmarkJunNemoto∗Scalar,Inc.Tokyo,Japanjun.nemoto@scalar-labs.comTakashiKambayashiNautilusTechnologies,Inc.Tokyo,Japankambayashi@nautilus-technologies.comTakashiHoshinoCybozuLabs,Inc.Tokyo,Japanhoshino@labs.cybozu.co.j...

展开>> 收起<<
Oze Decentralized Graph-based Concurrency Control for Real-world Long Transactions on BoM Benchmark.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:1.55MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注