Calibration A Simple Trick for Wide-table Delta Analytics_2

2025-04-27 0 0 6.47MB 22 页 10玖币
侵权投诉
Calibration: A Simple Trick for Wide-table Delta Analytics
Zezhou Huang
zh2408@columbia.edu
Columbia University
Eugene Wu
ewu@cs.columbia.edu
DSI, Columbia University
ABSTRACT
Data analytics over normalized databases typically requires com-
puting and materializing expensive joins (wide-tables). Factorized
query execution models execution as message passing between
relations in the join graph and pushes aggregations through joins
to reduce intermediate result sizes. Although this accelerates query
execution, it only optimizes a single wide-table query. In contrast,
wide-table analytics is usually interactive and users want to apply
delta to the initial query structure. For instance, users want to slice,
dice and drill-down dimensions, update part of the tables and join
with new tables for enrichment. Such Wide-table Delta Analytics
oers novel work-sharing opportunities.
This work shows that carefully materializing messages during
query execution can accelerate Wide-table Delta Analytics by
>
10
5×
as compared to factorized execution, and only incurs a constant
factor overhead. The key challenge is that messages are sensitive to
the message passing ordering. To address this challenge, we borrow
the concept of calibration in probabilistic graphical models to ma-
terialize sucient messages to support any ordering. We manifest
these ideas in the novel Calibrated Junction Hypertree (CJT) data
structure, which is fast to build, aggressively re-uses messages to
accelerate future queries, and is incrementally maintainable under
updates. We further show how
CJT
s benet applications such as
OLAP, query explanation, streaming data, and data augmentation
for ML. Our experiments evaluate three versions of the CJT that run
in a single-threaded custom engine, on cloud DBs, and in Pandas,
and show 30
×
– 10
5×
improvements over state-of-the-art factorized
execution algorithms on the above applications.
ACM Reference Format:
Zezhou Huang and Eugene Wu. 2022. Calibration: A Simple Trick for Wide-
table Delta Analytics. In Proceedings of ACM Conference (Conference’17).
ACM, New York, NY, USA, 22 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn
1 INTRODUCTION
Schema normalization is a foundational concept in databases, and
is used to minimize redundancy, potential data inconsistencies, and
storage costs. It is widely used in practice and taught in nearly
every database course. Unfortunately, normalized schemas present
a number of usability challenges in modern data analytics. First,
analyses often access data from disparate tables that necessitate
joining across the normalized schema (called join graph). These
massive joins are dicult to optimize [
8
,
55
,
65
], expensive to mate-
rialize, and dominate analytics costs. Second, joins are notoriously
confusing to students and programmers [18, 32, 47, 62].
As such, there is increasing advocacy for a wide-table abstrac-
tion [
16
,
64
,
64
], where users directly perform analytics over a fully
denormalized schema. Unfortunately, materializing a denormalized
Conference’17, July 2017, Washington, DC, USA
2022. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00
https://doi.org/10.1145/nnnnnnn.nnnnnnn
A
1
1
B
1
2
cnt
2
3
Original Annotated
Relations
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
Hypergraph
A
B C
DT
SR
Junction Hypertree
A B A C A D
Absorption
A B A C A D
Join Result
A
1
1
B
1
1
C
1
1
D
1
2
cnt
2×3×4=24
2×3×2=12
1
1
1
1
2
2
1
2
2×5×4=40
2×5×2=20
1
1
2
2
1
1
1
2
3×3×4=36
3×3×2=18
1
1
2
2
2
2
1
2
3×5×4=60
3×5×2=30
Marg Result
cnt
24+12+40+20+36
+18+60+30=240
A
1
cnt
5
A
1
cnt
40
Message Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
1
D
1
2
cnt
4×40=160
2×40=80
Junction Hypertree As
Data Structure
A B A C A D
R S T
A
1
cnt
5
A
1
cnt
40
A
1
cnt
48
A
1
cnt
6
Downward Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×3=40
A
1
cnt
2+4=6
A
1
cnt
6×3+6×5=48
Cycle Hypergraph
A
B C
T
SRA B C
R S T
A B A B C B C
R S T
A C
A
1
1
B
1
2
cnt
3
3
R'
A
1
B
1
cnt
1
ΔR
A B A C A D
A
1
cnt
1
A
1
cnt
1×3+1×5=8
A
1
1
D
1
2
cnt
8×4=32
8×2=16
Factorised Incremental View Maintenance
D
1
2
E
1
2
cnt
7
1
UA B A C A D
D
1
E
1
cnt
7×160=1120
D E
D
1
cnt
160
A B C
A B AC BC
AB C
All
A B C
R S
A B AC BC
AC B
A B
R1, R2
A C
R3
A D
R4, R5
A B
R1, R2
A C
R3
C D
R6
A B
R1, R2
A C
R3
A D
R7
A B A C A D
A B A C A D
A B A C A D
A
B C
T (100)
S (100)
R (100)
DU (100)
A D
U
100
A B C
R S T
1000
Fractional Edge Cover: 10000
Hypertree Decomposition: 1000
Consider query COUNT(*) GROUP BY D,
Fractional Sub-Hypertree Width: 100
A D
U
100
A B C
R S T
1000
A
B C
S (100)R (1000)
A C
S
100
A B
R
1000
A C
S
100
A B
R
1000
A B
R
A C
S
B C
D E
D E
B F
B F
B F D E
A
1
1
B
1
2
p
0.1
0.3
P(A, B)
2
2
1
2
0.2
0.4
B
1
1
C
1
2
p
0.3
0.7
P(C|B)
2
2
1
2
0.2
0.8
P(A, B, C)
A
1
1
B
1
1
C
1
2
p
0.1×0.3=0.03
1
1
2
2
1
2
2
2
1
1
1
2
2
2
2
2
1
2
0.1×0.7=0.07
0.3×0.2=0.06
0.3×0.8=0.24
0.2×0.3=0.06
0.2×0.7=0.14
0.4×0.2=0.08
0.4×0.9=0.32
B
1
2
p
0.23
0.77
P(C)
A B B C
B
1
2
p
0.3
0.7
P(B)
B
1
1
C
1
2
P(B, C) = P(B)×P(C|B)
2
2
1
2
A B B C
p
0.3×0.3=0.09
0.7×0.3=0.21
0.2×0.7=0.14
0.8×0.7=0.56
A B
R, S
A B C
T
A B
R
A B C
S, T
D
D
1
cnt
160
2 80
A B
σA C A D
B C C D D E
D E
E F
B F
A B
Q2
A B
σA C A D
B C A C C D
D E
D E
B F
B F
Q2
B C
σC D D E E FA B
B C C D
σD E E FA B
B C C D D E E FA B
B C
σC D D E E FA B
B C C D D E E FA B
B C
σC D D E E FA B A C A DA B
A C
σA DA B
A C A DA B
A C
σA DA B
B C C D D EA B
B C
σC D D EA B
B C C D
σD EA B
A B A C A D
B
1
1
D
1
2
cnt
4×6=24
2×6=12
A
1
1
B
1
2
cnt
2
3
A
1
1
B
1
2
cnt
2×3=6
3×3=9
A
1
1
2
2
1
2
4×9=36
2×9=18
1
1
Store
Sales
Items
Stores
Time
Stores
Cust Time
B C A C C D D EB F
2 80 2 2 1×80=80
Store
Sales
Items
Stores
Cust Time
B C
A B B G B D D E E F
Cast
Info Movie Movie
Comp
Comp
Movie
Info
Info
Type
Aug
Comp
Person
Person
Info
Aug
Person
Movie
Key
Key
Type
Title
B C A C C D D E
B F
B C A C C D D EB F
with compensating
annotations
A B
R
A C
S
B C
A B B G B D D E E F
DE is the
augmentation
relation
Trans Sales ItemsStores
Dates
Aug
Stores
Aug
Dates
Aug
Items
A B A C A D
A
1
1
D
1
2
cnt
4×40=160
2×40=80
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A B A C A D
A
1
1
B
1
2
cnt
2
3
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
A
1
1
B
1
2
cnt
2
3
B
1
1
C
1
2
cnt
2×3=6
2×5=10
A
1
1
2
2
1
2
3×3=9
3×5=15
1
1
(a) Relations.
A
1
1
B
1
2
cnt
2
3
Original Annotated
Relations
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
Hypergraph
A
B C
DT
SR
Junction Hypertree
A B A C A D
Absorption
A B A C A D
Join Result
A
1
1
B
1
1
C
1
1
D
1
2
cnt
2×3×4=24
2×3×2=12
1
1
1
1
2
2
1
2
2×5×4=40
2×5×2=20
1
1
2
2
1
1
1
2
3×3×4=36
3×3×2=18
1
1
2
2
2
2
1
2
3×5×4=60
3×5×2=30
Marg Result
cnt
24+12+40+20+36
+18+60+30=240
A
1
cnt
5
A
1
cnt
40
Message Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
1
D
1
2
cnt
4×40=160
2×40=80
Junction Hypertree As
Data Structure
A B A C A D
R S T
A
1
cnt
5
A
1
cnt
40
A
1
cnt
48
A
1
cnt
6
Downward Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×3=40
A
1
cnt
2+4=6
A
1
cnt
6×3+6×5=48
Cycle Hypergraph
A
B C
T
SRA B C
R S T
A B A B C B C
R S T
A C
A
1
1
B
1
2
cnt
1
3
R
A
1
B
1
cnt
-1
ΔR
A B A C A D
A
1
cnt
-1
A
1
cnt
-1×3+-1×5=-8
A
1
1
D
1
2
cnt
4×40=160
2×40=80
Factorised Incremental View Maintenance
D
1
1
E
1
2
cnt
7
1
UA B A C A D
A
1
cnt
40
A
1
D
1
cnt
4×40×8=1280
Augmentation
D E
D
1
cnt
1+7=8
A B C
A B AC BC
AB C
All
A B C
R S
A B AC BC
AC B
A B
R1, R2
A C
R3
A D
R4, R5
A B
R1, R2
A C
R3
C D
R6
A B
R1, R2
A C
R3
A D
R7
A B A C A D
A B A C A D
A B A C A D
A
B C
T (100)
S (100)
R (100)
DU (100)
A D
U
100
A B C
R S T
1000
Fractional Edge Cover: 10000
Hypertree Decomposition: 1000
Consider query COUNT(*) GROUP BY D,
Fractional Sub-Hypertree Width: 100
A D
U
100
A B C
R S T
1000
A
B C
S (100)R (1000)
A C
S
100
A B
R
1000
A C
S
100
A B
R
1000
A B
R
A C
S
B C
D E
D E
B F
B F
B F D E
(b) Join graph.
A
1
1
B
1
2
cnt
2
3
Original Annotated
Relations
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
Hypergraph
A
B C
DT
SR
Junction Hypertree
A B A C A D
Absorption
A B A C A D
Join Result
A
1
1
B
1
1
C
1
1
D
1
2
cnt
2×3×4=24
2×3×2=12
1
1
1
1
2
2
1
2
2×5×4=40
2×5×2=20
1
1
2
2
1
1
1
2
3×3×4=36
3×3×2=18
1
1
2
2
2
2
1
2
3×5×4=60
3×5×2=30
Marg Result
cnt
24+12+40+20+36
+18+60+30=240
A
1
cnt
5
A
1
cnt
40
Message Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
1
D
1
2
cnt
4×40=160
2×40=80
Junction Hypertree As
Data Structure
A B A C A D
R S T
A
1
cnt
5
A
1
cnt
40
A
1
cnt
48
A
1
cnt
6
Downward Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
cnt
2+4=6
A
1
cnt
6×3+6×5=48
Cycle Hypergraph
A
B C
T
SRA B C
R S T
A B A B C B C
R S T
A C
A
1
1
B
1
2
cnt
3
3
R'
A
1
B
1
cnt
1
ΔR
A B A C A D
A
1
cnt
1
A
1
cnt
1×3+1×5=8
A
1
1
D
1
2
cnt
8×4=32
8×2=16
Factorised Incremental View Maintenance
D
1
2
E
1
2
cnt
7
1
UA B A C A D
D
1
E
1
cnt
7×160=1120
D E
D
1
cnt
160
A B C
A B AC BC
AB C
All
A B C
R S
A B AC BC
AC B
A B
R1, R2
A C
R3
A D
R4, R5
A B
R1, R2
A C
R3
C D
R6
A B
R1, R2
A C
R3
A D
R7
A B A C A D
A B A C A D
A B A C A D
A
B C
T (100)
S (100)
R (100)
DU (100)
A D
U
100
A B C
R S T
1000
Fractional Edge Cover: 10000
Hypertree Decomposition: 1000
Consider query COUNT(*) GROUP BY D,
Fractional Sub-Hypertree Width: 100
A D
U
100
A B C
R S T
1000
A
B C
S (100)R (1000)
A C
S
100
A B
R
1000
A C
S
100
A B
R
1000
A B
R
A C
S
B C
D E
D E
B F
B F
B F D E
A
1
1
B
1
2
p
0.1
0.3
P(A, B)
2
2
1
2
0.2
0.4
B
1
1
C
1
2
p
0.3
0.7
P(C|B)
2
2
1
2
0.2
0.8
P(A, B, C)
A
1
1
B
1
1
C
1
2
p
0.1×0.3=0.03
1
1
2
2
1
2
2
2
1
1
1
2
2
2
2
2
1
2
0.1×0.7=0.07
0.3×0.2=0.06
0.3×0.8=0.24
0.2×0.3=0.06
0.2×0.7=0.14
0.4×0.2=0.08
0.4×0.9=0.32
B
1
2
p
0.23
0.77
P(C)
A B B C
B
1
2
p
0.3
0.7
P(B)
B
1
1
C
1
2
P(B, C) = P(B)×P(C|B)
2
2
1
2
A B B C
p
0.3×0.3=0.09
0.7×0.3=0.21
0.2×0.7=0.14
0.8×0.7=0.56
A B
R, S
A B C
T
A B
R
A B C
S, T
D
D
1
cnt
160
2 80
A B
σA C A D
B C C D D E
D E
E F
B F
A B
Q2
A B
σA C A D
B C A C C D
D E
D E
B F
B F
Q2
B C
σC D D E E FA B
B C C D
σD E E FA B
B C C D D E E FA B
B C
σC D D E E FA B
B C C D D E E FA B
B C
σC D D E E FA B A C A DA B
A C
σA DA B
A C A DA B
A C
σA DA B
B C C D D EA B
B C
σC D D EA B
B C C D
σD EA B
A B A C A D
B
1
1
D
1
2
cnt
4×6=24
2×6=12
A
1
1
B
1
2
cnt
2
3
A
1
1
B
1
2
cnt
2×3=6
3×3=9
A
1
1
2
2
1
2
4×9=36
2×9=18
1
1
Store
Sales
Items
Stores
Time
Stores
Cust Time
B C A C C D D EB F
2 80 2 2 1×80=80
Store
Sales
Items
Stores
Cust Time
B C
A B B G B D D E E F
Cast
Info Movie Movie
Comp
Comp
Movie
Info
Info
Type
Aug
Comp
Person
Person
Info
Aug
Person
Movie
Key
Key
Type
Title
B C A C C D D E
B F
B C A C C D D EB F
with compensating
annotations
A B
R
A C
S
B C
A B B G B D D E E F
DE is the
augmentation
relation
Trans Sales ItemsStores
Dates
Aug
Stores
Aug
Dates
Aug
Items
A B A C A D
A
1
1
D
1
2
cnt
4×40=160
2×40=80
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A B A C A D
A
1
1
B
1
2
cnt
2
3
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
A
1
1
B
1
2
cnt
2
3
B
1
1
C
1
2
cnt
2×3=6
2×5=10
A
1
1
2
2
1
2
3×3=9
3×5=15
1
1
(c) Naive query execution. The full join result is in green.
A
1
1
B
1
2
cnt
2
3
Original Annotated
Relations
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
Hypergraph
A
B C
DT
SR
Junction Hypertree
A B A C A D
Absorption
A B A C A D
Join Result
A
1
1
B
1
1
C
1
1
D
1
2
cnt
2×3×4=24
2×3×2=12
1
1
1
1
2
2
1
2
2×5×4=40
2×5×2=20
1
1
2
2
1
1
1
2
3×3×4=36
3×3×2=18
1
1
2
2
2
2
1
2
3×5×4=60
3×5×2=30
Marg Result
cnt
24+12+40+20+36
+18+60+30=240
A
1
cnt
5
A
1
cnt
40
Message Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
1
D
1
2
cnt
4×40=160
2×40=80
Junction Hypertree As
Data Structure
A B A C A D
R S T
A
1
cnt
5
A
1
cnt
40
A
1
cnt
48
A
1
cnt
6
Downward Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
cnt
2+4=6
A
1
cnt
6×3+6×5=48
Cycle Hypergraph
A
B C
T
SRA B C
R S T
A B A B C B C
R S T
A C
A
1
1
B
1
2
cnt
3
3
R'
A
1
B
1
cnt
1
ΔR
A B A C A D
A
1
cnt
1
A
1
cnt
1×3+1×5=8
A
1
1
D
1
2
cnt
8×4=32
8×2=16
Factorised Incremental View Maintenance
D
1
2
E
1
2
cnt
7
1
UA B A C A D
D
1
E
1
cnt
7×160=1120
D E
D
1
cnt
160
A B C
A B AC BC
AB C
All
A B C
R S
A B AC BC
AC B
A B
R1, R2
A C
R3
A D
R4, R5
A B
R1, R2
A C
R3
C D
R6
A B
R1, R2
A C
R3
A D
R7
A B A C A D
A B A C A D
A B A C A D
A
B C
T (100)
S (100)
R (100)
DU (100)
A D
U
100
A B C
R S T
1000
Fractional Edge Cover: 10000
Hypertree Decomposition: 1000
Consider query COUNT(*) GROUP BY D,
Fractional Sub-Hypertree Width: 100
A D
U
100
A B C
R S T
1000
A
B C
S (100)R (1000)
A C
S
100
A B
R
1000
A C
S
100
A B
R
1000
A B
R
A C
S
B C
D E
D E
B F
B F
B F D E
A
1
1
B
1
2
p
0.1
0.3
P(A, B)
2
2
1
2
0.2
0.4
B
1
1
C
1
2
p
0.3
0.7
P(C|B)
2
2
1
2
0.2
0.8
P(A, B, C)
A
1
1
B
1
1
C
1
2
p
0.1×0.3=0.03
1
1
2
2
1
2
2
2
1
1
1
2
2
2
2
2
1
2
0.1×0.7=0.07
0.3×0.2=0.06
0.3×0.8=0.24
0.2×0.3=0.06
0.2×0.7=0.14
0.4×0.2=0.08
0.4×0.9=0.32
B
1
2
p
0.23
0.77
P(C)
A B B C
B
1
2
p
0.3
0.7
P(B)
B
1
1
C
1
2
P(B, C) = P(B)×P(C|B)
2
2
1
2
A B B C
p
0.3×0.3=0.09
0.7×0.3=0.21
0.2×0.7=0.14
0.8×0.7=0.56
A B
R, S
A B C
T
A B
R
A B C
S, T
D
D
1
cnt
160
2 80
A B
σA C A D
B C C D D E
D E
E F
B F
A B
Q2
A B
σA C A D
B C A C C D
D E
D E
B F
B F
Q2
B C
σC D D E E FA B
B C C D
σD E E FA B
B C C D D E E FA B
B C
σC D D E E FA B
B C C D D E E FA B
B C
σC D D E E FA B A C A DA B
A C
σA DA B
A C A DA B
A C
σA DA B
B C C D D EA B
B C
σC D D EA B
B C C D
σD EA B
A B A C A D
B
1
1
D
1
2
cnt
4×6=24
2×6=12
A
1
1
B
1
2
cnt
2
3
A
1
1
B
1
2
cnt
2×3=6
3×3=9
A
1
1
2
2
1
2
4×9=36
2×9=18
1
1
Store
Sales
Items
Stores
Time
Stores
Cust Time
B C A C C D D EB F
2 80 2 2 1×80=80
Store
Sales
Items
Stores
Cust Time
B C
A B B G B D D E E F
Cast
Info Movie Movie
Comp
Comp
Movie
Info
Info
Type
Aug
Comp
Person
Person
Info
Aug
Person
Movie
Key
Key
Type
Title
B C A C C D D E
B F
B C A C C D D EB F
with compensating
annotations
A B
R
A C
S
B C
A B B G B D D E E F
DE is the
augmentation
relation
Trans Sales ItemsStores
Dates
Aug
Stores
Aug
Dates
Aug
Items
A B A C A D
A
1
1
D
1
2
cnt
4×40=160
2×40=80
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A B A C A D
A
1
1
B
1
2
cnt
2
3
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
A
1
1
B
1
2
cnt
2
3
B
1
1
C
1
2
cnt
2×3=6
2×5=10
A
1
1
2
2
1
2
3×3=9
3×5=15
1
1
(d) Upward message passing. The absorption result is in green.
Figure 1: Example database with three relations, its join
graph (also JT), naive query execution for total count, and
factorized query execution by upward message passing.
schema incurs exponential space overhead
O
(
n×fr
), where
f
is the
fanout along edges in a join graph with r relations each of size n.
To address this challenge, factorized query execution [
5
,
77
]
accelerates queries over a large join graph using early marginaliza-
tion. In the spirit of projection pushdown, early marginalization
pushes down aggregation through the joins to reduce intermediate
result sizes. Abo et al. [
6
] established the equivalence between early
marginalization and message passing in Probabilistic Graphical
Model [
49
] (PGM). Factorized query execution can then be mod-
eled as passing messages between relations in the join graph. The
messages are of size
O
(
n
), so the space overhead (for acyclic join)
is only linear: O(rn).
Example 1. Figure 1(a,b) list example relations (duplicates are
tracked with a
cnt
“annotation”) and the join graph, respectively.
Figure 1c naively computes the total count over the full join result
(wide-table) using message passing, where each message is the inter-
mediate result so far. For instance,
AB
sends itself to
AC
, which sends
the join result to
AD
, which computes the full join before summing the
counts. This clearly requires exponential space. In contrast, factorized
query execution distributes the summation through joins, so that each
node rst sums out (marginalizes) attributes irrelevant downstream,
and then emits a smaller message. In Figure 1d, AB marginalizes out
B
and
AC
marginalizes out
C
, so that the nal result has only 2 tuples.
arXiv:2210.03851v1 [cs.DB] 7 Oct 2022
This concept has been extended to handle arbitrary join graph
structures [
68
]; identify good message passing orders [
77
]; and
support complex aggregations over semi-ring structures such as
factorized learning [
21
,
78
], where the aggregation function trains
a ML model. These properties make factorized execution promising
for wide-table analytics.
However, common wide-table analytics starts with an initial
(pivot) query, and then applies deltas to the query structure (delta
queries) —these may modify selection or grouping clauses, update
or remove tables, or join new tables. Further, these use cases de-
mand interactive response times [
29
,
81
]. For instance, users may
incrementally slice, dice and drill down along dimensions [
34
]; ap-
ply predicate-based deletion interventions on the input tables to
understand their eects [
73
,
88
]; or join with new tables as part of
ML augmentation [
17
] or data enrichment [
24
]. Although factorized
execution reduces individual query latencies, it does not leverage
materialization during query processing to exploit work-sharing
across them (either from the pivot query, or between pre-computed
data structures).
Our experiments show that Wide-table Delta
Analytics can be >105×slower than need be.
Our core question is:
what intermediates should be materi-
alized for Wide-table Delta Analytics?
First, which messages
can be shared between the pivot query and delta queries? Delta
queries only dier from the pivot query in a subset of selection,
projection, join or aggregation clauses, and most messages from the
pivot query could be re-used. To this end, we introduce an ecient
method to check the equivalence of intermediate messages.
Second, given the pivot query, which of its messages are sucient
to support re-use for Wide-table Delta Analytics? The key challenge
is that simply caching the messages emitted when executing the
pivot query is insucient because message re-usability is sensitive
to the message-passing order. This would force future delta queries
to either use the same ordering or forgo message re-use. To address
this, we observe that query execution passes messages across all
edges in the join graph along a single direction, and show that
sending and materializing messages in reverse for the pivot query
are sucient to support arbitrary orderings in future delta queries.
We relate this to calibration in probabilistic graphical models [
79
],
which similarly shares computation between interactive queries
over posterior distributions.
Third, we bring these ideas together in the
Calibrated Junction
HyperTree
(
CJT
). Given a pivot query, the novel data structure
manages message materialization and re-use. Building the data
structure only takes up to twice the time as executing the pivot
query, but supports re-use for arbitrary message passing orders.
Finally, we apply
CJT
to three important classes of wide-table
applications.
CJT
accelerates
Data Cube
construction by re-using
messages from low-dimensional cuboids to answer higher dimen-
sional OLAP queries, and avoiding the exponential cost of con-
structing high dimensional cuboids directly.
CJT
enables interactive
Data Augmentation for ML
by reducing the time to add a new
relation (features) to the join graph and update an ML model by
> 10
2×
compared to prior approaches.
CJT
can directly use Factor-
ized IVM [
7
,
67
] to accelerate
Data Explanation and Streaming
applications, and we further reduce IVM maintenance overheads
by >92×by lazily maintaining messages.
To summarize, our contributions are as follows:
Conceptually, we expand the connection between factorized
queries and PGM by drawing on the idea of calibration.
Practically, we design the novel
CJT
data structure, which uses
calibration to enable work-sharing for Wide-table Delta Analytics.
The cost of materializing the data structure is within a constant
factor of the pivot query execution, but accelerates future queries
by multiple orders of magnitude.
We apply
CJT
to data cube, data augmentation for ML, streaming
and explanation applications, and describe additional application-
specic optimizations.
To illustrate the algorithmic benets of our ideas, we implement
and evaluate three versions of
CJT
: a custom single threaded
query engine and middleware compilers to SQL and Pandas data
frame operations. Our custom engine out-performs the state-
of-the-art LMFAO factorized engine by
30
×
on OLAP queries;
compared to factorized execution algorithms, our SQL experi-
ments on AWS Redshift reduce execution by up to 10
3×
on TPC-H
queries, while Pandas experiments accelerate data augmentation
for ML by >100×.
2 BACKGROUND
This section provides a brief overview of annotated relations, early
marginalization and variable elimination to accelerate join-aggregation
queries, and the junction hypertree join representation. Our goal in
this paper is to keep the content accessible. To this end, we avoid
technical concepts (e.g., hypergraphs) that are needed for deriv-
ing bounds but not needed for developing intuition, and limit the
discussion to COUNT queries. However, our work generalizes to
any commutative semi-ring aggregation query [
35
], and the full
technical details can be found in the technical report [39].
Data Model.
Let uppercase symbol
A
be an attribute,
dom
(
A
) is its
domain, and lowercase symbol
adom
(
A
) be a valid attribute value.
By default, we assume categorical attributes. Numerical attributes
are usually part of the semi-ring annotation discussed below. How-
ever, we can easily support numerical attributes by introducing a
domain with innite size. Given relation R, its schema
SR
is a set of
attributes, and its domain
dom
(
R
) =
×ASdom
(
A
) is the Cartesian
product of its attribute domains. An attribute is incident of R if
ASR. Given tuple t, let t[A] be its value of attribute A.
Annotated Relations.
Since relational algebra (rst-order logic)
does not support aggregation, it has been extended with the use of
commutative structures to support aggregation. The main idea is
that tuples are annotated with values from a semi-ring, and when
relational operators (e.g., join, project, group-by) concatenate or
combine tuples, they also multiply or add the tuple annotations,
such that the nal annotations correspond to the desired aggrega-
tion results.
A commutative semi-ring (
D
, +,
×
, 0, 1) denes a set
D
, binary
operators + and
×
closed over
D
where both are commutative,
and the zero 0 and unit 1 elements. For simplicity, the text will
be based on COUNT queries and the natural numbers semi-ring
(
N
, +,
×
, 0, 1), which operates as in grade school math. However,
our work extends to arbitrary commutative semi-ring structures
that support aggregation queries containing common statistical
2
functions (mean, min, max, std), as well as machine learning models
(e.g., linear regression, regression trees, etc). Our applications and
experiments illustrate these use cases, and the technical report
presents a full treatment [
39
]. Each relation
R
annotates each of
its tuples
tdom
(
R
) with a natural number, and
R
(
t
) refers to this
annotation for tuple
t
[
35
,
41
,
67
]. We will use the terms relation
and annotated relation interchangeably.
Semi-ring Aggregation Query.
Aggregation queries are dened
over annotated relations, and the relational operators are extended
to add or multiple tuple annotations together, so that the output
tuples’ annotations are the desired aggregated values1.
Consider the query
𝜑S’
=
γS–{A},COUNT
(
R11R2
...
1Rn
) that
joins
n
relations, groups by all attributes except
A
:
S’
=
S
– {
A
}, and
computes the COUNT. The operators that combine multiple tuples
are join and groupby (projection under set semantics corresponds
to groupby), and they compute the output tuple annotations as
follows:
(R 1T)(t) = R(πSR(t)) ×T(πST(t)) (1)
(
A
R)(t) = {R(t1)| t1DS, t = πSR\{A}(t1)} (2)
The rst statement states that given a join output tuple
t
, its anno-
tation is dened by multiplying the annotations of the pair of input
tuples, where
SR
and
ST
are
R
and
T
’s schemas. The second denes
the count for output tuple
t
, and
ÍAR
denotes that we marginalize
over
A
and remove it from the output schema. This corresponds to
summing the annotations for all input tuples that are in the same
group as t.
To summarize, join and groupby correspond to
×
and +, respec-
tively, and the query can be rewritten as
𝜑S’
=
ÍAS’
(
R11R2
...
1
Rn
). This lets us distribute summations across multiplications as in
simple algebra, as we discuss next.
Early Marginalization.
In simple algebra (as well as semi-rings),
multiply distributes over addition, and can allow us to push marginal-
ization through joins, in the spirit of projection push down [36].
Consider Figure 1, which computes
γA;COUNT (R1S1T)
. We
can rewrite it as marginalizing
B
,
C
, and
D
from the full join result
B
C
D
R[A, B] 1S[A, C] 1T[A, D].
Although the naive cost is
O
(
n3
) where
n
is the cardinality of rela-
tions, we can push down the marginalizations to derive the follow-
ing, where the largest intermediate result, and thus the join cost, is
O(n):
D
(
C
((
B
R[A, B])1S[A, C]) 1T[A, D])
Join Ordering and Variable Elimination.
Variable elimination
is a class of query execution plans that combines early marginaliza-
tion with join ordering. Early marginalization is applied to a given
join order, thus we may also reorder the joins to cluster relations
that involve a given attribute, so that it can be safely marginal-
ized. Consider the query
ÍAR
[
A
,
B
]
1S
[
B
,
D
]
1T
[
A
,
C
]. We can
1
Note that this means dierent aggregation functions are dened over dierent semi-
ring structures, and our examples will focus on COUNT queries.
reorder the joins so that A can be marginalized out earlier:
S[B, D] 1
A
(R[A, B] 1T[A, C]).
The above procedure, where for each marginalized attribute
A
,
we rst cluster and join relations incident to
A
, and then marginal-
ize
A
, is called variable elimination [
20
] and is widely used for
inference in Probabilistic Graphic Models [
49
]. The order in which
attribute(s) are marginalized out (by clustering and joining the in-
cident relations) is called the variable elimination order. Note that a
given order is simply an execution plan. The complexity of variable
elimination is dominated by the intermediate join result size of the
clustered relations (using worst-case optimal join [
66
]). It is well
known that nding the optimal order (with the minimum interme-
diate join size) is NP-hard [
28
]. Prior work [
6
] has shown that the
intermediate result size of optimal order is bound by the fractional
hypertree width of the join graph, which we introduce in Appen-
dix A. However, common database queries are over acyclic join
graph, whose optimal order could be found eciently as discussed
below.
Junction Hypertree.
The Junction Hypertree
2
(
JT
) is a represen-
tation of a join query that is amenable to complexity analysis [
6
,
43
]
and semi-ring aggregation query optimization [
5
]. In the next sec-
tion, we will show how
JT
can be materialized and maintained
to provide work sharing and optimization opportunities for join-
aggregation queries, and how to extend it to balance storage costs
and query benets. For now, we simply dene the structure.
Given a join graph
R11. . . 1Rn
using natural joins for simplic-
ity, a Junction Hypertree is a pair (
E
,
V
), where each vertex
vV
is a subset of attributes in the join graph, and the undirected edges
form a tree that spans the vertices. The join graph may be explicitly
dened by a query, or induced by the foreign key relationships in
a database schema. Following prior work [
6
], a
JT
vertex is also
called a bag. A JT must satisfy three properties:
Vertex Coverage:
the union of all bags in the tree must be equal
to the set of attributes in the join graph.
Edge Coverage:
for every relation
R
in the join graph, there
exists at least one bag that is a superset of R’s attributes.
Running intersection:
for any attribute in the join graph, the
bags containing the attribute must form a connected subtree. In
other words, if two bags both contain attribute
A
, all bags along
the path between them should also contain A.
The last property is important because
JT
s are related to variable
elimination and are used for query execution. Given an elimination
ordering, let each join cluster be a bag in the
JT
, and adjacent clus-
ters be connected by an edge. In this context, executing the variable
elimination order corresponds to traversing the tree (path); when
execution moves beyond an attribute’s connected subtree, then it
can be safely marginalized out. Note that since the
JT
is undirected,
it can induce many variable elimination orders (execution plans)
from a given JT, all with the same runtime complexity.
Finally, there are many valid
JT
for a given join graph, and the
complexity (query execution cost) of a
JT
is dominated by the largest
bag (the join size of the relations covered by the bag). Although
2JT
is also called Hypertree Decomposition [
6
,
43
], Join Tree, Join Forest [
40
,
77
] in
databases and Clique Hypertree in probalistic graphical models [49].
3
A
1
1
B
1
2
cnt
2
3
Original Annotated
Relations
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
Hypergraph
A
B C
DT
SR
Junction Hypertree
A B A C A D
Absorption
A B A C A D
Join Result
A
1
1
B
1
1
C
1
1
D
1
2
cnt
2×3×4=24
2×3×2=12
1
1
1
1
2
2
1
2
2×5×4=40
2×5×2=20
1
1
2
2
1
1
1
2
3×3×4=36
3×3×2=18
1
1
2
2
2
2
1
2
3×5×4=60
3×5×2=30
Marg Result
cnt
24+12+40+20+36
+18+60+30=240
A
1
cnt
5
A
1
cnt
40
Message Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
1
D
1
2
cnt
4×40=160
2×40=80
Junction Hypertree As
Data Structure
A B A C A D
R S T
A
1
cnt
5
A
1
cnt
40
A
1
cnt
48
A
1
cnt
6
Downward Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
cnt
2+4=6
A
1
cnt
6×3+6×5=48
Cycle Hypergraph
A
B C
T
SRA B C
R S T
A B A B C B C
R S T
A C
A
1
1
B
1
2
cnt
3
3
R'
A
1
B
1
cnt
1
ΔR
A B A C A D
A
1
cnt
1
A
1
cnt
1×3+1×5=8
A
1
1
D
1
2
cnt
8×4=32
8×2=16
Factorised Incremental View Maintenance
D
1
2
E
1
2
cnt
7
1
UA B A C A D
D
1
E
1
cnt
7×160=1120
D E
D
1
cnt
160
A B C
A B AC BC
AB C
All
A B C
R S
A B AC BC
AC B
A B
R1, R2
A C
R3
A D
R4, R5
A B
R1, R2
A C
R3
C D
R6
A B
R1, R2
A C
R3
A D
R7
A B A C A D
A B A C A D
A B A C A D
A
B C
T (100)
S (100)
R (100)
DU (100)
A D
U
100
A B C
R S T
1000
Fractional Edge Cover: 10000
Hypertree Decomposition: 1000
Consider query COUNT(*) GROUP BY D,
Fractional Sub-Hypertree Width: 100
A D
U
100
A B C
R S T
1000
A
B C
S (100)R (1000)
A C
S
100
A B
R
1000
A C
S
100
A B
R
1000
A B
R
A C
S
B C
D E
D E
B F
B F
B F D E
A
1
1
B
1
2
p
0.1
0.3
P(A, B)
2
2
1
2
0.2
0.4
B
1
1
C
1
2
p
0.3
0.7
P(C|B)
2
2
1
2
0.2
0.8
P(A, B, C)
A
1
1
B
1
1
C
1
2
p
0.1×0.3=0.03
1
1
2
2
1
2
2
2
1
1
1
2
2
2
2
2
1
2
0.1×0.7=0.07
0.3×0.2=0.06
0.3×0.8=0.24
0.2×0.3=0.06
0.2×0.7=0.14
0.4×0.2=0.08
0.4×0.9=0.32
B
1
2
p
0.23
0.77
P(C)
A B B C
B
1
2
p
0.3
0.7
P(B)
B
1
1
C
1
2
P(B, C) = P(B)×P(C|B)
2
2
1
2
A B B C
p
0.3×0.3=0.09
0.7×0.3=0.21
0.2×0.7=0.14
0.8×0.7=0.56
A B
R, S
A B C
T
A B
R
A B C
S, T
D
D
1
cnt
160
2 80
A B
σA C A D
B C C D D E
D E
E F
B F
A B
Q2
A B
σA C A D
B C A C C D
D E
D E
B F
B F
Q2
B C
σC D D E E FA B
B C C D
σD E E FA B
B C C D D E E FA B
B C
σC D D E E FA B
B C C D D E E FA B
B C
σC D D E E FA B A C A DA B
A C
σA DA B
A C A DA B
A C
σA DA B
B C C D D EA B
B C
σC D D EA B
B C C D
σD EA B
A B A C A D
B
1
1
D
1
2
cnt
4×6=24
2×6=12
A
1
1
B
1
2
cnt
2
3
A
1
1
B
1
2
cnt
2×3=6
3×3=9
A
1
1
2
2
1
2
4×9=36
2×9=18
1
1
Store
Sales
Items
Stores
Time
Stores
Cust Time
B C A C C D D EB F
2 80 2 2 1×80=80
Store
Sales
Items
Stores
Cust Time
B C
A B B G B D D E E F
Cast
Info Movie Movie
Comp
Comp
Movie
Info
Info
Type
Aug
Comp
Person
Person
Info
Aug
Person
Movie
Key
Key
Type
Title
B C A C C D D E
B F
B C A C C D D EB F
with compensating
annotations
A B
R
A C
S
B C
A B B G B D D E E F
DE is the
augmentation
relation
Trans Sales ItemsStores
Dates
Aug
Stores
Aug
Dates
Aug
Items
A B A C A D
A
1
1
D
1
2
cnt
4×40=160
2×40=80
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A B A C A D
A
1
1
B
1
2
cnt
2
3
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
A
1
1
B
1
2
cnt
2
3
B
1
1
C
1
2
cnt
2×3=6
2×5=10
A
1
1
2
2
1
2
3×3=9
3×5=15
1
1
(a) Message passing to root AD.
A
1
1
B
1
2
cnt
2
3
Original Annotated
Relations
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
Hypergraph
A
B C
DT
SR
Junction Hypertree
A B A C A D
Absorption
A B A C A D
Join Result
A
1
1
B
1
1
C
1
1
D
1
2
cnt
2×3×4=24
2×3×2=12
1
1
1
1
2
2
1
2
2×5×4=40
2×5×2=20
1
1
2
2
1
1
1
2
3×3×4=36
3×3×2=18
1
1
2
2
2
2
1
2
3×5×4=60
3×5×2=30
Marg Result
cnt
24+12+40+20+36
+18+60+30=240
A
1
cnt
5
A
1
cnt
40
Message Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
1
D
1
2
cnt
4×40=160
2×40=80
Junction Hypertree As
Data Structure
A B A C A D
R S T
A
1
cnt
5
A
1
cnt
40
A
1
cnt
48
A
1
cnt
6
Downward Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
cnt
2+4=6
A
1
cnt
6×3+6×5=48
Cycle Hypergraph
A
B C
T
SRA B C
R S T
A B A B C B C
R S T
A C
A
1
1
B
1
2
cnt
3
3
R'
A
1
B
1
cnt
1
ΔR
A B A C A D
A
1
cnt
1
A
1
cnt
1×3+1×5=8
A
1
1
D
1
2
cnt
8×4=32
8×2=16
Factorised Incremental View Maintenance
D
1
2
E
1
2
cnt
7
1
UA B A C A D
D
1
E
1
cnt
7×160=1120
D E
D
1
cnt
160
A B C
A B AC BC
AB C
All
A B C
R S
A B AC BC
AC B
A B
R1, R2
A C
R3
A D
R4, R5
A B
R1, R2
A C
R3
C D
R6
A B
R1, R2
A C
R3
A D
R7
A B A C A D
A B A C A D
A B A C A D
A
B C
T (100)
S (100)
R (100)
DU (100)
A D
U
100
A B C
R S T
1000
Fractional Edge Cover: 10000
Hypertree Decomposition: 1000
Consider query COUNT(*) GROUP BY D,
Fractional Sub-Hypertree Width: 100
A D
U
100
A B C
R S T
1000
A
B C
S (100)R (1000)
A C
S
100
A B
R
1000
A C
S
100
A B
R
1000
A B
R
A C
S
B C
D E
D E
B F
B F
B F D E
A
1
1
B
1
2
p
0.1
0.3
P(A, B)
2
2
1
2
0.2
0.4
B
1
1
C
1
2
p
0.3
0.7
P(C|B)
2
2
1
2
0.2
0.8
P(A, B, C)
A
1
1
B
1
1
C
1
2
p
0.1×0.3=0.03
1
1
2
2
1
2
2
2
1
1
1
2
2
2
2
2
1
2
0.1×0.7=0.07
0.3×0.2=0.06
0.3×0.8=0.24
0.2×0.3=0.06
0.2×0.7=0.14
0.4×0.2=0.08
0.4×0.9=0.32
B
1
2
p
0.23
0.77
P(C)
A B B C
B
1
2
p
0.3
0.7
P(B)
B
1
1
C
1
2
P(B, C) = P(B)×P(C|B)
2
2
1
2
A B B C
p
0.3×0.3=0.09
0.7×0.3=0.21
0.2×0.7=0.14
0.8×0.7=0.56
A B
R, S
A B C
T
A B
R
A B C
S, T
D
D
1
cnt
160
2 80
A B
σA C A D
B C C D D E
D E
E F
B F
A B
Q2
A B
σA C A D
B C A C C D
D E
D E
B F
B F
Q2
B C
σC D D E E FA B
B C C D
σD E E FA B
B C C D D E E FA B
B C
σC D D E E FA B
B C C D D E E FA B
B C
σC D D E E FA B A C A DA B
A C
σA DA B
A C A DA B
A C
σA DA B
B C C D D EA B
B C
σC D D EA B
B C C D
σD EA B
A B A C A D
B
1
1
D
1
2
cnt
4×6=24
2×6=12
A
1
1
B
1
2
cnt
2
3
A
1
1
B
1
2
cnt
2×3=6
3×3=9
A
1
1
2
2
1
2
4×9=36
2×9=18
1
1
Store
Sales
Items
Stores
Time
Stores
Cust Time
B C A C C D D EB F
2 80 2 2 1×80=80
Store
Sales
Items
Stores
Cust Time
B C
A B B G B D D E E F
Cast
Info Movie Movie
Comp
Comp
Movie
Info
Info
Type
Aug
Comp
Person
Person
Info
Aug
Person
Movie
Key
Key
Type
Title
B C A C C D D E
B F
B C A C C D D EB F
with compensating
annotations
A B
R
A C
S
B C
A B B G B D D E E F
DE is the
augmentation
relation
Trans Sales ItemsStores
Dates
Aug
Stores
Aug
Dates
Aug
Items
A B A C A D
A
1
1
D
1
2
cnt
4×40=160
2×40=80
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A B A C A D
A
1
1
B
1
2
cnt
2
3
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
A
1
1
B
1
2
cnt
2
3
B
1
1
C
1
2
cnt
2×3=6
2×5=10
A
1
1
2
2
1
2
3×3=9
3×5=15
1
1
(b) Moving the root increases
message reuse.
Figure 2: Work sharing opportunities between queries Q1(to-
tal count query) and Q2with additional predicate applied to
S(A,C). Dotted blue edges are reusable messages and solid
red edges are non-reusable edges.
nding the optimal
JT
for an arbitrary join graph is NP-hard [
28
],
we can trivially create the optimal
JT
for an acyclic join graph by
creating one bag for each relation (e.g., the
JT
is simply the join
graph) and the size of each bag is bounded by its corresponding
relation size. We refer readers to FAQ [
6
] for a complete description.
Message Passing for Query Execution.
Message Passing was
rst introduced by Judea Pearl in 1982 [
69
] (known as belief propa-
gation) in order to eciently perform inference (compute marginal
probability) over probabilistic graphical models. In database terms,
each probability table corresponds to a relation, the probabilistic
graphical model corresponds to the full join graph in a database
(as expressed by a
JT
), the joint probability over the model corre-
sponds to the full join result, and marginal probabilities correspond
to grouping over dierent sets of attributes. To further support
semi-ring aggregation, Abo et al. [
6
] established the equivalence
between variable elimination, factorized execution of a single query,
and (upward) message passing. Below, we illustrate how message
passing over a
JT
is used for query execution, and the next section
leverages the ability to reuse messages across queries.
The procedure rst determines a traversal order over the
JT
since the
JT
is undirected, we can arbitrarily choose any bag as
the root and create directed edges that point towards the root—and
then traverses from leaves to root. We rst compute the initial
contents of each bag by joining the necessary tables based on the
bag’s attributes. When we traverse an outgoing edge from a bag
l
to its parent
p
, we marginalize out all attributes that are not in
their intersection—the result is the Message between
l
and
p
. The
parent bag then joins the message with its contents. Each bag waits
until it has received messages from all incoming edges before it
emits along its outgoing edges, and once the root has received all
incoming messages, its updated contents correspond to the query
result.
Example 2 (Message Passing). Consider the relations in Fig-
ure 1a, and the
JT
in Figure 1b where each bag is a base relation. We
wish to execute
ÍABCD R
(
A
,
B
)
1S
(
A
,
C
)
1T
(
A
,
D
)by traversing
along the path
AB AC AD
(Figure 1d). We rst marginalize
out
B
from
AB
, so the message to
AC
is a single row with count 5. The
bag
AC
joins the row with its contents, and thus multiplies each of
its counts by 5. It then marginalizes out
C
, so its message to
AD
is a
single row with count (3 + 5)
×
5. Finally, bag
AD
absorbs the message
(Figure 1d) and marginalizes out
A
and
D
to compute the nal result.
Scope.
Following prior factorized query execution work [
67
,
68
,
77
],
we use the acyclic join graph as the
JT
, or assume a good
JT
has
already been determined from standard hypertree decomposition
[
6
,
43
] for cyclic join graphs. Although
CJT
supports any commu-
tative semi-ring, theta joins with arbitrary conditions, outer joins,
and any factorized machine learning model
3
, we try keep the ideas
accessible to a general database audience and base our examples on
natural numbers (a semi-ring), COUNT queries, with natural joins,
and the linear regression model.
3 CALIBRATED JUNCTION HYPERTREE
While message passing over
JT
exploits early marginalization to
accelerate query execution, it has traditionally been limited to use
in complexity analysis and single-query execution. This section in-
troduces the Calibrated Junction Tree (
CJT
) to enable work-sharing
for ecient Wide-table Delta Analytics. The idea is to materialize
messages over the
JT
for the wide-table pivot query, and reuse a
subset of its messages for future delta queries. This section will focus
on the basis for the
CJT
data structure and how it is used to execute
delta queries. The next section will describe how to customize
CJT
for a range of useful wide-table applications.
Our novelty is 1) to use
JT
s as a concrete data structure to support
message reuse across queries, and 2) to borrow Calibration [
79
] from
probabilistic graphical models to ensure that sucient messages are
materialized to eciently support arbitrary SPJA queries over the
full join graph. Although
CJT
is widely used across engineering [
72
,
92
], ML [
15
,
23
], and medicine [
53
,
70
], we are the rst to introduce
CJT
in the context of query execution and generalize it to semi-ring
aggregation queries.
3.1 Motivating Example
We rst illustrate work sharing examples between a pivot query
Q1
=
ÍABCD AB 1AC 1AD
(whose messages are materialized)
and a delta query
Q2
=
ÍABCD AB 1σc=1
(
AC
)
1AD
.
Q1
computes
the total count, and Q2applies an additional predicate C=1.
Example 3. The
JT
s in Figure 2a both assign
AD
as the root and
traverse along the path
AB AC AD
. Although the message
AB AC
will be identical (blue edges), the additional lter over
AC
means that its outgoing message (and all subsequent messages)
will dier from
Q1
’s and cannot be reused (red edges). In contrast,
Figure 2b uses
AC
as the root, so both messages can be reused and the
AC bag simply applies the lter after joining its incoming messages.
This example shows that message reuse depends on how the
root bag is chosen in the pivot query, and for dierent delta queries,
we may wish to choose dierent roots. Since we may not know the
exact join, grouping, and lter criteria of future delta queries, it is
not eective to simply materialize messages for a single root. The
CJT
data structure addresses these limitations, and the following
text describes the
CJT
data structure, message reuse, and query
execution given a CJT.
3.2 Junction Hypertree as Data Structure
A naive approach to re-use messages is to execute an aggregation
query over a
JT
, and store the messages; when a future query tra-
verses an edge in the
JT
, it simply reuses the corresponding message.
Unfortunately, this is 1) inaccurate, because messages generated
3
Including ridge regression, classication tree, regression tree [
77
], k-means (RK-
means), support vector machine [48] and factorization machine [77].
4
A
1
1
B
1
2
cnt
2
3
Original Annotated
Relations
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
Hypergraph
A
B C
DT
SR
Junction Hypertree
A B A C A D
Absorption
A B A C A D
Join Result
A
1
1
B
1
1
C
1
1
D
1
2
cnt
2×3×4=24
2×3×2=12
1
1
1
1
2
2
1
2
2×5×4=40
2×5×2=20
1
1
2
2
1
1
1
2
3×3×4=36
3×3×2=18
1
1
2
2
2
2
1
2
3×5×4=60
3×5×2=30
Marg Result
cnt
24+12+40+20+36
+18+60+30=240
A
1
cnt
5
A
1
cnt
40
Message Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
1
D
1
2
cnt
4×40=160
2×40=80
Junction Hypertree As
Data Structure
A B A C A D
R S T
A
1
cnt
5
A
1
cnt
40
A
1
cnt
48
A
1
cnt
6
Downward Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×3=40
A
1
cnt
2+4=6
A
1
cnt
6×3+6×5=48
Cycle Hypergraph
A
B C
T
SRA B C
R S T
A B A B C B C
R S T
A C
A
1
1
B
1
2
cnt
3
3
R'
A
1
B
1
cnt
1
ΔR
A B A C A D
A
1
cnt
1
A
1
cnt
1×3+1×5=8
A
1
1
D
1
2
cnt
8×4=32
8×2=16
Factorised Incremental View Maintenance
D
1
2
E
1
2
cnt
7
1
UA B A C A D
D
1
E
1
cnt
7×160=1120
D E
D
1
cnt
160
A B C
A B AC BC
AB C
All
A B C
R S
A B AC BC
AC B
A B
R1, R2
A C
R3
A D
R4, R5
A B
R1, R2
A C
R3
C D
R6
A B
R1, R2
A C
R3
A D
R7
A B A C A D
A B A C A D
A B A C A D
A
B C
T (100)
S (100)
R (100)
DU (100)
A D
U
100
A B C
R S T
1000
Fractional Edge Cover: 10000
Hypertree Decomposition: 1000
Consider query COUNT(*) GROUP BY D,
Fractional Sub-Hypertree Width: 100
A D
U
100
A B C
R S T
1000
A
B C
S (100)R (1000)
A C
S
100
A B
R
1000
A C
S
100
A B
R
1000
A B
R
A C
S
B C
D E
D E
B F
B F
B F D E
A
1
1
B
1
2
p
0.1
0.3
P(A, B)
2
2
1
2
0.2
0.4
B
1
1
C
1
2
p
0.3
0.7
P(C|B)
2
2
1
2
0.2
0.8
P(A, B, C)
A
1
1
B
1
1
C
1
2
p
0.1×0.3=0.03
1
1
2
2
1
2
2
2
1
1
1
2
2
2
2
2
1
2
0.1×0.7=0.07
0.3×0.2=0.06
0.3×0.8=0.24
0.2×0.3=0.06
0.2×0.7=0.14
0.4×0.2=0.08
0.4×0.9=0.32
B
1
2
p
0.23
0.77
P(C)
A B B C
B
1
2
p
0.3
0.7
P(B)
B
1
1
C
1
2
P(B, C) = P(B)×P(C|B)
2
2
1
2
A B B C
p
0.3×0.3=0.09
0.7×0.3=0.21
0.2×0.7=0.14
0.8×0.7=0.56
A B
R, S
A B C
T
A B
R
A B C
S, T
D
D
1
cnt
160
2 80
A B
σA C A D
B C C D D E
D E
E F
B F
A B
Q2
A B
σA C A D
B C A C C D
D E
D E
B F
B F
Q2
B C
σC D D E E FA B
B C C D
σD E E FA B
B C C D D E E FA B
B C
σC D D E E FA B
B C C D D E E FA B
B C
σC D D E E FA B A C A DA B
A C
σA DA B
A C A DA B
A C
σA DA B
B C C D D EA B
B C
σC D D EA B
B C C D
σD EA B
A B A C A D
B
1
1
D
1
2
cnt
4×6=24
2×6=12
A
1
1
B
1
2
cnt
2
3
A
1
1
B
1
2
cnt
2×3=6
3×3=9
A
1
1
2
2
1
2
4×9=36
2×9=18
1
1
Store
Sales
Items
Stores
Time
Stores
Cust Time
B C A C C D D EB F
2 80 2 2 1×80=80
Store
Sales
Items
Stores
Cust Time
B C
A B B G B D D E E F
Cast
Info Movie Movie
Comp
Comp
Movie
Info
Info
Type
Aug
Comp
Person
Person
Info
Aug
Person
Movie
Key
Key
Type
Title
B C A C C D D E
B F
B C A C C D D EB F
with compensating
annotations
A B
R
A C
S
B C
A B B G B D D E E F
DE is the
augmentation
relation
Trans Sales ItemsStores
Dates
Aug
Stores
Aug
Dates
Aug
Items
A B A C A D
A
1
1
D
1
2
cnt
4×40=160
2×40=80
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A B A C A D
A
1
1
B
1
2
cnt
2
3
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
A
1
1
B
1
2
cnt
2
3
B
1
1
C
1
2
cnt
2×3=6
2×5=10
A
1
1
2
2
1
2
3×3=9
3×5=15
1
1
Figure 3: The same JT over relations R(A,B),S(A),T(B,C) can
have dierent relation mappings (X) and each mapping re-
sults in dierent messages (m). For each bag, its attributes
are at the top and mapped relations are at the bottom.
(a) TPC-DS join graph (also JT)
(b) Add empty bag (Time, Stores).
Figure 4: The join graph and JT of TPC-DS (simplied).
Adding an empty bag can accelerate queries group-by Time
and Stores.
along an edge are not symmetric, so that message contents depend
on the specic traversal order during message passing, 2) insu-
cient, because it cannot directly express all lter-group-by queries
over the
JT
, and 3) leaves performance on the table. To do so, we
extend the JT data structure as follows:
Directed Edges.
To support arbitrary traversal orders, we replace
each undirected edge with two directed edges, and use
Y
(
ij
) to
refer to the cached message for the directed edge i j.
Relation Mapping. X
(
R
) maps each base relation
R
to exactly one
bag containing R’s schema. Although dierent mappings can lead
to dierent messages (Figure 3), acyclic join graphs have a good de-
fault mapping where each relation maps to a single bag
4
. Relations
mapped to the same bag are joined during message passing.
Empty Bags.
To avoid large paths during message passing it can
help to add custom Empty bags to create “short cuts”. Empty bags
are not mapped from any relations and simply a mechanism to
materialize custom views for work sharing. They join incoming
messages, marginalize using standard rules, and materialize the
outgoing messages. Empty bags are a novel addition in this work:
previous works [
5
,
6
,
89
] focus on non-redundant
JT
without empty
bags in the context of single query optimization.
Example 4 (Empty Bag). Consider the simplied TPC-DS
JT
in
Figure 4a. Store Sales is a large fact table (2.68M rows at SF=1), while
the rest are much smaller. To accelerate a query that aggregates
sales
grouped by
(Store,Time)
, we can create the empty bag
Time
Stores
between
Store_Sales
,
Time
and
Stores
(Figure 4b). The
message from
Store_Sales
to the empty bag is sucient for the
query and is 17.3×smaller (154K rows) than the fact table.
Note that leaf empty bag may result in an empty output message;
we avoid this special case by mapping the identity relation
I5
to it,
such that
R1I
=
R
for any relation
R
. Essentially, the empty bag
is “pass-through” and doesn’t change the join results nor the query
4
Heuristics for the general case have been studied by the greedy variable elimination
in Probabilistic Graphical Model [49].
5The schema is the same as the bag and all tuples in its domain are annotated with 1
element in the semiring.
A
1
1
B
1
2
cnt
2
3
Original Annotated
Relations
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
Hypergraph
A
B C
DT
SR
Junction Hypertree
A B A C A D
Absorption
A B A C A D
Join Result
A
1
1
B
1
1
C
1
1
D
1
2
cnt
2×3×4=24
2×3×2=12
1
1
1
1
2
2
1
2
2×5×4=40
2×5×2=20
1
1
2
2
1
1
1
2
3×3×4=36
3×3×2=18
1
1
2
2
2
2
1
2
3×5×4=60
3×5×2=30
Marg Result
cnt
24+12+40+20+36
+18+60+30=240
A
1
cnt
5
A
1
cnt
40
Message Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
1
D
1
2
cnt
4×40=160
2×40=80
Junction Hypertree As
Data Structure
A B A C A D
R S T
A
1
cnt
5
A
1
cnt
40
A
1
cnt
48
A
1
cnt
6
Downward Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
cnt
2+4=6
A
1
cnt
6×3+6×5=48
Cycle Hypergraph
A
B C
T
SRA B C
R S T
A B A B C B C
R S T
A C
A
1
1
B
1
2
cnt
3
3
R'
A
1
B
1
cnt
1
ΔR
A B A C A D
A
1
cnt
1
A
1
cnt
1×3+1×5=8
A
1
1
D
1
2
cnt
8×4=32
8×2=16
Factorised Incremental View Maintenance
D
1
2
E
1
2
cnt
7
1
UA B A C A D
D
1
E
1
cnt
7×160=1120
D E
D
1
cnt
160
A B C
A B AC BC
AB C
All
A B C
R S
A B AC BC
AC B
A B
R1, R2
A C
R3
A D
R4, R5
A B
R1, R2
A C
R3
C D
R6
A B
R1, R2
A C
R3
A D
R7
A B A C A D
A B A C A D
A B A C A D
A
B C
T (100)
S (100)
R (100)
DU (100)
A D
U
100
A B C
R S T
1000
Fractional Edge Cover: 10000
Hypertree Decomposition: 1000
Consider query COUNT(*) GROUP BY D,
Fractional Sub-Hypertree Width: 100
A D
U
100
A B C
R S T
1000
A
B C
S (100)R (1000)
A C
S
100
A B
R
1000
A C
S
100
A B
R
1000
A B
R
A C
S
B C
D E
D E
B F
B F
B F D E
A
1
1
B
1
2
p
0.1
0.3
P(A, B)
2
2
1
2
0.2
0.4
B
1
1
C
1
2
p
0.3
0.7
P(C|B)
2
2
1
2
0.2
0.8
P(A, B, C)
A
1
1
B
1
1
C
1
2
p
0.1×0.3=0.03
1
1
2
2
1
2
2
2
1
1
1
2
2
2
2
2
1
2
0.1×0.7=0.07
0.3×0.2=0.06
0.3×0.8=0.24
0.2×0.3=0.06
0.2×0.7=0.14
0.4×0.2=0.08
0.4×0.9=0.32
B
1
2
p
0.23
0.77
P(C)
A B B C
B
1
2
p
0.3
0.7
P(B)
B
1
1
C
1
2
P(B, C) = P(B)×P(C|B)
2
2
1
2
A B B C
p
0.3×0.3=0.09
0.7×0.3=0.21
0.2×0.7=0.14
0.8×0.7=0.56
A B
R, S
A B C
T
A B
R
A B C
S, T
D
D
1
cnt
160
2 80
A B
σA C A D
B C C D D E
D E
E F
B F
A B
Q2
A B
σA C A D
B C A C C D
D E
D E
B F
B F
Q2
B C
σC D D E E FA B
B C C D
σD E E FA B
B C C D D E E FA B
B C
σC D D E E FA B
B C C D D E E FA B
B C
σC D D E E FA B A C A DA B
A C
σA DA B
A C A DA B
A C
σA DA B
B C C D D EA B
B C
σC D D EA B
B C C D
σD EA B
A B A C A D
B
1
1
D
1
2
cnt
4×6=24
2×6=12
A
1
1
B
1
2
cnt
2
3
A
1
1
B
1
2
cnt
2×3=6
3×3=9
A
1
1
2
2
1
2
4×9=36
2×9=18
1
1
Store
Sales
Items
Stores
Time
Stores
Cust Time
B C A C C D D EB F
2 80 2 2 1×80=80
Store
Sales
Items
Stores
Cust Time
B C
A B B G B D D E E F
Cast
Info Movie Movie
Comp
Comp
Movie
Info
Info
Type
Aug
Comp
Person
Person
Info
Aug
Person
Movie
Key
Key
Type
Title
B C A C C D D E
B F
B C A C C D D EB F
with compensating
annotations
A B
R
A C
S
B C
A B B G B D D E E F
DE is the
augmentation
relation
Trans Sales ItemsStores
Dates
Aug
Stores
Aug
Dates
Aug
Items
A B A C A D
A
1
1
D
1
2
cnt
4×40=160
2×40=80
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A B A C A D
A
1
1
B
1
2
cnt
2
3
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
A
1
1
B
1
2
cnt
2
3
B
1
1
C
1
2
cnt
2×3=6
2×5=10
A
1
1
2
2
1
2
3×3=9
3×5=15
1
1
(a) Upward and downward
message passing. Green rec-
tangle is the root.
A
1
1
B
1
2
cnt
2
3
Original Annotated
Relations
R
A
1
1
C
1
2
cnt
3
5
S
A
1
1
D
1
2
cnt
4
2
T
Hypergraph
A
B C
DT
SR
Junction Hypertree
A B A C A D
Absorption
A B A C A D
Join Result
A
1
1
B
1
1
C
1
1
D
1
2
cnt
2×3×4=24
2×3×2=12
1
1
1
1
2
2
1
2
2×5×4=40
2×5×2=20
1
1
2
2
1
1
1
2
3×3×4=36
3×3×2=18
1
1
2
2
2
2
1
2
3×5×4=60
3×5×2=30
Marg Result
cnt
24+12+40+20+36
+18+60+30=240
A
1
cnt
5
A
1
cnt
40
Message Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×5=40
A
1
1
D
1
2
cnt
4×40=160
2×40=80
Junction Hypertree As
Data Structure
A B A C A D
R S T
A
1
cnt
5
A
1
cnt
40
A
1
cnt
48
A
1
cnt
6
Downward Passing
A B A C A D
A
1
cnt
2+3=5
A
1
cnt
5×3+5×3=40
A
1
cnt
2+4=6
A
1
cnt
6×3+6×5=48
Cycle Hypergraph
A
B C
T
SRA B C
R S T
A B A B C B C
R S T
A C
A
1
1
B
1
2
cnt
1
3
R
A
1
B
1
cnt
-1
ΔR
A B A C A D
A
1
cnt
-1
A
1
cnt
-1×3+-1×5=-8
A
1
1
D
1
2
cnt
4×40=160
2×40=80
Factorised Incremental View Maintenance
D
1
1
E
1
2
cnt
7
1
UA B A C A D
A
1
cnt
40
A
1
D
1
cnt
4×40×8=1280
Augmentation
D E
D
1
cnt
1+7=8
A B C
A B AC BC
AB C
All
A B C
R S
A B AC BC
AC B
A B
R1, R2
A C
R3
A D
R4, R5
A B
R1, R2
A C
R3
C D
R6
A B
R1, R2
A C
R3
A D
R7
A B A C A D
A B A C A D
A B A C A D
A
B C
T (100)
S (100)
R (100)
DU (100)
A D
U
100
A B C
R S T
1000
Fractional Edge Cover: 10000
Hypertree Decomposition: 1000
Consider query COUNT(*) GROUP BY D,
Fractional Sub-Hypertree Width: 100
A D
U
100
A B C
R S T
1000
A
B C
S (100)R (1000)
A C
S
100
A B
R
1000
A C
S
100
A B
R
1000
A B
R
A C
S
B C
D E
D E
B F
B F
B F D E
A
1
1
B
1
2
p
0.1
0.3
P(A, B)
2
2
1
2
0.2
0.4
B
1
1
C
1
2
p
0.3
0.7
P(C|B)
2
2
1
2
0.2
0.8
P(A, B, C)
A
1
1
B
1
1
C
1
2
p
0.1×0.3=0.03
1
1
2
2
1
2
2
2
1
1
1
2
2
2
2
2
1
2
0.1×0.7=0.07
0.3×0.2=0.06
0.3×0.8=0.24
0.2×0.3=0.06
0.2×0.7=0.14
0.4×0.2=0.08
0.4×0.9=0.32
B
1
2
p
0.23
0.77
P(C)
A B B C
B
1
2
p
0.3
0.7
P(B)
B
1
1
C
1
2
P(B, C) = P(B)×P(C|B)
2
2
1
2
A B B C
p
0.3×0.3=0.09
0.7×0.3=0.21
0.2×0.7=0.14
0.8×0.7=0.56
A B
R, S
A B C
T
A B
R
A B C
S, T
D
D
1
cnt
160
2 80
(b) Calibrated Junction Hypertree
with empty bag (dotted). Iis the
identity relation.
Figure 5: Message Passing and Calibration
result. When the bag is a leaf node, its message is simply
I
. We do
not materialize the identity relation, as it’s evident from the JT.
Example 5 (
JT
Data Structure). Figure 5b illustrates the
JT
data structure for the example in Figure 1. Each relation maps to
exactly one bag (orange dotted arrows), and each directed edge between
bags (black arrows) stores its corresponding message (purple dashed
arrows). Bag D (dotted rectangle) is an empty bag and materializes
the view of "count group by D". Iis the identity relation.
3.3 Message Passing Over Annotated Bags
We now describe support for general SPJA queries over
JT
. Al-
though each query
JT
has the same structure, we annotate the
bags based on the query’s SPJA operations. We then modify mes-
sage passing rules to accommodate the bag annotations. These
annotations will come in handy when determining work sharing
opportunities for a new delta query given a pivot query.
Given the database
R
= {
R1
,
R2
, ...,
Rn
} and
JT
= ((
E
,
V
),
X
,
Y
),
we focus on queries of the following form, where any semi-ring
aggregation is acceptable:
SELECT G, COUNT(*) FROM J
WHERE [JOIN COND] AND PGROUP BY G
where
G
is the grouping attributes,
J ⊆ R
is the set of relations
joined in the
FROM
clause, and
P
is the set of single-attribute predi-
cates
6
. Query execution is based on message passing as in Section 2,
however the processing at each bag diers based on its annotations.
We propose 4 annotation types, summarized in Table 1:
GROUP BY G.
For each attribute
A∈ G
, we annotate exactly one
bag
u
that contains this attribute with
γA
. Messages emitted by
the annotated bag and all downstream bags do not marginalize
out
A
. Since all bags containing
A
form a connected subtree,
which bag we annotate does not aect correctness, however
we will later discuss the performance implications of dierent
choices when we use the annotated JT to execute delta queries.
Joined Relations J.
The query may not join all relations
in the join graph, or the joined relations are updated. For each
relation R not included (resp. updated) in the query, we annotate
6
Multi-attributes predicates have interesting optimization opportunities [
48
] but is not
our focus. We rewrite them into group-by those attributes followed by the predicate.
5
摘要:

Calibration:ASimpleTrickforWide-tableDeltaAnalyticsZezhouHuangzh2408@columbia.eduColumbiaUniversityEugeneWuewu@cs.columbia.eduDSI,ColumbiaUniversityABSTRACTDataanalyticsovernormalizeddatabasestypicallyrequirescom-putingandmaterializingexpensivejoins(wide-tables).Factorizedqueryexecutionmodelsexecuti...

展开>> 收起<<
Calibration A Simple Trick for Wide-table Delta Analytics_2.pdf

共22页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:22 页 大小:6.47MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 22
客服
关注