Language-Integrated Query for Temporal Data

2025-05-03 0 0 3.38MB 31 页 10玖币
侵权投诉
Language-Integrated Query for Temporal Data
(Extended version)
Simon Fowler
University of Glasgow
UK
simon.fowler@glasgow.ac.uk
Vashti Galpin
University of Edinburgh
UK
vashti.galpin@ed.ac.uk
James Cheney
University of Edinburgh
UK
jcheney@inf.ed.ac.uk
Abstract
Modern applications often manage time-varying data. De-
spite decades of research on temporal databases, which cul-
minated in the addition of temporal data operations into
the SQL:2011 standard, temporal data query and manipula-
tion operations are unavailable in most mainstream database
management systems, leaving developers with the unenvi-
able task of implementing such functionality from scratch.
In this paper, we extend language-integrated query to sup-
port writing temporal queries and updates in a uniform
host language, with the language performing the required
rewriting to emulate temporal capabilities automatically on
any standard relational database. We introduce two core
languages,
𝜆TLINQ
and
𝜆VLINQ
, for manipulating transaction
time and valid time data respectively, and formalise exist-
ing implementation strategies by giving provably correct
semantics-preserving translations into a non-temporal core
language,
𝜆LINQ
. We show how existing work on query nor-
malisation supports a surprisingly simple implementation
strategy for sequenced joins. We implement our approach in
the Links programming language, and describe a non-trivial
case study based on curating COVID-19 statistics.
Keywords:
language-integrated query, temporal databases,
domain-specic languages, multi-tier programming
1 Introduction
Most interesting programs access or query data stored persis-
tently, often in a database. Relational database management
systems (RDBMSs) are the most popular option and provide
a standard domain-specic language, SQL, for querying and
modifying the data. Ideally, programmers can express the
desired queries or updates declaratively in SQL and leave
the database to decide how to answer queries or perform up-
dates eciently and safely (e.g. in the presence of concurrent
accesses), but there are many pitfalls arising from interfacing
with SQL from a general-purpose language, leading to the
well-known impedance mismatch problem [
10
]. These di-
culties range from run-time failures due to the generation
of queries as unchecked SQL strings at runtime, to serious
security vulnerabilities like SQL injection attacks [31].
Among the most successful approaches to overcome the
above challenges, and the approach we build upon in this pa-
per, is language-integrated query, exemplied by Microsoft’s
popular LINQ for .NET [
24
,
35
] and in a number of other
language designs such as Links [
9
,
22
] and libraries such as
Quill [
28
]. Within this design space we focus on a family of
techniques derived from foundational work by Buneman et
al. [
3
] on the nested relational calculus (NRC), a core query
language with monadic collection types; work by Wong [
36
]
on rewriting NRC expressions to normal forms that can be
translated to SQL, which forms the basis of the approach
taken in Links [8,22] and has been adapted to F# [5].
Many interesting database applications involve data that
changes over time. Perhaps inevitably, temporal data man-
agement [
18
] has a long history. Temporal databases pro-
vide powerful features for querying and modifying data that
changes over time, and are particularly suitable for mod-
eling time-varying phenomena, such as enterprise data or
evolving scientic knowledge, and supporting auditing and
transparency about how the current state of the data was
achieved, for example in nancial or scientic settings.
To illustrate how temporal databases work and why they
are useful, consider the following toy example: a temporal to-
do list. A temporal database can be conceptualised abstractly
as a function mapping each possible time instant (e.g. times
of day) to a database state [
19
]. For eciency in the common
case where most of the data is unchanged most of the time,
temporal databases are often represented by augmenting
each row with an interval timestamp indicating the time
period when the row is present. For technical reasons, closed-
open intervals
[𝑠𝑡𝑎𝑟𝑡, 𝑒𝑛𝑑)
representing the times
start 𝑡<
end are typically used [33].
In our temporal to-do list, the table at each time instant
has elds “task”, a string eld, and “done, a Boolean eld.
Additional elds “start” and “end” record the endpoints of
the time interval during which each row is to be considered
part of the table. An end time of
(“forever”) reects that
there is no (currently known) end time and in the absence of
other changes, the row is considered present from the start
time onwards. For example, the table:
task done start end
Go shopping true 11:00
Cook dinner false 11:00 17:30
Walk the dog false 11:00
Cook dinner true 17:30
Watch TV false 11:00 19:00
represents a temporal table where all four tasks were added
at 11:00, with “Go shopping” being complete and the others
incomplete; at 17:30 “Cook dinner” was marked “done” from
arXiv:2210.12077v1 [cs.PL] 21 Oct 2022
Conference’17, July 2017, Washington, DC, USA Simon Fowler, Vashti Galpin, and James Cheney
then onwards, and at 19:00 “Watch TV” was removed from
the table without being completed. Technically, note that
this example interprets the time annotations as transaction
time, that is, the times indicate when certain data was in the
database; there is another dimension, valid time, and we will
discuss both dimensions in greater detail later on.
The problems of querying and updating temporal databases
have been well-studied, leading to the landmark language de-
sign TSQL2 [
32
] extending SQL. However, despite decades of
eort, only limited elements of TSQL2 were eventually incor-
porated into the SQL:2011 standard [
21
] and these features
have not yet been widely adopted. Directly implementing
temporal queries in SQL is possible, but painful: a TSQL2-
style query or update operation may grow by a factor of 10
or more when translated to plain SQL, which leaves plenty of
scope for error, and thus these powerful capabilities remain
outside the grasp of non-experts. In this paper we take rst
steps towards reconciling temporal data management with
language-integrated query based on query normalisation.
We propose supporting temporal capabilities by translation
to ordinary language-integrated query and hypothesise that
this approach can make temporal data management safer,
easier to use and more accessible to non-experts than the
current state of aairs. As an initial test of this hypothesis
we present a high-level design, a working implementation,
and detail our experience with a nontrivial case study.
Although both language-integrated query and temporal
databases are now well-studied topics, we believe that their
combination has never been considered before. Doing so has
a number of potential benets, including making the power
of well-studied language designs such as TSQL2 more acces-
sible to non-expert programmers, and providing a high-level
abstraction that can be implemented eciently in dierent
ways. Our interest is particularly motivated by the needs
of scientic database development, where data versioning
for accountability and research integrity are very important
needs that are not well-supported by conventional database
systems [
4
]. Temporal data management has the potential
to become a “killer app” for language-integrated query, and
this paper takes a rst but signicant step towards this goal.
The overarching contribution of this paper is the rst ex-
tension of language-integrated query to support transaction
time and valid time temporal data.
Concretely, we make three main contributions:
1.
Based on
𝜆LINQ
2), a formalism based on the Nested
Relational Calculus (NRC) [
30
], we introduce typed
𝜆
-calculi to model queries and modications on trans-
action time (§3) and valid time (§4) databases. We give
semantics-preserving translations to 𝜆LINQ for both.
2.
We show how existing work on query normalisation
allows a surprisingly straightforward implementation
strategy for sequenced joins 5).
Types 𝐴, 𝐵 ::=𝐶|𝐴𝐸𝐵|Bag(𝐴) | ( g
:𝐴) | Table(𝐴)
Base types 𝐶::=String |Int |Bool |Time
Eects 𝑒::=read |write
Eect sets 𝐸
Terms 𝐿, 𝑀, 𝑁 ::=𝑥|𝑐|𝑡
|𝜆𝑥.𝑀 |𝑀 𝑁 | ⊙ {
𝑀}
|if 𝐿then 𝑀else 𝑁
|* + |*𝑀+|𝑀𝑁|for (𝑥𝑀)𝑁
| (
=𝑀) | 𝑀.ℓ |now
|query 𝑀|get 𝑀|insert 𝑀values 𝑁
|update (𝑥𝐿)where 𝑀set (
=𝑁)
|delete (𝑥𝑀)where 𝑁
Figure 1. Syntax of 𝜆LINQ
3.
We implement our constructs in the Links functional
web programming language, and describe a case study
based on curating COVID-19 data (§6).
Although the concepts behind translating temporal
queries and updates into non-temporal core languages are
well known [
33
], our core calculi
𝜆TLINQ
and
𝜆VLINQ
are novel
and aid us in showing (to the best of our knowledge) the rst
correctness results for the translations.
We relegate several details and all proofs to the appendices.
2 Background: Language-Integrated Query
We begin by introducing a basic
𝜆
-calculus, called
𝜆LINQ
, to
model language-integrated query in non-temporal databases.
Our calculus is based heavily on the Nested Relational Cal-
culus [
30
], with support for database modications heavily
inspired by the calculus of Fehrenbach and Cheney
[13]
. The
calculus uses a type-and-eect system to ensure database
accesses can occur in ‘safe’ places, i.e., that we do not attempt
to perform a modication operation in the middle of a query.
Eects include
read
(denoting a read from a database) and
write (denoting an update to the database).
Types
𝐴, 𝐵
include base types
𝐶
, eect-annotated function
types
𝐴𝐸𝐵
, unordered collection types
Bag(𝐴)
, record
types
(𝑖
:
𝐴𝑖)𝑖
denoting a record with labels
𝑖
and types
𝐴𝑖
,
and handles
Table(𝐴)
for tables containing records of type
𝐴
. A record is a base record if it contains only elds of base
type. We assume that the base types includes at least
Bool
and the Time type, which denotes (abstract) timestamps.
Basic terms include variables
𝑥
, constants
𝑐
, table handles
𝑡
, functions
𝜆𝑥.𝑀
, application
𝑀 𝑁
, n-ary operations
{
𝑀}
,
and conditionals
if 𝐿then 𝑀else 𝑁
. We assume that the
set of operations contains the usual unary and binary rela-
tion operators, as well as the
𝑛
-ary operations
greatest
and
least
on timestamps which return their largest and smallest
arguments respectively. We assume that the set of constants
contains timestamps
𝜄
of type
Time
, and two distinguished
timestamps
−∞
and
of type
Time
, which denote the mini-
mum and maximum timestamps respectively. The calculus
also includes the empty multiset constructor
* +
; the single-
ton multiset constructor
*𝑀+
; multiset union
𝑀𝑁
; and
comprehensions
for (𝑥𝑀)𝑁
. We write
*𝑀1, . . . , 𝑀𝑛+
as
Language-Integrated ery for Temporal Data Conference’17, July 2017, Washington, DC, USA
sugar for
*𝑀1+. . . *𝑀𝑛+
. We also have records
(𝑖=𝑀𝑖)𝑖
and projection 𝑀.. Term now retrieves the current time.
We write
let 𝑥=𝑀in 𝑁
as the usual syntactic sugar for
(𝜆𝑥.𝑁 )𝑀
, and
𝑀
;
𝑁
as sugar for
(𝜆𝑥.𝑁 )𝑀
for some fresh
𝑥
.
We also dene
where𝑀 𝑁
as sugar for
if 𝑀then 𝑁else * +
.
We denote unordered collections with a tilde (e.g.,
e
𝑀
), and
ordered sequences with an arrow (e.g.,
𝑀).
The
get 𝑀
term retrieves the contents of a table into
a bag;
insert 𝑀values 𝑁
inserts values
𝑁
into table
𝑀
;
update (𝑥𝐿)where 𝑀set (𝑖=𝑁𝑖)𝑖
updates table
𝐿
, updating the elds
𝑖
to
𝑁𝑖
of each record
𝑥
satisfying
predicate 𝑀. The delete (𝑥𝑀)where 𝑁term removes
those records 𝑥in table 𝑀satisfying predicate 𝑁.
2.1 Typing rules
Figure 2shows the typing rules for 𝜆LINQ; the typing judge-
ment has the shape
Γ𝑀:𝐴
!
𝐸
, which can be read, “Under
type environment
Γ
, term
𝑀
has type
𝐴
and produces ef-
fects
𝐸
. The rules are implicitly parameterised by a data-
base schema
Σ
mapping table names to types of the form
Bag((𝑖
:
𝐶𝑖)𝑖)
. Many rules are similar to those of the simply-
typed
𝜆
-calculus extended with monadic collection opera-
tions [
3
] and a set-based eect system [
23
], and such standard
rules are relegated to Appendix A.
Rule T-ery states that a term
query 𝑀
is well-typed if
𝑀
is of a query type: either a base type, a record type whose
elds are query types, or a bag whose elements are query
types. The term
𝑀
must also only have
read
eects. These
restrictions allow ecient compilation to SQL [6,8].
Rule T-Get states that
get 𝑀
has type
Bag(𝐴)
if
𝑀
has type
Table(𝐴)
, and produces the
read
eect. Rule T-
Table states that a table reference follows the type of
the table in the schema. T-Insert types a database insert
insert 𝑀values 𝑁
, requiring
𝑀
to be a table reference of
type
Table(𝐴)
, and the inserted values
𝑁
to be a bag of type
Bag(𝐴)
.T-Update ensures the predicate and update terms
are typable under an environment extended with the row
type, and ensures that all updated values match the type
expected by the row. Rule T-Delete is similar. All subterms
used as predicates or used to calculate updated terms must
be pure (that is, side-eect free), and all modications have
the write eect.
2.2 Semantics
Figure 3shows the syntax and typing rules of values, and
the big-step semantics of
𝜆LINQ
. Most rules are standard,
and presented in Appendix A. Values
𝑉 ,𝑊
include func-
tions, constants, tables, fully-evaluated records, and fully-
evaluated bags
*e
𝑉+
. Unlike the unary bag constructor
*𝑀+
,
fully-evaluated bags contain an unordered collection of val-
ues. All values are pure. We write
for record extension,
e.g., (1=𝑀)⊕(2=𝑁)=(1=𝑀, ℓ2=𝑁).
Since evaluation is eectful (as database operations can
update the database), the evaluation judgement has the shape
𝑀Δ,𝜄 (𝑉 , Δ)
; this can be read “term
𝑀
with current data-
base
Δ
at time
𝜄
evaluates to value
𝑉
with updated database
Δ
. A database is a mapping from table names to bags of
base records. To avoid additional complexity, we assume
evaluation is atomic and does not update the time; one could
straightforwardly update the semantics with a
tick
opera-
tion without aecting any results.
We use two further evaluation relations for terms which do
not write to the database:
𝑀
𝜄𝑉
states that a pure term
𝑀
(i.e., a term typable under an empty eect set) evaluates to
𝑉
.
Similarly,
𝑀
Δ,𝜄 𝑉
states that a term
𝑀
which only performs
the
read
eect evaluates to
𝑉
. We omit the denitions, which
are similar to the evaluation relation but do not propagate
database changes (since no changes can occur).
Rule E-Now returns the current timestamp. Rule E-ery
evaluates the body of a query using the read-only evaluation
relation. Rule E-Get evaluates its subject to a table reference,
and then returns the contents of a table. Rule E-Insert does
similar, evaluating the values to insert, and then appending
them to the contents of the table. Rule E-Update iterates
over a table, updating a record if the predicate matches, and
leaving it unmodied if not. Finally, E-Delete deletes those
rows satisfying the deletion predicate.
𝜆LINQ enjoys a standard type soundness property.
Proposition 2.1
(Type soundness)
.
If
· ⊢ 𝑀:𝐴
!
𝐸
then there
exists some
𝑉
and
Δ
such that
𝑀Δ,𝜄 (𝑉 , Δ)
and
· 𝑉:𝐴
!
.
More importantly, the type-and-eect system ensures that
query and update expressions in
𝜆LINQ
can be translated to
SQL equivalent, even in the presence of higher-order func-
tions and nested query results [
5
,
6
,
8
,
22
]. This alternative
implementation is equivalent to the semantics given here
but usually much more ecient since the database query
optimiser can takes advantage of any available integrity con-
straints or statistics about the data.
3 Transaction Time
The rst dimension of time we investigate is transaction
time [
34
], which records how the state of the database
changes over time. The key idea behind transaction time
databases is that update operations are non-destructive, so
we can always view a database as it stood at a particular
point in time.
Let us illustrate with the to-do list example from the in-
troduction. The original table is on the left. The table after
making some changes is shown on the right.
task done
Go shopping true
Cook dinner false
Walk the dog false
Watch TV false
task done
Go shopping true
Cook dinner true
Walk the dog false
Conference’17, July 2017, Washington, DC, USA Simon Fowler, Vashti Galpin, and James Cheney
Term typing
Γ𝑀:𝐴!𝐸
T-ery
Γ𝑀:Bag(𝐴)!𝐸 𝐴 :: QType 𝐸⊆ {read}
Γquery 𝑀:Bag(𝐴)!𝐸
T-Now
Γnow:Time !
T-Get
Γ𝑀:Table(𝐴)!𝐸
Γget 𝑀:Bag(𝐴)!{read} ∪ 𝐸
T-Insert
Γ𝑀:Table(𝐴)!𝐸Γ𝑁:Bag(𝐴)!
Γinsert 𝑀values 𝑁:() !{write} 𝐸
T-Update
Γ𝐿:Table(𝐴)!𝐸
𝐴=(𝑖:𝐵𝑖)𝑖𝐼Γ, 𝑥 :𝐴𝑀:Bool ! ( 𝑗𝐼Γ, 𝑥 :𝐴𝑁𝑗:𝐵𝑗!∅) 𝑗𝐽
Γupdate (𝑥𝐿)where 𝑀set (𝑗=𝑁𝑗)𝑗𝐽:() !{write} 𝐸
T-Delete
Γ𝑀:Table(𝐴)!𝐸Γ, 𝑥 :𝐴𝑁:Bool !
Γdelete (𝑥𝑀)where 𝑁:() !{write} 𝐸
Query typing
A :: QType
𝐶:: QType
(𝐴𝑖:: QType)𝑖
(𝑖:𝐴𝑖)𝑖:: QType
𝐴:: QType
Bag(𝐴):: QType
Figure 2. Typing rules for 𝜆LINQ (selected)
Syntax of values, operations on values, and value typing
Values 𝑉 , 𝑊 ::=𝜆𝑥 .𝑀 |𝑐|𝑡| (𝑖=𝑉𝑖)𝑖|*e
𝑉+
*e
𝑉+ˆ
*f
𝑊+*e
𝑉·f
𝑊+
( (𝑖=𝑉𝑖)𝑖𝐼with (𝑗=𝑊𝑗)) (𝑖=𝑉𝑖)𝑖𝐼\𝐽⊕ (𝑗=𝑊𝑗)𝑗𝐽
T-BagV
(Γ𝑉𝑖:𝐴!∅)𝑖
Γ*e
𝑉+:Bag(𝐴)!
Big-step reduction rules 𝑀Δ,𝜄 (𝑉 , Δ)
E-Now
now Δ,𝜄 (𝜄, Δ)
E-ery
𝑀
Δ,𝜄 𝑉
query 𝑀Δ,𝜄 (𝑉 , Δ)
E-Get
𝑀Δ,𝜄 (𝑡, Δ)
get 𝑀Δ,𝜄 (Δ(𝑡),Δ)
E-Insert
𝑀Δ,𝜄 (𝑡, Δ)𝑁
𝜄𝑉
insert 𝑀values 𝑁Δ,𝜄 ( (),Δ[𝑡↦→ Δ(𝑡)ˆ
𝑉])
E-Update
𝐿Δ,𝜄 (𝑡, Δ1)Δ2=Δ1[𝑡↦→ *upd(v) | vΔ1(𝑡)+]
upd(v)=((vwith
=𝑊)if 𝑀{v/𝑥} ⇓
𝜄true and (𝑁𝑖{v/𝑥} ⇓
𝜄𝑊𝑖)𝑖
vif 𝑀{v/𝑥} ⇓
𝜄false
update (𝑥𝐿)where 𝑀set (
=𝑁) ⇓Δ,𝜄 ( (),Δ2)
E-Delete
𝑀Δ,𝜄 (𝑡, Δ1)Δ2=Δ1[𝑡↦→ *vΔ(𝑡) | 𝑁{v/𝑥} ⇓
𝜄false+]
delete (𝑥𝑀)where 𝑁Δ,𝜄 ( (),Δ2)
Figure 3. Semantics of 𝜆LINQ (selected)
However, since updates and deletions in
𝜆LINQ
are destruc-
tive, we have lost the original data. Instead, let us see how
this could be handled by a transaction-time database:
task done start end
Go shopping true 11:00
Cook dinner false 11:00
Walk the dog false 11:00
Watch TV false 11:00
There are several methods by which we can maintain the
temporal information in the database: for example we could
maintain a tracking log which records each entry, or we
could use various temporal partitioning strategies [
33
]. In this
paper, we use a period-stamped representation, where each
record in the database is augmented with elds delimiting
the interval when the record was present in the database.
The time period is a closed-open representation, meaning
that each row is in the database from (and including) the
time stated in the start column, and is in the database up to
(but not including) the time stated in the end column. We
also assume that 𝑠𝑡𝑎𝑟𝑡 <𝑒𝑛𝑑 always holds.
Here, our database states that all four tasks were entered
into the database at 11:00. However, if we then decide to
check o “Cook dinner” at 17:30 and delete “Watch TV” at
19:00, we obtain the table shown in the introduction.
Since timestamps are either
or only refer to the past;
users do not modify period stamps directly; and the informa-
tion in the database grows monotonically, we can reconstruct
the state of the database at any given time.
3.1 Calculus
𝜆TLINQ
extends
𝜆LINQ
with native support for transaction time
operations; instead of performing destructive updates, we
adjust the end timestamp of aected rows and, if necessary,
insert updated rows.
𝜆TLINQ
database entries are therefore of
the form
𝑉[𝑉2,𝑉3)
1
, where
𝑉1
is the record data and
𝑉2
and
𝑉3
are the start and end timestamps.
Language-Integrated ery for Temporal Data Conference’17, July 2017, Washington, DC, USA
Additional Syntax for 𝜆TLINQ
Types 𝐴, 𝐵 ::=· · · | TransactionTime(𝐴)Timestamped rows 𝐷::=𝑉[𝑉2,𝑉3)
1
Terms 𝐿, 𝑀, 𝑁 ::=· · · | data 𝑀|start 𝑀|end 𝑀Values 𝑉 ,𝑊 ::=· · · | 𝐷
Modied Typing Rules for 𝜆TLINQ Γ𝑉:𝐴!𝐸Γ𝑀:𝐴!𝐸
T-Row
Γ𝑉1:𝐴!
Γ𝑉2:Time !Γ𝑉3:Time !
Γ𝑉[𝑉2,𝑉3)
1:TransactionTime(𝐴)!
T-Data
Γ𝑀:TransactionTime(𝐴)!𝐸
Γdata 𝑀:𝐴!𝐸
T-Start
Γ𝑀:TransactionTime(𝐴)!𝐸
Γstart 𝑀:Time !𝐸
T-End
Γ𝑀:TransactionTime(𝐴)!𝐸
Γend 𝑀:Time !𝐸
T-Get
Γ𝑀:Table(𝐴)!𝐸
Γget 𝑀:Bag(TransactionTime(𝐴)) !{read} 𝐸
Semantics for 𝜆TLINQ database operations 𝑀T
Δ,𝜄 (𝑉 , Δ)
ET-Data
𝑀T
Δ,𝜄 (𝑉[𝑉2,𝑉3)
1,Δ)
data 𝑀T
Δ,𝜄 (𝑉1,Δ)
ET-Start
𝑀T
Δ,𝜄 (𝑉[𝑉2,𝑉3)
1,Δ)
start 𝑀T
Δ,𝜄 (𝑉2,Δ)
ET-End
𝑀T
Δ,𝜄 (𝑉[𝑉2,𝑉3)
1,Δ)
end 𝑀T
Δ,𝜄 (𝑉3,Δ)
ET-Insert
𝑀T
Δ,𝜄 (𝑡, Δ1)𝑁
𝜄*e
𝑉+
vs =*v[𝜄,) |ve
𝑉+Δ2=Δ[𝑡↦→ Δ1(𝑡)ˆ
vs]
insert 𝑀values 𝑁T
Δ,𝜄 ( (),Δ2)
ET-Delete
𝑀T
Δ,𝜄 (𝑡, Δ1)Δ2=Δ1[𝑡↦→ *del(v) | vΔ1(𝑡)+]
del(data[start,end))=
(data[start,𝜄)if end =and 𝑁{data/𝑥} ⇓
𝜄true
data[start,end)otherwise
delete (𝑥𝑀)where 𝑁T
Δ,𝜄 ( (),Δ2)
ET-Update
𝐿T
Δ,𝜄 (𝑡, Δ1)Δ2=Δ1[𝑡↦→ ˆ
Ú*upd(v) | vΔ1(𝑡)+]
upd(data[start,end))=
*data[start,𝜄),(data with
=𝑉)[𝜄,) +
if 𝑀{data/𝑥} ⇓
𝜄true,(𝑁𝑖{data/𝑥} ⇓
𝜄𝑉𝑖)𝑖,and end =
*data[start,end)+otherwise
update (𝑥𝐿)where 𝑀set (
=𝑁) ⇓T
Δ,𝜄 ( (),Δ2)
Figure 4. Syntax, typing rules, and semantics of 𝜆TLINQ
Figure 4shows the syntax, typing rules, and semantics of
𝜆TLINQ
; for brevity, we show the main dierences to
𝜆LINQ
.
Period-stamped database rows are represented as triples
data[start,end)
with type
TransactionTime(𝐴)
, where
data
has
type
𝐴
(the type of each record), and both
start
and
end
have type
Time
. A row is currently present in the database
if its end value is
. We introduce three accessors:
data 𝑀
extracts the data record from a transaction-time row;
start𝑀
extracts the start time; and
end 𝑀
extracts the end time. The
get
construct has an updated type to show that it returns a
bag of
TransactionTime(𝐴)
values, rather than the records
directly. The typing rules for the other constructs remain the
same as in 𝜆LINQ.
The accessor rules ET-Data,ET-Start, and ET-End
project the expected component of the transaction-time row.
Rule ET-Insert period-stamps each record to begin at the
current time, and sets the end time to be
. Rule ET-Delete
records deletions for current rows satisfying the deletion
predicate. Instead of being removed from the database, the
end times of the aected rows are set to the current times-
tamp. Finally, rule ET-Update performs updates for current
rows satisfying the update predicate. Instead of changing a
record directly, the
upd
denition generates two records: the
previous record, closed o at the current timestamp, and the
new record with updated values, starting from the current
timestamp and with end eld
. Returning to our running
example, dene
at(tbl,time)
to return all records in
tbl
start-
ing before
time
and ending after
time
. We can then query
the database as it stood at 18:00:
at(t,time)
for (𝑥get t)
where (start 𝑥time time <end 𝑥)
*data 𝑥+
at(tbl,18:00)
task done
Go shopping true
Cook dinner true
Walk the dog false
Watch TV false
Let
current(
t
)=at(
t
,∞)
return the current snapshot of
𝑡
.
We can then query the current snapshot of the database:
current(tbl)=
task done
Go shopping true
Cook dinner true
Walk the dog false
3.2 Translation
We can implement the native transaction-time semantics
for
𝜆TLINQ
database operations by translation to
𝜆LINQ
. Our
translation adapts the SQL implementations of temporal
operations by Snodgrass
[33]
to a language-integrated query
setting. We prove correctness relative to the semantics.
𝜆TLINQ
has a native representation of period-stamped data,
whereas
𝜆LINQ
requires table types to be at. Consequently,
the translations require knowledge of the types of each
record. We therefore annotate each
𝜆TLINQ
database term
摘要:

Language-IntegratedQueryforTemporalData(Extendedversion)SimonFowlerUniversityofGlasgowUKsimon.fowler@glasgow.ac.ukVashtiGalpinUniversityofEdinburghUKvashti.galpin@ed.ac.ukJamesCheneyUniversityofEdinburghUKjcheney@inf.ed.ac.ukAbstractModernapplicationsoftenmanagetime-varyingdata.De-spitedecadesofrese...

展开>> 收起<<
Language-Integrated Query for Temporal Data.pdf

共31页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:31 页 大小:3.38MB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 31
客服
关注