Dierentially Private Stochastic Convex Optimization for Network Routing Applications Matthew Tsao

2025-05-06 0 0 2.54MB 31 页 10玖币
侵权投诉
Differentially Private Stochastic Convex Optimization for Network
Routing Applications
Matthew Tsao
Lyft Inc.
mtsao@lyft.com
Karthik Gopalakrishnan
Stanford University
gkarthik@stanford.edu
Kaidi Yang
National University of Singapore
kaidi.yang@nus.edu.sg
Marco Pavone
Stanford University
pavone@stanford.edu
October 27, 2022
Abstract
Network routing problems are common across many engineering applications. Computing
optimal routing policies requires knowledge about network demand, i.e., the origin and destina-
tion (OD) of all requests in the network. However, privacy considerations make it challenging to
share individual OD data that would be required for computing optimal policies. Privacy can
be particularly challenging in standard network routing problems because sources and sinks can
be easily identified from flow conservation constraints, making feasibility and privacy mutually
exclusive.
In this paper, we present a differentially private algorithm for network routing problems.
The main ingredient is a reformulation of network routing which moves all user data-dependent
parameters out of the constraint set and into the objective function. We then present an
algorithm for solving this formulation based on a differentially private variant of stochastic
gradient descent. In this algorithm, differential privacy is achieved by injecting noise, and one
may wonder if this noise injection compromises solution quality. We prove that our algorithm
is both differentially private and asymptotically optimal as the size of the training set goes
to infinity. We corroborate the theoretical results with numerical experiments on a road traffic
network which show that our algorithm provides differentially private and near-optimal solutions
in practice.
1 Introduction
Network routing problems appear in many important topics in engineering, including traffic routing
in transportation systems, power routing in electrical grids, and packet routing in distributed
computer systems. Network routing problems study settings where resources must be delivered to
customers through a network with limited bandwidth. The goal is typically to route resources to
their respective customers as efficiently as possible, or equivalently, with as little network congestion
as possible.
1
arXiv:2210.14489v1 [eess.SY] 26 Oct 2022
Private Convex Optimization for Network Routing Tsao, Gopalakrishnan, Yang, Pavone
One key challenge in network routing problems is that the requests (i.e., network demand) are
often not known in advance. Indeed, it is difficult to know exactly how much power a neighborhood
will need, or exactly how many visits a particular website will receive on any given day. Since
information about the demand is often necessary to develop optimal or near-optimal network routing
solutions, network routing algorithms often need a way of obtaining or estimating future demand.
With the advent of big data and internet-of-things systems, crowd-sourcing has gained popularity as
a demand forecasting approach for network routing systems. In crowd-sourcing, customers submit
their request history to the network operator. The network operator uses this historical data to
train a statistical or machine learning model to predict future demand from historical demand.
While crowd-sourcing provides a bountiful supply of data for training demand forecasting mod-
els, it can also introduce potential privacy concerns. Since crowd-sourcing uses individual-level
customer data to train demand forecasting models, the model’s outputs may reveal sensitive user
information, especially if it overfits to its training data [CTW+21]. Such privacy risks are problem-
atic because they may deter users from sharing their data with network operators, hence reducing
the supply of training data for demand forecasting models.
To address such concerns, the demand forecasting pipeline should be augmented with privacy-
preserving mechanisms. Differential privacy [DMNS06] is a principled and popular method to
occlude the influence a single user’s data on the result of a population study while also maintaining
the study’s accuracy. This is done by carefully injecting noise into the desired computation so that
data sets that differ by at most one data point will produce statistically indistinguishable results.
Providing differential privacy guarantees for the standard formulation of network routing is
difficult because the constraints contain user data, meaning that in general feasibility and privacy
become mutually exclusive. More specifically, in the standard network routing problem, the demand
sources and sinks can be identified by checking for conservation of flow, and as a result, the presence
of a user going to or from a very rare location can be detected from any feasible solution. Because
differential privacy requires that the presence of any single user’s data be difficult to detect from the
algorithm’s output, privacy and feasibility are at odds with one another in the standard formulation.
1.1 Statement of Contributions
In this paper we present a differentially private algorithm for network routing problems. The main
ingredient is a reformulation of network routing which moves all user data dependent parameters
out of the constraint set and into the objective function. We then present an algorithm for solving
this formulation based on differentially private variants of stochastic gradient descent. In this
algorithm, differential privacy is achieved by injecting noise, and one may wonder if this noise
injection compromises solution quality. We prove that our algorithm is differentially private and
under several reasonable regularity conditions, is also asymptotically optimal (as the size of the
training set goes to infinity). We note that in exchange for becoming compatible with differentially
private algorithms, this new formulation is more computationally expensive.
1.2 Related Work
Traffic assignment in transportation systems is one of the most well-known applications of network
routing. Herein we focus our literature review on privacy research in transportation networks.
Privacy works in transportation mainly focus on location privacy, where the objectives is to prevent
untrusted and/or external entities from learning geographic locations or location sequences of an
2
Private Convex Optimization for Network Routing Tsao, Gopalakrishnan, Yang, Pavone
individual [BS03]. Privacy-preserving approaches have been proposed for various location-based
applications, e.g., trajectory publishing, mobile crowdsensing, traffic control, etc. These techniques
are based on spatial cloaking [CML11] and differential privacy [Dwo08]. While not the setting of
interest for this paper, there are many works that use Secure Multi-Party Computation (MPC)
[GMW87] to achieve privacy in distributed mobility systems.
Spatial cloaking approaches aggregate users’ exact locations into coarse information. These
approaches are often based on k-anonymity [Swe02], where a mobility dataset is divided into equiv-
alence classes based on data attributes (e.g., geological regions, time, etc.) so that each class
contains at least krecords [GFCA14, HC20]. These k-anonymity-based approaches can guarantee
that every record in the dataset is indistinguishable from at least k1 other records. However,
k-anonymity is generally considered to be a weak privacy guarantee, especially when kis small.
Furthermore, very coarse data aggregation is required to address outliers or sparse data, and in
these cases spatial cloaking-based approaches provide low data accuracy.
Differential privacy provides a principled privacy guarantee by producing randomized responses
to queries, whereby two datasets that differ in only one entry produce statistically indistinguishable
responses [DMNS06]. In other words, differential privacy ensures that an adversary with arbitrary
background information (e.g., query responses, other entries) cannot infer individual entries with
high confidence. Within transportation research, [WHL+18, YLL+19] share noisy versions of lo-
cation data for mobile crowd-sensing applications. [GZFS15, GLTY18, AHFI+18, LYH20] use
differential privacy to publish noisy versions of trajectory data. [DKBS15] and [HTP17] apply
differential privacy to gradient descent algorithms for federated learning in mobility systems.
Many of the works mentioned in the previous paragraph establish differential privacy of their
proposed algorithms by using composition properties of differential privacy. Composition theorems
for differential privacy describe how well privacy is preserved when conducting several computations
one after another. In [DKBS15] and [HTP17], composition theorems are applied as black boxes
without considering the mathematical properties of the gradient descent algorithm. As a result,
the privacy guarantees are overly conservative, meaning that large amounts of noise are added to
the algorithm, leading to suboptimal behavior both in theory and in practice. Similarly, [GZFS15,
GLTY18, AHFI+18, LYH20] use composition rules as a black box, and while privacy is achieved in
this way, there are no accuracy guarantees for the algorithms presented in those works.
While blackbox applications of differential privacy can lead to impractical levels of noise in-
jection, specialized applications of differential privacy were discovered that could provide privacy
with much less noise. [WLK+17] show how a simple adjustment to stochastic gradient descent
can give rise to an algorithm which is both differentially private, and under reasonable regularity
assumptions, is also asymptotically optimal. [FMTT18] and [FKT20] refined this idea to develop
stochastic gradient descent algorithms that are differentially private, computationally efficient, and
have optimal convergence rates. These techniques cannot directly be used to solve the standard
formulation of network routing because they study unconstrained optimization problems or op-
timization problems with public constraint sets (i.e., constraints that do not include any private
data).
2 Model
In this section we define notations, network models, and assumptions that will allow us to formulate
network routing as a data-driven convex optimization problem.
3
Private Convex Optimization for Network Routing Tsao, Gopalakrishnan, Yang, Pavone
2.1 Preliminaries
Indicator Representation of Edge Sets: Let G= (V, E) be a graph with vertices Vand edges
E={e1, ..., em}. For any subset of edges E0E, we define the indicator representation of E0as
1[E0]as a boolean vector of length min the following way: The ith entry of 1[E0]is 1 if eiE0,
and is 0 otherwise.
Derivative Notation: For a scalar valued function x7→ f(x), we use f(x0) to denote the gradi-
ent of the fwith respect to xevaluated at the point x=x0. For a vector valued function x7→ g(x),
and a variable z, we use Dz[g](x0) to denote the derivative matrix of gwith respect to zevaluated
at the point x=x0.
Projections: For a convex set S Rd, we use ΠS:RdRdto denote the projection operator
for S. For any xRd, ΠS(x) := arg mins∈S ||xs||2to be the point in Sthat is closest to x.
2.2 Network, Demand, and Network Flow
In this section we will introduce a) a graph model of a network, b) a representation of network
demand, c) the standard formulation for network routing and d) privacy requirements. The notation
defined in this section is aggregated in table form in Section A for the reader’s convenience.
Definition 1 (Network Representation).We use a directed graph G= (V, E) to represent the
network, where Vand Erepresent the set of vertices and edges in the network, respectively. We
will use n:= |V|and m:= |E|to denote the number of vertices and edges in the graph, respectively.
For vertex pairs (o, d)V×Vwe will use PG(o, d) to denote the set of simple paths from oto d
in G.
Definition 2 (Operation Period).We use T:= [tstart, tend] to denote the operation period within
which a network operator wants to optimize its routing decisions. We will also use Tto denote the
number of minutes in the operation period. For example, tstart = 8 : 00am, tend = 9 : 00am could
represent a morning commute period where T= 60.
Definition 3 (Demand Representation).We study a stationary demand model where demand
within the operation period Tis specified by a matrix Λ Rn×n
+. For each ordered pair of vertices
(o, d)V×V, Λ(o, d) is the number of requests arriving per minute (i.e., the arrival rate) during
Tthat need routing from vertex oto vertex d.
Remark 1 (Estimating Λ from historical data).The arrival rates from historical demand are
computed empirically, i.e., if Λtrepresents the demand for day t, then Λt(o, d) is calculated by
counting the number of (o, d) requests appearing on day t, and then dividing it by Tto obtain
requests per minute.
Definition 4 (Link Latency Functions).To model congestion effects, each link eEhas a latency
function fe:R+R+which specifies the average travel time through the link as a function of the
total flow on the link.
In this paper we study a setting where a network operator wants to route demand while min-
imizing the total travel time for the requests. With these definitions, the standard formulation of
minimum travle time network routing is described in Definition 5.
4
Private Convex Optimization for Network Routing Tsao, Gopalakrishnan, Yang, Pavone
Definition 5 (Standard Formulation of Network Flow).For a network G= (V, E) with link latency
functions {fe}eEand demand Λ, the standard network flow problem is the following minimization
program:
minimize
xX
eE
yefe(ye) (1a)
subject to x=nx(o,d)o(o,d)V×V,(1b)
ye=X
(o,d)V×V
x(o,d)
efor all eE, (1c)
X
v:(u,v)E
x(o,d)
(u,v)X
w:(w,u)E
x(o,d)
(w,u)= Λ(o, d)1[u=o]Λ(o, d)1[u=d]for all uV, (o, d)V×V. (1d)
In (1), the decision variable xis a collection of flows x(o,d)(o,d)V×V, one for each non-zero
entry of Λ, as represented by constraint (1b). (1d) are flow conservation constraints to ensure that
x(o,d)sends Λ(o, d) units of flow from oto d. Constraints (1c) ensure that {ye}eErepresents the
total amount of flow on each edge. Finally, the objective function (1a) is to minimize the total
travel time as a function of total network flow.
In the next subsection we will describe the rigorous privacy requirements that we will mandate
while designing algorithms for network flow. We then describe in Section 2.4 why privacy and
feasibility are mutually exclusive in the standard network flow formulation.
2.3 Privacy Requirements
We will use differential privacy to reason about the privacy-preserving properties of our algorithms.
At a high level, changing one data point of the input to a differentially private algorithm should
lead to a statistically indistinguishable change in the output. To make this concept concrete we
will need to define data sets and adjacency.
Definition 6 (Data sets).Given a space of data points Z, a data set Lis any finite set of elements
from Z. In practice, each element of a data set is data collected from a single user, or data collected
during a specific time period. We will use Lto denote the set of all data sets.
Definition 7 (Data Set Adjacency).Given a space of data sets L, an adjacency relation Adj is
a mapping Adj : L × L {0,1}. Two data sets L1, L2are said to be adjacent if and only if
Adj(L1, L2) = 1.
While the exact definition can vary across applications, adjacent data sets are sets that are
very similar to one another. The most common definition of adjacency is the following: L, L0are
adjacent if and only if L0can be obtained by adding, deleting, or modifying at most one data point
from L, and vice versa. Thus comparing a function’s output on adjacent data sets measures how
sensitive the function is to changes in a single data point.
With these definitions in place, we are now ready to define differential privacy.
Definition 8 (Differential Privacy).For a given adjacency relation Adj, a function M:L→Xis
(, δ)-differentially private if for any L1, L2∈ L with Adj(L1, L2) = 1, the following inequality
P[M(L1)E]eP[M(L2)E] + δ
5
摘要:

Di erentiallyPrivateStochasticConvexOptimizationforNetworkRoutingApplicationsMatthewTsaoLyftInc.mtsao@lyft.comKarthikGopalakrishnanStanfordUniversitygkarthik@stanford.eduKaidiYangNationalUniversityofSingaporekaidi.yang@nus.edu.sgMarcoPavoneStanfordUniversitypavone@stanford.eduOctober27,2022AbstractN...

展开>> 收起<<
Dierentially Private Stochastic Convex Optimization for Network Routing Applications Matthew Tsao.pdf

共31页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:31 页 大小:2.54MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 31
客服
关注