Dierentially Private Stochastic Convex Optimization for Network Routing Applications Matthew Tsao

2025-05-06 0 0 2.54MB 31 页 10玖币

侵权投诉

Diﬀerentially Private Stochastic Convex Optimization for Network

Routing Applications

Matthew Tsao

Lyft Inc.

mtsao@lyft.com

Karthik Gopalakrishnan

Stanford University

gkarthik@stanford.edu

Kaidi Yang

National University of Singapore

kaidi.yang@nus.edu.sg

Marco Pavone

Stanford University

pavone@stanford.edu

October 27, 2022

Abstract

Network routing problems are common across many engineering applications. Computing

optimal routing policies requires knowledge about network demand, i.e., the origin and destina-

tion (OD) of all requests in the network. However, privacy considerations make it challenging to

share individual OD data that would be required for computing optimal policies. Privacy can

be particularly challenging in standard network routing problems because sources and sinks can

be easily identiﬁed from ﬂow conservation constraints, making feasibility and privacy mutually

exclusive.

In this paper, we present a diﬀerentially private algorithm for network routing problems.

The main ingredient is a reformulation of network routing which moves all user data-dependent

parameters out of the constraint set and into the objective function. We then present an

algorithm for solving this formulation based on a diﬀerentially private variant of stochastic

gradient descent. In this algorithm, diﬀerential privacy is achieved by injecting noise, and one

may wonder if this noise injection compromises solution quality. We prove that our algorithm

is both diﬀerentially private and asymptotically optimal as the size of the training set goes

to inﬁnity. We corroborate the theoretical results with numerical experiments on a road traﬃc

network which show that our algorithm provides diﬀerentially private and near-optimal solutions

in practice.

1 Introduction

Network routing problems appear in many important topics in engineering, including traﬃc routing

in transportation systems, power routing in electrical grids, and packet routing in distributed

computer systems. Network routing problems study settings where resources must be delivered to

customers through a network with limited bandwidth. The goal is typically to route resources to

their respective customers as eﬃciently as possible, or equivalently, with as little network congestion

as possible.

arXiv:2210.14489v1 [eess.SY] 26 Oct 2022

Private Convex Optimization for Network Routing Tsao, Gopalakrishnan, Yang, Pavone

One key challenge in network routing problems is that the requests (i.e., network demand) are

often not known in advance. Indeed, it is diﬃcult to know exactly how much power a neighborhood

will need, or exactly how many visits a particular website will receive on any given day. Since

information about the demand is often necessary to develop optimal or near-optimal network routing

solutions, network routing algorithms often need a way of obtaining or estimating future demand.

With the advent of big data and internet-of-things systems, crowd-sourcing has gained popularity as

a demand forecasting approach for network routing systems. In crowd-sourcing, customers submit

their request history to the network operator. The network operator uses this historical data to

train a statistical or machine learning model to predict future demand from historical demand.

While crowd-sourcing provides a bountiful supply of data for training demand forecasting mod-

els, it can also introduce potential privacy concerns. Since crowd-sourcing uses individual-level

customer data to train demand forecasting models, the model’s outputs may reveal sensitive user

information, especially if it overﬁts to its training data [CTW+21]. Such privacy risks are problem-

atic because they may deter users from sharing their data with network operators, hence reducing

the supply of training data for demand forecasting models.

To address such concerns, the demand forecasting pipeline should be augmented with privacy-

preserving mechanisms. Diﬀerential privacy [DMNS06] is a principled and popular method to

occlude the inﬂuence a single user’s data on the result of a population study while also maintaining

the study’s accuracy. This is done by carefully injecting noise into the desired computation so that

data sets that diﬀer by at most one data point will produce statistically indistinguishable results.

Providing diﬀerential privacy guarantees for the standard formulation of network routing is

diﬃcult because the constraints contain user data, meaning that in general feasibility and privacy

become mutually exclusive. More speciﬁcally, in the standard network routing problem, the demand

sources and sinks can be identiﬁed by checking for conservation of ﬂow, and as a result, the presence

of a user going to or from a very rare location can be detected from any feasible solution. Because

diﬀerential privacy requires that the presence of any single user’s data be diﬃcult to detect from the

algorithm’s output, privacy and feasibility are at odds with one another in the standard formulation.

1.1 Statement of Contributions

In this paper we present a diﬀerentially private algorithm for network routing problems. The main

ingredient is a reformulation of network routing which moves all user data dependent parameters

out of the constraint set and into the objective function. We then present an algorithm for solving

this formulation based on diﬀerentially private variants of stochastic gradient descent. In this

algorithm, diﬀerential privacy is achieved by injecting noise, and one may wonder if this noise

injection compromises solution quality. We prove that our algorithm is diﬀerentially private and

under several reasonable regularity conditions, is also asymptotically optimal (as the size of the

training set goes to inﬁnity). We note that in exchange for becoming compatible with diﬀerentially

private algorithms, this new formulation is more computationally expensive.

1.2 Related Work

Traﬃc assignment in transportation systems is one of the most well-known applications of network

routing. Herein we focus our literature review on privacy research in transportation networks.

Privacy works in transportation mainly focus on location privacy, where the objectives is to prevent

untrusted and/or external entities from learning geographic locations or location sequences of an

Private Convex Optimization for Network Routing Tsao, Gopalakrishnan, Yang, Pavone

individual [BS03]. Privacy-preserving approaches have been proposed for various location-based

applications, e.g., trajectory publishing, mobile crowdsensing, traﬃc control, etc. These techniques

are based on spatial cloaking [CML11] and diﬀerential privacy [Dwo08]. While not the setting of

interest for this paper, there are many works that use Secure Multi-Party Computation (MPC)

[GMW87] to achieve privacy in distributed mobility systems.

Spatial cloaking approaches aggregate users’ exact locations into coarse information. These

approaches are often based on k-anonymity [Swe02], where a mobility dataset is divided into equiv-

alence classes based on data attributes (e.g., geological regions, time, etc.) so that each class

contains at least krecords [GFCA14, HC20]. These k-anonymity-based approaches can guarantee

that every record in the dataset is indistinguishable from at least k−1 other records. However,

k-anonymity is generally considered to be a weak privacy guarantee, especially when kis small.

Furthermore, very coarse data aggregation is required to address outliers or sparse data, and in

these cases spatial cloaking-based approaches provide low data accuracy.

Diﬀerential privacy provides a principled privacy guarantee by producing randomized responses

to queries, whereby two datasets that diﬀer in only one entry produce statistically indistinguishable

responses [DMNS06]. In other words, diﬀerential privacy ensures that an adversary with arbitrary

background information (e.g., query responses, other entries) cannot infer individual entries with

high conﬁdence. Within transportation research, [WHL+18, YLL+19] share noisy versions of lo-

cation data for mobile crowd-sensing applications. [GZFS15, GLTY18, AHFI+18, LYH20] use

diﬀerential privacy to publish noisy versions of trajectory data. [DKBS15] and [HTP17] apply

diﬀerential privacy to gradient descent algorithms for federated learning in mobility systems.

Many of the works mentioned in the previous paragraph establish diﬀerential privacy of their

proposed algorithms by using composition properties of diﬀerential privacy. Composition theorems

for diﬀerential privacy describe how well privacy is preserved when conducting several computations

one after another. In [DKBS15] and [HTP17], composition theorems are applied as black boxes

without considering the mathematical properties of the gradient descent algorithm. As a result,

the privacy guarantees are overly conservative, meaning that large amounts of noise are added to

the algorithm, leading to suboptimal behavior both in theory and in practice. Similarly, [GZFS15,

GLTY18, AHFI+18, LYH20] use composition rules as a black box, and while privacy is achieved in

this way, there are no accuracy guarantees for the algorithms presented in those works.

While blackbox applications of diﬀerential privacy can lead to impractical levels of noise in-

jection, specialized applications of diﬀerential privacy were discovered that could provide privacy

with much less noise. [WLK+17] show how a simple adjustment to stochastic gradient descent

can give rise to an algorithm which is both diﬀerentially private, and under reasonable regularity

assumptions, is also asymptotically optimal. [FMTT18] and [FKT20] reﬁned this idea to develop

stochastic gradient descent algorithms that are diﬀerentially private, computationally eﬃcient, and

have optimal convergence rates. These techniques cannot directly be used to solve the standard

formulation of network routing because they study unconstrained optimization problems or op-

timization problems with public constraint sets (i.e., constraints that do not include any private

data).

2 Model

In this section we deﬁne notations, network models, and assumptions that will allow us to formulate

network routing as a data-driven convex optimization problem.

Private Convex Optimization for Network Routing Tsao, Gopalakrishnan, Yang, Pavone

2.1 Preliminaries

Indicator Representation of Edge Sets: Let G= (V, E) be a graph with vertices Vand edges

E={e1, ..., em}. For any subset of edges E0⊂E, we deﬁne the indicator representation of E0as

1[E0]as a boolean vector of length min the following way: The ith entry of 1[E0]is 1 if ei∈E0,

and is 0 otherwise.

Derivative Notation: For a scalar valued function x7→ f(x), we use ∇f(x0) to denote the gradi-

ent of the fwith respect to xevaluated at the point x=x0. For a vector valued function x7→ g(x),

and a variable z, we use Dz[g](x0) to denote the derivative matrix of gwith respect to zevaluated

at the point x=x0.

Projections: For a convex set S ⊂ Rd, we use ΠS:Rd→Rdto denote the projection operator

for S. For any x∈Rd, ΠS(x) := arg mins∈S ||x−s||2to be the point in Sthat is closest to x.

2.2 Network, Demand, and Network Flow

In this section we will introduce a) a graph model of a network, b) a representation of network

demand, c) the standard formulation for network routing and d) privacy requirements. The notation

deﬁned in this section is aggregated in table form in Section A for the reader’s convenience.

Deﬁnition 1 (Network Representation).We use a directed graph G= (V, E) to represent the

network, where Vand Erepresent the set of vertices and edges in the network, respectively. We

will use n:= |V|and m:= |E|to denote the number of vertices and edges in the graph, respectively.

For vertex pairs (o, d)∈V×Vwe will use PG(o, d) to denote the set of simple paths from oto d

in G.

Deﬁnition 2 (Operation Period).We use T:= [tstart, tend] to denote the operation period within

which a network operator wants to optimize its routing decisions. We will also use Tto denote the

number of minutes in the operation period. For example, tstart = 8 : 00am, tend = 9 : 00am could

represent a morning commute period where T= 60.

Deﬁnition 3 (Demand Representation).We study a stationary demand model where demand

within the operation period Tis speciﬁed by a matrix Λ ∈Rn×n

+. For each ordered pair of vertices

(o, d)∈V×V, Λ(o, d) is the number of requests arriving per minute (i.e., the arrival rate) during

Tthat need routing from vertex oto vertex d.

Remark 1 (Estimating Λ from historical data).The arrival rates from historical demand are

computed empirically, i.e., if Λtrepresents the demand for day t, then Λt(o, d) is calculated by

counting the number of (o, d) requests appearing on day t, and then dividing it by Tto obtain

requests per minute.

Deﬁnition 4 (Link Latency Functions).To model congestion eﬀects, each link e∈Ehas a latency

function fe:R+→R+which speciﬁes the average travel time through the link as a function of the

total ﬂow on the link.

In this paper we study a setting where a network operator wants to route demand while min-

imizing the total travel time for the requests. With these deﬁnitions, the standard formulation of

minimum travle time network routing is described in Deﬁnition 5.

Private Convex Optimization for Network Routing Tsao, Gopalakrishnan, Yang, Pavone

Deﬁnition 5 (Standard Formulation of Network Flow).For a network G= (V, E) with link latency

functions {fe}e∈Eand demand Λ, the standard network ﬂow problem is the following minimization

program:

minimize

e∈E

yefe(ye) (1a)

subject to x=nx(o,d)o(o,d)∈V×V,(1b)

ye=X

(o,d)∈V×V

x(o,d)

efor all e∈E, (1c)

v:(u,v)∈E

x(o,d)

(u,v)−X

w:(w,u)∈E

x(o,d)

(w,u)= Λ(o, d)1[u=o]−Λ(o, d)1[u=d]for all u∈V, (o, d)∈V×V. (1d)

In (1), the decision variable xis a collection of ﬂows x(o,d)(o,d)∈V×V, one for each non-zero

entry of Λ, as represented by constraint (1b). (1d) are ﬂow conservation constraints to ensure that

x(o,d)sends Λ(o, d) units of ﬂow from oto d. Constraints (1c) ensure that {ye}e∈Erepresents the

total amount of ﬂow on each edge. Finally, the objective function (1a) is to minimize the total

travel time as a function of total network ﬂow.

In the next subsection we will describe the rigorous privacy requirements that we will mandate

while designing algorithms for network ﬂow. We then describe in Section 2.4 why privacy and

feasibility are mutually exclusive in the standard network ﬂow formulation.

2.3 Privacy Requirements

We will use diﬀerential privacy to reason about the privacy-preserving properties of our algorithms.

At a high level, changing one data point of the input to a diﬀerentially private algorithm should

lead to a statistically indistinguishable change in the output. To make this concept concrete we

will need to deﬁne data sets and adjacency.

Deﬁnition 6 (Data sets).Given a space of data points Z, a data set Lis any ﬁnite set of elements

from Z. In practice, each element of a data set is data collected from a single user, or data collected

during a speciﬁc time period. We will use Lto denote the set of all data sets.

Deﬁnition 7 (Data Set Adjacency).Given a space of data sets L, an adjacency relation Adj is

a mapping Adj : L × L → {0,1}. Two data sets L1, L2are said to be adjacent if and only if

Adj(L1, L2) = 1.

While the exact deﬁnition can vary across applications, adjacent data sets are sets that are

very similar to one another. The most common deﬁnition of adjacency is the following: L, L0are

adjacent if and only if L0can be obtained by adding, deleting, or modifying at most one data point

from L, and vice versa. Thus comparing a function’s output on adjacent data sets measures how

sensitive the function is to changes in a single data point.

With these deﬁnitions in place, we are now ready to deﬁne diﬀerential privacy.

Deﬁnition 8 (Diﬀerential Privacy).For a given adjacency relation Adj, a function M:L→Xis

(, δ)-diﬀerentially private if for any L1, L2∈ L with Adj(L1, L2) = 1, the following inequality

P[M(L1)∈E]≤eP[M(L2)∈E] + δ

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DierentiallyPrivateStochasticConvexOptimizationforNetworkRoutingApplicationsMatthewTsaoLyftInc.mtsao@lyft.comKarthikGopalakrishnanStanfordUniversitygkarthik@stanford.eduKaidiYangNationalUniversityofSingaporekaidi.yang@nus.edu.sgMarcoPavoneStanfordUniversitypavone@stanford.eduOctober27,2022AbstractN...

展开>> 收起<<

Dierentially Private Stochastic Convex Optimization for Network Routing Applications Matthew Tsao.pdf

共31页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Dierentially Private Stochastic Convex Optimization for Network Routing Applications Matthew Tsao

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: