Online Pen Testing Mingda Qiao and Gregory Valiant fmqiaovaliantgstanford.edu

2025-05-02 0 0 557.72KB 31 页 10玖币
侵权投诉
Online Pen Testing
Mingda Qiao and Gregory Valiant
{mqiao,valiant}@stanford.edu
Stanford University
Abstract
We study a “pen testing” problem, in which we are given npens with unknown amounts of
ink X1, X2, . . . , Xn, and we want to choose a pen with the maximum amount of remaining ink
in it. The challenge is that we cannot access each Xidirectly; we only get to write with the i-th
pen until either a certain amount of ink is used, or the pen runs out of ink. In both cases, this
testing reduces the remaining ink in the pen and thus the utility of selecting it.
Despite this significant lack of information, we show that it is possible to approximately
maximize our utility up to an O(log n) factor. Formally, we consider two different setups: the
“prophet” setting, in which each Xiis independently drawn from some distribution Di, and the
“secretary” setting, in which (Xi)n
i=1 is a random permutation of arbitrary a1, a2, . . . , an. We
derive the optimal competitive ratios in both settings up to constant factors. Our algorithms
are surprisingly robust: (1) In the prophet setting, we only require one sample from each Di,
rather than a full description of the distribution; (2) In the secretary setting, the algorithm also
succeeds under an arbitrary permutation, if an estimate of the maximum aiis given.
Our techniques include a non-trivial online sampling scheme from a sequence with an un-
known length, as well as the construction of a hard, non-uniform distribution over permutations.
Both might be of independent interest. We also highlight some immediate open problems and
discuss several directions for future research.
1 Introduction
Suppose that we have a few whiteboard pens to choose from for an upcoming presentation. We
want to maximize the amount of remaining ink in the pen we pick, measured in writing time.
Naturally, before we make our decision, we write with each pen for a short while to check whether
the ink has almost run out. We face a dilemma regarding how long each pen should be tested. If
we use the pen for just five seconds, we could not distinguish whether it had ten seconds or twenty
minutes of writing time at the beginning. At the other extreme, too long a test period may exhaust
the ink in the pen, leaving us too little ink for the actual writing.
This toy problem models scenarios such as testing the service life of a flimsy spare part, and
more generally, other real-world decision-making in which obtaining information about each option
We would like to thank Ian Tullis, Petr Mitrichev, and the entire problem setting team of Google Code Jam 2020
for writing and preparing the problem titled Pen Testing [TM20], which inspired this work. We thank the anonymous
reviewers for their comments that have helped improve this paper. This work was supported by NSF awards 1813049,
1704417 and 1804222, and DOE award DE-SC0019205.
1
arXiv:2210.00655v2 [cs.DS] 22 Nov 2022
inevitably reduces the utility of the option. For example, we want to invest in one of nstart-ups
with unknown growth potentials. We could, of course, watch from the sidelines for a while, and see
whether the total value of each company has grown to a certain point (e.g., twice its initial value).
However, we would have a lower return since our investment in the company only starts at this
point, and the 2x increase does not count towards our profit.
To our knowledge, this “pen testing” problem first appeared as a competitive programming
problem, written by Ian Tullis and prepared by Petr Mitrichev, in Google Code Jam 2020 [TM20].
They considered the case that n= 15 pens hold 0,1,2, . . . , n 1 units of ink respectively, but are
presented to us after a random shuffling. We can test the pens in an arbitrary order, and possibly
go back to a pen that we tested earlier for further testing, if this is deemed necessary. Finally, we
are asked to choose two of the pens, and we win the game if the total units of remaining ink in
them is at least n.
If we randomly pick two pens without any testing, our winning probability is clearly below
50%. Surprisingly, it was shown by [TM20] that at n= 15, a better strategy wins the game with a
higher probability of 64.4%! This strategy is computed by dynamic programming, in which each
state simply consists of all the information that we obtain from testing: the amount of ink that has
been used from each pen, and whether each pen has run out or not. For larger n, however, this
approach would necessarily result in an exponential runtime. Furthermore, in the general case that
the amounts of ink in the pens are no longer a permutation of (0,1,2, . . . , n 1), it is difficult to
analyze how the optimal solution computed by this dynamic programming scales asymptotically.
In this work, we formulate and study an online version of this pen testing problem, in which
we select only one of the noptions, and both the testing and decision are subject to an additional
temporal restriction—we must review the noptions in the given order. For each option, we are
allowed to test it to some extent, during which the value of the option also decreases. Then, we
need to make an irrevocable decision on whether to accept the option—once we accept, the entire
game ends and we cannot explore the remaining options; once we reject an option, we can no longer
go back to it if the later options appear less ideal.
This online pen testing problem that we consider is closely related to the theory of optimal
stopping, in which the player is often assumed to observe the value of each option directly. In the
single-choice case that we focus on, two well-studied settings are the prophet inequality and the
secretary problem. In the former, the values are assumed to be drawn independently from ngiven
distributions. In the secretary problem, the options can have arbitrary values but the noptions are
assumed to arrive in a uniformly random order. We discuss the connection between our work and
this literature in Section 1.4. In this paper, we study the online pen testing problem in settings
similar to these two problems, and derive the optimal guarantee that the player can achieve under
minimal assumptions on the option values.
Awerbuch, Azar, Fiat, and Leighton [AAFL96] studied a closely related and more general
setting: A decision-maker may hold at most one of ncommodities on each day. At the end of the
day, each commodity issues a dividend of either 0 or 1 to its holders. The goal is to achieve a
total profit comparable to the dividend issued by the best commodity, by switching between the
commodities as few times as possible. In Remark 1.3 we discuss how pen testing can be realized
as a special case of this setting. We discuss this connection further when describing our results
(Section 1.2) and techniques (Section 1.3).
2
1.1 Problem Setup
We first define the online pen testing problem formally.
Definition 1 (Online pen testing).A problem instance is specified by X1, X2, . . . , Xn0. At each
step i[n], the player first tests Xiand then makes a decision:
(Testing) The player picks threshold θi[0,+]. If Xi> θi, the test passes; the test fails
if Xiθi, in which case the player observes Xi. The remaining utility of option ibecomes
X0
i= max{Xiθi,0}.
(Decision) After seeing whether the test passes, the player either accepts or rejects the i-th
option irrevocably. If the player accepts, the game ends and the player receives a score of X0
i.
Remark 1.1. The player may pick threshold θi= +, in which case the player gets to observe
Xiat the cost of leaving a remaining utility of X0
i= 0.
Remark 1.2. Our definition allows a more general testing procedure, in which the player performs
ttests sequentially at chosen thresholds θ(1)
i, θ(2)
i, . . . , θ(t)
i0. This is equivalent to running a single
test at threshold θi=θ(1)
i+θ(2)
i+··· +θ(t)
i.
Remark 1.3. The problem can be viewed as a special case of the setting studied by [AAFL96],
where no switching is allowed and the commodities issue their dividend sequentially. Assuming that
X1, X2, . . . , Xnare all integers, the online pen testing problem corresponds to an instance where the
i-th commodity issues a unit dividend on Xiconsecutive days starting from day number 1+Pi1
j=1 Xj,
and zero dividend on each of the other days.
This problem can be viewed as a variant of the well-studied optimal stopping problem in which
information is both limited and costly. For each option i, we either: (1) receive a single bit of
information (namely, that Xi> θiholds) at the cost of reducing the value of the option by θi; or:
(2) observe Xiexactly when Xiθi, at the cost of losing all the utility in option i.
Without any assumptions on X1, . . . , Xn, no non-trivial guarantee on the player’s score can
be made.1In this paper, we consider the following two setups: the “prophet” setting and the
“secretary” setting, both of which make some distributional assumption on the instance. In the
following, we formally define the settings and the notion of competitive ratio in each of them.
Definition 2 (Prophet setting).The player is given information about distributions D1,D2,...,Dn
over [0,+), from which the nvalues X1, X2, . . . , Xnare drawn independently.
The player is α-competitive if its expected score, over the randomness in the distributional
information, the generation of (Xi)n
i=1, and the player itself, is at least 1
α·EX∼D maxi[n]Xi.
Remark 1.4. Formally, each Diis defined by a cumulative distribution function Fi:R[0,1]
that is non-decreasing, right-continuous, and satisfies limx+Fi(x)=1and Fi(x)=0for x <
0. The resulting Disatisfies PrXi∼Di[Xix] = Fi(x). We assume that each Dihas a finite
expectation, i.e., the integral R+
0[1Fi(x)] dxconverges, which implies that the expected maximum,
EX∼D maxi[n]Xi, is also finite.
We will sometimes assume for simplicity that each Diis continuous, i.e., the corresponding Fi(x)
is continuous. In other words, Dihas no point masses. The general case can be handled using a
1This is true even if the player can observe the value Xidirectly.
3
simple reduction (see e.g., [RWW20]). The continuity of Figuarantees that for any α(0,1], we
may define the (1 α)-quantile of Dias the minimum number τthat satisfies PrXi∼Di[Xi> τ ] =
1Fi(τ) = α.
In the prophet setting, the values of different options are drawn independently from distributions
D1,D2,...,Dn. At the beginning of the game, the player is given certain information about the
distributions.2We consider both the case that the player receives a complete description of (Di)n
i=1,
and the case where the player sees one sample ˆ
Xidrawn from each Di. In the latter case, the
observed sample ˆ
Xiis independent from the actual value Xi, and the expected score of the player
is defined over the randomness in ( ˆ
Xi)n
i=1 as well. Finally, the player’s score is compared to that of
an omniscient prophet that knows the realization of X1, . . . , Xnand always picks the highest one.
Definition 3 (Secretary setting).The player is given information about a1, a2, . . . , an0. The
nvalues X1, X2, . . . , Xnare guaranteed to be a permutation, either uniformly random or arbitrary,
of a1, a2, . . . , an.
The player is α-competitive in the random order case if its expected score, over the randomness
in both the permutation and the player itself, is at least 1
α·maxi[n]ai. The player is α-competitive
in the arbitrary order case if for any permutation (Xi)n
i=1 of (ai)n
i=1, the player’s expected score,
over the randomness in the player itself, is at least 1
α·maxi[n]ai.
Note that an α-competitive player for the arbitrary order case is also α-competitive under
a random arrival order. We consider the following three forms of information provided to the
player, in decreasing order of helpfulness: (1) full information, the player is given a1, a2, . . . , an;
(2) optimum information, the player is given maxi[n]ai; (3) no information, the player is given
nothing.
With full or optimum information, if the player could observe each Xidirectly, it would be
easy to achieve a utiliy of maxi[n]ai—simply accept option ionly if Xiis equal to this maximum.
This is, however, not true for the online pen testing problem. For example, when (a1, a2, . . . , an) =
(1,2, . . . , n), the player can only ensure that Xi=nby setting the threshold θito n1, but this
would leave a remaining utility of merely 1.
1.2 Our Results
We obtain the optimal competitive ratios (modulo constant factors) for online pen testing, under
different variants of the prophet and secretary settings defined above.
A simple lower bound. The following example shows that even when the Xi’s are drawn
independently from the same “nice” distribution, our score can still be an Ω(log n) factor away
from the optimal outcome.
Fact 1.5. Suppose that X1, X2, . . . , Xnare drawn independently from the exponential distribu-
tion with parameter 1. The expected score of the player is at most 1, while the maximum among
X1, X2, . . . , Xnis Hn:=Pn
k=1 1
k= Ω(log n)in expectation.
Proof of Fact 1.5.Whenever the player accepts option iafter testing it at threshold θi, the expected
remaining utility is EX∼D [X|X > θi]θi= 1. The expected score of the player is thus at most 1.
The second claim follows from a straightforward calculation, which is deferred to Appendix A.
2Without information about (Di)n
i=1, this is as hard as the worst-case setting when every Diis degenerate.
4
Note that the instance above is a special case of the prophet setting, in which D1,D2,...,Dn
are the same distribution. Furthermore, the lower bound argument still goes through even in the
“offline” setting of [TM20], i.e., the player is allowed to: (1) test the pens in an arbitrary order;
(2) come back and test some pen that has been tested; (3) accept any pen after gathering all the
information.
Perhaps surprisingly, this Ω(log n) lower bound is the only obstacle against a competitive algo-
rithm: An O(log n)-competitive algorithm exists in almost all the variants, even though the player
is under an additional temporal restriction, and has far less information about (Xi)n
i=1.
The prophet setting. Our first result addresses the prophet setting, assuming that the player
is given full descriptions of the distributions.
Theorem 1. In the prophet setting, there is an algorithm that, given D1,D2,...,Dn, achieves a
competitive ratio of O(log n).
In light of Fact 1.5, the O(log n) competitive ratio is tight up to a constant factor. This positive
result can be strengthened to an O(log n)-competitive single-sample prophet inequality.
Theorem 2. In the prophet setting, there is an algorithm that, given samples ˆ
X1,ˆ
X2,..., ˆ
Xn
independently drawn from D1,D2,...,Dn, achieves a competitive ratio of O(log n).
The secretary setting. Our positive result for the secretary setting states that an O(log n)
competitive ratio is achievable even if we are given no information about (ai)n
i=1 (under a random
arrival order). Furthermore, if the maximum value maxi[n]aiis given, an O(log n)-competitive
algorithm exists even if the arrival order is arbitrary. The O(log n) upper bound for the latter case
also follows from a result of [AAFL96] and the reduction outlined in Remark 1.3. Interestingly, we
obtain this competitive ratio using a quite different approach; we compare these two algorithms in
more detail in Section 1.3.
Theorem 3 (Secretary setting, upper bounds; Theorem 2.3 of [AAFL96]).In the secretary setting,
an O(log n)-competitive algorithm exists in the following two cases: (1) the order is random and
the player is given no information; (2) the order is arbitrary and the player is given optimum
information.
We prove a matching Ω(log n) lower bound under settings that are even easier than those in
Theorem 3.
Theorem 4 (Secretary setting, lower bounds).In the secretary setting, any algorithm is at best
Ω(log n)-competitive in the following two cases: (1) the order is random and the player is given
optimum information; (2) the order is arbitrary and the player is given full information.
In the easiest combination among all secretary settings—that (Xi)n
i=1 is a random permutation
of known values (ai)n
i=1, the competitive ratio is slightly improved to Θ log n
log log n.
Theorem 5. In the secretary setting with random order and full information, there is an Olog n
log log n-
competitive algorithm. Furthermore, this is tight up to a constant factor.
5
摘要:

OnlinePenTesting*MingdaQiaoandGregoryValiantfmqiao,valiantg@stanford.eduStanfordUniversityAbstractWestudya\pentesting"problem,inwhichwearegivennpenswithunknownamountsofinkX1;X2;:::;Xn,andwewanttochooseapenwiththemaximumamountofremaininginkinit.ThechallengeisthatwecannotaccesseachXidirectly;weonlyget...

展开>> 收起<<
Online Pen Testing Mingda Qiao and Gregory Valiant fmqiaovaliantgstanford.edu.pdf

共31页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:31 页 大小:557.72KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 31
客服
关注