Online Pen Testing Mingda Qiao and Gregory Valiant fmqiaovaliantgstanford.edu

2025-05-02 0 0 557.72KB 31 页 10玖币

侵权投诉

Online Pen Testing∗

Mingda Qiao and Gregory Valiant

{mqiao,valiant}@stanford.edu

Stanford University

Abstract

We study a “pen testing” problem, in which we are given npens with unknown amounts of

ink X1, X2, . . . , Xn, and we want to choose a pen with the maximum amount of remaining ink

in it. The challenge is that we cannot access each Xidirectly; we only get to write with the i-th

pen until either a certain amount of ink is used, or the pen runs out of ink. In both cases, this

testing reduces the remaining ink in the pen and thus the utility of selecting it.

Despite this signiﬁcant lack of information, we show that it is possible to approximately

maximize our utility up to an O(log n) factor. Formally, we consider two diﬀerent setups: the

“prophet” setting, in which each Xiis independently drawn from some distribution Di, and the

“secretary” setting, in which (Xi)n

i=1 is a random permutation of arbitrary a1, a2, . . . , an. We

derive the optimal competitive ratios in both settings up to constant factors. Our algorithms

are surprisingly robust: (1) In the prophet setting, we only require one sample from each Di,

rather than a full description of the distribution; (2) In the secretary setting, the algorithm also

succeeds under an arbitrary permutation, if an estimate of the maximum aiis given.

Our techniques include a non-trivial online sampling scheme from a sequence with an un-

known length, as well as the construction of a hard, non-uniform distribution over permutations.

Both might be of independent interest. We also highlight some immediate open problems and

discuss several directions for future research.

1 Introduction

Suppose that we have a few whiteboard pens to choose from for an upcoming presentation. We

want to maximize the amount of remaining ink in the pen we pick, measured in writing time.

Naturally, before we make our decision, we write with each pen for a short while to check whether

the ink has almost run out. We face a dilemma regarding how long each pen should be tested. If

we use the pen for just ﬁve seconds, we could not distinguish whether it had ten seconds or twenty

minutes of writing time at the beginning. At the other extreme, too long a test period may exhaust

the ink in the pen, leaving us too little ink for the actual writing.

This toy problem models scenarios such as testing the service life of a ﬂimsy spare part, and

more generally, other real-world decision-making in which obtaining information about each option

∗We would like to thank Ian Tullis, Petr Mitrichev, and the entire problem setting team of Google Code Jam 2020

for writing and preparing the problem titled Pen Testing [TM20], which inspired this work. We thank the anonymous

reviewers for their comments that have helped improve this paper. This work was supported by NSF awards 1813049,

1704417 and 1804222, and DOE award DE-SC0019205.

arXiv:2210.00655v2 [cs.DS] 22 Nov 2022

inevitably reduces the utility of the option. For example, we want to invest in one of nstart-ups

with unknown growth potentials. We could, of course, watch from the sidelines for a while, and see

whether the total value of each company has grown to a certain point (e.g., twice its initial value).

However, we would have a lower return since our investment in the company only starts at this

point, and the 2x increase does not count towards our proﬁt.

To our knowledge, this “pen testing” problem ﬁrst appeared as a competitive programming

problem, written by Ian Tullis and prepared by Petr Mitrichev, in Google Code Jam 2020 [TM20].

They considered the case that n= 15 pens hold 0,1,2, . . . , n −1 units of ink respectively, but are

presented to us after a random shuﬄing. We can test the pens in an arbitrary order, and possibly

go back to a pen that we tested earlier for further testing, if this is deemed necessary. Finally, we

are asked to choose two of the pens, and we win the game if the total units of remaining ink in

them is at least n.

If we randomly pick two pens without any testing, our winning probability is clearly below

50%. Surprisingly, it was shown by [TM20] that at n= 15, a better strategy wins the game with a

higher probability of ≈64.4%! This strategy is computed by dynamic programming, in which each

state simply consists of all the information that we obtain from testing: the amount of ink that has

been used from each pen, and whether each pen has run out or not. For larger n, however, this

approach would necessarily result in an exponential runtime. Furthermore, in the general case that

the amounts of ink in the pens are no longer a permutation of (0,1,2, . . . , n −1), it is diﬃcult to

analyze how the optimal solution computed by this dynamic programming scales asymptotically.

In this work, we formulate and study an online version of this pen testing problem, in which

we select only one of the noptions, and both the testing and decision are subject to an additional

temporal restriction—we must review the noptions in the given order. For each option, we are

allowed to test it to some extent, during which the value of the option also decreases. Then, we

need to make an irrevocable decision on whether to accept the option—once we accept, the entire

game ends and we cannot explore the remaining options; once we reject an option, we can no longer

go back to it if the later options appear less ideal.

This online pen testing problem that we consider is closely related to the theory of optimal

stopping, in which the player is often assumed to observe the value of each option directly. In the

single-choice case that we focus on, two well-studied settings are the prophet inequality and the

secretary problem. In the former, the values are assumed to be drawn independently from ngiven

distributions. In the secretary problem, the options can have arbitrary values but the noptions are

assumed to arrive in a uniformly random order. We discuss the connection between our work and

this literature in Section 1.4. In this paper, we study the online pen testing problem in settings

similar to these two problems, and derive the optimal guarantee that the player can achieve under

minimal assumptions on the option values.

Awerbuch, Azar, Fiat, and Leighton [AAFL96] studied a closely related and more general

setting: A decision-maker may hold at most one of ncommodities on each day. At the end of the

day, each commodity issues a dividend of either 0 or 1 to its holders. The goal is to achieve a

total proﬁt comparable to the dividend issued by the best commodity, by switching between the

commodities as few times as possible. In Remark 1.3 we discuss how pen testing can be realized

as a special case of this setting. We discuss this connection further when describing our results

(Section 1.2) and techniques (Section 1.3).

1.1 Problem Setup

We ﬁrst deﬁne the online pen testing problem formally.

Deﬁnition 1 (Online pen testing).A problem instance is speciﬁed by X1, X2, . . . , Xn≥0. At each

step i∈[n], the player ﬁrst tests Xiand then makes a decision:

•(Testing) The player picks threshold θi∈[0,+∞]. If Xi> θi, the test passes; the test fails

if Xi≤θi, in which case the player observes Xi. The remaining utility of option ibecomes

i= max{Xi−θi,0}.

•(Decision) After seeing whether the test passes, the player either accepts or rejects the i-th

option irrevocably. If the player accepts, the game ends and the player receives a score of X0

Remark 1.1. The player may pick threshold θi= +∞, in which case the player gets to observe

Xiat the cost of leaving a remaining utility of X0

i= 0.

Remark 1.2. Our deﬁnition allows a more general testing procedure, in which the player performs

ttests sequentially at chosen thresholds θ(1)

i, θ(2)

i, . . . , θ(t)

i≥0. This is equivalent to running a single

test at threshold θi=θ(1)

i+θ(2)

i+··· +θ(t)

Remark 1.3. The problem can be viewed as a special case of the setting studied by [AAFL96],

where no switching is allowed and the commodities issue their dividend sequentially. Assuming that

X1, X2, . . . , Xnare all integers, the online pen testing problem corresponds to an instance where the

i-th commodity issues a unit dividend on Xiconsecutive days starting from day number 1+Pi−1

j=1 Xj,

and zero dividend on each of the other days.

This problem can be viewed as a variant of the well-studied optimal stopping problem in which

information is both limited and costly. For each option i, we either: (1) receive a single bit of

information (namely, that Xi> θiholds) at the cost of reducing the value of the option by θi; or:

(2) observe Xiexactly when Xi≤θi, at the cost of losing all the utility in option i.

Without any assumptions on X1, . . . , Xn, no non-trivial guarantee on the player’s score can

be made.1In this paper, we consider the following two setups: the “prophet” setting and the

“secretary” setting, both of which make some distributional assumption on the instance. In the

following, we formally deﬁne the settings and the notion of competitive ratio in each of them.

Deﬁnition 2 (Prophet setting).The player is given information about distributions D1,D2,...,Dn

over [0,+∞), from which the nvalues X1, X2, . . . , Xnare drawn independently.

The player is α-competitive if its expected score, over the randomness in the distributional

information, the generation of (Xi)n

i=1, and the player itself, is at least 1

α·EX∼D maxi∈[n]Xi.

Remark 1.4. Formally, each Diis deﬁned by a cumulative distribution function Fi:R→[0,1]

that is non-decreasing, right-continuous, and satisﬁes limx→+∞Fi(x)=1and Fi(x)=0for x <

0. The resulting Disatisﬁes PrXi∼Di[Xi≤x] = Fi(x). We assume that each Dihas a ﬁnite

expectation, i.e., the integral R+∞

0[1−Fi(x)] dxconverges, which implies that the expected maximum,

EX∼D maxi∈[n]Xi, is also ﬁnite.

We will sometimes assume for simplicity that each Diis continuous, i.e., the corresponding Fi(x)

is continuous. In other words, Dihas no point masses. The general case can be handled using a

1This is true even if the player can observe the value Xidirectly.

simple reduction (see e.g., [RWW20]). The continuity of Figuarantees that for any α∈(0,1], we

may deﬁne the (1 −α)-quantile of Dias the minimum number τthat satisﬁes PrXi∼Di[Xi> τ ] =

1−Fi(τ) = α.

In the prophet setting, the values of diﬀerent options are drawn independently from distributions

D1,D2,...,Dn. At the beginning of the game, the player is given certain information about the

distributions.2We consider both the case that the player receives a complete description of (Di)n

i=1,

and the case where the player sees one sample ˆ

Xidrawn from each Di. In the latter case, the

observed sample ˆ

Xiis independent from the actual value Xi, and the expected score of the player

is deﬁned over the randomness in ( ˆ

Xi)n

i=1 as well. Finally, the player’s score is compared to that of

an omniscient prophet that knows the realization of X1, . . . , Xnand always picks the highest one.

Deﬁnition 3 (Secretary setting).The player is given information about a1, a2, . . . , an≥0. The

nvalues X1, X2, . . . , Xnare guaranteed to be a permutation, either uniformly random or arbitrary,

of a1, a2, . . . , an.

The player is α-competitive in the random order case if its expected score, over the randomness

in both the permutation and the player itself, is at least 1

α·maxi∈[n]ai. The player is α-competitive

in the arbitrary order case if for any permutation (Xi)n

i=1 of (ai)n

i=1, the player’s expected score,

over the randomness in the player itself, is at least 1

α·maxi∈[n]ai.

Note that an α-competitive player for the arbitrary order case is also α-competitive under

a random arrival order. We consider the following three forms of information provided to the

player, in decreasing order of helpfulness: (1) full information, the player is given a1, a2, . . . , an;

(2) optimum information, the player is given maxi∈[n]ai; (3) no information, the player is given

nothing.

With full or optimum information, if the player could observe each Xidirectly, it would be

easy to achieve a utiliy of maxi∈[n]ai—simply accept option ionly if Xiis equal to this maximum.

This is, however, not true for the online pen testing problem. For example, when (a1, a2, . . . , an) =

(1,2, . . . , n), the player can only ensure that Xi=nby setting the threshold θito n−1, but this

would leave a remaining utility of merely 1.

1.2 Our Results

We obtain the optimal competitive ratios (modulo constant factors) for online pen testing, under

diﬀerent variants of the prophet and secretary settings deﬁned above.

A simple lower bound. The following example shows that even when the Xi’s are drawn

independently from the same “nice” distribution, our score can still be an Ω(log n) factor away

from the optimal outcome.

Fact 1.5. Suppose that X1, X2, . . . , Xnare drawn independently from the exponential distribu-

tion with parameter 1. The expected score of the player is at most 1, while the maximum among

X1, X2, . . . , Xnis Hn:=Pn

k=1 1

k= Ω(log n)in expectation.

Proof of Fact 1.5.Whenever the player accepts option iafter testing it at threshold θi, the expected

remaining utility is EX∼D [X|X > θi]−θi= 1. The expected score of the player is thus at most 1.

The second claim follows from a straightforward calculation, which is deferred to Appendix A.

2Without information about (Di)n

i=1, this is as hard as the worst-case setting when every Diis degenerate.

Note that the instance above is a special case of the prophet setting, in which D1,D2,...,Dn

are the same distribution. Furthermore, the lower bound argument still goes through even in the

“oﬄine” setting of [TM20], i.e., the player is allowed to: (1) test the pens in an arbitrary order;

(2) come back and test some pen that has been tested; (3) accept any pen after gathering all the

information.

Perhaps surprisingly, this Ω(log n) lower bound is the only obstacle against a competitive algo-

rithm: An O(log n)-competitive algorithm exists in almost all the variants, even though the player

is under an additional temporal restriction, and has far less information about (Xi)n

i=1.

The prophet setting. Our ﬁrst result addresses the prophet setting, assuming that the player

is given full descriptions of the distributions.

Theorem 1. In the prophet setting, there is an algorithm that, given D1,D2,...,Dn, achieves a

competitive ratio of O(log n).

In light of Fact 1.5, the O(log n) competitive ratio is tight up to a constant factor. This positive

result can be strengthened to an O(log n)-competitive single-sample prophet inequality.

Theorem 2. In the prophet setting, there is an algorithm that, given samples ˆ

X1,ˆ

X2,..., ˆ

independently drawn from D1,D2,...,Dn, achieves a competitive ratio of O(log n).

The secretary setting. Our positive result for the secretary setting states that an O(log n)

competitive ratio is achievable even if we are given no information about (ai)n

i=1 (under a random

arrival order). Furthermore, if the maximum value maxi∈[n]aiis given, an O(log n)-competitive

algorithm exists even if the arrival order is arbitrary. The O(log n) upper bound for the latter case

also follows from a result of [AAFL96] and the reduction outlined in Remark 1.3. Interestingly, we

obtain this competitive ratio using a quite diﬀerent approach; we compare these two algorithms in

more detail in Section 1.3.

Theorem 3 (Secretary setting, upper bounds; Theorem 2.3 of [AAFL96]).In the secretary setting,

an O(log n)-competitive algorithm exists in the following two cases: (1) the order is random and

the player is given no information; (2) the order is arbitrary and the player is given optimum

information.

We prove a matching Ω(log n) lower bound under settings that are even easier than those in

Theorem 3.

Theorem 4 (Secretary setting, lower bounds).In the secretary setting, any algorithm is at best

Ω(log n)-competitive in the following two cases: (1) the order is random and the player is given

optimum information; (2) the order is arbitrary and the player is given full information.

In the easiest combination among all secretary settings—that (Xi)n

i=1 is a random permutation

of known values (ai)n

i=1, the competitive ratio is slightly improved to Θ log n

log log n.

Theorem 5. In the secretary setting with random order and full information, there is an Olog n

log log n-

competitive algorithm. Furthermore, this is tight up to a constant factor.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

OnlinePenTesting*MingdaQiaoandGregoryValiantfmqiao,valiantg@stanford.eduStanfordUniversityAbstractWestudya\pentesting"problem,inwhichwearegivennpenswithunknownamountsofinkX1;X2;:::;Xn,andwewanttochooseapenwiththemaximumamountofremaininginkinit.ThechallengeisthatwecannotaccesseachXidirectly;weonlyget...

展开>> 收起<<

Online Pen Testing Mingda Qiao and Gregory Valiant fmqiaovaliantgstanford.edu.pdf

共31页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Online Pen Testing Mingda Qiao and Gregory Valiant fmqiaovaliantgstanford.edu

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: