
4Golrezaei, Jaillet, and Zhou: Online Resource Allocation with Samples
decision-maker. The result for the second scenario justifies our assumption about knowing
the sampling probability; see (Correa et al. 2021) for a similar assumption. Nonetheless,
this assumption can be also justified by the fact that the outreach program is designed
by the decision-maker herself, and hence pcan be well estimated using historical data
for similar outreach programs. See also our case study in Section 6 where we examine
the robustness of our proposed algorithm (which we will present shortly) to the lack of
knowledge of the sampling probability.3
Asymptotically optimal protection level algorithm. In light of our impossibility
results in Section 3, we design a simple, yet effective protection level algorithm that opti-
mally utilizes the sample information; see Algorithm 1 in Section 4. The algorithm uses the
sample information to obtain an estimate of the expected reward of each type of agents. If
the estimated reward of type 1 is greater than that of type 2, the algorithm protects type
1 agents, otherwise type 2 agents will be protected. In each of these cases, the algorithm
uses the sample information to estimate the protection level for the type that has a higher
estimated average reward.
In Theorem 1, we present the competitive ratio of our proposed algorithm for any finite
value of m. In addition, in Proposition 2, we show that our algorithm is asymptotically
optimal as mgoes to infinity. More precisely, we show that for any p=ω(1/√m)4(which
includes constant values for pthat does not scale with m), the asymptotic competitive ratio
of our algorithm is on the order of 1 −Θ(1/(p√m)).5As we show in Section 7.1, the same
asymptotic competitive ratio continues to hold even when there is a capacity constraint
during the test period. (There we show that when the number of resources during the test
period is in the order of ω(√m), the asymptotic competitive ratio of (a slightly modified
version of) Algorithm 1 is in the order of 1 −Θ(1/(p√m)).)
3In Appendix K, using numerical studies and under adversarial arrival processes, we further test the robustness of
our proposed algorithm to the lack of knowledge of p. There we show that when the estimation error in pis 10%, the
competitive ratio decreases by at most 6%. Similar results are obtained when the estimation error in pis 20%.
4We scale pwith mbecause of the cost of testing the market. With a large value of m, if the market size turns out
to be large too (e.g., proportional to m), the decision-maker will incur a large cost during the test period, justifying
our choice of p=ω(1/√m). Nonetheless, our results hold for constant p’s, and theoretically speaking, handling the
case of p=ω(1/√m) is more challenging than that of constant p’s. This is because, under constant p’s, the sample
information is more informative, easing the analysis of the algorithm.
5For the case of p=O(1
√m) where pgoes to zero very fast as mgrows, in Section J, we design another algorithm whose
competitive ratio is 1/2. We note that our upper bound in Section 5.2 confirms the challenges of such an extreme case
for the sampling probability. As we show there, when pgoes to zero very fast (i.e., p=O(1/m)), obtaining asymptotic
optimality is impossible.