2
download time. However, as will be explained in Section 3,
we find that this is not the case in general.
On the other hand, most delay-related works consider
average queuing delay of a large number of packets (fixed
size) following Poisson arrivals [24]–[28]. However, focusing
on individual file transfers is important for small files when
the SU needs to transmit each file as soon as possible. For
instance, IEEE 802.11p protocol requires each car to generate
and send safety messages continuously at 100 ms intervals
[18], making Poisson arrivals unsuitable to model this situ-
ation. In addition, the file size can vary depending on the
application [29]. Same channel data rate across all channels
is another implicit assumption in those delay-related works
[24]–[28],1but it doesn’t reflect the realistic heterogeneous
channel environment assumed in the throughput-oriented
studies [20]–[23]. Clearly, allowing the SU to switch over
such channels during instances of PU’s interruption can
further reduce the file transfer time, but to the best of our
knowledge, this issue has not been fully explored.
1.2 Related Works and Their Limitations
Throughput and delay are two performance metrics com-
monly used in the OSA literature to evaluate the quality
of service (QoS) in the wireless network. For throughput-
oriented works, the PU’s behavior can be modeled as a two-
state Markov chain (thus correlated over time), for which
partially observable Markov decision processes (POMDPs)
are typically employed to formulate the spectrum sensing
strategy in order to maximize the long-term throughput
[30]. These POMDPs do not possess known structured so-
lutions in general and they are known to be Polynomial-
Space-complete (PSPACE-complete) even if all the channel
statistics are known a priori [31]. To achieve near maximum
throughput, computationally efficient yet sub-optimal poli-
cies, such as myopic policy [32] and Whittle’s index policy
[8], have been proposed for the offline OSA setting (known
channel parameters). In particular, both [32] and [8] intro-
duced a concept of “belief vector” to guess the available
probability of each channel and updated the vector after
each observation of the channel state. In each time slot, the
myopic policy [32] selected the channel with the maximum
“guess” throughput, while the Whittle’s index policy [8]
selected the channel with the highest value according to the
Whittle’s index and the belief vector.
Recently, machine learning techniques have emerged
in the online setting (unknown channel parameters) that
the SU needs to learn the unknown channel environment
in order to find the max-throughput policy. For exam-
ple, model-based MAB techniques [33]–[35] and model-free
deep neural networks [14], [36], are utilized to obtain the
max-throughput policy over Markovian channels. Explic-
itly, single-channel online policies have been developed in
[33], [34] to find the channel with maximum long-term
throughput. [35] utilized thompson sampling to estimate
the parameters of each channel and employed the offline
policies from [8], [32]. With well-trained neural networks,
[36] showed better performance than the policies in [8], [32].
1. Channel data rate differs from the service rate in [24]–[27] because
service rate is related to the length of time the channel is available,
while data rate refers to the speed of data transfer through a channel.
[14] tackled the coordination problem among multiple SUs
with deep Q-learning. On the other hand, MAB techniques
have been extensively studied for heterogeneous channels,
each with i.i.d Bernoulli distribution, in order to find the
channel with maximum throughput. The widely used MAB
techniques include the Bayesian approach [37], upper con-
fidence bounds [23], thompson sampling [20] and its im-
provement from efficient sampling [21], and coordination
approach among multiple SUs [5], [6]. In both two chan-
nel models studied in the throughput-oriented works, i.e.,
Bernoulli channels and Markovian channels, we will later
show that “the max-throughput channel does not always
minimize the file transfer time”.
For delay-sensitive applications, packet delay in cog-
nitive radio networks has been extensively studied using
queuing theory to derive delay-efficient spectrum schedul-
ing strategies. In this setting, a stream of packet arrivals (of
the same size) modeled as a Poisson process with a constant
rate is a common assumption in delay related works [24]–
[28], and the goal is often to minimize the average packet
delay in the steady state. Specifically, a spectrum sensing
strategy (including queuing delay, packet priority and inter-
ruption by PU’s) was studied in the multimedia applications
[24]. A dynamic load-balancing spectrum decision scheme
was proposed in [25], where an R-learning algorithm was
introduced to deal with unknown channels and queuing
statistics. [26] controlled the transmission probability of each
SU in the random access network and proposed a random-
access strategy for two-and-three-SUs cases to minimize the
queuing delay. Later, a model-free reinforcement-learning
based strategy was studied in [27] to predict the channel
accessing schemes without information exchanges among
SUs and maximize the Quality-of-Service performance of
the target SU. [28] focused on the hybrid spectrum access
strategy with interweave and underlay spectrum access
techniques for multiple SUs in order to obtain both good
throughput and lower packet delay. However, as discussed
Section 1.1, fixed packet length with Poisson arrivals and
the same channel rate assumed in [24]–[28] is not realistic in
some delay-sensitive applications. The file transfer problem
considered in this paper allows for arbitrary file sizes and
heterogeneous channel rates, and we can tackle the file
transfer problem for each single file.
1.3 Our Contributions
In this paper, we study the OSA model with the aim of
minimizing the transfer time of a single file over the het-
erogeneous Bernoulli channels with different channel rates
and channel available probabilities. We also provide prac-
tical implementations that take into account computational
costs and unknown channel environments, as well as the
extension to Markovian channels. Our main contributions
can be summarized as follows:
Theoretical Analysis of the File Transfer Problem.
•We first analyze the expected file transfer time of a
single file for static policies.
•By using the analysis of the static policy as a stepping
stone, we interpret the file transfer problem under dy-
namic policies as a stochastic shortest path problem,
and obtain the dynamic optimal policy.