processes (MDP). The main ingredients of the MDP are states,
actions, rewards, and state transition probabilities. The state
could be a composite states of fading channel, energy arrival,
and battery condition. The action is the transmit power level
or the amount of energy to be consumed, and the reward is
a function of the states and the actions. The state transition
probability describes the transition probability from the current
state to the next state with respect to each action. The goal is
to find the optimal policy, which specifies the optimal action
in the state and maximizes the long-term expected discount
infinite-horizon reward starting from the initial state [18].
In the context of distributed detection in WSNs, there are
only few studies that consider EH-powered sensors [21]–[27].
In the following we provide a concise review of these works,
highlight how our present work fills the knowledge gap in the
literature, and how it is different from our previous works in
[25]–[27].
A. Related Works and Knowledge Gap
Considering an EH-powered node, that is deployed to
monitor the change in its environment, the authors in [21]
formulated a quickest change detection problem, where the
goal is to detect the time at which the underlying distribution
of sensor observation changes. Considering an EH-WSN and
choosing deflection coefficient as the detection performance
metric, the authors in [24] formulated an adaptive transmit
power control strategy based on PHY-MAC cross-layer design.
Considering an EH-WSN and choosing error probability as the
detection performance metric, the authors in [22] proposed
ordered transmission schemes, that can lead to a smaller
average number of transmitting sensors, without comprising
the detection performance. Modeling the randomly arriving
energy units during a time slot as a Bernoulli process, the
battery state as a K-state Markov chain, and choosing Bhat-
tacharya distance as the detection performance metric, the
authors in [23] have investigated the optimal local decision
thresholds at the sensors, such that the detection performance
is optimized. We note the system model in [24] lacks a battery
to store the harvested energy. Further, the adopted energy
arrival model in [24] is deterministic. On the other hand,
[21], [22] assumed sensor-FC channels are error-free and [23]
considered a binary asymmetric channel model for sensor-FC
links. The high level communication channel model, combined
with a simple stochastic model for random energy arrival is
limiting. Specifically, it does not allow one to study channel-
dependent transmit power control strategies. Such a study
requires a more realistic communication channel model and
stochastic energy arrival model that match the energy needed
for a channel-dependent transmission. This is the knowledge
gap that we address in this work.
To highlight how our present work is different from our
previous works in [25]–[27], we briefly summarize them
in the following. Modeling the random energy arrival as a
Bernoulli process, the dynamics of the battery as a finite-
state Markov chain, and considering fading channel model,
in [25] we adopted channel-inversion transmit power control
policy, where allocated power is inversely proportional to
fading channel state information (CSI) in full precision, and
we found the optimal decision thresholds at sensors such that
Kullback-Leibler (KL) distance detection metric at the FC
is maximized. Different from [25], in [26] we modeled the
random energy arrival as an exponential process and assumed
that each sensor only knows its quantized CSI and adapts its
transmit power according to its battery state and its quantized
CSI, such that J-divergence based detection metric at the FC is
maximized. Modeling the random energy arrival as a Poisson
process in [27], we proposed a novel transmit power control
strategy that is parameterized in terms of the channel gain
quantization thresholds and the scale factors corresponding
to the quantization intervals, and found the jointly optimal
quantization thresholds and the scale factors such that J-
divergence based detection metric at the FC is maximized.
Our present work is different from our prior works in [25]–
[27] in several aspects. The transmit power control strategies in
these works are intrinsically different from our present work,
since in [25]–[27] we have assumed that the battery operates
at the steady-state and the energy arrival and channel models
are independent and identically distributed (i.i.d) across trans-
mission blocks. Consequently, the power optimization problem
in [25]–[27] became a deterministic optimization problem, in
terms of the optimization variables, and the obtained solutions
are different. In this work, the battery is not at the steady-state.
Also, both the channel and the energy arrival are modeled as
homogeneous finite-state Markov chains (FSMCs). Therefore,
the power control optimization problem at hand becomes a
multistage stochastic optimization problem, and can be solved
via the MDP framework. To the best of our knowledge, this
is the first work that develops MDP-based channel-dependent
power control policy for distributed detection in EH-WSNs.
The MDP framework has been utilized before in [29], [30] to
address a quickest change detection problem.
B. Our Contribution
Given our adopted WSN model (see Fig. 1), we aim
at developing an adaptive channel-dependent transmit power
control policy for sensors such that a detection performance
metric is optimized. We choose the J-divergence between
the distributions of the detection statistics at the FC under
two hypotheses, as the detection performance metric. Our
choice is motivated by the fact that J-divergence is a widely
adopted metric for designing distributed detection systems
[12], [13], [27], [31]. We note that J-divergence and Peare
related through Pe>Π0Π1e−J/2, where Π0,Π1are the a-
priori probabilities of the null and the alternative hypothe-
ses, respectively [12], [13], [31]. Hence, maximizing the J-
divergence is equivalent to minimizing the lower bound on
Pe. Modeling the quantized fading channel, the energy arrival,
and the dynamics of the battery as homogeneous FSMCs, and
the network lifetime as a random variable with geometric dis-
tribution, we formulate J-divergence-optimal transmit power
control problem, subject to total transmit power constraint, as
adiscounted infinite-horizon constrained MDP optimization
problem, where the control actions (i.e., transmit powers) are
functions of the battery state, quantized CSI, and the arrived