JOINT HUMAN ORIENTATION-ACTIVITY RECOGNITION USING WIFI SIGNALS FOR
HUMAN-MACHINE INTERACTION
Hojjat Salehinejad1,2, Member, IEEE, Navid Hasanzadeh2,Radomir Djogo2, and
Shahrokh Valaee2, Fellow, IEEE
1Kern Center for the Science of Health Care Delivery, Mayo Clinic, Rochester, MN, USA
2Department of Electrical & Computer Engineering, University of Toronto, Toronto, Canada
hojjat@ieee.org, {navid.hasanzadeh, radomir.djogo}@mail.utoronto.ca, valaee@ece.utoronto.ca
ABSTRACT
WiFi sensing is an important part of the new WiFi 802.11bf
standard, which can detect motion and measure distances.
In recent years, some machine learning methods have been
proposed for human activity recognition from WiFi signals.
However, to the best of our knowledge, none of these meth-
ods have explored orientation prediction of the user using
WiFi signals. Orientation prediction is particularly critical for
human-machine interaction in an environment with multiple
smart devices. In this paper, we propose a data collection
setup and machine learning models for joint human orienta-
tion and activity recognition using WiFi signals from a single
access point (AP) or multiple APs. The results show feasibil-
ity of joint orientation-activity recognition in an indoor envi-
ronment with a high accuracy.
Index Terms—Activity recognition, channel state infor-
mation, human-machine interaction, machine learning, WiFi.
1. INTRODUCTION
Human activity recognition (HAR) refers to detection and
recognition of human gestures and activities in an environ-
ment. Some major systems/mediums for collecting data are
wearable sensors (e.g. gyroscope and accelerometer), cam-
eras (e.g. still image and video), and radio frequency signals
(e.g. WiFi signals) [1]. HAR with wireless signals has at-
tracted attention due to its privacy preserving nature, broad
sensing coverage, and ability to sense the environment with-
out line-of-sight (LoS) [2]. This is particularly interesting
since the WiFi 802.11bf standard will enable remote moni-
toring and sensing [3].
Channel state information (CSI) in a wireless communi-
cation system can provide properties about the wireless chan-
nel and how a subcarrier has been affected in the environment.
Changes in the environment such as walking, falling, and sit-
ting can affect the CSI signals which can be used for various
sensing applications. CSI is measured in the baseband and is
a vector of complex values. A multiple-input multiple-output
(MIMO) wireless system provides a spatial diversity which
can be used for wider and more accurate sensing and detec-
tion of activities. This property of wireless signals can be
very useful in designing systems for human-machine inter-
action. Some examples are presence detection [4], security
systems [5], localization [6], and internet of things [7].
Various approaches have been proposed for HAR using
machine learning. That includes random forest (RF) [8],
hidden Markov model (HMM) [8], long-short-term mem-
ory (LSTM) [8], sparse auto-encoder (SAE) network [9],
attention-based bi-directional LSTM [10], and diversified
deep ensemble learning (WiARes) [11]. Most of the proposed
methods are based on training many trainable parameters for
feature extraction from CSI measurements. This approach
requires large CSI training data and hyper-parameter tuning.
In addition, most of these models due to their high compu-
tational complexity may not be suitable for implementation
on resource-limited devices such as smart phones and edge
devices [12]. LiteHAR [2] method uses a large number of
random convolution kernels without training them [13] for
feature extraction, followed by a pool of Ridge regression
classifiers per frequency for activity recognition. This ap-
proach enables fast and accurate HAR using CSI.
To the best of our knowledge, none of the previous works
in HAR have explored the possibility of predicting both ac-
tivity and orientation of the user using CSI. In this paper, ma-
chine learning models for prediction of the joint user activity
and orientation are introduced. Orientation prediction is par-
ticularly important for interaction with devices in smart en-
vironments, where multiple devices exist. It governs which
device the user is trying to interact with. We have built an
infrastructure for CSI measurements collection from multiple
access points (APs). Based on our previous work for a light-
weight HAR [2] solution, the idea of using 1-dimensional ran-
dom convolution kernels in [14] is utilized for feature extrac-
tion from CSI measurements. Then, Ridge regression classi-
fiers are used for prediction of the activation and orientation
of the user. The proposed models are evaluated for single AP
and multiple AP scenarios and the performance results are
discussed.
arXiv:2210.05078v2 [cs.HC] 21 Oct 2022