4 Ridwan, et al.
2.1 Vision-based AMCs
Existing vision-based AMCs in the literature leverage facial or eye gaze features, collected from real-time video feeds
using an eye tracker, webcam, or other imaging sensors, and map eye gaze to screen coordinates for cursor control.
However, the user’s eye gaze needs to be calibrated before use. For mouse click actuation, dwell-time-based mechanisms
or gestures such as — eye wink,blink, or smile are among the common ones. Among the many studied works, researchers
have developed Smyle Mouse [
1
,
48
,
77
] that uses a generic webcam to register users’ head movement through nose
tracking for cursor movement and smile gestures for registering mouse click events. The system oers a calibration
phase where a user has to perform a series of gestures as instructed via a UI before actual usage. By default, the
smile gesture actuates a left mouse click. However, a user can customize the event (left-click, right-click, double-click,
drag, etc.) to be triggered with the gesture from a separate UI. Apart from the smile gesture, the system also oers
dwell-time-based click mechanism. Zhang et al. [
92
] have developed a software-based AMC leveraging eye gaze tracking
with an eye tracker to control the mouse cursor movement and dwell-time-based clicking method via a virtual UI for
Mouse/Keyboard simulation. The authors have evaluated their works through two experiments, a searching task, and
a web browsing task, utilizing Technology Acceptance Model (TAM) and System Usability Scale (SUS). Apart from
eye gaze tracking, researchers have also leveraged nose tracking for cursor control [
38
]. As opposed to eye trackers
or webcams, optical mouse or imaging sensors have also been used to track eye gaze for cursor movement [
7
,
81
].
However, a potential drawback of optical mouse sensors is the requirement of an additional light source to work
properly, as demonstrated by [
7
,
81
]. Concerning the methods used in the aforementioned works, virtual interface-based
methods for making computers accessible to the physically disabled are popular in the literature [
26
,
36
,
83
]. Although
common, in practice dwell-time-based click actuation suers from unwanted actuation of mouse clicks due to eye gaze
xation, generally known as the Midas Touch problem [
32
]. To address this issue, researchers [
93
] proposed a muscle or
eyebrow shrugging-based click actuation technique, implemented through the software packages Camera Mouse [
6
,
54
]
and ClickerAid [
53
]. Furthermore, Rajanna et al. [
68
,
69
] have also developed a system for people with arm or hand
impairment that uses eye gaze for pointing at a screen element while selection is actuated by exerting pressure on a
pressure sensor-based footwear.
A critical requirement for vision-based AMCs to work properly is to ensure proper lighting conditions for calibration
and accurate detection of facial [
26
,
36
,
38
,
83
] or eye gaze [
7
,
81
,
92
] features. For eye gaze-based AMCs, gaze tracking
may be challenging due to image resolution, dierent lighting conditions, the user’s dependency on eyeglasses due to
poor eyesight, and even the user’s skin complexion. In addition to that, human eyes are not the most accurate pointing
device [
4
]. Moreover, a human can only gaze at a single point at a given time, preventing the user from looking at
another region of interest without moving the mouse cursor. Most importantly, since eyes are used for both cursor
movement and click actuation, users might nd it dicult to execute both actions simultaneously when required, for
example, dragging an item. As stated earlier, a particular problem with dwell-time-based click actuation is the Midas
Touch problem [
32
], resulting in unwanted selection of UI elements. Another disadvantage of vision-based AMCs is
that existing eye trackers and webcams can not detect and track a user’s eye gaze beyond a particular distance from the
PC or workstation, forcing the user to maintain a particular distance from the PC or workstation.
2.2 Electromyography (EMG)-based AMCs
Electromyography (EMG) signals refer to the measurement of very low electric potentials generated due to muscle
contractions with electrodes placed noninvasively on the skin, where the signal amplitudes are proportional to the
Manuscript submitted to ACM