
AI and Ethics
1 3
from each iteration are used in the design and control of the
next iteration cycle—referred to as ‘phases’. To reduce and
overcome the complexity of the iterative methodological
process, and to progress towards a better understanding of
the outcome of each iteration, a variety of complimentary
but different techniques are used. For example, the article
intersects methodologies from engineering and computer
sciences, to resolve future concerns with autonomous pro-
cessing and analysing of real-time data from edge devices.
3 Design forautonomous AI—AutoAI
The design consists of 4 Phases. In Phase 1 automatic prepa-
ration and ingestion of raw OSINT data is synthesised for
the construction of training scenarios for automated feature
engineering, and to teach AI how to categorise and use (ana-
lyse) new and emerging forms of OSINT data. In Phase 2
the domain knowledge is applied to extract features from
raw data. Feature is considered as valid if the attributes or
properties of the feature are useful or if the characteristics
are helpful to the model. For the automation of feature engi-
neering two approaches are considered (1) multi-relational
decision tree learning, which is a supervised algorithm based
on decision tree and (2) Deep Feature Synthesis which is
available as an open-source library named Featuretools.1 In
Phase 3 selection algorithms [2] is used to identify hyper-
parameter values. The normal parameters are typically opti-
mised during training, but hyperparameters are generally
optimised manually and this task is associated with a model
designer. The automation scenario design with start with
building upon the knowledge from biological behaviours.
The particle swarm optimisation and evolutionary algo-
rithms can be used, both of which derive from biological
behaviours. The particle swarm optimisation emerges from
studies on biological communities’ interactions in individual
and social levels and evolutionary algorithms emerge from
studies on biological evolution. Second, the scenario design
can apply Bayesian optimisation, which is the most used
method for hyperparameter optimisation [3].
The combination of these methodological approaches
is considered for automatic hyperparameter optimisation,
but the concern is that edge devices are characterised by a
large number of data points, and new and emerging forms
of data are characterised by large configuration space and
dimensionality. These factors in combination could create
a longer than adequate time requirement for finding the
optimal hyperparameters. Alternative method is a com-
bined algorithm selection and hyperparameter optimisa-
tion. The method selection includes testing for the most
effective approach, starting from Bayesian optimisation,
Bandit Search, Evolutionary Algorithms, Hierarchical Task
Networks, Probabilistic Matrix Factorisation, Reinforce-
ment Learning and Monte Carlo Tree Search. In Phase 4
an automated pipeline optimisation is designed, comparable
to the Tree-based Pipeline Optimization Tool (TPOT) [4]
but for autonomous optimising feature pre-processors for
maximising classification accuracy on a unsupervised clas-
sification task. The above analysis of current AI algorithms
is designed for low cost devices that contain substantively
larger memory than IoT sensors. In other words, this design
could work for a ‘Raspberry Pi’ device, but it won’t work
for a low memory / low computation power sensor. Given
this, the functionality of the proposed AutoAI needs to come
in perspective. The proposed design could be applied in the
metaverse, or in mobile phones, on edge devices with some
memory and power, or in the metaverse. The design cannot
be applied to sensors used to monitor water flow under a
bridge, or the air pollution, or smoke detector sensors in
the forest.
3.1 Phase 1: automated data preparation
O1: develop open access autonomous data preparation
method (for digital healthcare data) from edge devices—
similar to the Oracle autonomous database,2 for autono-
mous ingestion of new and emerging forms of raw data,
e.g., OSINT (big data). The first scientific milestone (M1)
is to build a new autonomous data preparation method that
can serve for training an AutoAI algorithm to: (1) become
self-driving by automating the data provisioning, tuning,
and scaling; (2) become self-securing by automating the
data protection and security; and (3) become self-repairing
by automating failure detection, failover, and repairment.
The new method design includes learning how to identify,
map and ignore patterns of data pollution (e.g., using direct
references to results obtained from OSINT queries) and
become more efficient in autonomously building improved
algorithms. To ensure the success of the autonomous data
preparation method, a new scenario is constructed to teach
the algorithm how adversarial systems pollute the training
data and how to discard such data from the training scenar-
ios. While constructing the scenario, the search for training
data expands in new and emerging forms of data (NEFD),
e.g., open data—Open Data Institute,3 Elgin,4 DataViva5;
1 https:// www. featu retoo ls. com/.
2 https:// www. oracle. com/ auton omous- datab ase/? fbclid= IwAR2
Niqrm jTZ76 hj0gN a1gQU ixCLE WY4g4 tlvSc YK0fv lW6q8 HiXM-
QXeC2A.
3 https:// theodi. org/.
4 https:// www. elgin tech. com/.
5 http:// datav iva. info/ en/.