
1 Introduction
With the advancement of data-collection technology, a wide range of industry and business
sectors are now able to collect functional data. According to Ramsay and Silverman [1], a
functional datum is not an individual value but rather a set of measurements/observations
along a continuum that, taken together, are to be regarded as a single entity. Functional
data come in many forms, but their defining quality is that they consist of functions – often,
but not always, curves. For example, spectroscopic techniques obtain spectral information
by probing each sample with electromagnetic radiation that varies in a range of wavelengths,
and hence the calculated absorption coefficient is a function of wavelength. By probing a
sample at different wavelengths, the set of absorption coefficients is one data unit. Another
example of functional data is an fMRI time series, consisting of a time series of 3D images
of the living human brain, where each 3D image consists of a large number of voxels (3D
pixels). For example, the prevalent BOLD fMRI detects the blood-oxygen-level-dependent
signal that reflects changes in deoxyhemoglobin, driven by localized changes in brain blood
flow and blood oxygenation. Each 3D image is a functional datum (or, equivalently, a
random field). Paradigmatic formats of functional data include time series, trajectories,
spatio-temporal data, etc. However, the term “functional” is not the defining quality of time
series, trajectories, or spatio-temporal data. Ansari et al. [2] classified spatio-temporal data
into five types, according to which certain types of spatio-temporal data are not functional
data. Apart from the difference in the definitions of data format, the main difference is in
the focus of statistical analysis: the focus of functional data analysis is on analyzing relations
among the random elements, rather than properties of individual random elements.
While functional data analysis has received attention from statisticians since the 1980s,
there is very little advancement in the area of functional data clustering. Within the two
databases: Scopus and Web of Science, we found only about 100 articles that are on develop-
ing clustering methods for functional data.1Moreover, nearly all documented methods tackle
only the functional-data part of the problem, not the clustering part of the problem. For
example, many studies mainly concern extracting a tabular-data proxy for functional data,
ignoring the synergy between the feature-learning (a.k.a., representation-learning) step and
the clustering step. The main objective of our review is to develop an overarching structure
of existing functional data clustering methods, which highlights the similarities and differ-
1In the appendix, we give the details on the identification of relevant literature and the article selection
process. We also provide a table that implements the classification of the reviewed articles according to our
taxonomy.
2