SCALABILITY IN VISUALIZATION 2
Using the conceptual model as a framework, we analyzed the
current state of visualization research and contribute a structured
and systematic literature analysis of the full papers published in
IEEE Visualization, SciVis, InfoVis, and VAST from 1990 to 2020.
The literature search led to 127 articles for which we derived
a coding scheme and analyzed them, respectively. Four of the
authors participated in multiple rounds of reviews of these relevant
papers followed by discussions to establish the conceptual model,
scenarios, and coding scheme. The two other authors coded the
complete set of papers after being introduced to the coding scheme.
Our goal was to learn about the current usage of the notion of
scalability in visualization research, as well as to assess how well
our conceptual model allows characterizing previous research on
scalability. We make the coding book and results publicly available
at the following repository: https://osf.io/xrvu7/.
Based on our conceptual model, general observations, and
the literature review, we arrive at recommendations to improve
the design and presentation of scalability-related research when
targeting an outside or mixed audience. We believe that this would
also help compare visualization techniques and systems, and foster
reproducibility.
2 RELATED WORK
The visualization research community has become more diverse
over the years, starting with statistics, algorithms, computer
graphics, and computational science in the early 1990s, and joined
by human-computer interaction (HCI), psychology, vision science,
design, cartography, and many more. The concept of scalability
varies from one community to the next, with different levels of
maturity. In this section, we review related work discussing and
defining scalability in different areas of computer science and in
the visualization community.
2.1 Definitions of Scalability
Weinstock and Goodenough [5] define the scalability problem as
“the inability of a system to accommodate an increased workload.”
Bondi [6] mentions several definitions of scalability in computer
science:
•
“Scalability is the property of a system to handle a growing
amount of work by adding resources to the system.” Adding
resources may have the form of adding more nodes to a system
made of multiple small interconnected servers (scaling out
or horizontally) or adding more resources to a single node
(scaling up or vertically) [7].
•
Load scalability is the “ability to function gracefully, i.e.,
without undue delay and without unproductive resource
consumption or resource contention at light, moderate, or
heavy loads while making good use of available resources.”
•
Space scalability is that “memory requirements do not grow to
intolerable levels as the number of items it supports increase.”
•
Space-time scalability: “continues to function gracefully as
the number of objects [. . . ] increases by orders of magnitude.”
•
Structural scalability means that “implementation or standards
do not impede the growth of the number of objects it
encompasses, or at least will not do so within a chosen time
frame.”
Parallel systems and HPC distinguish mainly two types of scalabil-
ity:
•
Strong scaling: “how the solution time varies with the number
of processors for a fixed total problem size.”
•
Weak scaling: “how the solution time varies with the number
of processors for a fixed problem size per processor.”
Hill [8] tries to define scalability for multiprocessor systems and
admits: “but I fail to find a useful, rigorous definition of it.” Duboc
et al. [9] define it as: “a quality of software systems characterized
by the causal impact that scaling aspects of the system environment
and design have on certain measured system qualities as these
aspects are varied over expected operational ranges. If the system
can accommodate this variation in a way that is acceptable to the
stakeholder, then it is a scalable system.”
All the definitions are specified as properties of systems
at an abstract level, focusing on “amount of work,” “delay,”
“resources,” “productive resource consumption,” “[work]loads,”
“memory,” “function gracefully,” “time frame,” “adding nodes,”
and “shared memory.” They rely on implicit domain knowledge to
be clearly understood and are not suitable to the wide audience of
visualization practitioners.
2.2 Scalability in Visualization and Visual Analytics
Visualization and visual analytics are concerned with general
computer science scalability when it comes to systems or algo-
rithms. In addition, they are also concerned with more specific
issues. Robertson et al. [10] mention information scalability, visual
scalability, display scalability, and human scalability, in addition
to computational scalability. They also add other scalability issues:
software scalability, temporal scalability, cross-scale issues, privacy
and security issues (related to scale), and language issues. Yost and
North [11] also mention graphical scalability (“limits imposed
by the number of pixels”) and perceptual scalability (“When
the screen is not the limiting factor, just how much data can a
person effectively perceive?”). Eick and Karr [12] want to quantify
visual scalability by modeling the dependence between responses,
factors, and data. They admit that it cannot be done because few
responses can be quantified or measured. Instead, they break down
the problem into subparts affecting the overall scalability, adding
“visual metaphors,” “interactivity,” and “aggregation” to the list of
factors affecting scalability.
Scalability is also related to evaluation since it is based on
measuring efficiency. Lam et al. [13] describe seven scenarios for
evaluation in visualization, some of them leading to quantitative
results and others to qualitative ones. Scalability is part of the
“Evaluating User Performance” and “Evaluating Visualization
Algorithms” scenarios. One area in which scalability evaluation is
well-established is the HPC/visualization community, where the
main focus is on algorithmic scalability with well-defined metrics
and definitions (e.g., strong scalability). However, the rest of the
visualization community may not be familiar with these definitions,
and it remains unclear if they could be applied in a broader context
than those with HPC resources.
2.3 Scalability in HCI, Psychology, and Vision Science
Scalability related to humans is different from scalability in
computer science. In their seminal book, Card et al. [14] describe
the human as a processor with numerous capabilities, some of them
ruled by laws or models expressible mathematically. Visualization
is concerned with several of these capabilities, in particular regard-
ing perceptual scalability, cognitive scalability, and movement. The
psychology laws and models often refer to information theory,
considering perception and action as communication through
capacity-limited channels.