
Fahrbach et al. [9] develop a method for efficient sampling of ridge regression problems involving
Kronecker product design matrices. They use these techniques to achieve an efficient sampling-based
ALS method for regularized Tucker decomposition.
Efficient sketching of structured matrices is an active research area with applications beyond
tensor decomposition. The recent work by Ma and Solomonik [16] is particularly interesting since
it proposes structured sketching operators for general TN matrices. These operators take the
form of TNs made up of Gaussian tensors and are therefore dense. When used in ALS for tensor
decomposition, their sketching operators therefore need to access all entries of the data tensor in
each least squares solve which leads to a per-iteration cost which is at least linear in the input
tensor size (and therefore exponential in N).
Due to space constraints, we have focused on previous works that develop sampling-based ALS
methods here. There is a large number of works that develop randomized tensor decomposition
methods using other techniques. For a comprehensive list of such works, see [18, Sec. 2]. For an
overview of work on structured sketching, see [20, Sec. 1.2].
3 Preliminaries
3.1 General Notation
By tensor, we mean a multidimensional array containing real numbers. We refer to a tensor with
Nindices as an N-way tensor and we say that it is of order Nor that it has Nmodes. We use
bold uppercase Euler script letters (e.g., X) to denote tensors of order 3 or greater, bold uppercase
letters (e.g., X) to denote matrices, bold lowercase letters (e.g., x) to denote vectors, and regular
lowercase letters to denote scalars (e.g., x). We denote entries of an object either in parentheses
or with subscripts. For example, X(i, j, k) = Xijk is the entry on position (i, j, k) in the 3-way
tensor X, and x(i) = xiis the ith entry in the vector x. We use a colon to denote all entries
along a particular mode. For example, X(i, :) denotes the ith row of the matrix X. Superscripts
in parentheses will be used to denote a sequence of objects. For example, A(1),...,A(M)is a
sequence of tensors and e(i)is the ith canonical basis vector. The values that a tensor’s indices can
take are referred to as dimensions. For example, if X= (Xijk)I,J,K
i=1,j=1,k=1 then the dimensions of
modes 1, 2 and 3 are I,Jand Krespectively. Note that dimension does not refer to the number of
modes, which is 3 for X. For a positive integer I, we use the notation [I]def
={1, . . . , I}. For indices
i1∈[I1], . . . , iN∈[IN], the notation i1· · · iN
def
= 1 + PN
n=1(in−1) Qn−1
j=1 Ijwill be used to denote
the linear index corresponding to the multi-index (i1, . . . , iN). The pseudoinverse of a matrix Ais
denoted by A+. We use ⊗,and ~to denote the Kronecker, Khatri–Rao and Hadamard matrix
products which are defined in Section A.
3.2 Tensor Networks
Atensor network (TN) consists of tensors and connections between them that indicate how they
should be contracted with each other. Since the mathematical notation easily gets unwieldy when
working with TNs, it is common practice to use graphical representations when discussing such
networks. Figure 1shows how scalars, vectors, matrices and tensors can be represented graphically.
A circle or node is used to represent a tensor, and dangling edges are used to indicate modes of the
tensor.
3