•One can also directly minimize a function of the form
L:Kµ×Kn→R,(h, b)7→ L(h, b) = L(y, h ∗(Qb), h, b),
defined in terms of a suitable loss function L. Let us call such methods direct.
We will give a brief overview of the relevant results from the literature about methods from both
categories. Throughout the discussion, it is instructive to remember that the optimal sample complexity
(i.e. the number of needed observations for injectivity) is given by µ+nin the dense setting and s+σ
in the sparse setting [20, 26].
Lifted approaches As for lifted approaches, it is crucial to understand that the lifted ground truth
signal h0⊗b0or equivalently its matrix form h0b∗
0, where b∗
0denotes the Hermitian transpose of b0,
simultaneously enjoys two structures: it has rank one and is sparse (in a structured manner). Each of
these structures can individually be exploited using convex regularizers – if we take the low-rank property
as the crucial one, the nuclear norm is the canonical regularizer [2, 28, 19], whereas if the sparsity is taken
as the crucial property, the ℓ1-norm [27] or its relative, the ℓ1,2-norm [13] should be used. While the
convex approaches are stable, mathematically pleasing and globally solveable by off-the-shelf algorithms,
they are computationally intensive and, thus, often impractical. Furthermore, recovery guarantees are
only known for generic subspace models. Under similar assumption of known supports as above, the
nuclear norm approach recovers the ground truth when µ≳max(s, σ) log(µ)2[28, 19, 9]. However, not
assuming know supports, that guarantee degenerates to µ≳max(µ, n) log(µ)2. The sparse models (ℓ1and
ℓ1,2) recover the ground truth with high probability when µ≳sσ log(sn) log(µ)2under the assumption
that the support of his known [27, 13]. If the support of his not known, the guarantee again degenerates
to µ≳µσ log(n) log(µ)2.
There also is a natural non-convex-way to solve the lifted problem: a gradient descent projected onto
the set of (bi)-sparse and low-rank matrices. As is thoroughly discussed in [16], there is however no
practical algorithm available to compute this projection. The bi-sparse unit-rank projection is known as
the sparse PCA problem, that has been shown to be worst-case NP-hard, as well as under approximation
and on average [29, 8, 7]. A canonical way to circumvent this obstacle is to alternate between projections
onto the two sets. This approach is for instance investigated in [11]. There, a local convergence guarantee
is presented under optimal sample complexity using the considerably simpler measurement model with a
Gaussian linear map without the structure of the convolution. A global one is not.
In this context, [3] should also be mentioned – there the authors obtain global convergence in just
two alternations steps, however by assuming a nested measurement structure tailor-made for a jointly
low-rank and sparse setting, which is often not given.
Finally, in [1], a different but related problem was considered. In the paper, they consider the case
of several convolutions yi=h∗(Qibi), i ∈[N] are independently observed. Importantly, the authors
do not to assume that the support of his known, only that it is sparse in a basis Bwith a certain
incoherence. However, they instead assume that the supports of the biare known, and that the Qi
are independently Gaussian distributed. Under those assumptions, they achieve recovery (through a
nuclear norm minimization approach) of both hand all bialready when µ≳(σ+slog(s)2) log(µN)4and
N≳log(Nµ). If we drop the assumption of known supports of the bi, the guarantee again degenerates,
to µ≳(µ+slog(s)) log(µs)2.
Direct approaches Among direct approaches, alternating minimization, or sparse power factorization,
is probably the most prominent. It was introduced in [31] as a means for solving the so-called phase-
retrieval problem and later adapted to blind deconvolution in [22, 23]. Alternating minimization generally
refers to taking turns in solving the following two problems,
minimize
h(sparse) L(h, b′) and minimize
b(sparse) L(h′, b).
Since the convolution is linear in each argument, each subproblem is effectively a classical compressed
sensing problem, and can be solved using a number of techniques, e.g. iterative hard thresholding [15] or
CoSAMP [30].
As for the question of success of alternating minimization, the authors of [22, 23] give a recovery
guarantee in a special setting: First, they use a generic subspace model, where hhas a representation
h=Bg with some s-sparse vector g, rather than being sparse itself, and Bis assumed to be a matrix with
random Gaussian entries. The measurement matrix Qis also assumed to be Gaussian. An additional
5