2
system to perform consistent back-propagation update
[18]. However, the fine-tuning of the optical parameters
is challenging due to discrepancies between the response
of the real system and the physical model employed to
describe the architecture. This reality gap often reduces
the expected performance of the network [19, 20], requir-
ing additional corrections at software-level [12], training
enforcement via hybrid strategies [18], or employing NNs
to more accurately model the optical response of the sys-
tem [20].
In this rapidly evolving scenario, the class of ran-
dom projections realized by multi-mode fibers (MMF)
are promising candidates for realizing ONNs. These de-
vices scramble the photons due to scattering events oc-
curring during the light-field propagation, yielding to the
formation of speckle patterns that are, in fact, random
projections. Although the light transmission can be re-
garded as a linear process [21] in which input modes are
coupled with output modes via a complex transmission
rule, interference takes place when dealing with the mea-
surement of the light-field intensity. Since the detection is
nonlinear, MMFs can be used [22] to classify time-domain
waveforms (using saturation effects as further nonlinear-
ity) [23], in pattern classification of 2-bits sequences [24],
or for binary (human, not human) facial recognition [25].
Furthermore, when dealing with more complex classifi-
cation tasks, high-power laser pulses were employed to
trigger the nonlinear response of the fiber itself [26]. Due
to the increasing interest in the employment of MMFs
as random projector computing devices, we decided to
study their behavior in carrying out classification tasks
in a linear, low-power continuum regime. Although our
MMF-based optical neural network does not employ feed-
back, we will see how its performances in classification are
considerable, as in reservoir computing systems [27–30].
We do this by comparing the performance of the phys-
ical neural network to that obtained with random Gaus-
sian linear projections and to that of a transmission ma-
trix approach, the model commonly used to describe light
propagation in disordered structures [21, 31]. We per-
form our study statistically, shuffling the training set to
assess the average behavior of the optical computing un-
der different training and initialization conditions. Re-
markably, a single MMF provides simultaneously two in-
dependent (though deterministically linked) projections
at both edges of the fiber, which we study separately
using different saturation regimes. Here, we show that
the real physical MMF leads to accuracy higher than its
corresponding transmission matrix model, highlighting
the reality gap between model theory and experimental
results. To assess the reason of this performance gap,
we study the characteristic of complex-valued random
projections in terms of flatness of the local energy land-
scape, proving that the MMF projection is more robust
than those provided by alternative datasets. Addition-
ally, we characterize the behavior of a hardware-based
neural network using optical fibers in terms of the num-
bers of the modes employed. We have set up our study
not for achieving the best performance in classification
tasks, but rather to deepen the understanding of physi-
cal neural networks against their physical model, giving
insights on the usage of MMF fibers for optical compu-
tation.
II. MATERIALS AND METHODS
In a low-power regime, a generic multi-mode fiber
transports the electromagnetic field via a linear process
[21] so that the light propagation can be described using
a simple multiplication of the input signal by a matrix
that encodes the transmission rule:
y=Tx.(1)
In this descriptive model, xis the controlled input, Tis
the (complex-valued and typically unknown) transmis-
sion matrix of the medium, and yis the output field.
Despite its propagation, the way we measure the MMF
output is not linear for two reasons. First, photons carry
complex signals, i.e., the electromagnetic field associated
to each propagation mode is characterized by amplitude
and phase. Current electronic devices cannot follow the
rapid oscillation of the field, which makes impossible the
measure of the phase information. Assuming the possi-
bility that the readout is also perturbed by an additive
noise ε, the camera only sees the noise affected intensity
distribution:
|y|2=|Tx|2+ε. (2)
Second, the camera has a well defined sensitivity range
that depends on each pixels capability to store intensity
change. If the signal reaching a given pixel exceeds the
sensitivity, the measure gets clipped at the peak (over-
exposure) or at the bottom (underexposure). In analogy
to machine learning terminology, the measuring process
can be described by a non-linear activation function σ(·)
that acts on the result of a complex-valued linear trans-
mission, Tx. For instance, the camera recording process
can be represented using the saturating linear transfer
function (SatLin):
σ(Tx) = min max d, |Tx|2+ε,2b−1,(3)
where the quantity dis the intensity threshold under
which the measure is not recorded, and bis the bit depth
of the camera.
These considerations make the readout of a coherent
field non-linear, as well as its inverse transmission recov-
ery problem [21, 31–36]. Such matrix can be estimated
by using the four-phases method [21], Bayesian optimiza-
tion [37], or iterative Gerchberg-Saxton schemes [38, 39].
However, the characterization of the device in terms of
its transmission rule is not the main scope of this paper,
nor to circumvent the limitations of the measuring pro-
cess. Instead, we want to study the multi-modal random-
projection nature of the fiber to perform optical com-
puting. In the neural network framework, the fiber can