Low-power multi-mode fiber projector overcomes shallow neural networks classifiers Daniele Ancora135Matteo Negri13 Antonio Gianfrate2 Dimitris Trypogeorgos2 Lorenzo Dominici2 Daniele Sanvitto2 Federico Ricci-Tersenghi134 and Luca Leuzzi31

2025-05-02 0 0 5.65MB 12 页 10玖币
侵权投诉
Low-power multi-mode fiber projector overcomes shallow neural networks classifiers
Daniele Ancora1,3,5,Matteo Negri1,3, Antonio Gianfrate2, Dimitris Trypogeorgos2,
Lorenzo Dominici2, Daniele Sanvitto2, Federico Ricci-Tersenghi1,3,4, and Luca Leuzzi3,1
1Department of Physics, Universit`a di Roma la Sapienza, Piazzale Aldo Moro 5, I-00185 Rome, Italy
2Institute of Nanotechnology, Consiglio Nazionale delle Ricerche (CNR-NANOTEC), Via Monteroni, I-73100 Lecce, Italy
3Institute of Nanotechnology, Soft and Living Matter Laboratory,
Consiglio Nazionale delle Ricerche (CNR-NANOTEC), Piazzale Aldo Moro 5, I-00185 Rome, Italy
4Istituto Nazionale di Fisica Nucleare, sezione di Roma1, Piazzale Aldo Moro 5, I-00185 Rome, Italy and
5Epigenetics and Neurobiology Unit, European Molecular Biology
Laboratory (EMBL Rome), Via Ramarini 32, 00015 Monterotondo, Italy
(Dated: December 18, 2023)
In the domain of disordered photonics, the characterization of optically opaque materials for
light manipulation and imaging is a primary aim. Among various complex devices, multi-mode
optical fibers stand out as cost-effective and easy-to-handle tools, making them attractive for several
tasks. In this context, we cast these fibers into random hardware projectors, transforming an input
dataset into a higher dimensional speckled image set. The goal of our study is to demonstrate that
using such randomized data for classification by training a single logistic regression layer improves
accuracy compared to training on direct raw images. Interestingly, we found that the classification
accuracy achieved is higher than that obtained with the standard transmission matrix model, a
widely accepted tool for describing light transmission through disordered devices. We conjecture
that the reason for such improved performance could be due to the fact that the hardware classifier
operates in a flatter region of the loss landscape when trained on fiber data, which aligns with the
current theory of deep neural networks. These findings suggest that the class of random projections
operated by multi-mode fibers generalize better to previously unseen data, positioning them as
promising tools for optically-assisted neural networks. With this study, in fact, we want to contribute
to advancing the knowledge and practical utilization of these versatile instruments, which may play
a significant role in shaping the future of neuromorphic machine learning.
Keywords: Multi-mode fibers, random projections, disordered photonics, neural networks,
classification, MNIST.
I. INTRODUCTION
A sound understanding of the enormous success of
Neural Networks (NNs) in learning processes and infer-
ence tasks is still lacking. The fundamental point is to
understand why such architectures, which can have even
billions of parameters, do not severely overfit data, as
predicted by statistical learning theory and the so-called
bias-variance tradeoff (see for example [1] or [2]). The
abundance of learnable parameters, in fact, is arguably
the most universal feature in the zoo of NN architectures.
Interestingly, it is known that, given a chosen NN archi-
tecture, most of the model parameters adapt little-to-
nothing during the learning procedure [3, 4], suggesting
that random projections may play an equally important
role in NNs. Recent works, in fact, have shown that it
is possible to train a simple two-layer model by learning
only the upper layer, interpreting the first one as a ran-
dom projection [5, 6]. These results were strengthened
further by Baldassi et al. [7], who proved that increas-
ing the dimension of the random projection leads to the
production of wide and flat regions in the loss landscape
(the function that is minimized during the training of
the model), which are related to the good generalization
daniele.ancora@uniroma1.it; daniele.ancora@cnr.it
properties in neural networks. The ability of a neural net-
work, that has been trained over a given dataset (train-
ing dataset), to generalize well is the ability of displaying
good performances when applied to data over which it
was not trained (test dataset). In the framework of the
loss landscape description an improvement in the gener-
alization means that models that lie in flat regions make
less mistakes when they classify previously unseen data.
Finally, recent evidence is provided [8] that the way the
random projection is chosen is fundamental to determin-
ing the generalization properties of the upper layer of
these simple models. This suggests that different classes
of random (possibly non-linear) projections impact dif-
ferently on the performance of the models.
In this context, we are interested in studying hard-
ware random projectors, such as those employed in the
field of photonic neuromorphic computing [9, 10]. The
advantage of using optical neural networks (ONN) is
that neurons can interact by exploiting light scattering
[11–13] and photon interference [14, 15] at the speed
of light. Tools for shaping and controlling the light-
field [16] are becoming so versatile that the field is un-
der constant development, aiming at high-speed, high-
throughput optical-based computing architectures. All-
optical neural networks [11, 17], in particular, have the
potential to be great tools for fast computation, though
they often require an accurate modeling of the optical
arXiv:2210.04745v3 [physics.optics] 2 Apr 2025
2
system to perform consistent back-propagation update
[18]. However, the fine-tuning of the optical parameters
is challenging due to discrepancies between the response
of the real system and the physical model employed to
describe the architecture. This reality gap often reduces
the expected performance of the network [19, 20], requir-
ing additional corrections at software-level [12], training
enforcement via hybrid strategies [18], or employing NNs
to more accurately model the optical response of the sys-
tem [20].
In this rapidly evolving scenario, the class of ran-
dom projections realized by multi-mode fibers (MMF)
are promising candidates for realizing ONNs. These de-
vices scramble the photons due to scattering events oc-
curring during the light-field propagation, yielding to the
formation of speckle patterns that are, in fact, random
projections. Although the light transmission can be re-
garded as a linear process [21] in which input modes are
coupled with output modes via a complex transmission
rule, interference takes place when dealing with the mea-
surement of the light-field intensity. Since the detection is
nonlinear, MMFs can be used [22] to classify time-domain
waveforms (using saturation effects as further nonlinear-
ity) [23], in pattern classification of 2-bits sequences [24],
or for binary (human, not human) facial recognition [25].
Furthermore, when dealing with more complex classifi-
cation tasks, high-power laser pulses were employed to
trigger the nonlinear response of the fiber itself [26]. Due
to the increasing interest in the employment of MMFs
as random projector computing devices, we decided to
study their behavior in carrying out classification tasks
in a linear, low-power continuum regime. Although our
MMF-based optical neural network does not employ feed-
back, we will see how its performances in classification are
considerable, as in reservoir computing systems [27–30].
We do this by comparing the performance of the phys-
ical neural network to that obtained with random Gaus-
sian linear projections and to that of a transmission ma-
trix approach, the model commonly used to describe light
propagation in disordered structures [21, 31]. We per-
form our study statistically, shuffling the training set to
assess the average behavior of the optical computing un-
der different training and initialization conditions. Re-
markably, a single MMF provides simultaneously two in-
dependent (though deterministically linked) projections
at both edges of the fiber, which we study separately
using different saturation regimes. Here, we show that
the real physical MMF leads to accuracy higher than its
corresponding transmission matrix model, highlighting
the reality gap between model theory and experimental
results. To assess the reason of this performance gap,
we study the characteristic of complex-valued random
projections in terms of flatness of the local energy land-
scape, proving that the MMF projection is more robust
than those provided by alternative datasets. Addition-
ally, we characterize the behavior of a hardware-based
neural network using optical fibers in terms of the num-
bers of the modes employed. We have set up our study
not for achieving the best performance in classification
tasks, but rather to deepen the understanding of physi-
cal neural networks against their physical model, giving
insights on the usage of MMF fibers for optical compu-
tation.
II. MATERIALS AND METHODS
In a low-power regime, a generic multi-mode fiber
transports the electromagnetic field via a linear process
[21] so that the light propagation can be described using
a simple multiplication of the input signal by a matrix
that encodes the transmission rule:
y=Tx.(1)
In this descriptive model, xis the controlled input, Tis
the (complex-valued and typically unknown) transmis-
sion matrix of the medium, and yis the output field.
Despite its propagation, the way we measure the MMF
output is not linear for two reasons. First, photons carry
complex signals, i.e., the electromagnetic field associated
to each propagation mode is characterized by amplitude
and phase. Current electronic devices cannot follow the
rapid oscillation of the field, which makes impossible the
measure of the phase information. Assuming the possi-
bility that the readout is also perturbed by an additive
noise ε, the camera only sees the noise affected intensity
distribution:
|y|2=|Tx|2+ε. (2)
Second, the camera has a well defined sensitivity range
that depends on each pixels capability to store intensity
change. If the signal reaching a given pixel exceeds the
sensitivity, the measure gets clipped at the peak (over-
exposure) or at the bottom (underexposure). In analogy
to machine learning terminology, the measuring process
can be described by a non-linear activation function σ(·)
that acts on the result of a complex-valued linear trans-
mission, Tx. For instance, the camera recording process
can be represented using the saturating linear transfer
function (SatLin):
σ(Tx) = min max d, |Tx|2+ε,2b1,(3)
where the quantity dis the intensity threshold under
which the measure is not recorded, and bis the bit depth
of the camera.
These considerations make the readout of a coherent
field non-linear, as well as its inverse transmission recov-
ery problem [21, 31–36]. Such matrix can be estimated
by using the four-phases method [21], Bayesian optimiza-
tion [37], or iterative Gerchberg-Saxton schemes [38, 39].
However, the characterization of the device in terms of
its transmission rule is not the main scope of this paper,
nor to circumvent the limitations of the measuring pro-
cess. Instead, we want to study the multi-modal random-
projection nature of the fiber to perform optical com-
puting. In the neural network framework, the fiber can
3
FIG. 1. Schematics of the shallow optical neural network with the MMF. Panel a), simplified scheme of light transport through
multi-mode fibers. The MNIST data set, modulated by a spatial light modulator, enters the MMF on the input facet. During
propagation, the light gets scrambled with a random but deterministic process, giving rise to the speckle pattern measured in
the camera. Panel b), corresponding neural network interpretation of the light propagation scheme. The MNIST set constitutes
the input vector of a linear complex layer with static weights. The non-linear operation is determined by the camera that reads
the intensity of a complex field. Successively, a linear classification layer is trained using the output of the fiber. Panel c),
scheme of the imaging setup.
be seen as an optical analogous of a densely-connected
network composed by a single “hidden layer” with fixed
weights [40]. In this shallow architecture, the MMF layer
already contains a particular realization of static weights
(the transmission matrix T), which depends upon the
physical status of the optical fiber. This property allows
to perform random -but deterministic- projections at the
speed of light using a fixed transmission rule, which can
be read out by the camera. Given these considerations,
the MMF is a good candidate to perform non-linear opti-
cal computation using continuous laser source even using
inexpensive and large (thus easier to handle in a setup)
optical multi-mode fibers. In particular, if we let just a
few modes propagate into the input facet of an MMF that
supports many more, all the output modes will be acti-
vated, implying a mapping of the kind few-to-many. In
this latter case, the optical hidden layer (i.e., MMF and
camera) can perform densely-connected random projec-
tions on a higher dimensional space.
The goal of this study is to carry out image classifica-
tion by concatenating a software-trained linear layer to
the measured output of a MMF, produced by inserting a
given image from the dataset into the input edge of the
fiber. We choose to approach the MNIST classification
problem in order to carry out a widely studied non-linear
task. The only parameters that we train are those of a
simple logistic regression layer, which is known to achieve
poor performances on the standard MNIST set, reaching
a maximum classification accuracy of 92.7% [29]. Ex-
ploiting random projection provided by the MMF, an
optical device that is known to be linear, we compare
with the performances obtained using reference datasets.
In this study, we train the parameters of the logistic clas-
sifier using six different input datasets:
1. Original MNIST. The standard MNIST dataset,
constituted by images of l×lpixels. The accu-
racy performance of this set is the baseline of our
study.
2. Upscaled MNIST. Each image at the original res-
olution is expanded by a factor L/l using a linear
spline interpolation to reach the target size of L×L.
3. Randomized MNIST. The MNIST set is linearly
multiplied with a Gaussian random matrix with
positive entries. This maps the dataset into a
higher-dimension space, producing images with a
side Llpixels.
4. MMF α-cam. The speckled output of the MMF is
recorded with a resolution of L×Lpixels. Each
speckle pattern is the result of sending a MNIST
image on the input edge of the fiber, recording the
output after disordered propagation. The patterns
摘要:

Low-powermulti-modefiberprojectorovercomesshallowneuralnetworksclassifiersDanieleAncora1,3,5,∗MatteoNegri1,3,AntonioGianfrate2,DimitrisTrypogeorgos2,LorenzoDominici2,DanieleSanvitto2,FedericoRicci-Tersenghi1,3,4,andLucaLeuzzi3,11DepartmentofPhysics,Universit`adiRomalaSapienza,PiazzaleAldoMoro5,I-001...

展开>> 收起<<
Low-power multi-mode fiber projector overcomes shallow neural networks classifiers Daniele Ancora135Matteo Negri13 Antonio Gianfrate2 Dimitris Trypogeorgos2 Lorenzo Dominici2 Daniele Sanvitto2 Federico Ricci-Tersenghi134 and Luca Leuzzi31.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:5.65MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注