
APREPRINT - OCTOBER 11, 2022
successfully employed to enhance the directionality of a white LED while maintaining the desired color temperature
[
18
]. Designing multilayer thin-films [
19
,
20
,
21
] has been a task in the nanophotonics community for a long time
and sophisticated techniques for the synthesis of thin-films, which exhibit desired optical characteristics have been
developed in open-source or commercially available software [
22
,
23
,
24
,
25
,
26
]. Methods such as the Fourier method
[
27
,
28
] or the needle method [
29
,
28
,
30
,
21
] compute the position inside the thin-film where the introduction of a new
layer is most beneficial. Then the software will continue with a refinement process, often based on a gradient-based
optimization such as the Levenberg-Marquardt algorithm [
31
,
32
], until it reaches a local minimum where it will
then introduce another layer. Although the software will often converge to a satisfying solution with respect to the
given target, the presented solutions often use excessive amounts of layers and the optimization is still limited by the
selected parameters in the beginning of the optimization. The problem of converging to local optima was tackled in
the past by the development of numerous global optimization techniques which have been introduced and tested in
the field of thin-film optimization [
33
,
34
,
35
,
36
,
37
]. Recently, the innovations of machine learning attracted much
interest in the thin-film community and resulted in interesting new ways to create thin films [5, 38]. Particularly, deep
reinforcement learning or Q-learning showed promising results in designing new and efficient multilayer thin-films
while punishing complicated designs, which employ many layers [
39
,
40
] and require targets that are difficult to achieve
with conventional optimization.
In this work we employ so called conditional Invertible Neural Networks (cINNs) [
41
] to directly infer the loss
landscape of all thin-film configurations with a fixed number of layers and material choice. The cINN learns to map the
thin-film configuration to a latent space, conditional on the optical properties, ie. the reflectivity of a thin-film. During
inference, due to the invertibility of the architecture, the cINN maps selected points from the latent space to their most
likely thin-film configurations, conditional on a chosen target. This results in requiring only a single application of the
cINN to obtain the most likely thin-film configuration given an optical target. Additionally, the log-likelihood training
makes the occurrence of mode-collapse [
42
] almost impossible. For thin-films, many different configurations lead to
similar optical properties. For conventional optimization, this leads to the convergence of the optimization to possibly
unfavorable local minima. A cINN circumvents this due to the properties of the latent space - by varying the points
in the latent space, a perfectly trained cINN is able to predict any possible thin-film configuration that satisfies the
desired optical properties. In this work, we investigated how good the generative capabilities of a cINN are for finding
suitable thin-film configurations in a real-world application. We present an optimization algorithm, which is suitable to
improve the thin-film predictions of the cINN. Then, we compared the optimization results of the presented algorithm to
state-of-the-art software. Finally, we discuss the limitations of the approach and give a guideline when the application
of a cINN is advantageous.
2 Normalizing flows and conditional invertible neural networks
Invertible neural networks are closely related to normalizing flows, which were first popularized by Dinh et. al. [
43
]. A
normalizing flow is an architecture that connects two probability distributions by a series of invertible transformations.
The idea is to map a complex probability distribution to a known and simple distribution such as a Gaussian distribution.
This can be used both for density estimation, but also for sampling since points can easily be sampled with a Gaussian
distribution and mapped to the complex distribution via the normalizing flow. The architecture of a normalizing flow is
constructed from the following. Assume two probability distributions,
π
which is known and for which
z∼π(z)
holds
and the complex, unknown distribution p. The mapping between both is given by the change-of-variables formula
p(x) = π(z)
det ∂z
∂x
.(1)
Consider a transformation fwhich maps f(x) = z. Then the change-of-variables formula can be written as
p(x) = π(z)
det ∂f(x)
∂x
.(2)
The transformation
f
can be given by a series of invertible transformations
f=fK◦fK−1. . . ◦f0
with
x=zK=
f(z0)=(fK◦. . . ◦f0)(z0)
. Then, the probability density at any intermediate point is given by
pi(xi) = zi=fi(zi−1)
.
By rewriting the change-of-variables formula and taking the logarithm one obtains
log (p(x)) = log π(z0)
K
Y
i=1
det ∂fi(zi−1)
∂zi−1
−1!= log (π(z0)) −
K
X
i=1
log
det ∂fi(zi−1)
∂zi−1
.(3)
To be practical, a key component of any transformation of a normalizing flow is that the Jacobian determinant of
the individual transformations must be easy to compute. A suitable invertible transformation, which is sufficiently
2