2   
 
introduced  to  design  ultra-compact,  highly  efficient  and 
complementary  metal-oxide-semiconductor  (CMOS)-
compatible  integrated  photonic  devices.  For  instance,  an 
approach  based  on  convex  optimization  is  presented  to 
design  several  wavelength  demultiplexers  [7–10].  Also, 
direct-binary  search  algorithm  is  applied  to  design  a 
polarization beamsplitter and a polarization rotator [11, 12]. 
Lately,  machine  learning  and  artificial  neural  networks 
have  attracted  great  interest  from  researchers  and  enabled 
design  of  photonic  devices  in  a  different  manner.  For 
example,  deep  learning  accelerated  the  design  of  an  all-
dielectric  metasurface  structures  [13,  14].  Also,  Bragg 
grating  devices  are  obtained  by  training  artificial  neural 
networks [15]. In addition, a machine learning algorithm is 
proposed  for  focusing  and  optical  coupling  of  light  [16]. 
Moreover,  a  reinforcement  learning  is  applied  to  design 
optical coupler and asymmetric light transmitter devices [17]. 
In  the  present  study,  we  demonstrate  the  design  of 
wavelength  demultiplexers  and  a  polarization  beamsplitter 
operating at near-infrared wavelengths via attractor selection 
(AttSel)  algorithm  which  is  considered  as  reinforcement 
learning and based on artificial neural networks [18, 19]. For 
this  purpose,  we  combined  the  algorithm  and  three-
dimensional  (3D)  finite-difference  time-domain  (FDTD) 
method  [20].  The  designed  structures  exhibit  high  optical 
performance  in  an  ultra-compact  area  and  their  numerical 
investigations are presented in detail. It should be noted that 
fabrication constraints are considered throughout the design 
process  [11]  which  enables  the  possible  fabrication  of 
devices in the future. We believe that the proposed approach 
is not only restricted to design integrated photonic devices, 
but it may lead the advances in different photonic designs. 
 
2. Design approach and numerical investigation of 
integrated photonic devices 
Even  though  photonic  integrated  circuits  have  superior 
features  comparing  to  integrated  electronic  circuits,  they 
have a main deficiency of lower integration density [21, 22]. 
On  the  other  hand,  there  is  a  trade-off  between  optical 
performance and structural dimensions of a photonic device. 
For  this  reason,  designing  a  highly  efficient  photonic 
structure  with  small  footprint  is  a  challenge  which  can  be 
overcome  by  applying  advanced  search  algorithms  or 
machine learning. 
In general, machine learning is divided into two branches 
such  as  supervised  learning  and  unsupervised  learning.  In 
these branches, simply, a data set is considered to train and 
test  the  learning  algorithm  where  data  are  independently 
collected,  i.e.,  the  algorithm  does  not  have  any  effect  on 
during  the  collection  process  of  data.  On  the  other  hand, 
artificial  neural  networks  (ANNs)  are  considered  as  the 
subclass of machine learning methods which is powerful for 
modelling nonlinear relationships. 
In a very recent study, ANNs are applied to characterize 
the cases of “forward” and “inverse” designs [15]. An ANNs 
is  applied  to  map  four  structural  parameters  of  Bragg 
gratings to two optical characteristics. In forward modelling, 
the  selected  four  structural  parameters  are  considered  as 
input  to  ANNs  whereas  the  two  optical  characteristics  are 
used  as  output.  In  inverse  modelling,  the  input  and  output 
parameters are switched. In other words, two optical values 
are inserted as inputs to ANNs whereas four structural values 
are  introduced  as  outputs.  As  a  result,  as  expected,  ANNs 
easily  found  the  nonlinear  relationship  between  small 
number  of  input  and  output  parameters.  In  similar  cases, 
ANNs  is  applied  for  a  regression  problem  which  is 
considered as a branch of supervised learning. 
However,  in  the  studies  of  inverse  design  of  photonic 
devices, hundreds of structural parameters are optimized to 
find only one or small number of optical characteristics [1–
12]. In this case, for inverse modelling, a simple regression 
method  based  on  ANNs  would  not  be  able  to  find  the 
characteristics between large number of structural parameters 
and  small  number  of  output  values.  The  reason  of  this 
possible failure is that the small number of input parameters 
may  not be  informative enough  for ANNs  to  map  them to 
large number of output parameters. Moreover, in the case of 
inverse modelling, if a data set does not contain any results 
of  good  optical  characteristics,  ANNs  would  not  predict 
structural parameters for a desired optical performance since 
ANNs do not operate as search/optimization algorithms. 
To  overcome  this  issue,  another  branch  of  machine 
learning  known  as  reinforcement  learning  would  be 
reasonable  to  apply  for  photonic  designs.  In  the 
reinforcement  learning,  the  algorithm  works  during  the 
sampling  of  data  set  which  differs  from  supervised  and 
unsupervised  learning.  Therefore,  a  reinforcement  learning 
algorithm can find the values of large number of structural 
parameters  for  even  small  number  of  desired  optical 
characteristics. For this reason, in our recent work, we have 
applied  AttSel algorithm which is a  reinforcement  learning 
method,  to  design  optical  coupler  and  asymmetric  light 
transmitter [17]. 
AttSel  models  the  interaction  of  the  metabolic  reaction 
network and the gene regulatory network in a cell [18]. The 
cell  growth  requires  converting  the  nutrition  in  the 
environment  by  metabolic  reactions  of  proteins  to  the 
substances necessary for the growth. The proteins that carry 
out  this  conversion  are  produced  by  the  gene  reaction 
network,  where  each  gene  has  an  expression  level  for 
controlling  the  protein  production  level.  The  expression 
levels  change  with  the  rate  of  substance  production.  If  the 
substance  production  rate  is  high,  it  implies  that  the 
conditions are favorable, so the state is saved as an attractor