
4
The model we use to construct an MSM for this system was
previously developed to study equilibrium folding of colloidal
chains20. To summarize the model, a state space is defined by
enumerating all possible adjacency matrices describing bonds
between the six particles. For each adjacency matrix, the rate
of forming a new bond and the probability distribution for
which bond forms first are estimated using Brownian Dynam-
ics simulations. This specifies all the forward rates of a rate
matrix, which are independent of interaction strength. For a
given set of interaction strengths, the equilibrium probabilities
of each adjacency matrix are measured using a Monte Carlo
sampler. The backward rates are then set from the forward
rates by imposing detailed balance with respect to the esti-
mated equilibrium measure. A re-weighting procedure can
then be used to evaluate the backward rates for other values of
the interaction strengths. The resulting rate matrix can be con-
verted into a probability transition matrix by exponentiation,
at which point the optimization can be performed as described
above.
These simulations are performed in non-dimensionalized
form. Positions, energies, and time are respectively scaled
by particle diameter σ,kBT, and a reference time tc. The re-
sulting equations have a non-dimensional diffusion coefficient
=Dtc/σ2, where Dis the dimensional diffusion coeffi-
cient. We set = 1, and use experimentally measured values
σ= 1.3µm and D= 0.065µm2/s for DNA coated colloids77
to obtain an order-of-magnitude estimate of tc≈17s.
2. Conical Subunit Assembly On a Nanoparticle
We study a system adapted from Ref.78, consisting of two
types of conical subunits that assemble into capsids on the sur-
face of a spherical nanoparticle. The subunits are rigid bodies
comprised of six beads of increasing radius, consistent with a
cone angle, α. Each of the four interior beads interacts attrac-
tively with the corresponding bead on other subunits, through
a Morse potential. The innermost bead has an attractive Morse
interaction with the nanoparticle.
The two subunit types have relative diameters such that they
correspond to pentamers (P) and hexamers (H) in the lowest
energy icosahedral capsid structure79. There are two types of
interactions between the subunits; an H-H attraction and an H-
P attraction. The strength of these interactions can be tuned by
varying the well-depth of the corresponding Morse potential,
EHH and EHP, respectively. Interaction strengths are chosen
in the range 1.2kBTto 1.8kBT, which can facilitate assembly
on the nanoparticle surface, but typically is too weak to drive
nucleation in the bulk. The strength of the attraction to the
nanoparticle is held constant at ENH = 7kBTfor hexamers
and ENP = 6.3kBTfor pentamers. The system contains one
nanoparticle, 86 pentamer subunits and 214 hexamer subunits,
so that subunits are in excess and the chemical potential of
free subunits remains nearly constant throughout the assem-
bly process. We consider a cubic box with side length 120l0,
where l0= 1nm is set as the unit length scale for the system.
Assembly simulations are performed using the Brownian dy-
namics algorithm in HOOMD80 with periodic boundary con-
ditions. Times are measured in terms of the HOOMD derived
unit, t0=pm/kBT l0, where mis the subunit mass. We also
construct an MSM for the system by estimating the transition
matrix elements from these Brownian dynamics simulations
(section II D). Further details on the simulations are provided
in the SM.
The structures with the lowest potential energy (per parti-
cle) for this system are capsids with either T= 4 icosahe-
dral symmetry or D5symmetry, which both contain 30 hex-
amer and 12 pentamer subunits with the same number of H-H
and H-P bonds81. However, in finite-time dynamical simula-
tions assembly usually results in asymmetric structures that,
although containing about 42 subunits, have one or more de-
fects. Figure 2 depicts the five most probable end structures
after Tf= 8 ×105t0, for the parameters that maximized the
yield of symmetric capsids (T= 4 and D5). The observation
time Tfwas chosen as the timescale after which the yield only
grows logarithmically for most parameter sets that we focus
on. We also compare the finite-time yields to the equilibrium
yields, which we estimated using the MSM. This comparison
shows that the system is far from reaching equilibrium at Tf,
and from the MSM we estimate that approaching equilibrium
would require a timescale of approximately 60Tf.
These observations are consistent with the competition
among thermodynamic and kinetic effects that arise for con-
stant assembly driving forces. For very low subunit-subunit
interactions or subunit concentrations, assembly is either ther-
modynamically unfavorable or does not nucleate on relevant
timescales, while overly high interactions or concentrations
cause the system to become trapped in metastable structures.
Since subunit-subunit interactions must be broken to exit a
metastable state, reconfiguration timescales increase exponen-
tially with interaction strengths. In our system, while the
T= 4 and D5structures are energetically favored, the other
metastable states that we observe are more kinetically acces-
sible under interaction strengths that enable nucleation on rel-
evant timescales and thus occur in most assembly trajectories.
These competing effects suggest that assembly yields and
fidelity could be enhanced by time-dependent sequences of
interactions, which can drive rapid nucleation while also
avoiding kinetic traps by facilitating reconfiguration from
metastable states into the ground state(s). To investigate
this possibility, we apply our optimization algorithm to the
cones system. Our objective is to compute an optimal time-
dependent protocol for EHH and EHP interactions that maxi-
mizes the yield of the symmetric T= 4 and D5capsids at a
specified observation time.
D. Constructing Markov State Models
To apply the protocol optimization algorithm in Eq. (5),
transition matrices need to be evaluated for the possible pro-
tocol values that are encountered during maximization of the
objective function. While this analysis is computationally ef-
ficient for the colloidal polymer system because we have a
semi-analytic Markov model, we are not so fortunate for the
cones model. Similarly, for most relevant systems it will not