Optimization of Non-Equilibrium Self-Assembly Protocols Using Markov State Models Anthony Trubiano1aand Michael F . Hagan1b

2025-04-26 0 0 2.87MB 20 页 10玖币
侵权投诉
Optimization of Non-Equilibrium Self-Assembly Protocols Using Markov
State Models
Anthony Trubiano1, a) and Michael F. Hagan1, b)
Martin Fisher School of Physics, Brandeis University, Waltham, Massachusetts 02454,
USA
(Dated: 13 October 2022)
The promise of self-assembly to enable bottom-up formation of materials with prescribed architectures and functions
has driven intensive efforts to uncover rational design principles for maximizing the yield of a target structure. Yet,
despite many successful examples of self-assembly, ensuring kinetic accessibility of the target structure remains an
unsolved problem in many systems. In particular, long-lived kinetic traps can result in assembly times that vastly exceed
experimentally accessible timescales. One proposed solution is to design non-equilibrium assembly protocols in which
system parameters change over time to avoid such kinetic traps. Here, we develop a framework to combine Markov
state model (MSM) analysis with optimal control theory to compute a time-dependent protocol that maximizes the yield
of the target structure at a finite-time. We present an adjoint-based gradient descent method that, in conjunction with
MSMs for a system as a function of its control parameters, enables efficiently optimizing the assembly protocol. We also
describe an interpolation approach to significantly reduce the number of simulations required to construct the MSMs.
We demonstrate our approach on two examples; a simple semi-analytic model for the folding of a polymer of colloidal
particles, and a more complex model for capsid assembly. Our results show that optimizing time-dependent protocols
can achieve significant improvements in yields of selected structures, including equilibrium free energy minima, long-
lived metastable structures, and transient states.
I. INTRODUCTION
Designing building blocks that are pre-programmed to self-
assemble into a target structure has enabled creating micro-
scopic and nano-scale materials with desirable properties and
promising applications1–7. However, achieving high yields
and selective assembly of the target structures remains a crit-
ical unsolved problem in most systems. Thus, there has been
intense research aimed at discovering the governing princi-
ples of self-assembly that would indicate how to optimize the
yield of complex targets, particularly in multifarious systems,
which are capable of assembling different target structures de-
pending on the experimental conditions. Many of these theo-
retical studies investigate self-assembly within the framework
of equilibrium statistical mechanics, in which the thermody-
namic stability of a desired target state is maximized. Suc-
cessful self-assembly has been achieved in various model sys-
tems by optimizing, according to this criterion, the subunit
concentrations8, interaction strengths9,10, particle shape11,12,
and bond specificity13.
Despite the successes of this approach, it suffers from a fun-
damental limitation: the thermodynamic stability of a struc-
ture does not guarantee its kinetic accessibility in the finite
timescales available to experiments, due to the presence of
long-lived intermediates along self-assembly pathways14–19.
Finite-time assembly yields depend on a competition between
thermodynamic and kinetic effects — thermodynamic sta-
bility of a target structure and rapid nucleation are favored
by strong interactions, whereas correction of misassembled
subunits requires weak interactions. Identifying the optimal
a)trubiano@brandeis.edu
b)hagan@brandeis.edu
trade-off between these requirements can be difficult, both
computationally and experimentally20.
As a result, a different route to achieve self-assembly has
been proposed: driving the system out of equilibrium21,22 by
using a time-varying protocol for system parameters23–27. For
example, temperature protocols with heating and cooling cy-
cles have been shown to enhance control over crystal size
distribution in crystallization experiments28,29 as well as se-
lectively assemble a structure distinct from the global free-
energy minimum in lattice simulations of a multi-component
mixture30. There is a broad range of experimental systems
for which this approach can and has been applied; strand dis-
placement reactions can tune interaction strengths between
DNA-coated colloids31,32, light-activated interactions can be
tuned via spatiotemporal intensity protocols33, and system
properties such as temperature, pressure, and concentrations
can be controlled within microfluidic devices34–36. While
the technology to implement such protocols is available, the
field lacks an efficient framework to rationally design and
optimize them. ‘On-the-fly’ methods37, such as the sta-
tistical physics design engine38, and machine learning ap-
proaches, such as automatic differentiation of molecular dy-
namics trajectories39, have been successful for select systems,
but the high computational cost of sampling a large parameter
space under experimentally relevant conditions is prohibitive
for most systems. To bridge this gap, we seek to develop a
method to efficiently compute optimal time-dependent proto-
cols to maximize the yield of a chosen target state, even for
challenging systems such as those exhibiting long assembly
timescales, kinetic trapping, or competing metastable states.
Our approach relies on the construction of Markov State
Models (MSMs)40,41. MSMs are a powerful tool for coarse-
graining the dynamics of complex systems into a reduced-
order form that is tractable to analysis and allows charac-
terizing the system dynamics on timescales that are orders
arXiv:2210.05749v1 [cond-mat.soft] 11 Oct 2022
2
of magnitude longer than those accessible to straightforward
simulations40,42–45. We show that, by taking advantage of
properties satisfied by general MSMs46,47, we can analytically
and efficiently evaluate derivatives of state probabilities with
respect to tunable system parameters for use in gradient-based
optimization. Although our approach has features in com-
mon with previously developed methods to compute feedback
control policies for self-assembly with real-time system sen-
sors (e.g.48–51), it avoids the potentially expensive dynamic
programming calculations in those methods by directly con-
sidering the final time target state probability in the objec-
tive function. We first demonstrate the method on a semi-
analytic Markov model for the assembly of a short polymer of
six colloidal particles in two dimensions. For this system, we
can optimize the time-dependent interactions between differ-
ent particle types to achieve selective, high yield assembly of
multiple structures, including the stable equilibrium structure,
metastable rigid structures, and floppy structures that are only
transient for most constant parameter sets. Our results pose
an interesting choice of tradeoffs for experimental systems;
we show that tuning a time-dependent protocol and increasing
subunit complexity (i.e. the number of distinct subunit types)
can have similar results on yield, but one of these choices may
be easier to implement experimentally.
To investigate generality, we also demonstrate our algo-
rithm on a very different system — a two-parameter model
for capsid assembly on a spherical nanoparticle. For this ex-
ample, there is no analytical form for the transition matrix.
We show how simulations and interpolation can be used to
make such complex problems suitable for our optimization
algorithm. By combining our algorithm with the use of radial
basis function interpolation within parameter space, we de-
velop a framework to dramatically reduce the computational
requirements for protocol optimization. The method is also
robust and reusable in comparison to existing methods; simu-
lations do not need to be rerun to change the initial or target
states in the optimization, as would be the case for ‘on-the-fly’
methods. Using this procedure, we compute time-dependent
protocols to assemble a target state that is highly kinetically
inaccessible; there is approximately a 60% gap between the
estimated equilibrium and dynamical yields. For timescales
we are able to simulate by brute force dynamics, our com-
puted time-dependent protocols increase the target yield by
greater than twofold over the best constant protocol, in the
same amount of assembly time. The MSMs also allow for
probing of longer timescales, for which we predict optimal
time-dependent protocols that achieve yields within 1% of
equilibrium, in orders of magnitude less time than required
to approach equilibrium under a constant protocol.
Importantly, the method is generally applicable to any sys-
tem that can be approximated by an MSM and optimized by
an objective function involving state probabilities, including
multifarious52–57 and reconfigurable assembly systems22,58–66.
Moreover, it can be used as a highly efficient parameter esti-
mation tool for both constant and time-dependent protocols.
II. METHODS
A. Theoretical Setup
We begin by assuming the process of interest is described
by a Markov State Model over the discrete state space, S. For
a given lag time, τ, we define a temporal discretization with
tn=for n= 0, . . . , N. The time-dependent protocol
will be piece-wise constant, represented as {θn}N1
n=0 , where
θnis a constant parameter value from time tnto tn+1. The
transition matrices are then given by Pn=P(θn), which
propagate the state probabilities from tnto tn+1. Extension
to multiple control parameters, ~
θn, as we use in the second
example below, is straightforward.
Let the row vector ~p ndenote the probability distribution
over the system states at time tn. The evolution of this proba-
bility is governed by the forward Kolmogorov equation,
~p n+1 =~p nP(θn), ~p 0=~p0.(1)
For a set of target states, B∈ S, we are interested in optimiz-
ing PN
B=PiBpN
i, subject to the constraint that the proto-
col does not change too rapidly. To this end, we maximize the
objective function
Φ[θ] = PN
Bλ
2
N1
X
n=0 θn+1 θn
τ2
,(2)
where the second term is a smoothing penalty function whose
strength is controlled by λ > 0. We will report this smooth-
ing parameter as λ=λ/(Nτ2), which is normalized with
respect to the timescales of the system. It is straightforward
to add additional terms to Eq. (2), for example to limit the
magnitude of the control parameter or maximize the speed of
assembly.
B. Protocol Optimization
We compute the gradient of PN
Bvia an adjoint method67,68.
The adjoint equation is the backward Kolmogorov equation,
~
Fn=P(θn)~
Fn+1,~
FN=~
1B,(3)
where Fnis the adjoint variable at time tn, and ~
1Bis the
indicator vector for the set B. This equation is prescribed at
a final condition and solved backward in time. By solving
Equations (1) and (3) for all time, the gradient components
can be computed (see the SM for details) as
P N
B
θk
=~p kPk
θk
~
Fk+1.(4)
The derivative of the penalty term with respect to θkcan
be interpreted as a discretization of the second time derivative
of the protocol. The gradient descent update step can then
be seen as solving a discretized diffusion equation, where the
3
0
20
40
60
80
100
Yield (%)
0
20
40
60
80
100
Yield (%)
Parameter Set Parameter Set
F1 F2 F3 F4&5 TD1 TD2 TD3 TD4 TD5
1
2
3
4
5
(a) (b) (c)
FIG. 1. Visualization and statistics for the colloidal polymer system. (a) Depiction of the initial colloidal chain with alternating red-blue
particle types and fixed backbone (shown in white), as well as example configurations of five states for which we optimize yields. Note
that only a single representative permutation is shown for each state, but reported yields are combined over all consistent permutations. (b)
Histograms of the yield for each of the five states in (a) (color-coded) for four sets of fixed parameter values. The number in the parameter set
label indicates which state’s yield is being maximized. Yields are computed by averaging over 600 Brownian Dynamics trajectories with the
given parameter values. The final time is Tf= 5 for state 1,Tf= 7.5for state 3, and Tf= 10 for the others, reported as a non-dimensional
simulation time. (c) Same as (b), but the parameter sets are now the optimal time-dependent protocols.
gradient of PN
Bacts as a source term. For stability reasons,
we solve this equation with the IMEX69 scheme
θj+1
k=θj
k+hP N,j
B
θk
+λh
τ2θj+1
k+1 2θj+1
k+θj+1
k1,(5)
where jindexes the iteration number, PN,j
B=PN
B[θj], and h
is a step-size to be chosen. See the SM for details.
C. Model Systems
We demonstrate our optimization algorithm on two sys-
tems, which involve different levels of complexity to con-
struct the MSM. The first example is a simple model for short
polymer chains constructed from colloidal particles with pro-
grammable interactions, which enables analytical construc-
tion of an MSM. The second example is a model for viral
capsid assembly in which two types of conical subunits as-
semble on the surface of a spherical nanoparticle. In this case
we construct an MSM by estimating transition rate matrix el-
ements from an ensemble of short unbiased simulations.
1. Colloidal Chain Folding
The first model describes the folding of an initially linear
polymer made up of six colloidal particles in two dimensions.
The folding problem has many features in common with as-
sembly, including analogous thermodynamic and kinetic ef-
fects, and provides a simple illustration of the optimal control
method. The interactions between each particle in the chain
can be programmed, for example by coating the surface of
the particles with strands of DNA70. Further, these interac-
tions can be varied in time by modulating the melting temper-
atures of different strand types71 and applying a temperature
protocol. Depending on the choice of interaction strengths,
the system can fold into one of three ground states as well
as a number of floppy states, which have been characterized
both experimentally72,73 and via theory and simulation20. It
has been shown for this system, as well as many others9,74–76,
that increasing the number of subunit types allows for more
efficient assembly of a target structure. However, increasing
the number of subunit types generally makes systems more
susceptible to kinetic traps and thus more sensitive to parame-
ter variations. Further, additional species require greater costs
in materials development and synthesis. Here we aim to in-
vestigate how time-dependent interaction strengths can boost
folding yields, as well as identify tradeoffs between subunit
complexity (number of different subunit types) and protocol
complexity.
To study the minimal subunit complexity case that allows
for multiple free energy minima, we allow for only two types
of particles, which we will distinguish as red and blue. We
assume the initial chain has an alternating ordering of these
particles, i.e. red, blue, red, blue, and so on. Figure 1(a)
shows this initial chain as well as five potential folded struc-
tures. Structures 1through 3are the rigid ground states for
this system, while structures 4and 5are floppy structures that
are typically folding intermediates, but can be stabilized by
setting some of the interactions strengths to zero. Figure 1(b)
shows histograms of the yields of each of these structures un-
der a number of parameter sets, which consist of fixed values
for each of the three interaction strengths; ERR,ERB, and EBB,
which are the well-depths for a short-ranged Morse potential.
We denote parameter set ias the one that maximizes the finite-
time yield of structure i. We find that relatively large yields
are already possible for most structures using fixed protocols,
except for structure 3, the triangle. While it is possible to fur-
ther boost yields of some of these structures by changing the
particle type distribution, we investigate what can be achieved
using this fixed ordering along with time-dependent protocols.
4
The model we use to construct an MSM for this system was
previously developed to study equilibrium folding of colloidal
chains20. To summarize the model, a state space is defined by
enumerating all possible adjacency matrices describing bonds
between the six particles. For each adjacency matrix, the rate
of forming a new bond and the probability distribution for
which bond forms first are estimated using Brownian Dynam-
ics simulations. This specifies all the forward rates of a rate
matrix, which are independent of interaction strength. For a
given set of interaction strengths, the equilibrium probabilities
of each adjacency matrix are measured using a Monte Carlo
sampler. The backward rates are then set from the forward
rates by imposing detailed balance with respect to the esti-
mated equilibrium measure. A re-weighting procedure can
then be used to evaluate the backward rates for other values of
the interaction strengths. The resulting rate matrix can be con-
verted into a probability transition matrix by exponentiation,
at which point the optimization can be performed as described
above.
These simulations are performed in non-dimensionalized
form. Positions, energies, and time are respectively scaled
by particle diameter σ,kBT, and a reference time tc. The re-
sulting equations have a non-dimensional diffusion coefficient
=Dtc2, where Dis the dimensional diffusion coeffi-
cient. We set = 1, and use experimentally measured values
σ= 1.3µm and D= 0.065µm2/s for DNA coated colloids77
to obtain an order-of-magnitude estimate of tc17s.
2. Conical Subunit Assembly On a Nanoparticle
We study a system adapted from Ref.78, consisting of two
types of conical subunits that assemble into capsids on the sur-
face of a spherical nanoparticle. The subunits are rigid bodies
comprised of six beads of increasing radius, consistent with a
cone angle, α. Each of the four interior beads interacts attrac-
tively with the corresponding bead on other subunits, through
a Morse potential. The innermost bead has an attractive Morse
interaction with the nanoparticle.
The two subunit types have relative diameters such that they
correspond to pentamers (P) and hexamers (H) in the lowest
energy icosahedral capsid structure79. There are two types of
interactions between the subunits; an H-H attraction and an H-
P attraction. The strength of these interactions can be tuned by
varying the well-depth of the corresponding Morse potential,
EHH and EHP, respectively. Interaction strengths are chosen
in the range 1.2kBTto 1.8kBT, which can facilitate assembly
on the nanoparticle surface, but typically is too weak to drive
nucleation in the bulk. The strength of the attraction to the
nanoparticle is held constant at ENH = 7kBTfor hexamers
and ENP = 6.3kBTfor pentamers. The system contains one
nanoparticle, 86 pentamer subunits and 214 hexamer subunits,
so that subunits are in excess and the chemical potential of
free subunits remains nearly constant throughout the assem-
bly process. We consider a cubic box with side length 120l0,
where l0= 1nm is set as the unit length scale for the system.
Assembly simulations are performed using the Brownian dy-
namics algorithm in HOOMD80 with periodic boundary con-
ditions. Times are measured in terms of the HOOMD derived
unit, t0=pm/kBT l0, where mis the subunit mass. We also
construct an MSM for the system by estimating the transition
matrix elements from these Brownian dynamics simulations
(section II D). Further details on the simulations are provided
in the SM.
The structures with the lowest potential energy (per parti-
cle) for this system are capsids with either T= 4 icosahe-
dral symmetry or D5symmetry, which both contain 30 hex-
amer and 12 pentamer subunits with the same number of H-H
and H-P bonds81. However, in finite-time dynamical simula-
tions assembly usually results in asymmetric structures that,
although containing about 42 subunits, have one or more de-
fects. Figure 2 depicts the five most probable end structures
after Tf= 8 ×105t0, for the parameters that maximized the
yield of symmetric capsids (T= 4 and D5). The observation
time Tfwas chosen as the timescale after which the yield only
grows logarithmically for most parameter sets that we focus
on. We also compare the finite-time yields to the equilibrium
yields, which we estimated using the MSM. This comparison
shows that the system is far from reaching equilibrium at Tf,
and from the MSM we estimate that approaching equilibrium
would require a timescale of approximately 60Tf.
These observations are consistent with the competition
among thermodynamic and kinetic effects that arise for con-
stant assembly driving forces. For very low subunit-subunit
interactions or subunit concentrations, assembly is either ther-
modynamically unfavorable or does not nucleate on relevant
timescales, while overly high interactions or concentrations
cause the system to become trapped in metastable structures.
Since subunit-subunit interactions must be broken to exit a
metastable state, reconfiguration timescales increase exponen-
tially with interaction strengths. In our system, while the
T= 4 and D5structures are energetically favored, the other
metastable states that we observe are more kinetically acces-
sible under interaction strengths that enable nucleation on rel-
evant timescales and thus occur in most assembly trajectories.
These competing effects suggest that assembly yields and
fidelity could be enhanced by time-dependent sequences of
interactions, which can drive rapid nucleation while also
avoiding kinetic traps by facilitating reconfiguration from
metastable states into the ground state(s). To investigate
this possibility, we apply our optimization algorithm to the
cones system. Our objective is to compute an optimal time-
dependent protocol for EHH and EHP interactions that maxi-
mizes the yield of the symmetric T= 4 and D5capsids at a
specified observation time.
D. Constructing Markov State Models
To apply the protocol optimization algorithm in Eq. (5),
transition matrices need to be evaluated for the possible pro-
tocol values that are encountered during maximization of the
objective function. While this analysis is computationally ef-
ficient for the colloidal polymer system because we have a
semi-analytic Markov model, we are not so fortunate for the
cones model. Similarly, for most relevant systems it will not
摘要:

OptimizationofNon-EquilibriumSelf-AssemblyProtocolsUsingMarkovStateModelsAnthonyTrubiano1,a)andMichaelF.Hagan1,b)MartinFisherSchoolofPhysics,BrandeisUniversity,Waltham,Massachusetts02454,USA(Dated:13October2022)Thepromiseofself-assemblytoenablebottom-upformationofmaterialswithprescribedarchitectures...

展开>> 收起<<
Optimization of Non-Equilibrium Self-Assembly Protocols Using Markov State Models Anthony Trubiano1aand Michael F . Hagan1b.pdf

共20页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:20 页 大小:2.87MB 格式:PDF 时间:2025-04-26

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 20
客服
关注