Optimization of Non-Equilibrium Self-Assembly Protocols Using Markov State Models Anthony Trubiano1aand Michael F . Hagan1b

2025-04-26 0 0 2.87MB 20 页 10玖币

侵权投诉

Optimization of Non-Equilibrium Self-Assembly Protocols Using Markov

State Models

Anthony Trubiano1, a) and Michael F. Hagan1, b)

Martin Fisher School of Physics, Brandeis University, Waltham, Massachusetts 02454,

USA

(Dated: 13 October 2022)

The promise of self-assembly to enable bottom-up formation of materials with prescribed architectures and functions

has driven intensive efforts to uncover rational design principles for maximizing the yield of a target structure. Yet,

despite many successful examples of self-assembly, ensuring kinetic accessibility of the target structure remains an

unsolved problem in many systems. In particular, long-lived kinetic traps can result in assembly times that vastly exceed

experimentally accessible timescales. One proposed solution is to design non-equilibrium assembly protocols in which

system parameters change over time to avoid such kinetic traps. Here, we develop a framework to combine Markov

state model (MSM) analysis with optimal control theory to compute a time-dependent protocol that maximizes the yield

of the target structure at a ﬁnite-time. We present an adjoint-based gradient descent method that, in conjunction with

MSMs for a system as a function of its control parameters, enables efﬁciently optimizing the assembly protocol. We also

describe an interpolation approach to signiﬁcantly reduce the number of simulations required to construct the MSMs.

We demonstrate our approach on two examples; a simple semi-analytic model for the folding of a polymer of colloidal

particles, and a more complex model for capsid assembly. Our results show that optimizing time-dependent protocols

can achieve signiﬁcant improvements in yields of selected structures, including equilibrium free energy minima, long-

lived metastable structures, and transient states.

I. INTRODUCTION

Designing building blocks that are pre-programmed to self-

assemble into a target structure has enabled creating micro-

scopic and nano-scale materials with desirable properties and

promising applications1–7. However, achieving high yields

and selective assembly of the target structures remains a crit-

ical unsolved problem in most systems. Thus, there has been

intense research aimed at discovering the governing princi-

ples of self-assembly that would indicate how to optimize the

yield of complex targets, particularly in multifarious systems,

which are capable of assembling different target structures de-

pending on the experimental conditions. Many of these theo-

retical studies investigate self-assembly within the framework

of equilibrium statistical mechanics, in which the thermody-

namic stability of a desired target state is maximized. Suc-

cessful self-assembly has been achieved in various model sys-

tems by optimizing, according to this criterion, the subunit

concentrations8, interaction strengths9,10, particle shape11,12,

and bond speciﬁcity13.

Despite the successes of this approach, it suffers from a fun-

damental limitation: the thermodynamic stability of a struc-

ture does not guarantee its kinetic accessibility in the ﬁnite

timescales available to experiments, due to the presence of

long-lived intermediates along self-assembly pathways14–19.

Finite-time assembly yields depend on a competition between

thermodynamic and kinetic effects — thermodynamic sta-

bility of a target structure and rapid nucleation are favored

by strong interactions, whereas correction of misassembled

subunits requires weak interactions. Identifying the optimal

a)trubiano@brandeis.edu

b)hagan@brandeis.edu

trade-off between these requirements can be difﬁcult, both

computationally and experimentally20.

As a result, a different route to achieve self-assembly has

been proposed: driving the system out of equilibrium21,22 by

using a time-varying protocol for system parameters23–27. For

example, temperature protocols with heating and cooling cy-

cles have been shown to enhance control over crystal size

distribution in crystallization experiments28,29 as well as se-

lectively assemble a structure distinct from the global free-

energy minimum in lattice simulations of a multi-component

mixture30. There is a broad range of experimental systems

for which this approach can and has been applied; strand dis-

placement reactions can tune interaction strengths between

DNA-coated colloids31,32, light-activated interactions can be

tuned via spatiotemporal intensity protocols33, and system

properties such as temperature, pressure, and concentrations

can be controlled within microﬂuidic devices34–36. While

the technology to implement such protocols is available, the

ﬁeld lacks an efﬁcient framework to rationally design and

optimize them. ‘On-the-ﬂy’ methods37, such as the sta-

tistical physics design engine38, and machine learning ap-

proaches, such as automatic differentiation of molecular dy-

namics trajectories39, have been successful for select systems,

but the high computational cost of sampling a large parameter

space under experimentally relevant conditions is prohibitive

for most systems. To bridge this gap, we seek to develop a

method to efﬁciently compute optimal time-dependent proto-

cols to maximize the yield of a chosen target state, even for

challenging systems such as those exhibiting long assembly

timescales, kinetic trapping, or competing metastable states.

Our approach relies on the construction of Markov State

Models (MSMs)40,41. MSMs are a powerful tool for coarse-

graining the dynamics of complex systems into a reduced-

order form that is tractable to analysis and allows charac-

terizing the system dynamics on timescales that are orders

arXiv:2210.05749v1 [cond-mat.soft] 11 Oct 2022

of magnitude longer than those accessible to straightforward

simulations40,42–45. We show that, by taking advantage of

properties satisﬁed by general MSMs46,47, we can analytically

and efﬁciently evaluate derivatives of state probabilities with

respect to tunable system parameters for use in gradient-based

optimization. Although our approach has features in com-

mon with previously developed methods to compute feedback

control policies for self-assembly with real-time system sen-

sors (e.g.48–51), it avoids the potentially expensive dynamic

programming calculations in those methods by directly con-

sidering the ﬁnal time target state probability in the objec-

tive function. We ﬁrst demonstrate the method on a semi-

analytic Markov model for the assembly of a short polymer of

six colloidal particles in two dimensions. For this system, we

can optimize the time-dependent interactions between differ-

ent particle types to achieve selective, high yield assembly of

multiple structures, including the stable equilibrium structure,

metastable rigid structures, and ﬂoppy structures that are only

transient for most constant parameter sets. Our results pose

an interesting choice of tradeoffs for experimental systems;

we show that tuning a time-dependent protocol and increasing

subunit complexity (i.e. the number of distinct subunit types)

can have similar results on yield, but one of these choices may

be easier to implement experimentally.

To investigate generality, we also demonstrate our algo-

rithm on a very different system — a two-parameter model

for capsid assembly on a spherical nanoparticle. For this ex-

ample, there is no analytical form for the transition matrix.

We show how simulations and interpolation can be used to

make such complex problems suitable for our optimization

algorithm. By combining our algorithm with the use of radial

basis function interpolation within parameter space, we de-

velop a framework to dramatically reduce the computational

requirements for protocol optimization. The method is also

robust and reusable in comparison to existing methods; simu-

lations do not need to be rerun to change the initial or target

states in the optimization, as would be the case for ‘on-the-ﬂy’

methods. Using this procedure, we compute time-dependent

protocols to assemble a target state that is highly kinetically

inaccessible; there is approximately a 60% gap between the

estimated equilibrium and dynamical yields. For timescales

we are able to simulate by brute force dynamics, our com-

puted time-dependent protocols increase the target yield by

greater than twofold over the best constant protocol, in the

same amount of assembly time. The MSMs also allow for

probing of longer timescales, for which we predict optimal

time-dependent protocols that achieve yields within 1% of

equilibrium, in orders of magnitude less time than required

to approach equilibrium under a constant protocol.

Importantly, the method is generally applicable to any sys-

tem that can be approximated by an MSM and optimized by

an objective function involving state probabilities, including

multifarious52–57 and reconﬁgurable assembly systems22,58–66.

Moreover, it can be used as a highly efﬁcient parameter esti-

mation tool for both constant and time-dependent protocols.

II. METHODS

A. Theoretical Setup

We begin by assuming the process of interest is described

by a Markov State Model over the discrete state space, S. For

a given lag time, τ, we deﬁne a temporal discretization with

tn=nτ for n= 0, . . . , N. The time-dependent protocol

will be piece-wise constant, represented as {θn}N−1

n=0 , where

θnis a constant parameter value from time tnto tn+1. The

transition matrices are then given by Pn=P(θn), which

propagate the state probabilities from tnto tn+1. Extension

to multiple control parameters, ~

θn, as we use in the second

example below, is straightforward.

Let the row vector ~p ndenote the probability distribution

over the system states at time tn. The evolution of this proba-

bility is governed by the forward Kolmogorov equation,

~p n+1 =~p nP(θn), ~p 0=~p0.(1)

For a set of target states, B∈ S, we are interested in optimiz-

ing PN

B=Pi∈BpN

i, subject to the constraint that the proto-

col does not change too rapidly. To this end, we maximize the

objective function

Φ[θ] = PN

B−λ

N−1

n=0 θn+1 −θn

τ2

,(2)

where the second term is a smoothing penalty function whose

strength is controlled by λ > 0. We will report this smooth-

ing parameter as λ∗=λ/(Nτ2), which is normalized with

respect to the timescales of the system. It is straightforward

to add additional terms to Eq. (2), for example to limit the

magnitude of the control parameter or maximize the speed of

assembly.

B. Protocol Optimization

We compute the gradient of PN

Bvia an adjoint method67,68.

The adjoint equation is the backward Kolmogorov equation,

Fn=P(θn)~

Fn+1,~

FN=~

1B,(3)

where Fnis the adjoint variable at time tn, and ~

1Bis the

indicator vector for the set B. This equation is prescribed at

a ﬁnal condition and solved backward in time. By solving

Equations (1) and (3) for all time, the gradient components

can be computed (see the SM for details) as

∂P N

∂θk

=~p k∂Pk

∂θk

Fk+1.(4)

The derivative of the penalty term with respect to θkcan

be interpreted as a discretization of the second time derivative

of the protocol. The gradient descent update step can then

be seen as solving a discretized diffusion equation, where the

100

Yield (%)

100

Yield (%)

Parameter Set Parameter Set

F1 F2 F3 F4&5 TD1 TD2 TD3 TD4 TD5

(a) (b) (c)

FIG. 1. Visualization and statistics for the colloidal polymer system. (a) Depiction of the initial colloidal chain with alternating red-blue

particle types and ﬁxed backbone (shown in white), as well as example conﬁgurations of ﬁve states for which we optimize yields. Note

that only a single representative permutation is shown for each state, but reported yields are combined over all consistent permutations. (b)

Histograms of the yield for each of the ﬁve states in (a) (color-coded) for four sets of ﬁxed parameter values. The number in the parameter set

label indicates which state’s yield is being maximized. Yields are computed by averaging over 600 Brownian Dynamics trajectories with the

given parameter values. The ﬁnal time is Tf= 5 for state 1,Tf= 7.5for state 3, and Tf= 10 for the others, reported as a non-dimensional

simulation time. (c) Same as (b), but the parameter sets are now the optimal time-dependent protocols.

gradient of PN

Bacts as a source term. For stability reasons,

we solve this equation with the IMEX69 scheme

θj+1

k=θj

k+h∂P N,j

∂θk

+λh

τ2θj+1

k+1 −2θj+1

k+θj+1

k−1,(5)

where jindexes the iteration number, PN,j

B=PN

B[θj], and h

is a step-size to be chosen. See the SM for details.

C. Model Systems

We demonstrate our optimization algorithm on two sys-

tems, which involve different levels of complexity to con-

struct the MSM. The ﬁrst example is a simple model for short

polymer chains constructed from colloidal particles with pro-

grammable interactions, which enables analytical construc-

tion of an MSM. The second example is a model for viral

capsid assembly in which two types of conical subunits as-

semble on the surface of a spherical nanoparticle. In this case

we construct an MSM by estimating transition rate matrix el-

ements from an ensemble of short unbiased simulations.

1. Colloidal Chain Folding

The ﬁrst model describes the folding of an initially linear

polymer made up of six colloidal particles in two dimensions.

The folding problem has many features in common with as-

sembly, including analogous thermodynamic and kinetic ef-

fects, and provides a simple illustration of the optimal control

method. The interactions between each particle in the chain

can be programmed, for example by coating the surface of

the particles with strands of DNA70. Further, these interac-

tions can be varied in time by modulating the melting temper-

atures of different strand types71 and applying a temperature

protocol. Depending on the choice of interaction strengths,

the system can fold into one of three ground states as well

as a number of ﬂoppy states, which have been characterized

both experimentally72,73 and via theory and simulation20. It

has been shown for this system, as well as many others9,74–76,

that increasing the number of subunit types allows for more

efﬁcient assembly of a target structure. However, increasing

the number of subunit types generally makes systems more

susceptible to kinetic traps and thus more sensitive to parame-

ter variations. Further, additional species require greater costs

in materials development and synthesis. Here we aim to in-

vestigate how time-dependent interaction strengths can boost

folding yields, as well as identify tradeoffs between subunit

complexity (number of different subunit types) and protocol

complexity.

To study the minimal subunit complexity case that allows

for multiple free energy minima, we allow for only two types

of particles, which we will distinguish as red and blue. We

assume the initial chain has an alternating ordering of these

particles, i.e. red, blue, red, blue, and so on. Figure 1(a)

shows this initial chain as well as ﬁve potential folded struc-

tures. Structures 1through 3are the rigid ground states for

this system, while structures 4and 5are ﬂoppy structures that

are typically folding intermediates, but can be stabilized by

setting some of the interactions strengths to zero. Figure 1(b)

shows histograms of the yields of each of these structures un-

der a number of parameter sets, which consist of ﬁxed values

for each of the three interaction strengths; ERR,ERB, and EBB,

which are the well-depths for a short-ranged Morse potential.

We denote parameter set ias the one that maximizes the ﬁnite-

time yield of structure i. We ﬁnd that relatively large yields

are already possible for most structures using ﬁxed protocols,

except for structure 3, the triangle. While it is possible to fur-

ther boost yields of some of these structures by changing the

particle type distribution, we investigate what can be achieved

using this ﬁxed ordering along with time-dependent protocols.

The model we use to construct an MSM for this system was

previously developed to study equilibrium folding of colloidal

chains20. To summarize the model, a state space is deﬁned by

enumerating all possible adjacency matrices describing bonds

between the six particles. For each adjacency matrix, the rate

of forming a new bond and the probability distribution for

which bond forms ﬁrst are estimated using Brownian Dynam-

ics simulations. This speciﬁes all the forward rates of a rate

matrix, which are independent of interaction strength. For a

given set of interaction strengths, the equilibrium probabilities

of each adjacency matrix are measured using a Monte Carlo

sampler. The backward rates are then set from the forward

rates by imposing detailed balance with respect to the esti-

mated equilibrium measure. A re-weighting procedure can

then be used to evaluate the backward rates for other values of

the interaction strengths. The resulting rate matrix can be con-

verted into a probability transition matrix by exponentiation,

at which point the optimization can be performed as described

above.

These simulations are performed in non-dimensionalized

form. Positions, energies, and time are respectively scaled

by particle diameter σ,kBT, and a reference time tc. The re-

sulting equations have a non-dimensional diffusion coefﬁcient

=Dtc/σ2, where Dis the dimensional diffusion coefﬁ-

cient. We set = 1, and use experimentally measured values

σ= 1.3µm and D= 0.065µm2/s for DNA coated colloids77

to obtain an order-of-magnitude estimate of tc≈17s.

2. Conical Subunit Assembly On a Nanoparticle

We study a system adapted from Ref.78, consisting of two

types of conical subunits that assemble into capsids on the sur-

face of a spherical nanoparticle. The subunits are rigid bodies

comprised of six beads of increasing radius, consistent with a

cone angle, α. Each of the four interior beads interacts attrac-

tively with the corresponding bead on other subunits, through

a Morse potential. The innermost bead has an attractive Morse

interaction with the nanoparticle.

The two subunit types have relative diameters such that they

correspond to pentamers (P) and hexamers (H) in the lowest

energy icosahedral capsid structure79. There are two types of

interactions between the subunits; an H-H attraction and an H-

P attraction. The strength of these interactions can be tuned by

varying the well-depth of the corresponding Morse potential,

EHH and EHP, respectively. Interaction strengths are chosen

in the range 1.2kBTto 1.8kBT, which can facilitate assembly

on the nanoparticle surface, but typically is too weak to drive

nucleation in the bulk. The strength of the attraction to the

nanoparticle is held constant at ENH = 7kBTfor hexamers

and ENP = 6.3kBTfor pentamers. The system contains one

nanoparticle, 86 pentamer subunits and 214 hexamer subunits,

so that subunits are in excess and the chemical potential of

free subunits remains nearly constant throughout the assem-

bly process. We consider a cubic box with side length 120l0,

where l0= 1nm is set as the unit length scale for the system.

Assembly simulations are performed using the Brownian dy-

namics algorithm in HOOMD80 with periodic boundary con-

ditions. Times are measured in terms of the HOOMD derived

unit, t0=pm/kBT l0, where mis the subunit mass. We also

construct an MSM for the system by estimating the transition

matrix elements from these Brownian dynamics simulations

(section II D). Further details on the simulations are provided

in the SM.

The structures with the lowest potential energy (per parti-

cle) for this system are capsids with either T= 4 icosahe-

dral symmetry or D5symmetry, which both contain 30 hex-

amer and 12 pentamer subunits with the same number of H-H

and H-P bonds81. However, in ﬁnite-time dynamical simula-

tions assembly usually results in asymmetric structures that,

although containing about 42 subunits, have one or more de-

fects. Figure 2 depicts the ﬁve most probable end structures

after Tf= 8 ×105t0, for the parameters that maximized the

yield of symmetric capsids (T= 4 and D5). The observation

time Tfwas chosen as the timescale after which the yield only

grows logarithmically for most parameter sets that we focus

on. We also compare the ﬁnite-time yields to the equilibrium

yields, which we estimated using the MSM. This comparison

shows that the system is far from reaching equilibrium at Tf,

and from the MSM we estimate that approaching equilibrium

would require a timescale of approximately 60Tf.

These observations are consistent with the competition

among thermodynamic and kinetic effects that arise for con-

stant assembly driving forces. For very low subunit-subunit

interactions or subunit concentrations, assembly is either ther-

modynamically unfavorable or does not nucleate on relevant

timescales, while overly high interactions or concentrations

cause the system to become trapped in metastable structures.

Since subunit-subunit interactions must be broken to exit a

metastable state, reconﬁguration timescales increase exponen-

tially with interaction strengths. In our system, while the

T= 4 and D5structures are energetically favored, the other

metastable states that we observe are more kinetically acces-

sible under interaction strengths that enable nucleation on rel-

evant timescales and thus occur in most assembly trajectories.

These competing effects suggest that assembly yields and

ﬁdelity could be enhanced by time-dependent sequences of

interactions, which can drive rapid nucleation while also

avoiding kinetic traps by facilitating reconﬁguration from

metastable states into the ground state(s). To investigate

this possibility, we apply our optimization algorithm to the

cones system. Our objective is to compute an optimal time-

dependent protocol for EHH and EHP interactions that maxi-

mizes the yield of the symmetric T= 4 and D5capsids at a

speciﬁed observation time.

D. Constructing Markov State Models

To apply the protocol optimization algorithm in Eq. (5),

transition matrices need to be evaluated for the possible pro-

tocol values that are encountered during maximization of the

objective function. While this analysis is computationally ef-

ﬁcient for the colloidal polymer system because we have a

semi-analytic Markov model, we are not so fortunate for the

cones model. Similarly, for most relevant systems it will not

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

OptimizationofNon-EquilibriumSelf-AssemblyProtocolsUsingMarkovStateModelsAnthonyTrubiano1,a)andMichaelF.Hagan1,b)MartinFisherSchoolofPhysics,BrandeisUniversity,Waltham,Massachusetts02454,USA(Dated:13October2022)Thepromiseofself-assemblytoenablebottom-upformationofmaterialswithprescribedarchitectures...

展开>> 收起<<

Optimization of Non-Equilibrium Self-Assembly Protocols Using Markov State Models Anthony Trubiano1aand Michael F . Hagan1b.pdf

共20页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Optimization of Non-Equilibrium Self-Assembly Protocols Using Markov State Models Anthony Trubiano1aand Michael F . Hagan1b

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: