When does deep learning fail and how to tackle it A critical analysis on polymer sequence -property surrogate models Himanshu and Tarak K Patra

2025-04-24 1 0 1.3MB 19 页 10玖币

侵权投诉

When does deep learning fail and how to tackle it? A critical analysis on

polymer sequence-property surrogate models

Himanshu and Tarak K Patra*

Department of Chemical Engineering, Center for Atomistic Modeling and Materials Design

and Center for Carbon Capture Utilization and Storage, Indian Institute of Technology

Madras, Chennai TN 600036, India

Abstract:

Deep learning models are gaining popularity and potency in predicting polymer properties.

These models can be built using pre-existing data and are useful for the rapid prediction of

polymer properties. However, the performance of a deep learning model is intricately

connected to its topology and the volume of training data. There is no facile protocol available

to select a deep learning architecture, and there is a lack of a large volume of homogeneous

sequence-property data of polymers. These two factors are the primary bottleneck for the

efficient development of deep learning models. Here we assess the severity of these factors and

propose new algorithms to address them. We show that a linear layer-by-layer expansion of a

neural network can help in identifying the best neural network topology for a given problem.

Moreover, we map the discrete sequence space of a polymer to a continuous one-dimensional

latent space using a machine learning pipeline to identify minimal data points for building a

universal deep learning model. We implement these approaches for three representative cases

of building sequence-property surrogate models, viz., the single-molecule radius of gyration

of a copolymer, adhesive free energy of a copolymer, and copolymer compatibilizer,

demonstrating the generality of the proposed strategies. This work establishes efficient

methods for building universal deep learning models with minimal data and hyperparameters

for predicting sequence-defined properties of polymers.

Keywords: Deep Learning, Structure-Property Correlations, Polymer Genome, Materials

Design

*Author to Correspond, E-mail: tpatra@iitm.ac.in

I. Introduction

Traditional experiments and computer simulations are limited by their inability to rapidly

measure polymer properties, and, thus, inadequate to screen the astronomically large chemical

and conformational space of a polymer. Recent advances in machine learning (ML) and

increasing data and software availability trends can address this problem and accelerate

polymer design.1–5 Significant progress has been made to create data-driven models that predict

polymer properties. These models are built by collecting candidate polymers and labeling them

by their properties, which are calculated using physics-based methods. A large variety of

machine-readable fingerprints and chemical descriptors are developed to represent polymers

for ML models.6–9 The fingerprint-property data is utilized for training and building an ML

model. Such an ML model serves as a cheaper, albeit low-fidelity, surrogate for the high-

fidelity first-principle-based simulations and experiments that are expensive. There exist a

large number of numerical frameworks, such as support vector regression, random forest, and

deep neural network (DNN) to build these ML models. Among these, DNN appears to be more

versatile and transferable and provides a flexible mathematical framework to model structure-

property correlation. DNNs have been progressively used to build structure-property models

of a wide range of materials including polymers.10–19 They consist of a large number of nodes

arranged in several intermediate layers between the input and output layers. Some of the

important factors that impact the performance of a DNN are weight initialization, activation

function of its nodes, learning rate, network topology, stopping criteria, and loss optimization

algorithm. Among these, the number of nodes and their arrangement in the intermediate layers

plays a key role in determining the accuracy and efficiency of the model. However, there is no

systematic guideline to build DNNs that are computationally efficient yet make good-quality

predictions. The connection between a DNN topology and the quality of its predictions is not

well-established. Moreover, there is no comprehensive understanding of the amount of training

data required for building a DNN model that can predict a wide variation in a material's

property.

We address the above problems of DNN model development for a representative case

of materials properties prediction, viz., sequence-property surrogate model of polymers. The

sequence of a polymer appreciably impacts its bulk and single-molecule properties. Glass

transition, ion transport, thermal conductivity, a single-molecule radius of gyration, and

multimolecular aggregation are all impacted by monomer-to-monomer sequence details of a

polymer.20–26 This sequence-property correlation of a polymer is poorly understood due to its

enormous sequence and composition space, and DNNs have been recently used to address this

problem and predict sequence-defined properties of polymers.8,27–29 However, no agreed-upon

strategy has emerged to decide the minimum sequence-property data required to build these

models. Also, it is not clear what would be the most efficient neural network topology for the

sequence-property metamodel of polymer. The primary bottleneck in building a universal

model is the astronomically large number of sequences that are possible for a copolymer, and

the sequence-specificity is so profound that a subtle change in the copolymer sequence results

in a significant change in the properties of interest.30–32 Oftentimes, the optimal property is

present in a non-intuitive, seemingly arbitrary polymer sequence, the sequence-specificity of

which is unknown.20,21,23 Learning and predicting these variations in structure-property

relations of a polymer are challenging. There are no analytical methods that can estimate the

extremum of a property and the corresponding sequences. It is also challenging to establish the

sequence space as a function of a few coordinates. Therefore, building a transferable model

remains a substantially complex task.

While the potential of ML predictive models such as DNNs is very lucrative, they are

interpolative and, therefore, it is not always clear how one should go about training a neural

network to exhaustively fit the entire configurational space of a given system. Currently, DNNs

are trained by generating a large quantity of training data in hopes that they have adequately

sampled the configurational space of a molecular system. This can, however, be an increasingly

prohibitive task when it comes to generating data using computationally expensive physics-

based methods. As such, it is desirable to train a model using the absolute minimal data set

possible, especially when the costs of high-fidelity calculations are high. In the recent past, we

have proposed active learning methods to sample configurational space for collecting DNN

training data in the context of neural network potential development.33–35 Several other active

learning strategies, such as QBC (query by committee)36, DP-GEN (deep potential generator)37,

and adaptive Bayesian inferences38 for data selection and building transferable neural network

models. Moreover, there have been other attempts, such as transfer learnings, to build models

with minimal training data. In transfer learning, a model trained on a different property with a

given abundant data set is reused and transferred to build another model for a target task with

considerably small data.39,40 All of these require physics-based property calculation while

selecting the training data. Therefore, selecting the minimal amount of candidate structures

without knowing their properties a priori remains an elusive and attractive goal of ML model

development.

The objectives of this work are to build an algorithm to identify the hyperparameters of

a DNN, estimate the limitation of a DNN, and, finally, establish a framework to build DNN

models that are transferable across the sequence space without the need to generate a large

volume of sequence-property data. To accomplish these objectives, we consider three

representative problems – the radius of gyration of a copolymer in an infinitely dilute solution,

copolymer compatibilizer, and copolymer adsorption on a surface. We propose a systematic

linear expansion of DNN architecture to identify the best surrogate models for all three cases.

This approach does not require any special optimization algorithm to explore enormously large

possibilities of a DNN topology. We use this protocol to develop DNN models that predict

sequence-defined properties of polymers with more than 95% accuracy. Secondly, we build a

DNN model using training data that represent a specific range of property and test this model's

ability to predict the property that is outside the training data. We show that the performance

of a DNN declines when the target property is outside the known range of property. We propose

a new framework to tackle the transferability problem of ML by leveraging the power of

convolution DNN autoencoder that automatically extracts features of a molecular system. We

construct a one-dimensional sequence space and sample the sequences uniformly covering the

entire space. This collection of points serves as the training data for our DNN model. We show

that a model based on ~500 data points, which are selected intelligently, can predict the

properties of ~40000 sequences very accurately. We expect this model to predict the properties

of all possible sequences of a copolymer, which is ~1030 for a binary copolymer of chain length

100. Although the current study focuses on sequence-property ML models, these methods are

extensible for other classes of properties and materials. We expect that these new approaches

to data and hyperparameter selections will accelerate the progress of ML model development.

II. Polymer Sequence-Property Data

In this study, we focus on three sequence-defined properties of a binary copolymer, viz., the

radius of gyration in an infinitely dilute solution, compatibilization of a polymer blend, and the

adsorption-free energy on a patterned surface. The data are collected from recent molecular

simulation studies that use the Kremer-Grest bead-spring phenomenological model41,42 to

investigate sequence-property correlations. In this phenomenological model, two chemical

moieties are linearly connected to form a copolymer. The interaction parameters of the moieties

are adjusted to represent their chemical affinity in a given system. It is a standard and popular

model for studying generic polymer properties in molecular simulations without considering

specific polymer chemistry and condition. This simple model is computationally very efficient

and can be mapped to real polymers by tuning its parameters.43 The schematic representations

of the systems and the distribution of data are shown in Figure 1. The radius of gyration of a

copolymer in an implicit solvent is taken from our recent study.22 In this study, a polymer of

chain length N=100 with an equal composition of both moieties is simulated in an implicit

solvent condition, as schematically shown in Figure 1A. A large number of sequences are

sampled using a molecular dynamics simulation-based evolutionary algorithm. The data set

consists of ~40000 sequences and their radius of gyration. The second data set (cf. Figure 1B

and E) corresponds to a copolymer compatibilizer.24 Copolymer compatibilizers are surfactant

molecules designed to improve the stability of an interface. They are deployed to enhance

material properties in settings ranging from emulsions to polymer blends. A major

compatibilization strategy employs block or random copolymers composed of distinct repeat

units with preferential affinity for each of the two phases forming the interface, as shown in

Figure 1B. In recent studies, we have shown that the surface tension of the interface is very

Figure 1: Sequence-property polymer data. The schematic representations of three systems - folding of a polymer chain in

an implicit solvent, a copolymer compatibilizer at the interface between two immiscible homopolymers, adsorption of a

copolymer on a substrate are shown schematically in A, B and C, respectively. The corresponding histograms of the available

data for the three cases are shown in D, E and F, respectively. A reduced unit is used in all three studies, wherein



and



are the unit of length and energy, respectively. Also,

𝑘𝐵

and T are the Boltzmann constant and temperature of a system,

respectively.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Whendoesdeeplearningfailandhowtotackleit?Acriticalanalysisonpolymersequence-propertysurrogatemodelsHimanshuandTarakKPatra*DepartmentofChemicalEngineering,CenterforAtomisticModelingandMaterialsDesignandCenterforCarbonCaptureUtilizationandStorage,IndianInstituteofTechnologyMadras,ChennaiTN600036,India...

展开>> 收起<<

When does deep learning fail and how to tackle it A critical analysis on polymer sequence -property surrogate models Himanshu and Tarak K Patra.pdf

共19页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

When does deep learning fail and how to tackle it A critical analysis on polymer sequence -property surrogate models Himanshu and Tarak K Patra

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: