Deep learning and multi-level featurization of graph representations of microstructural data Reese JonesCosmin Safta Ari Frankel

2025-05-06 2 0 4.27MB 27 页 10玖币

侵权投诉

Deep learning and multi-level featurization of

graph representations of microstructural data

Reese Jones,∗Cosmin Safta, Ari Frankel

Sandia National Laboratories, Livermore, CA 94551

Abstract

Many material response functions depend strongly on microstructure, such as inhomo-

geneities in phase or orientation. Homogenization presents the task of predicting the mean

response of a sample of the microstructure to external loading for use in subgrid models and

structure-property explorations. Although many microstructural ﬁelds have obvious segmen-

tations, learning directly from the graph induced by the segmentation can be diﬃcult because

this representation does not encode all the information of the full ﬁeld. We develop a means

of deep learning of hidden features on the reduced graph given the native discretization and a

segmentation of the initial input ﬁeld. The features are associated with regions represented as

nodes on the reduced graph. This reduced representation is then the basis for the subsequent

multi-level/scale graph convolutional network model. There are a number of advantages of

reducing the graph before fully processing with convolutional layers it, such as interpretable

features and eﬃciency on large meshes. We demonstrate the performance of the proposed net-

work relative to convolutional neural networks operating directly on the native discretization

of the data using three physical exemplars.

1 Introduction

Newly developed graph neural networks (GNNs) [1–3], in particular convolutional graph neural

networks, have been shown to be eﬀective in a variety of classiﬁcation and regression tasks. Recently

they have been applied to physical problems [4, 5] where they can accommodate unstructured

and hierarchical data naturally. Analogous to pixel-based convolutional neural networks (CNNs),

“message passing” [6] graph convolutional neural networks (GCNNs) [7, 8] employ convolutional

operations to achieve a compact parameter space by exploiting correlations in the data through

connectivity deﬁned by adjacency on the source discretization. Frankel et al. [9] and others [10–

13] derive the information transmission graph directly from the connectivity of the discretization,

computational grid or mesh based on the assumption the physical interactions are local. Some

obvious advantages of applying convolutions to the discretization graph are that: general mesh data

can be handled without interpolation to a structured grid, the discretization can be conformal to

the microstructure, periodic boundary conditions can be handled without padding, and topological

irregularities can be accommodated without approximations. In this approach the kernels and

number of parameters are similar for a pre-selected reduction of the representation, e.g. based on the

grains in a polycrystal [5], but the size of the adjacency can be prohibitive. Naively constructed and

∗Corresponding author: rjones@sandia.gov

arXiv:2210.00854v1 [cs.LG] 29 Sep 2022

applied, these mesh-based graph models can operate very large graphs; however, graph reduction

is important for eﬃciency and can promote learning [14]. While physical problems with short-

(e.g interface) and long-range (e.g. elastic) interactions are ubiquitous in engineering and materials

science, treating the longer scales with convolutions on the discretization graph can be ineﬃcient

or ineﬀective.

Treating aggregated data via graph pooling based on data clustering is a long-standing ﬁeld of

research with broad applications, e.g. image classiﬁcation [15], mesh decomposition [16], and chem-

ical structure [17]. Since clustered data rarely has a regular topology, graph based networks are

natural representations. In particular, compared to a CNN operating on a pixelized image, a GCNN

operating on an unstructured mesh has less well-deﬁned spatial locality and inherits the varying

neighborhood size of the source data. Akin to topologically localized convolutions supplanting

spectral-based convolutions [7, 18] and spectral pre-processing [19] in the literature, spectral clus-

tering based on the primary eigenvectors of the graph Laplacian has been largely superseded by

less expensive, more easily diﬀerentiable techniques, some of which connect to spectral clustering.

Dhillon et al. [20] showed the equivalence of spectral clustering and kernel k-means clustering

which allowed expensive eigenvalue problems to be reframed in terms of trace maximization objec-

tives. These objectives include maximizing in-cluster links/edges and minimizing cuts, i.e. number

of links, between any cluster and remainder of the graph. Typically these objectives are normalized

relative to cluster size (number of nodes) or degree (sum of non-self connections/adjacency) but are

ambivalent to the data on the graph. More recently, graph-based neural nets, such as DiﬀPool [21]

and MinCutPool [14], have been designed to take into account the data on the graph in soft clus-

tering, trainable pooling operations. Ying et al. [21] developed DiﬀPool to enable a hierarchical

treatment of graph structure. DiﬀPool uses a GNN for the ultimate classiﬁcation task and another

with a softmax output for the intermediary pooling task. The separate GNNs learn a soft, in the

sense not binary and disjoint, assignment of nodes in the input graph to those in a smaller graph

as well as derivative features on the smaller embedded graph. Due to non-convexity of the clus-

tering objective, an auxiliary entropy loss is used to regularize the training. Bianchi, Grattarola

and Alippi [14] developed MinCutPool based on a degree normalized objective of minimizing edges

between any cluster and the remainder of the graph. They relaxed the objective of ﬁnding a binary,

disjoint cluster assignment to reduce the computational complexity by recasting the problem to a

continuous analog. Ultimately they proposed a network similar to DiﬀPool albeit with the softmax

assignment GNN replaced by a softmax multilayer perceptron (MLP) and diﬀerent loss augmenta-

tion designed to promote orthogonality of the clustering matrix mapping graph nodes to clusters.

Grattarola et al. [22] also generalized the myriad approaches to pooling and clustering on graphs

with their select-reduced-connect abstraction of these operations.

Graph convolutional neural networks can be particularly opaque in how they achieve accurate

models. To interpret what convolution networks are learning through their representations, ﬁlter

visualization and activation [23–25], saliency maps [26], sensitivity, attention based and related

techniques [27, 28] have been developed. These techniques have been largely applied to the ubiqui-

tous commercial classiﬁcation tasks, for instance Yosinski et al. [25] demonstrated how activations

at the deepest layers of a CNN correspond to obvious features in the original image used in clas-

siﬁcation, e.g. faces. Some of these techniques have enabled a degree of ﬁlter interpretation by

translating the ﬁlter output so that the learned features are obvious by inspection. For pixel-based

CNNs deconvolution and inversion of the input-output map [24], as well as guided/regularized op-

timization [25] have produces some insights. Some other methods rely on clustering, for example,

Local Interpretable Model-agnostic Explanations (LIME) [29] is based on segmentation and pertur-

bation of the cluster values. In general, clustering simpliﬁes the input-output map created by the

convolution ﬁlter and therefore the complexity of the image-to-image convolutional transformation.

Given that in many physical homogenization problems the segmentation of the domain based

on speciﬁc features, e.g. phase or orientation, of the constituents is readily accessible but the

informative features on the sub-domains are not, we propose a means of constructing a graph-based

representation where the features on the nodes representing the sub-domains are learned/not pre-

selected while the clustered topology is known. The concept resembles the reduction-prolongation

operations of algebraic multi-grid approaches [30], the architecture of graph U-nets [31] and the

combination of multigrid with pixel-based CNNs [32]. Beyond developing a GCNN architecture

to eﬃciently reduce data on the native discretization graph to logical clustering without manual

feature engineering, a secondary goal is to provide some insight into how the resulting models learn

an accurate representation of the data.

This paper is organized as follows. In Sec. 2 we describe the homogenization problems and in

Sec. 3 we present the two physical exemplars we model with GCNNs. Sec. 4 describes the proposed

architecture and contrast it with traditional CNN and GCNN-based architectures. Sec.5 outlines the

techniques we use to interpret the ﬂow of information in the networks. Then in Sec.6 we demonstrate

the performance and interpretation of the proposed framework. With the idea that that the simplest

accurate model is likely to be the most interpretable, we compare the proposed deep reduced GCNN

to analogous architectures based on previous GCNN [9] and CNN [33] representations. We conclude

in Sec. 7 with a discussion of the proposed architecture in the context to soft, learned clustering,

and extensions left for future work.

2 Homogenization

Homogenization of the physical response of a sample with complex structure represents a class

of problems with broad physical applications, such as creating sub-grid models of materials with

multiple phases, orientations or other microstructural aspects. For each particular physical problem

the input is a sample of the detailed structure in a regular domain and the output is the average

or total response of the sample subject to external loading. Samples that are of a size that is

representative of the large sample limit are called representative volume elements (RVEs) and those

that are smaller, where statistical variations are apparent in the output, are called statistical volume

elements (SVEs) [34, 35]; this distinction is clearly a matter of degree.

For example, the volume average of stress evolution σ(t) as a function of (boundary loading)

strain ¯

(t) is often of interest in developing material models. The boundary value problem over the

sample gives a ﬁeld, in this example stress σ(X, t), as a function of position Xand time t:

σ(X, t) = σ(φ(X),¯

(t)),(1)

where φ(X) is a ﬁeld describing the initial microstructure, and (t) is an external loading applied

through the boundary conditions. Homogenization aims to represent the average response over the

sample domain Ω:

σ(t) = 1

VZΩ

σ(X, t) dV(2)

as a function of φ(X) and ¯

(t). Here Vis the volume of the sample domain Ω. The ﬁeld φ(X)

can be interpreted as a (potentially multichannel) image resulting from, for instance, computed

tomography and/or electron backscatter diﬀraction (refer to Fig. 1). For this work we assume that

φ(X) over the sample domain Ω can be readily segmented into disjoint regions ΩKthat each have a

uniform value φK, as in Fig. 1. The discretization necessary to resolve the ﬁelds leads to the number

of discretization cells Ncells being much greater than the number of regions Nclusters in general.

Homogenization traditionally has relied on analytical models based on simplifying assumptions.

Mixture rules where individual contributions of the constituents of the microstructure are summed

and weighted by their volume fractions, χK=VK/V , have generally been utilized in approximate

analytical homogenization. For instance, the approximation

σ=1

KZΩK

σdV≈X

χKσK(3)

where σK=σ(φK,K) is widely employed. Note that mixture rules such as these typically pair ex-

tensive partitioning variables, such as χK, with intensive variables, such as φK. The approximation

Eq. (3) leads to an estimate of the elastic moduli C≡∂σ:

C≈X

χKCK,(4)

assuming the strain is homogeneous in the sample. Homogeneous ﬂux or homogeneous ﬁeld

gradient are typical simplifying assumptions for homogenizing gradient-driven ﬂuxes such as stress

and heat ﬂux. These complementary limiting cases can be combined, as in the classic Hill average

estimator [36] for the elastic modulus tensor:

C=1

2

X

χKCK+"X

χKC−1

K#−1

.(5)

Note that assuming uniform stress or strain in all ΩKin the domain Ω omits compatibility require-

ments that make the actual ﬁeld dependent on diﬀerences in φKalong interface (e.g. orientation

diﬀerences along inter-crystal boundaries (misorientation) [37]).

Figure 1: Electron backscatter diﬀraction (EBSD) image of a polycrystalline metal (courtesy of

Brad Boyce, Sandia). Colors indicate the crystal orientations.

3 Datasets

To demonstrate the eﬃcacy of the proposed architecture we will use a few exemplars: (a) heat

conduction in polycrystals, (b) stress evolution in polycrystalline materials undergoing plastic de-

formation, and (c) stress evolution in viscoelastic composite materials. The aspects of the data

germane to machine learning are described in the following section and the remaining simulation

details are given in App. A. With these exemplars we generate both 2D and 3D datasets. For each

exemplar only the apparent orientation or phase information and the discretization cell volumes

are provided to the GCNN models. The inclusion of cell volumes is motivated by the classical

mixture formulas since the unstructured meshes have a distribution of element volumes. The 3D

datasets demonstrate the full generality of the neural network architectures described in Sec. 4 and

test how they perform on samples with large discretizations, while the 2D datasets facilitate ex-

ploration through visualization. Some of the datasets involve data pixelated on structured grids,

which are treated as graphs by the graph convolutional neural networks, These datasets enable

direct comparison with pixel-based CNNs.

3.1 Polycrystal heat conduction

The simplest dataset was generated with two-dimensional simulations of steady heat conduction

through ersatz polycrystals represented on a triangular mesh. The problem has the complexity

that the conductivity tensor κin each crystal is orientation φdependent; however, the governing

partial diﬀerential equation linear in the conductivity and the temperature ﬁelds. Fig. 2 shows a

few of the 10,000 realizations generated, which which had 211 to 333 elements (mean 235.8) and 14

to 18 crystals (mean 15.4).Also apparent is deviation in the temperature ﬁelds from homogeneous

gradients due to the anisotropy of the crystal components of the samples.

As Fig. 3a shows, the volume averaged angle ¯

φfor a realization is not particularly correlated

with the output. However if the crystal conductivities are known, mixture rules analogous to Eq. (5)

for a homogeneous gradient

¯κ=X

χKκK(6)

or homogeneous ﬂux

¯κ="X

χKκ−1

K#−1

(7)

give reasonably well-correlated estimates of the eﬀective conductivity ¯κ. These estimates are show

in Fig. 3b for each of the realizations in the ensemble. Clearly the mixture estimates are biased

since the assumptions of homogeneous gradient or homogeneous ﬂux are limiting cases.

3.2 Polycrystals with crystal plasticity

Crystal plasticity (CP) is another common example where the homogenized response of a repre-

sentative sample of grain structure characterized by the orientation angles φ(X) is of interest. As

detailed in App. A.2, the response of the polycrystalline metal samples is complex and non-linear

in the loading and properties. Fig. 4 shows a few of the 12,000 2D realizations on a 32×32 grid and

another 10,000 3D realizations on a 25×25×25 grid were simulated independently. Fig. 4 also illus-

trates the inhomogeneities in the stress response (at 0.3% strain in tension), which are particularly

marked at grain boundaries. The 2D realizations had 5 to 30 grains (with a mean at 13.9) and the

3D realizations had 1 to 34 grains (with a mean at 13.0).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Deeplearningandmulti-levelfeaturizationofgraphrepresentationsofmicrostructuraldataReeseJones,*CosminSafta,AriFrankelSandiaNationalLaboratories,Livermore,CA94551AbstractManymaterialresponsefunctionsdependstronglyonmicrostructure,suchasinhomo-geneitiesinphaseororientation.Homogenizationpresentsthetask...

展开>> 收起<<

Deep learning and multi-level featurization of graph representations of microstructural data Reese JonesCosmin Safta Ari Frankel.pdf

共27页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Deep learning and multi-level featurization of graph representations of microstructural data Reese JonesCosmin Safta Ari Frankel

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: