Deep learning and multi-level featurization of graph representations of microstructural data Reese JonesCosmin Safta Ari Frankel

2025-05-06 0 0 4.27MB 27 页 10玖币
侵权投诉
Deep learning and multi-level featurization of
graph representations of microstructural data
Reese Jones,Cosmin Safta, Ari Frankel
Sandia National Laboratories, Livermore, CA 94551
Abstract
Many material response functions depend strongly on microstructure, such as inhomo-
geneities in phase or orientation. Homogenization presents the task of predicting the mean
response of a sample of the microstructure to external loading for use in subgrid models and
structure-property explorations. Although many microstructural fields have obvious segmen-
tations, learning directly from the graph induced by the segmentation can be difficult because
this representation does not encode all the information of the full field. We develop a means
of deep learning of hidden features on the reduced graph given the native discretization and a
segmentation of the initial input field. The features are associated with regions represented as
nodes on the reduced graph. This reduced representation is then the basis for the subsequent
multi-level/scale graph convolutional network model. There are a number of advantages of
reducing the graph before fully processing with convolutional layers it, such as interpretable
features and efficiency on large meshes. We demonstrate the performance of the proposed net-
work relative to convolutional neural networks operating directly on the native discretization
of the data using three physical exemplars.
1 Introduction
Newly developed graph neural networks (GNNs) [1–3], in particular convolutional graph neural
networks, have been shown to be effective in a variety of classification and regression tasks. Recently
they have been applied to physical problems [4, 5] where they can accommodate unstructured
and hierarchical data naturally. Analogous to pixel-based convolutional neural networks (CNNs),
“message passing” [6] graph convolutional neural networks (GCNNs) [7, 8] employ convolutional
operations to achieve a compact parameter space by exploiting correlations in the data through
connectivity defined by adjacency on the source discretization. Frankel et al. [9] and others [10–
13] derive the information transmission graph directly from the connectivity of the discretization,
computational grid or mesh based on the assumption the physical interactions are local. Some
obvious advantages of applying convolutions to the discretization graph are that: general mesh data
can be handled without interpolation to a structured grid, the discretization can be conformal to
the microstructure, periodic boundary conditions can be handled without padding, and topological
irregularities can be accommodated without approximations. In this approach the kernels and
number of parameters are similar for a pre-selected reduction of the representation, e.g. based on the
grains in a polycrystal [5], but the size of the adjacency can be prohibitive. Naively constructed and
Corresponding author: rjones@sandia.gov
1
arXiv:2210.00854v1 [cs.LG] 29 Sep 2022
2
applied, these mesh-based graph models can operate very large graphs; however, graph reduction
is important for efficiency and can promote learning [14]. While physical problems with short-
(e.g interface) and long-range (e.g. elastic) interactions are ubiquitous in engineering and materials
science, treating the longer scales with convolutions on the discretization graph can be inefficient
or ineffective.
Treating aggregated data via graph pooling based on data clustering is a long-standing field of
research with broad applications, e.g. image classification [15], mesh decomposition [16], and chem-
ical structure [17]. Since clustered data rarely has a regular topology, graph based networks are
natural representations. In particular, compared to a CNN operating on a pixelized image, a GCNN
operating on an unstructured mesh has less well-defined spatial locality and inherits the varying
neighborhood size of the source data. Akin to topologically localized convolutions supplanting
spectral-based convolutions [7, 18] and spectral pre-processing [19] in the literature, spectral clus-
tering based on the primary eigenvectors of the graph Laplacian has been largely superseded by
less expensive, more easily differentiable techniques, some of which connect to spectral clustering.
Dhillon et al. [20] showed the equivalence of spectral clustering and kernel k-means clustering
which allowed expensive eigenvalue problems to be reframed in terms of trace maximization objec-
tives. These objectives include maximizing in-cluster links/edges and minimizing cuts, i.e. number
of links, between any cluster and remainder of the graph. Typically these objectives are normalized
relative to cluster size (number of nodes) or degree (sum of non-self connections/adjacency) but are
ambivalent to the data on the graph. More recently, graph-based neural nets, such as DiffPool [21]
and MinCutPool [14], have been designed to take into account the data on the graph in soft clus-
tering, trainable pooling operations. Ying et al. [21] developed DiffPool to enable a hierarchical
treatment of graph structure. DiffPool uses a GNN for the ultimate classification task and another
with a softmax output for the intermediary pooling task. The separate GNNs learn a soft, in the
sense not binary and disjoint, assignment of nodes in the input graph to those in a smaller graph
as well as derivative features on the smaller embedded graph. Due to non-convexity of the clus-
tering objective, an auxiliary entropy loss is used to regularize the training. Bianchi, Grattarola
and Alippi [14] developed MinCutPool based on a degree normalized objective of minimizing edges
between any cluster and the remainder of the graph. They relaxed the objective of finding a binary,
disjoint cluster assignment to reduce the computational complexity by recasting the problem to a
continuous analog. Ultimately they proposed a network similar to DiffPool albeit with the softmax
assignment GNN replaced by a softmax multilayer perceptron (MLP) and different loss augmenta-
tion designed to promote orthogonality of the clustering matrix mapping graph nodes to clusters.
Grattarola et al. [22] also generalized the myriad approaches to pooling and clustering on graphs
with their select-reduced-connect abstraction of these operations.
Graph convolutional neural networks can be particularly opaque in how they achieve accurate
models. To interpret what convolution networks are learning through their representations, filter
visualization and activation [23–25], saliency maps [26], sensitivity, attention based and related
techniques [27, 28] have been developed. These techniques have been largely applied to the ubiqui-
tous commercial classification tasks, for instance Yosinski et al. [25] demonstrated how activations
at the deepest layers of a CNN correspond to obvious features in the original image used in clas-
sification, e.g. faces. Some of these techniques have enabled a degree of filter interpretation by
translating the filter output so that the learned features are obvious by inspection. For pixel-based
CNNs deconvolution and inversion of the input-output map [24], as well as guided/regularized op-
timization [25] have produces some insights. Some other methods rely on clustering, for example,
Local Interpretable Model-agnostic Explanations (LIME) [29] is based on segmentation and pertur-
bation of the cluster values. In general, clustering simplifies the input-output map created by the
3
convolution filter and therefore the complexity of the image-to-image convolutional transformation.
Given that in many physical homogenization problems the segmentation of the domain based
on specific features, e.g. phase or orientation, of the constituents is readily accessible but the
informative features on the sub-domains are not, we propose a means of constructing a graph-based
representation where the features on the nodes representing the sub-domains are learned/not pre-
selected while the clustered topology is known. The concept resembles the reduction-prolongation
operations of algebraic multi-grid approaches [30], the architecture of graph U-nets [31] and the
combination of multigrid with pixel-based CNNs [32]. Beyond developing a GCNN architecture
to efficiently reduce data on the native discretization graph to logical clustering without manual
feature engineering, a secondary goal is to provide some insight into how the resulting models learn
an accurate representation of the data.
This paper is organized as follows. In Sec. 2 we describe the homogenization problems and in
Sec. 3 we present the two physical exemplars we model with GCNNs. Sec. 4 describes the proposed
architecture and contrast it with traditional CNN and GCNN-based architectures. Sec.5 outlines the
techniques we use to interpret the flow of information in the networks. Then in Sec.6 we demonstrate
the performance and interpretation of the proposed framework. With the idea that that the simplest
accurate model is likely to be the most interpretable, we compare the proposed deep reduced GCNN
to analogous architectures based on previous GCNN [9] and CNN [33] representations. We conclude
in Sec. 7 with a discussion of the proposed architecture in the context to soft, learned clustering,
and extensions left for future work.
2 Homogenization
Homogenization of the physical response of a sample with complex structure represents a class
of problems with broad physical applications, such as creating sub-grid models of materials with
multiple phases, orientations or other microstructural aspects. For each particular physical problem
the input is a sample of the detailed structure in a regular domain and the output is the average
or total response of the sample subject to external loading. Samples that are of a size that is
representative of the large sample limit are called representative volume elements (RVEs) and those
that are smaller, where statistical variations are apparent in the output, are called statistical volume
elements (SVEs) [34, 35]; this distinction is clearly a matter of degree.
For example, the volume average of stress evolution σ(t) as a function of (boundary loading)
strain ¯
(t) is often of interest in developing material models. The boundary value problem over the
sample gives a field, in this example stress σ(X, t), as a function of position Xand time t:
σ(X, t) = σ(φ(X),¯
(t)),(1)
where φ(X) is a field describing the initial microstructure, and (t) is an external loading applied
through the boundary conditions. Homogenization aims to represent the average response over the
sample domain Ω:
¯
σ(t) = 1
VZ
σ(X, t) dV(2)
as a function of φ(X) and ¯
(t). Here Vis the volume of the sample domain Ω. The field φ(X)
can be interpreted as a (potentially multichannel) image resulting from, for instance, computed
tomography and/or electron backscatter diffraction (refer to Fig. 1). For this work we assume that
φ(X) over the sample domain Ω can be readily segmented into disjoint regions ΩKthat each have a
4
uniform value φK, as in Fig. 1. The discretization necessary to resolve the fields leads to the number
of discretization cells Ncells being much greater than the number of regions Nclusters in general.
Homogenization traditionally has relied on analytical models based on simplifying assumptions.
Mixture rules where individual contributions of the constituents of the microstructure are summed
and weighted by their volume fractions, χK=VK/V , have generally been utilized in approximate
analytical homogenization. For instance, the approximation
¯
σ=1
VX
KZK
σdVX
K
χKσK(3)
where σK=σ(φK,K) is widely employed. Note that mixture rules such as these typically pair ex-
tensive partitioning variables, such as χK, with intensive variables, such as φK. The approximation
Eq. (3) leads to an estimate of the elastic moduli Cσ:
¯
CX
K
χKCK,(4)
assuming the strain is homogeneous in the sample. Homogeneous flux or homogeneous field
gradient are typical simplifying assumptions for homogenizing gradient-driven fluxes such as stress
and heat flux. These complementary limiting cases can be combined, as in the classic Hill average
estimator [36] for the elastic modulus tensor:
¯
C=1
2
X
K
χKCK+"X
K
χKC1
K#1
.(5)
Note that assuming uniform stress or strain in all ΩKin the domain Ω omits compatibility require-
ments that make the actual field dependent on differences in φKalong interface (e.g. orientation
differences along inter-crystal boundaries (misorientation) [37]).
Figure 1: Electron backscatter diffraction (EBSD) image of a polycrystalline metal (courtesy of
Brad Boyce, Sandia). Colors indicate the crystal orientations.
5
3 Datasets
To demonstrate the efficacy of the proposed architecture we will use a few exemplars: (a) heat
conduction in polycrystals, (b) stress evolution in polycrystalline materials undergoing plastic de-
formation, and (c) stress evolution in viscoelastic composite materials. The aspects of the data
germane to machine learning are described in the following section and the remaining simulation
details are given in App. A. With these exemplars we generate both 2D and 3D datasets. For each
exemplar only the apparent orientation or phase information and the discretization cell volumes
are provided to the GCNN models. The inclusion of cell volumes is motivated by the classical
mixture formulas since the unstructured meshes have a distribution of element volumes. The 3D
datasets demonstrate the full generality of the neural network architectures described in Sec. 4 and
test how they perform on samples with large discretizations, while the 2D datasets facilitate ex-
ploration through visualization. Some of the datasets involve data pixelated on structured grids,
which are treated as graphs by the graph convolutional neural networks, These datasets enable
direct comparison with pixel-based CNNs.
3.1 Polycrystal heat conduction
The simplest dataset was generated with two-dimensional simulations of steady heat conduction
through ersatz polycrystals represented on a triangular mesh. The problem has the complexity
that the conductivity tensor κin each crystal is orientation φdependent; however, the governing
partial differential equation linear in the conductivity and the temperature fields. Fig. 2 shows a
few of the 10,000 realizations generated, which which had 211 to 333 elements (mean 235.8) and 14
to 18 crystals (mean 15.4).Also apparent is deviation in the temperature fields from homogeneous
gradients due to the anisotropy of the crystal components of the samples.
As Fig. 3a shows, the volume averaged angle ¯
φfor a realization is not particularly correlated
with the output. However if the crystal conductivities are known, mixture rules analogous to Eq. (5)
for a homogeneous gradient
¯κ=X
K
χKκK(6)
or homogeneous flux
¯κ="X
K
χKκ1
K#1
(7)
give reasonably well-correlated estimates of the effective conductivity ¯κ. These estimates are show
in Fig. 3b for each of the realizations in the ensemble. Clearly the mixture estimates are biased
since the assumptions of homogeneous gradient or homogeneous flux are limiting cases.
3.2 Polycrystals with crystal plasticity
Crystal plasticity (CP) is another common example where the homogenized response of a repre-
sentative sample of grain structure characterized by the orientation angles φ(X) is of interest. As
detailed in App. A.2, the response of the polycrystalline metal samples is complex and non-linear
in the loading and properties. Fig. 4 shows a few of the 12,000 2D realizations on a 32×32 grid and
another 10,000 3D realizations on a 25×25×25 grid were simulated independently. Fig. 4 also illus-
trates the inhomogeneities in the stress response (at 0.3% strain in tension), which are particularly
marked at grain boundaries. The 2D realizations had 5 to 30 grains (with a mean at 13.9) and the
3D realizations had 1 to 34 grains (with a mean at 13.0).
摘要:

Deeplearningandmulti-levelfeaturizationofgraphrepresentationsofmicrostructuraldataReeseJones,*CosminSafta,AriFrankelSandiaNationalLaboratories,Livermore,CA94551AbstractManymaterialresponsefunctionsdependstronglyonmicrostructure,suchasinhomo-geneitiesinphaseororientation.Homogenizationpresentsthetask...

展开>> 收起<<
Deep learning and multi-level featurization of graph representations of microstructural data Reese JonesCosmin Safta Ari Frankel.pdf

共27页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:27 页 大小:4.27MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 27
客服
关注