KENCH et al. MICROLIB PREPRINT
In this paper, we aim to address the disparity between the availability of 2D micrographs compared to 3D. A
number of previous approaches have been developed to address this problem through dimensionality expansion,
which commonly entails statistical generation of 3D micrographs using statistics from a 2D training image.
These are typically physic based and require the extraction of particular metrics from the training data for
comparison. For example, sphere packing models using 2D particle size distributions
14
, poly-crystalline grain
growth algorithms15, and data fusion approaches16.
In this work, we use SliceGAN, a recently developed convolutional machine learning algorithm for dimensionality
expansion
17
. A typical GAN uses two convolutional networks (generator and discriminator) to learn to mimic
dataset distributions. The generator synthesises fake examples, and the discriminator identifies differences
between these fake samples and the true training data distribution. Through iterative learning, the discriminator
informs the generator how to make increasingly realistic samples that match the real training data. Importantly,
in a typical set-up, the dimensionality of the generated images and the training data match. To facilitate different
dimensionalities, SliceGAN uses a simple modification; a 3D generator network produces a sample volume,
then a 2D discriminator checks the fidelity of one slice at a time, where the 2D dimensionality of the slice
now matches the 2D dimensionality of the training images. The algorithm is described in full in the original
manuscript
17
.SliceGAN is particularly well suited to the task at hand due to a number of key features. First,
broad applicability means that the same algorithm and hyper-parameters can be used for a very diverse set of
microstructures, as demonstrated in this dataset. Second, high speed training (typically 3 hrs on an RTX6000
GPU) and generation (
<3
seconds for a 500
3
voxel volume) enables the synthesis of hundreds of large samples
for statistical experiments, as well as the generation of volumes far larger than it is currently possible to obtain
directly through imaging (
>20003
voxel). Third, complete automation of the 2D to 3D algorithm is possible
with no user defined inputs, such as statistical features, being required. This combination of strengths makes
SliceGAN an excellent candidate for building the first large scale 3D microstructural database from existing
open-source 2D data.
The benefits of this database are twofold. First, we provide a diverse 3D microstructural dataset which can be
used by the material science community for modelling purposes. Crucially, users are not limited to the single
example cube we provide, as each data entry also has an associated trained generator neural network (45 Mb
in size) available to download. This can be used to synthesise arbitrary size datasets by cloning the SliceGAN
repo and running the relevant scripts (see methods). The second important function of this database is as a
demonstration to the material science community of the strengths of SliceGAN. The entries we provide are
diverse in their nature, and contained in an easily searchable website. Interested researchers can thus use this
website to check whether SliceGAN works on materials in their research field, and see examples of generated
outputs. This encourages the submission of more entries to the database, and the further use of SliceGAN in the
field of computational materials. The key data processing steps and datasets are presented in Figure 1.
Methods
As shown in Figure 1, the database construction required several distinct steps. First, a subset of micrographs
were selected using a set of exclusion criteria. A number of simple pre-processing operations were then applied
to ensure suitability for the SliceGAN workflow. An automated in-painting method was used to remove scale
bars from the micrographs; compared to a cropping approach, this saves crucial data in an already extremely
data-scarce setting. Finally, the resulting micrographs are used to train SliceGAN generators, each of which was
used to generate an example
3203
cubic volume. Each of these steps can be reproduced by cloning the MicroLib
repository and running main.py in the relevant modes, as described in the repository README.
Exclusion criteria
DoITPoMS includes 818 diverse micrographs which can easily be downloaded directly from their website.
However, not all are suitable for SliceGAN, which has a number of limitations. As such, the following exclusion
criteria are applied to leave 87 feasible microstructures:
1.
Microstructure isotropy – SliceGAN can be used for some anisotropic microstructures, but this mode
requires multiple perpendicular micrographs which are not available from DoITPoMS.
2