Benchmarking GPU and TPU Performance with Graph Neural Networks Xiangyang Ju1Yunsong Wang2Daniel Murnane3Nicholas Choma3Steven Farrell2

2025-05-06 0 0 396.19KB 10 页 10玖币

侵权投诉

Benchmarking GPU and TPU Performance with

Graph Neural Networks

Xiangyang Ju,1Yunsong Wang,2Daniel Murnane,3Nicholas Choma,3Steven Farrell2

and Paolo Calaﬁura3

1Physics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA

2NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA

3Scientiﬁc Data Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA

E-mail: xju@lbl.gov,yunsongwang@lbl.gov,dtmurnane@lbl.gov,

njchoma@lbl.gov,sfarrell@lbl.gov,pcalafiura@lbl.gov

Abstract: Many artiﬁcial intelligence (AI) devices have been developed to accelerate the

training and inference of neural network models. The most common ones are the Graphics

Processing Unit (GPU) and Tensor Processing Unit (TPU). They are highly optimized for

dense data representations. However, sparse representations such as graphs are prevalent in

many domains, including science. It is therefore important to characterize the performance

of available AI accelerators on sparse data. This work analyzes and compares the GPU

and TPU performance training a Graph Neural Network (GNN) developed to solve a real-

life pattern recognition problem. Characterizing the new class of models acting on sparse

data may prove helpful in optimizing the design of deep learning libraries and future AI

accelerators.

arXiv:2210.12247v1 [cs.LG] 21 Oct 2022

Contents

1 Introduction 1

2 Graph Neural Network Benchmark 2

3 Hardware and Software 3

4 Comparison of TPU and GPU 4

5 Proﬁling and Rooﬂine analysis 5

6 Conclusions 8

1 Introduction

Modern machine learning (ML) plays a critical role in numerous domains, including com-

puter vision, language processing, and speech/image recognition. Much of their success is

driven by three factors: novel deep learning models, large-scale datasets, and massive com-

puting power. ML models are getting wider and deeper, and reaching a trillion trainable

parameters [1]. To keep up with the demands of deep learning at the end of Moore’s Law,

novel specialized computing devices, collectively known as AI accelerators, are necessary.

The most popular AI accelerator is the Graphics Processing Unit (GPU). GPUs are

optimized for massively parallel execution of simple code blocks, and well-suited for linear

algebra. Similarly, the Tensor Processing Unit (TPU) is optimized for matrix operations [2].

Both GPU and TPU are optimized to operate on dense matrices.

MLPerf [3] is a machine learning benchmark suite that has gained industry-wide sup-

port and recognition. It includes computer vision, language processing, recommendation

system and gaming applications. There are other ML benchmark being proposed such as

the ParaDNN [4]. which focuses on fully connected, convolutional and recurrent neural

networks. These benchmarks, while essential for fair comparisons among diﬀerent archi-

tectures, do not capture the performance characteristic of models (such as Graph Neural

Networks) that operate on sparse and irregular datasets. Sparse datasets are common

in science applications such as molecular dynamics, genomics, and High Energy Physics

(HEP). Graph representation and GNNs have seen rapidly growing applications in these

science domains. Ref [5] reviews GNN application to HEP. This work uses as a benchmark

a GNN model that solves a combinatorially hard pattern recognition problem on the Large

Hadron Collider data.

This paper is organized as follows. Section 2describes the benchmark GNN model

and the dataset used for training it. Section 3introduces the hardware platforms used for

this study. Section 4provides a thorough comparison of TPU and GPU. The performance

analysis is described in Section 5. Outlook and conclusions are in Section 6.

– 1 –

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

BenchmarkingGPUandTPUPerformancewithGraphNeuralNetworksXiangyangJu,1YunsongWang,2DanielMurnane,3NicholasChoma,3StevenFarrell2andPaoloCalaura31PhysicsDivision,LawrenceBerkeleyNationalLaboratory,Berkeley,CA94720,USA2NERSC,LawrenceBerkeleyNationalLaboratory,Berkeley,CA94720,USA3ScienticDataDivision,L...

展开>> 收起<<

Benchmarking GPU and TPU Performance with Graph Neural Networks Xiangyang Ju1Yunsong Wang2Daniel Murnane3Nicholas Choma3Steven Farrell2.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Benchmarking GPU and TPU Performance with Graph Neural Networks Xiangyang Ju1Yunsong Wang2Daniel Murnane3Nicholas Choma3Steven Farrell2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: