Triangle Counting Through Cover-Edges David A. Bader Fuhuan Li Anya Ganeshan Ahmet Gundogdu Jason Lew Oliver Alvarado Rodriguez Zhihui Du Department of Data Science

2025-05-06 0 0 359.92KB 7 页 10玖币
侵权投诉
Triangle Counting Through Cover-Edges
David A. Bader, Fuhuan Li, Anya Ganeshan+, Ahmet Gundogdu#, Jason Lew, Oliver Alvarado Rodriguez, Zhihui Du
Department of Data Science
New Jersey Institute of Technology
Newark, New Jersey, USA
{bader,fl28,jl247,oaa9,zhihui.du}@njit.edu
Abstract—Counting and finding triangles in graphs is often
used in real-world analytics to characterize cohesiveness and
identify communities in graphs. In this paper, we propose the
novel concept of a cover-edge set that can be used to find
triangles more efficiently. We use a breadth-first search (BFS)
to quickly generate a compact cover-edge set. Novel sequential
and parallel triangle counting algorithms are presented that
employ cover-edge sets. The sequential algorithm avoids unnec-
essary triangle-checking operations, and the parallel algorithm is
communication-efficient. The parallel algorithm can asymptoti-
cally reduce communication on massive graphs such as from real
social networks and synthetic graphs from the Graph500 Bench-
mark. In our estimate from massive-scale Graph500 graphs,
our new parallel algorithm can reduce the communication on
a scale 36 graph by 1156x and on a scale 42 graph by 2368x.
Index Terms—Graph Algorithms, Triangle Counting, Parallel
Algorithms, High Performance Data Analytics
I. INTRODUCTION
Triangle counting [1] is a fundamental problem in graph an-
alytics, which involves finding the number of unique triangles
in a graph. It plays a crucial role in various graph analysis
techniques such as clustering coefficients [2], k-truss [3], and
triangle centrality [4]. The significance of triangle counting
is evident in its application in high-performance computing
benchmarks like Graph500 [5] and the MIT/Amazon/IEEE
Graph Challenge [6], as well as in the design of future
architecture systems (e.g., IARPA AGILE [7]).
Both sequential and parallel triangle counting algorithms
have been studied extensively since 1977 [8]. Latapy [9]
provides a comprehensive overview of sequential triangle
counting and various finding algorithms. Existing techniques,
including list intersection, matrix multiplication, and subgraph
matching [10], are techniques used to count triangles.
To enhance the performance of triangle counting, Cohen
[11] introduced a novel map-reduce parallelization technique
that generates open wedges between triples of vertices in the
graph. It determines whether a closing edge exists to complete
a triangle, thus avoiding the redundant counting of the same
triangle while maintaining load balancing. Many parallel ap-
proaches for triangle counting [12], [13] partition the sparse
+Anya Ganeshan attends Bergen County Academies, Hackensack, NJ
#Ahmet Gundogdu attends Paramus High School, Paramus, NJ
This research was partially supported by NSF grant number CCF-
2109988.
graph data structure across multiple compute nodes and adopt
the strategy of generating open wedges, which are sent to other
compute nodes to determine the presence of a closing edge.
Consequently, the communication time for these open wedges
often dominates the running time of parallel triangle counting.
In traditional edge-based triangle counting methods, all tri-
angles are identified by accumulating the sizes of intersections
between pairs of endpoints for each edge. Direction-oriented
approaches can avoid counting the same triangle multiple
times. However, in this paper, we propose a novel approach
that efficiently identifies all triangles using a reduced set of
edges known as a cover-edge set. By leveraging the cover-
edge-based triangle counting method, unnecessary edge checks
can be skipped while ensuring that no triangles are missed.
This significantly reduces the number of computational op-
erations compared to existing methods. Furthermore, for dis-
tributed parallel algorithms, the cover-edge-based method can
greatly reduce overall communication requirements. As a
result, our proposed method offers improved efficiency and
scalability for triangle counting.
Our contributions include:
A novel concept, Cover-Edge Set, is proposed to support
efficient triangle counting. The essential idea is that we
can identify all triangles from a significantly reduced
cover-edge set instead of the complete edge set. A simple
breadth-first search (BFS) is used to orient the graph’s
vertices into levels and to generate the cover-edge set.
A novel triangle counting and finding algorithm, CETC,
is developed based on the concept of Cover-Edge Set.
CETC runs in O(m·dmax)time and O(n+m)space,
where dmax is the maximal degree of a vertex vV.
A novel communication-efficient distributed parallel al-
gorithm for triangle counting and finding, Comm-CETC,
is also developed based on the concept of Cover-Edge
Set.Comm-CETC can asymptotically reduce the commu-
nication to improve total performance.
The remainder of the paper is organized as follows. Sec-
tion II presents our new approach for triangle counting. In
Section III, we employ our idea in distributed parallel triangle
counting to reduce communication. Section V discusses related
work. Lastly, in Section VI, we conclude the paper.
arXiv:2210.00389v4 [cs.DC] 16 Sep 2023
s
Fig. 1. Example BFS tree of Graph G. The tree-edges are black, strut-edges
are blue, and horizontal-edges are red.
II. COVER-EDGE SET BASED TRIANGLE COUNTING
A. Notations and Basic Idea
Let G= (V, E)be an undirected graph with n=|V|
vertices and m=|E|edges. A triangle in the graph
is a set of three vertices {va, vb, vc} ⊆ Vsuch that
{(va, vb),(va, vc),(vb, vc)} ⊆ E. We will use N(v) = {u|u
V((v, u)E)}to denote the neighbor set of vertex vV.
The degree of vertex vVis d(v) = |N(v)|, and dmax is the
maximal degree of a vertex in graph G.
Definition 1 (Cover-Edge and Cover-Edge Set).For any edge
eof a triangle in graph G,eis referred to as a cover-edge
of . For a given graph G, an edge set SEis called a
cover-edge set if it contains at least one cover-edge for every
triangle in G.
Based on the given definition, it is evident that the entire
edge set Ecan serve as a cover-edge set Sfor graph G.
However, our proposed method aims to efficiently count all
triangles using a smaller subset of edges instead of E. Thus,
the primary challenge lies in generating a compact cover-edge
set, which forms the initial problem to be addressed in our
approach. Let k=|S|/|E|. Our goal is to identify cover-
edge sets with the smallest k. In this paper, we propose using
breadth-first search (BFS) to generate a compact cover-edge
set.
Definition 2 (BFS-Edge).Let rbe the root vertex of an
undirected graph G. The level L(v)of a vertex vis defined as
the shortest distance from rto vobtained through a breadth-
first search (BFS). From the BFS, we classify the edges into
three types:
Tree-Edges: These edges belong to the BFS tree.
Strut-Edges: These are non-tree edges with endpoints on
two adjacent levels in the BFS traversal.
Horizontal-Edges: These are non-tree edges with end-
points on the same level in the BFS traversal.
Fig. 1 gives an example of these different edge types.
Lemma 1. Each triangle {u, v, w}in a graph contains at
least one horizontal-edge.
Proof. (Proof by contradiction) A triangle is a path of length
3 that starts and ends at the same vertex. Suppose there are
no horizontal-edges in the triangle. In that case, every edge
in the path (i.e., a tree-edge or strut-edge) either increases or
decreases the level by one.
Since the path must end on the same level as the starting
vertex, the number of edges in the path that decrease the level
must be equal to the number of edges that increase the level.
Consequently, the length of the path must be even to maintain
level parity. However, this contradicts the fact that a triangle
has an odd path length of 3.
Therefore, we conclude that there must be at least one
horizontal-edge in every triangle.
Theorem 3 (Cover-Edge Set Generation).All horizontal-
edges form a valid cover-edge set.
Proof. According to Definition 1, for any triangle in graph
G, we can always find at least one horizontal-edge that serves
as a cover-edge for . Thus, the set of all horizontal-edges
constitutes a cover-edge set.
Therefore, we can construct a cover-edge set, denoted as
BFS-CES, by selecting all the horizontal-edges obtained during
a breadth-first search (BFS). It is evident that BFS-CES is a
subset of Eand is typically much smaller than the complete
edge set E.
B. Cover-Edge based Triangle Counting
In this subsection, we provide a comprehensive description
of the process involved in identifying all triangles using a
cover-edge set generated through a breadth-first search.
Lemma 2. Each triangle {u, v, w}must contain either one
or three horizontal-edges.
Proof. By referring to the proof of Lemma 1, we know that
the path corresponding to the triangle’s three edges consists
of an even number of tree-edges and strut-edges. This implies
that there can be either 0 or 2 tree- or strut-edges within each
triangle.
In the case where there are 0 tree- or strut-edges, all three
edges of the triangle must be horizontal-edges. This is because
the absence of tree- or strut-edges implies that the entire path
is composed of horizontal-edges.
In the case where there are 2 tree- or strut-edges, the triangle
contains exactly one horizontal-edge. This is because having
two tree- or strut-edges in the path means that there is one
horizontal-edge connecting the remaining two vertices.
Therefore, we conclude that each triangle {u, v, w}must
contain either one or three horizontal-edges.
Our triangle counting approach, described in Alg. 1, effi-
ciently counts triangles using a cover-edge set. In line 1, we
initialize the counter Tto 0, which will store the total number
of triangles. To generate the cover-edge set, we perform a
摘要:

TriangleCountingThroughCover-EdgesDavidA.Bader∗,FuhuanLi,AnyaGaneshan+,AhmetGundogdu#,JasonLew,OliverAlvaradoRodriguez,ZhihuiDuDepartmentofDataScienceNewJerseyInstituteofTechnologyNewark,NewJersey,USA{bader,fl28,jl247,oaa9,zhihui.du}@njit.eduAbstract—Countingandfindingtrianglesingraphsisoftenusedinr...

展开>> 收起<<
Triangle Counting Through Cover-Edges David A. Bader Fuhuan Li Anya Ganeshan Ahmet Gundogdu Jason Lew Oliver Alvarado Rodriguez Zhihui Du Department of Data Science.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:359.92KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注