Triangle Counting Through Cover-Edges
David A. Bader∗, Fuhuan Li, Anya Ganeshan+, Ahmet Gundogdu#, Jason Lew, Oliver Alvarado Rodriguez, Zhihui Du
Department of Data Science
New Jersey Institute of Technology
Newark, New Jersey, USA
{bader,fl28,jl247,oaa9,zhihui.du}@njit.edu
Abstract—Counting and finding triangles in graphs is often
used in real-world analytics to characterize cohesiveness and
identify communities in graphs. In this paper, we propose the
novel concept of a cover-edge set that can be used to find
triangles more efficiently. We use a breadth-first search (BFS)
to quickly generate a compact cover-edge set. Novel sequential
and parallel triangle counting algorithms are presented that
employ cover-edge sets. The sequential algorithm avoids unnec-
essary triangle-checking operations, and the parallel algorithm is
communication-efficient. The parallel algorithm can asymptoti-
cally reduce communication on massive graphs such as from real
social networks and synthetic graphs from the Graph500 Bench-
mark. In our estimate from massive-scale Graph500 graphs,
our new parallel algorithm can reduce the communication on
a scale 36 graph by 1156x and on a scale 42 graph by 2368x.
Index Terms—Graph Algorithms, Triangle Counting, Parallel
Algorithms, High Performance Data Analytics
I. INTRODUCTION
Triangle counting [1] is a fundamental problem in graph an-
alytics, which involves finding the number of unique triangles
in a graph. It plays a crucial role in various graph analysis
techniques such as clustering coefficients [2], k-truss [3], and
triangle centrality [4]. The significance of triangle counting
is evident in its application in high-performance computing
benchmarks like Graph500 [5] and the MIT/Amazon/IEEE
Graph Challenge [6], as well as in the design of future
architecture systems (e.g., IARPA AGILE [7]).
Both sequential and parallel triangle counting algorithms
have been studied extensively since 1977 [8]. Latapy [9]
provides a comprehensive overview of sequential triangle
counting and various finding algorithms. Existing techniques,
including list intersection, matrix multiplication, and subgraph
matching [10], are techniques used to count triangles.
To enhance the performance of triangle counting, Cohen
[11] introduced a novel map-reduce parallelization technique
that generates open wedges between triples of vertices in the
graph. It determines whether a closing edge exists to complete
a triangle, thus avoiding the redundant counting of the same
triangle while maintaining load balancing. Many parallel ap-
proaches for triangle counting [12], [13] partition the sparse
+Anya Ganeshan attends Bergen County Academies, Hackensack, NJ
#Ahmet Gundogdu attends Paramus High School, Paramus, NJ
∗
This research was partially supported by NSF grant number CCF-
2109988.
graph data structure across multiple compute nodes and adopt
the strategy of generating open wedges, which are sent to other
compute nodes to determine the presence of a closing edge.
Consequently, the communication time for these open wedges
often dominates the running time of parallel triangle counting.
In traditional edge-based triangle counting methods, all tri-
angles are identified by accumulating the sizes of intersections
between pairs of endpoints for each edge. Direction-oriented
approaches can avoid counting the same triangle multiple
times. However, in this paper, we propose a novel approach
that efficiently identifies all triangles using a reduced set of
edges known as a cover-edge set. By leveraging the cover-
edge-based triangle counting method, unnecessary edge checks
can be skipped while ensuring that no triangles are missed.
This significantly reduces the number of computational op-
erations compared to existing methods. Furthermore, for dis-
tributed parallel algorithms, the cover-edge-based method can
greatly reduce overall communication requirements. As a
result, our proposed method offers improved efficiency and
scalability for triangle counting.
Our contributions include:
•A novel concept, Cover-Edge Set, is proposed to support
efficient triangle counting. The essential idea is that we
can identify all triangles from a significantly reduced
cover-edge set instead of the complete edge set. A simple
breadth-first search (BFS) is used to orient the graph’s
vertices into levels and to generate the cover-edge set.
•A novel triangle counting and finding algorithm, CETC,
is developed based on the concept of Cover-Edge Set.
CETC runs in O(m·dmax)time and O(n+m)space,
where dmax is the maximal degree of a vertex v∈V.
•A novel communication-efficient distributed parallel al-
gorithm for triangle counting and finding, Comm-CETC,
is also developed based on the concept of Cover-Edge
Set.Comm-CETC can asymptotically reduce the commu-
nication to improve total performance.
The remainder of the paper is organized as follows. Sec-
tion II presents our new approach for triangle counting. In
Section III, we employ our idea in distributed parallel triangle
counting to reduce communication. Section V discusses related
work. Lastly, in Section VI, we conclude the paper.
arXiv:2210.00389v4 [cs.DC] 16 Sep 2023