In the weighted version, input also includes a weight function W:C→Rand the output of the query
is the weighted sum of the distinct colors in R, i.e., the value Pc∈CRW(c).
Colored range searching assumes that the colors are completely unstructured and that each point
receives exactly one color. However, these assumptions can be inadequate as hierarchical categories are
quite common. For example, biological classification of living organisms are done through a tree where
inclusion in a category implies inclusion in all the ancestor categories. In fact, similar phenomena happen
with respect to most notions of classification (e.g., classification of industries). In other scenarios, a point
may have multiple categories (e.g., a car can have a “brand”, a “color”, “fuel type” and so on). While
it is possible to view the set of categories assigned to a point as one single category, doing so ignores a
lot of structure. For example, “a blue diesel car” is both a “blue car” and also a “diesel car” but by
considering “blue diesel” as a single category, this information is lost. We believe it is worthwhile to
study the notion of structures within categories from a theoretical point of view. We are not in fact the
first in trying to do so and we will shortly discuss some of the previous attempts.
A natural way to represent hierarchical categories is to assume that vertices of a DAG Grepresent
the set of categories where an edge ~e = (u, v) from (a category) uto (a category) vmeans that uis a
sub-category of v. We call Gacategory DAG (or a category tree if it is a tree). For a vertex v∈ G, we
define G≤(v) as the subset of vertices of Gthat can reach v(i.e., “sub-categories” of v), G≥(v) as the
subset of vertices of Gthat vcan reach (i.e., “super-categories” of v). Similarly, for a subset H⊂V(G)
we define G≥(H) = ∪v∈HG≥(v), and G≤(H) = ∪v∈HG≤(v).
Category trees allows us to represent hierarchical categories. Category DAGs allow us to capture cases
where points can have multiple categories. Consider the car example again. We can define a category
DAG Gwhere a vertex u∈ G represents the compound category {diesel,blue}with edges to vertices
dand bthat represent “diesel” and “blue” categories respectively. Thus, the set G≤(d) represents all
the “diesel” cars and it includes the category u, the “blue diesel” category, and similarly, the set G≤(b)
represents all the “blue” cars which also includes the category u, the “blue diesel” category. Likewise,
G≥(u) includes both band dsince a “blue diesel” car is both a “blue car” and a “diesel car”.
We revisit colored range searching problems, using concepts of category DAGs or trees.
Problem 2 (sub-category range counting (SCRC)).Consider an input point set P⊂Rdof npoints, a
DAG Gwith O(n)edges, and a function C:P→ G. The goal is to store the input in a data structure,
such that given a query that consists of a query range R ⊂ Rdand a query vertex vq∈ G it can output
|CR∩ G≤(vq)|where CR={C(p)|p∈P∩ R}.
Problem 3 (hierarchical color counting (HCC)).Consider an input point set P⊂Rdof npoints, a
DAG Gwith O(n)edges, and a function C:P→ G. The goal is to store the input in a data structure,
such that given a query range R ⊂ Rdone can output |GR|where GRis the set of colors in R, i.e.,
GR=Sp∈R∩PG≥(C(p)).
In the weighted version of the problem, each vertex v(i.e., category) of Gis associated with a weight
w(v)and given the query R, the goal is to compute Pv∈GRw(v)instead.
Related problems. Very recently, there have been other attempts to consider the structure of
“colors” within the computational geometry community, e.g., He and Kazi [12] consider a problem very
similar to SCRC on a tree G; the only difference is that instead of |CR∩ G≤(v)|, the query outputs
|CR∩π(v, w)|where π(v, w) is a path between two query vertices v, w ∈ G. There are also other variants
available. See [12] for further references.
1.2 Previous and Other Related Results
The study of colored range counting and its variants began in 1993 [14] and since then it has received
considerable attention. See the survey on colored range searching and its variants [9]. The problem has
at least three main variants: color range counting (CRC), color range reporting, and “type 2” color range
counting (for every distinct color, count how many times it appears). Here, we only review colored range
counting results.
In 1D, one can solve the CRC problem using O(n) space and O(log n) query time by an elegant and
simple transformation [10] which turns the problem into the unweighted 2D orthogonal range counting
2