
SPABERT: A Pretrained Language Model from Geographic Data for
Geo-Entity Representation
Zekun Li1, Jina Kim1, Yao-Yi Chiang1, Muhao Chen2
1Department of Computer Science and Engineering, University of Minnesota, Twin Cities
2Department of Computer Science, University of Southern California
{li002666,kim01479,yaoyi}@umn.edu, muhaoche@usc.edu
Abstract
Named geographic entities (geo-entities for
short) are the building blocks of many geo-
graphic datasets. Characterizing geo-entities
is integral to various application domains, such
as geo-intelligence and map comprehension,
while a key challenge is to capture the spatial-
varying context of an entity. We hypothe-
size that we shall know the characteristics of
a geo-entity by its surrounding entities, sim-
ilar to knowing word meanings by their lin-
guistic context. Accordingly, we propose a
novel spatial language model, SPABERT ( ),
which provides a general-purpose geo-entity
representation based on neighboring entities in
geospatial data. SPABERT extends BERT to
capture linearized spatial context, while incor-
porating a spatial coordinate embedding mech-
anism to preserve spatial relations of entities
in the 2-dimensional space. SPABERT is pre-
trained with masked language modeling and
masked entity prediction tasks to learn spatial
dependencies. We apply SPABERT to two
downstream tasks: geo-entity typing and geo-
entity linking. Compared with the existing lan-
guage models that do not use spatial context,
SPABERT shows significant performance im-
provement on both tasks. We also analyze the
entity representation from SPABERT in vari-
ous settings and the effect of spatial coordinate
embedding.
1 Introduction
Interpreting human behaviors requires consider-
ing human activities and their surrounding en-
vironment. Looking at a stopping location,
[
Speedway ,½
],
1
from a person’s trajectory, we
might assume that this person needs to use the loca-
tion’s amenities if
Speedway
implies a gas station,
and
½
is near a highway exit. We might predict a
meetup at [
Speedway ,½
] if the trajectory travels
through many other locations,
½
,
½
, ..., of the same
1
A geographic entity name
Speedway
and its loca-
tion ½(e.g., latitude and longitude). Best viewed in color.
name,
Speedway
, to arrive at [
Speedway ,½
] in
the middle of farmlands. As humans, we are able
to make such inferences using the name of a ge-
ographic entity (geo-entity) and other entities in
a spatial neighborhood. Specifically, we contex-
tualize a geo-entity by a reasonable surrounding
neighborhood learned from experience and, from
the neighborhood, relate other relevant geo-entities
based on their name and spatial relations (e.g., dis-
tance) to the geo-entity. This way, even if two gas
stations have the same name (e.g.,
Speedway
) and
entity type (e.g., ‘gas station’), we can still reason
about their spatially varying semantics and use the
semantics for prediction.
Capturing this spatially varying location seman-
tics can help recognizing and resolving geospa-
tial concepts (e.g., toponym detection, typing and
linking) and the grounding of geo-entities in doc-
uments, scanned historical maps, and a variety
of knowledge bases, such as Wikidata, Open-
StreetMap, and GeoNames. Also, the location se-
mantics can support effective use of spatial textual
information (geo-entities names) in many spatial
computing task, including moving behavior detec-
tion from visiting locations of trajectories (Yue
et al.,2021,2019), point of interest recommen-
dations (Yin et al.,2017;Zhao et al.,2022), air
quality (Lin et al.,2017,2018,2020;Jiang et al.,
2019) and traffic prediction (Yuan and Li,2021;
Gao et al.,2019) using location context.
Recently, the research community has seen a
rapid advancement in pretrained language mod-
els (PLMs) (Devlin et al.,2019;Liu et al.,2019;
Lewis et al.,2020;Sanh et al.,2019), which sup-
ports strong contextualized language representa-
tion abilities (Lan et al.,2020) and serves as the
backbones of various NLP systems (Rothe et al.,
2020;Yang et al.,2019). The extensions of these
PLMs help NL tasks in different data domains (e.g.,
biomedicine (Lee et al.,2020;Phan et al.,2021),
software engineering (Tabassum et al.,2020), fi-
arXiv:2210.12213v1 [cs.CL] 21 Oct 2022