
based on local properties. This makes the task more
challenging and hampers the possibility of taking a
completely supervised approach.
We propose an unsupervised alternative for seg-
mentation, based on two assumptions: (1) segment
boundaries correspond to places with low mutual
information between sentences over the boundary;
(2) neural language models can serve as reliable
sentence probability estimators. Based on these
assumptions, we propose a simple approach to seg-
mentation and offer extensions involving dynamic
programming. The proposed models give a sub-
stantial margin over the existing methods in terms
of segmentation performance. In order to adapt
the model to jointly segment and classify, we in-
corporate into the model a supervised topic clas-
sifier, trained over manually indexed one-minute
testimony segments, provided by the USC Shoah
Foundation (SF).
2
Inspired by Misra et al. (2011),
we also incorporate the topical coherence based on
the topic classifier into the segmentation model.
Our contributions are the following: (1) we
present the task of topical segmentation for run-
ning, unedited text; (2) we propose novel algorith-
mic methods for tackling the task without any man-
ual segmentation supervision, building on recent
advances in language modeling; (3) comparing to
previous work, we find substantial improvements
over existing methods; (4) we compile a test set for
evaluation in the case of Holocaust testimonies; (5)
we develop domain-specific topical classifiers to
extract lists of topics for long texts.
Typically, narrative research faces a tradeoff be-
tween the number of narrative texts, which is im-
portant for computational methods, and the speci-
ficity of the narrative context, which is essential for
qualitative narrative research (Sultana et al.,2022).
Holocaust testimonies provide a unique case of a
large corpus with a specific context. Our work also
communicates with Holocaust research, seeking
methods to better access testimonies as the survivor
generation is slowly passing away (Artstein et al.,
2016). We expect our methods to promote schema-
based analysis and browsing of testimonies, en-
abling better access and understanding.
2 Previous work
Text Segmentation.
Considerable previous work
addressed the task of text segmentation, using both
supervised and unsupervised approaches. Proposed
2https://sfi.usc.edu/
methods for unsupervised text segmentation can
be divided into linear segmentation algorithms and
dynamic graph-based segmentation algorithms.
Linear segmentation, i.e., segmentation that is
performed on the fly, dates back to the TextTiling
algorithm (Hearst,1997), which detects boundaries
using window-based vocabulary changes. Recently,
He et al. (2020) proposed an improvement to the
algorithm, which, unlike TextTiling, uses the vo-
cabulary of the entire dataset and not only of the
currently considered segment. TopicTiling (Riedl
and Biemann,2012) uses a similar approach, using
LDA-based topical coherence instead of vocabu-
lary only. This method produces topics as well as
segments. Another linear model, BATS (Wu et al.,
2020), uses combined spectral and agglomerative
clustering for topics and segments.
In contrast to the linear approach, several mod-
els follow a Bayesian sequence modeling approach,
using dynamic programming for inference. This
approach allows making a global prediction of the
segmentation, at the expense of higher complex-
ity. Implementation details vary, and include using
pretrained LDA models (Misra et al.,2011), online
topic estimation (Eisenstein and Barzilay,2008;
Mota et al.,2019), shared topics (Jeong and Titov,
2010), ordering-based topics (Du et al.,2015), and
context-aware LDA (Li et al.,2020b).
Following recent advances in neural models,
these models have been used for the task of super-
vised text segmentation. Pethe et al. (2020) intro-
duced ChapterCaptor which relies on two methods.
The first method performs chapter break prediction
based on Next Sentence Prediction (NSP) scores.
The second method uses dynamic programming to
regularize the segment lengths towards the aver-
age. The models use supervision for finetuning the
model for boundary scores, but can also be used in
a completely unsupervised fashion. They experi-
ment with segmenting books into chapters, which
offers natural incidental supervision.
Another approach performs the segmentation
task in a completely supervised manner, similar to
supervised labeled span extraction tasks. At first,
the models were LSTM-based (Koshorek et al.,
2018;Arnold et al.,2019), and later on, Trans-
former based (Somasundaran et al.,2020;Lukasik
et al.,2020). Unlike finetuning, this approach re-
quires a large amount of segmented data.
All of these works were designed and evaluated
with structured written text, such as book chapters,