embeddings have been observed to surpass the per-
formance of bag of words for large enough data sets
(Rudkowsky et al.,2018) because bag of words of-
ten met with various problems such as disregarding
the grammatical structure of the text, large vocabu-
lary dimension and sparse representation (Le and
Mikolov,2014;El-Din,2016). In order to tackle
the above challenges, word embeddings can be
used. Since word embeddings capture the simi-
larities among ingrained sentiments in words and
represent them in the vector space, word embed-
dings tend to increase the accuracy of classification
models (Goldberg,2016).
However, one of the major weaknesses of word
embedding models is that they fail to capture syn-
tax and polysemy; i.e. the presence of multiple
possible meanings for a certain word or a phrase
(Mu et al.,2016). In order to overcome these obsta-
cles and also to achieve fine granularity in the em-
bedding, sentence embeddings are used. The idea
is to test common Euclidean space word embed-
ding techniques such as fastText (Bojanowski et al.,
2017;Joulin et al.,2016), Word2vec (Mikolov
et al.,2013), and GloVe (Pennington et al.,2014)
with sentence embedding techniques. The pooling
methods (i.e. max pooling, min pooling and avg
pooling) will be considered as the baseline meth-
ods for the test. More advanced models such as
sequence to sequence model (i.e. seq2seq model)
(Sutskever et al.,2014) and the modified version of
the sequence to sequence model introduced by the
work of Cho et al. (2014) with GRU (Chung et al.,
2014) and LSTM (Hochreiter and Schmidhuber,
1997) recurrent neural network units will be tested
against the pooling means. Furthermore, the addi-
tion of attention mechanism (Vaswani et al.,2017)
into the sequence to sequence model will also be
tested.
Most models created using word and sentence
embeddings are based on the Euclidean space.
Though this vector space is commonly used, it
poses significant limitations when representing
complex structures (Nickel and Kiela,2017). Us-
ing the hyperbolic space provides a plausible so-
lution for such instances. The hyperbolic space
is a negatively-curved, non-Euclidean space. It
is advantageous for embedding trees as the cir-
cumference of a circle grows exponentially with
the radius. The usage of hyperbolic embedding
is still a novel research area as it was only intro-
duced recently, through the work of Nickel and
Kiela (2017); Chamberlain et al. (2017); Sala et al.
(2018). The work of Lu et al. (2019,2020) high-
light the importance of using the hyperbolic space
to improve the quality of embeddings in a practi-
cal context within the medical domain. However,
research done on the applicability of hyperbolic
embeddings in different arenas is highly limited.
Thus, the full potential of the hyperbolic space is
yet to be fully uncovered.
Through this paper, we are testing the effective-
ness of a set of two-tiered word representation mod-
els that include various word embeddings as the
lower tier and sentence embeddings as the upper
tier will be compared.
2 Related Work
The sequence to sequence model introduced by the
work of Sutskever et al. (2014) is vital in this re-
search as it is one of the core models in developing
sentence embedding. Though originally developed
for translation purposes the model has gone under
multiple modifications depending on the context
such as description generation for images (Karpa-
thy and Fei-Fei,2015), phrase representation (Cho
et al.,2014), attention models (Vaswani et al.,2017)
and BERT models (Devlin et al.,2018) thus prov-
ing the potential it holds in the machine learning
area.
The work of Nickel and Kiela (2017) intro-
duces and explores the potential of hyperbolic em-
bedding by using an n-dimension Poincaré ball.
The research work compares the hyperbolic and
Euclidean embeddings for a complex latent data
structure and comes to the conclusion that hyper-
bolic embedding surpasses the Euclidean embed-
ding in effectivity. Inspired by the above results,
both Leimeister and Wilson (2018) and Dhingra
et al. (2018) have extended the methodology intro-
duced by Nickel and Kiela (2017). Leimeister and
Wilson (2018) have developed a hyperbolic word
embedding using the skip-ngram negative sam-
pling architecture taken from Word2vec. In lower
embedding dimensions, the developed model per-
forms better in comparison to its Euclidean coun-
terpart. The work of Dhingra et al. (2018) uses
re-parameterization to extend the Poincaré embed-
ding, in order to learn the embedding of arbitrarily
parameterized objects. The framework thus created
is used to develop word and sentence embeddings.
In our research, we will be following the footsteps
of the above papers.