
Applying FrameNet to Chinese (Poetry)
Zirong Chen
Department of Computer Science
Georgetown University
Washington, DC, USA
zc157@gerogetown.edu
Abstract
FrameNet( Fillmore and Baker [2009] ) is
well-known for its wide use for
knowledge representation in the form of
inheritance-based ontologies and
lexica( Trott et al. [2020]). Although
FrameNet is usually applied to languages
like English, Spanish and Italian, there
are still plenty of FrameNet data sets
available for other languages like Chinese,
which differs significantly from those
languages based on Latin alphabets. In
this paper, the translation from ancient
Chinese Poetry to modern Chinese will
be first conducted to further apply the
Chinese FrameNet(CFN, provided by
Shanxi University). Afterwards, the
translation from modern Chinese will be
conducted as well for the comparison
between the applications of CFN and
English FrameNet. Finally, the overall
comparison will be draw between CFN to
modern Chinese and English FrameNet.
1. Introduction
The detailed definition of FrameNet will not be
discussed in this paper.
CFN is a vocabulary database, including frames,
vocabulary units and annotated sentences. It is
based on the theory of frame semantics, cites
Burke's work on FrameNet( Fillmore and Baker
[2009] ) written in English, and is supported by
evidence from a large Chinese corpus. CFN
currently contains 323 semantic frames, 3,947
lexical units, and more than 18,000 annotated
sentences with framed syntax and semantic
information, covering the common core of
language and more specialized fields such as
travel, online book sales and law. And a speech
with 200 comments. In addition to constructing
the CFN database, they also study the
framework semantic theory related to Chinese,
and study the construction technology of
CFN-based applications. They have developed a
frame semantic role labeling system for single
sentences and speech.
Tangshi(Shi form of Chinese verse/Poetry in the
Tang Dynasty, 618AD to 907AD), Songci(Ci
form of Chinese verse/Poetry in the Song
Dynasty, 960AD to 1279AD) and Yuanqu(Qu
form of Chinese verse/Poetry in the Yuan
Dynasty, 1271AD to 1368AD) are widely
considered as the crowning achievement of
Chinese Literature. Among them, Songci will be
chosen for analysis in this paper, reasons are
listed below.
Generally, there are two genres of Tangshi:
seven-feet rhyme and five-feet rhyme. The
seven-feet rhyme has 56 characters in total, with
8 sentences each containing 7 characters. Each
character in Chinese has exactly 1 syllable,
making each sentence 7-syllable long. Hence the
name seven-feet rhyme. The five-feet rhyme
works in a similar fashion. The appropriateness
of grammar, especially by modern standards,
was often sacrificed so that the meter and length
of the poem could fit the strict fashion of
Tangshi described above. So Tangshi is not
suitable for the application of CFN. Because
they were created as lyrics that are meant to fit
into common tunes(there were more than 1,000
types of tunes), Songci and Yuanqu do not share
the same constrains as Tangshi. Therefore,
Songci and Yuanqu are more fitting options for
CFN. Personally, I prefer Songci over Yuanqu
due to historical reasons. Thus, I chose to apply
Songci to CFN later.
2. Related Work
Previous work based on the FrameNet(Fillmore
and Baker [2009] ) project, such as the online
annotation tool provided by the Brazil Research
Lab, has made it more convenient than ever to
do research on the different applications of
FrameNet.
Besides, the Chinese FrameNet(CFN) is a
current active research topic at Semantic
Computing & Chinese FrameNet Research
Center, Shanxi University, China. CFN currently
contains 1,152 semantic frames, 12,153 lexical
units(LU), more than 18,000 sentences. After