DAGKT: Difficulty and Attempts Boosted Graph-based Knowledge Tracing 3
are not considered models can’t discriminate between questions with the same
KCs, or between answers with different attempts.
Furthermore, the question embedding is achieved by GNN aggregating the
information of the surrounding nodes in the question-KC graph, so the question-
KC graph is very important. Most existing graph-based KT models perform
convolution on bipartite graphs and there is no question-question relationship
in the graph (shown in III of Figure 1). Gao et al. [5] hold the view that there
are two kinds of relationships between questions: prerequisite relationships and
similarity relationships. In the field of GNN-based KT, few studies put the rela-
tionships between questions into the convolution process (most of them only use
the question similarities in the prediction process, such as [18]). Tong et al. [16]
designed a method of constructing prior support relationships between questions
from students’ answer results illustrating the effectiveness of constructing rela-
tions from students’ answer results. However, most existing studies construct
the question similarity relationship through question text information or prob-
lem embedding distances, without using the students’ answer results. There is
still a need for a method that can use students’ answer results to build similarity
relationships.
To address these two problems, we propose the DAGKT model. Specifically,
to solve the first problem, we design a fusion module to fuse two types of infor-
mation: difficulty and attempts. We get the difficulties of the questions and the
students’ number of attempts from the datasets and encode them into embed-
dings through the encoder. After that, we put them with question embeddings
and answer embeddings to the fusion module to obtain exercise embeddings that
contain enormous information. Secondly, to address the second question and ob-
tain a good question embedding, we design a relationship-building module that
enriches the question-KC graph so that GCN can generate question embeddings
that combine the information of the question relationships. We use statistical
information combined with the calculation method of the F1 score to calculate
similarity relationships between questions. It is assumed that the two questions
may have a close relationship when students always obtain similar answering
results (correct/incorrect) on the two questions. The F1 score is an indicator
used in statistics to measure the accuracy of binary models. Another way to
say, the F1 score infers to the degree of similarity between predicted and tar-
get values [10]. Therefore, the similarity of questions in this study is calculated
according to the F1 score.
Finally, extensive experiments on real world datasets demonstrate the effec-
tiveness of DAGKT and each module. In summary, our main contributions are
as follows:
–To address the problem that most graph-based KT models cannot clearly
discriminate between questions with same KCs, or between answers with dif-
ferent attempts, DAGKT is proposed with a fusion module. In this module,
the question and answer embeddings are fused with difficulty and attempts.
–Furthermore, the relationship-building module is designed to construct the
similarity relationship between questions, inspired by the F1 score. The con-