Experimental Study - Relation Linking using a Semantically Indexed Bi-Partite Knowledge Base

Relation Linking using a Semantically Indexed Bi-Partite Knowledge Base

6.3 Experimental Study

Chapter 6 Relation Linking using a Semantically Indexed Bi-Partite Knowledge Base

first count the number of relational patterns associated with it; then this value is normalised by the total number of patterns in PATTY. The penalty functionWis defined as follows:

W=1−







count(Pr,1)/count(Pall)

· · ·

count(Pr,n)/count(Pall)







Pr,1,· · · Pr,n are numbers of patterns for a relation, and Pall is the total number of relational pat-terns in PATTY. This step changes the ranking of the retrieved relations in Step 6.2.1. Therefore, potentialRels⁰(pattern(Q⁰),G⁰) is now turned into the ranked relations RankedRel⁰(pattern(Q⁰),G⁰), which is the output of this pipeline step. In our example, the ranked list of relevant relations is updated from the list (dbo:parent, dbo:spouse, dbo:deathPlace, dbo:predecessor, dbo:birthPlace,dbo:relation) to (dbo:parent,dbo:spouse,dbo:birthPlace, dbo:deathPlace,dbo:predecessor,dbo:relation), i.e., the DBpedia predicate dbo:birth-Placeis ranked in a higher position.

Extending the set of relevant natural language relations for the input Question Many times an irrelevant pattern appearing in a question, matches higher in number while calculating cosine similarities in the previous step. For example, the word ‘where’ appears 1,498 times in PATTY; this will negatively impact on the overall results. Therefore, to overcome this problem, we extract NL relations from the input question. In DBpedia, it is very likely that the DBpedia predicate associated has similar names with the NL predicate. For example, the NL relation ‘was born’ is associated with dbo:birthPlace, the relation ‘president of’ is associated withdbo:President, the relation ‘wife of’ is associated withdbo:spousein the ranked list of DBpedia properties, and so on. Therefore, we extractPredicate(Pr) from the question Q; furthermore, we expand this list with synonyms from Wordnet. We then create vector representation of each of the relations inextendedPredicate(Pr⁰) using the GloVe model. In our running example, the relation ‘was born’ is expanded to the list (born, birth, bear, deliver); it is converted further into its vector representation.

Re-ranking the relevant relations In the last step of the pipeline, we take the outputs of the second and third step, which correspond to the vector representation of ranked potential relations (RankedRel⁰(pattern(Q⁰),G⁰)) and extended predicate patterns (extendedPredicate(Pr⁰)). We again calcu-late cosine similarities between them to re-rank the list of obtained relations inRankedRel⁰(pattern(Q⁰),G⁰).

In our example, the extended question predicate list from the third step is (born, birth, bear, de-liver) and the ranked list of potential relations from the second step of the pipeline is (dbo:parent, dbo:spouse,dbo:birthPlace,dbo:predecessor,dbo:relation,dbo:deathPlace).

After this step, the relationdbo:birthPlacehas the highest similarity withbirth, changing its posi-tion in the ranked list of relaposi-tions. Therefore, our final re-ranked list of relaposi-tions associated with the pattern was bornis the following: (dbo:birthPlace,dbo:parent,dbo:spouse,dbo:deathPlace, dbo:predecessor,dbo:relation). The DBpedia predicatedbo:birthPlaceis the top-1.

6.3 Experimental Study

the impact of using an SIBKB on a relation linking task? RQi2)What is the impact of using an SIBKB on the relation linking execution time? RQi3)What is the impact of an SIBKB on the size of a collection of semantically-typed relational patterns?

The experimental configuration is as follows:

Relation Linking BenchmarkIn Section5.2, we created a benchmark for entity linking task based on the QALD (Question Answering over Linked Data) benchmark used for evaluating complete QA systems.

We devised a similar approach for the relation linking benchmark using the QALD-7 training set³that contains 215 questions.

Metrics: i) Execution Time: Elapsed time between the submission of a question to an engine and the delivery of the relevant DBpedia relations. Timeout is set to 300 seconds. ii)Inv.Time: It is calculated as: 1- (average execution time for BaseLine/average execution time for SIBKB)iii)In Memory Size: The Total size of the PATTY knowledge base and size of its corresponding SIBKB.

iv)Inv.Memory: It is calculated as: 1- (Memory Size of PATTY/Memory Size of SIBKB)v)Global Precision: The number of correct relations retrieved at first rank in the list of retrieved relations out of the total number of questions. vi)Global Recall: The number of questions answered at any position (in our case till the 5th position of occurrence of a relation in the retrieved list) out of the total number of questions. vii)F-Score: Harmonic mean of global precision and global recall.viii)Precision

@ K: The cumulative precision at position K.ix)Recall @ K: The correct relations for questions recommended in top K position out of total number of questions.x)F-Score @ K: Harmonic mean of precision and recall at position K.

Implementation: The pipeline for relation linking has been implemented in Python 2.7.12. Experiments were executed on a laptop with a quad-core 1.50 GHz Intel i7-4550U processor and 8GB RAM, running Fedora Linux 25. The word to vector conversion was done using GloVe [104]. Furthermore, for extracting NL predicates from the input question in the third step of the pipeline in section6.2.1, we used the TextRazor API⁴. The source code and evaluation results can be downloaded fromhttps:

//github.com/WDAqua/ReMatchfor independent use.

Cumulative Frequency at Rank Positions Precision @k Recall @k F-Score @k Num Properties Total Rank#1 Rank#2 Rank#3 Rank#4 Rank#5 #1 #5 #5 #5

1 Property 116 55 71 78 84 87 47.4% 57.6% 75.0% 63.2%

Properties 21 10 11 13 13 13 47.6% 53.1% 61.9% 57.2%

5 10 14 14 15 23.80% 44.9% 71.4% 52.6%

Properties 6 1 1 1 1 1 16.6% 16.6% 16.6% 16.6%

1 1 2 4 4 16.6% 30.5% 66.6% 41.8%

1 1 2 2 2 16.6% 22.2% 33.3% 26.64%

Table 6.1:SIBKB Performance.Cumulative Frequency at Rank Positions 1 to 5; Precision, Recall, and F-Score are also reported at Top-1 and Top-5. Accuracy of the SIBKB-based relation linking method is enhanced whenever Top-5 results are considered.

3https://github.com/ag-sc/QALD/blob/master/7/data/qald-7-train-multilingual.json

4https://www.textrazor.com/

Chapter 6 Relation Linking using a Semantically Indexed Bi-Partite Knowledge Base

Global

Precision Recall F-Score

Baseline 17% 37% 23%

SIBKB 51% 73% 60%

Table 6.2:Comparison of SIBKB and Baseline.Top-1 predicates are considered; SIBKB enhances accuracy of the proposed relation linking method.

6.3.1 Experiment 1: Performance Evaluation Using Relation Linking Benchmark

Evaluation of Relation Linking Task Using SIBKB

To evaluate the impact of the SIBKB on the relation linking task, we first calculate the performance of PATTY using a similarity measurement between question patterns and PATTY relational patterns using cosine similarity [104]; we call it ‘BaseLine’ approach. In the BaseLine approach, PATTY is directly used without indices. However, in our approach, we use SIBKB i.e., PATTY with indices along with the pipeline described in Section6.2.1. Out of 215 questions of QALD-7, using PATTY patterns, we can answer 143 questions. The remaining 72 questions do not have any associated relational patterns for QALD questions in PATTY, and are therefore out of the scope for evaluation. Table6.2illustrates the results. Using our approach, the global precision increases from 17% to 51% compared to the BaseLine, which means a significant improvement of nearly three folds. The same analysis can be also seen in terms of the global recall and F-score.

We further observed the impact of our approach on capturing knowledge from the knowledge base by calculating the precision and recall values till the first five occurrences in the obtained list of relations. We divided questions with two or three properties into different groups as shown in Table6.1. For example, the question ‘Which professional surfers were born in Australia?’, contains two DBpedia properties, namely,dbo:occupationanddbo:birthPlace.

Table6.1has two or three rows depending on the number of relations in a question. Using our SIBKB approach for relation linking, precision and recall at the first position are high enough to prove that our implementation can be easily used as relation linking tool in modular question answering frameworks such as OKBQA[13]; it will significantly improve the overall performance of QA systems in general. For example, QA system CASIA, which uses PATTY, shows an average precision of 0.35 over QALD-3 [57].

If its relation linking tool is replaced by our approach, this will improve the overall performance of the CASIA system. Furthermore, we have excluded a performance comparison with state-of-the-art relation linking tool presented in [19] because this work does not use the background knowledge base PATTY; it relies on modelling natural language relations with their underlying part of speech. The part of speech is then enhanced with Wordnet and dependency parsing. In contrast, our approach focuses on enhancing efficient knowledge capturing from knowledge bases for relation linking, which can further be extended for other similar knowledge bases like PATTY. However, combining both approaches will result in better performance of the relation linking task since relational patterns in PATTY are limited.

6.3.2 Experiment 2: Trade-offs between Different Metrics

We illustrate a trade-off between different dimensions of performance metrics for the SIBKB-based approach compared to the baseline. We choose global precision, global recall, F-score, in-memory size, and execution time as five different dimensions. The in-memory size of the PATTY knowledge base has increased from 7.34 MB to 22.44 MB as we have converted PATTY (two column corpus of relational

Im Dokument Towards Dynamic Composition of Question Answering Pipelines (Seite 88-91)