• Keine Ergebnisse gefunden

3 An Extensible Architecture for RTE

3.2. THE ARCHITECTURE 65

Figure 3.2: The proposed RTE system architecture

paraphrase, and so on. However, these cases are not trivially machine-differentiable. Nor are they suitable for the data collected from different NLP applications.

In fact, the criteria for a good subset are highly related to the module dealing with it. Therefore, it is easier to do the split based on what specialized modules we have. For instance, if we have an inferencer deal-ing with temporal expressions, we should find those cases of entailment contained in the dataset which need temporal reasoning. If we have an accurate person name normalization system, we should find those cases that need pronoun resolution. In the more general sense, we need to discover those T-H pairs which the available systems can handle well.

Therefore, we prefer a system with high precision over one with high

recall (if we cannot achieve them both at the same time) in both the splitting of the data and the processing with the specialize modules. In particular, the criteria for such an architecture are:

A good split using basic linguistic processing to choose a subset of the whole dataset;

A good module precision-oriented, preferring accuracy to coverage of the dataset.

In the following, we briefly introduce our RTE system based on this architecture (Figure 3.2) and the details are in Chapter 4.

For preprocessing, we utilize several linguistic processing components, such as a POS tagger, a dependency parser, and a named-entity (NE) recognizer to annotate the original plain texts from the RTE corpus.

We then apply several specialized RTE modules. Since all the modules aim at high precision, they do not necessarily cover all the T-H pairs.

The cases which cannot be covered by any specialized RTE module are passed to the high-coverage, but probably less accurate backup modules.

In the final stage, we join the results of all specialized RTE modules and backup modules together. Different confidence values are assigned to the different modules according to the performances on the data for development. In order to deal with possible overlapping cases (i.e., T-H pairs that are covered by more than one module), a voting mechanism is applied taking into account the confidence values.

For the specialized modules, we have developed and implemented the following three to deal with three different cases of entailment:

Temporal anchored pairs Extract temporal expressions and correspond-ing events from the dependency trees, and apply entailment rules between extracted time event pairs;

Named entity pairs Extract other Named Entities (NE) and corre-sponding events, and apply entailment rules between extracted entity-event pairs;

Noun phrase anchored pairs For pairs with no NEs but containing two NPs, determine the subtree alignment, and apply a kernel-based classifier.

In addition to the precision-oriented RTE-modules, we also consider two robust but not necessarily precise backup strategies to deal with those

3.3. SUMMARY 67 cases which cannot be covered by any specialized module. Chief require-ments for the backup strategy are robustness and simplicity. Therefore, we considered two backup modules, the Triple backup and the Bag-of-Words (BoW) backup (Wang and Neumann, 2007a).

The Triple backup module is based on the Triple similarity function which operates on two triple (dependency structure represented in the form of<head, relation, modifier>) sets and determines how many triples of H are contained in T. The core assumption here is that the higher the number of matching triple elements, the more similar both sets are, and the more likely it is that T entails H. The function uses an approxi-mate matching function. Different cases (i.e., ignoring either the parent node or the child node, or the relation between nodes) may provide dif-ferent indications for the similarity of T and H. We then sum them up using different weights and divide the result by the cardinality of H for normalization.

The BoW backup module is based on BoW similarity score, which is calculated by dividing the number of overlapping words between T and H by the total number of words in H after a simple tokenization according to the space between words.

There is one more issue we have not addressed, which is the application of the external knowledge. Chapter 5 focuses mainly on this. In particu-lar, we consider using a collection of textual inference rules for the RTE task. The rules were obtained separately, using an acquisition method based on the Distributional Hypothesis (Harris, 1954). The system itself can be viewed as an extended version of the third specialized module men-tioned above. The original module extracts Tree Skeleton (Section 5.4.2) from the dependency trees and applies a subsequence-kernel-based clas-sifier that learns to decide whether the entailment relation holds between two texts. The extended system replaces the learning part with the rule application. Therefore, whether the inference rule triggers defines the subset of the data which the specialized module deals with.

3.3 Summary

In summary, this chapter provides an overview of the extensible archi-tecture of our RTE system. The system contains multiple specialized modules which deal with different types of entailment separately, instead of tackling them all together. We show three such modules in Chapter 4

and one extended module with an external inference rule collection in Chapter 5.

Bobrow et al. (2007) also had the idea of developing a precision-oriented RTE system, although their system had very limited coverage of the dataset. Bentivogli et al. (2010) built specialized datasets made of monothematic T-H pairs, i.e., pairs in which a certain phenomenon relevant to the entailment relation is highlighted and isolated. Recent work done by Mirkin et al. (2010a) focused on those data with discourse information involved. All this related work confirms the “specialized”

strategy of tackling the RTE task.

Naturally, more specialized modules (including those mentioned above) can be added into our extensible architecture. For example, one can en-hance entailment recognition with logic inferencing, which deals with quantifiers, modal verbs, etc. The integration of generic and specialized modules is also outside the scope of this dissertation. In the long run, we will explore different combination strategies as well. We leave these issues for the future work (Chapter 10).

4 Textual Entailment with Event