• Keine Ergebnisse gefunden

8.4 Chapter Summary

In this chapter, we presented a first prototype of a concept map–based exploratory search system. In line with the requirements laid out in Chapter 2, the system supports exploratory search by providing a concise, structured overview of the content of a document collection and by allowing a user to navigate to specific details in the documents.

The summary concept map, which is a core part of the system, can be automatically generated with the computational models presented in Chapters 5, 6 and 7 as well as future improvements of them. Thereby, the prototype serves as a demonstration of how those models can be used for practical purposes. The positive initial user feedback on the system showed that the application scenario is realistic and that the system is intuitive to use.

That is in line with the previous, more extensive studies reviewed in Section 2.2.2 that analyzed the use of concept maps in practice and observed benefits over other types of representations to support document exploration.

Beyond its purpose as a demonstrator of a specific application scenario, the system can also be used to conduct user studies that compare how useful concept maps created with different computational models are in practice. We suggested this as the third possible type of evaluation for CM-MDS in Section 3.5.2. Using different computational models with the same exploration system can ensure a controlled experiment with comparable settings across conditions. Further, the logging capabilities of the system allow detailed analysis of such experiments. Similar user studies have been carried out by Carnot et al. (2001) and Valerio et al. (2012) to compare concept maps with alternative representations.

As the work on CM-MDS is still at its beginning and very few models covering the whole task exist, there is currently no need to perform such comparisons. In this thesis, we presented two pipeline-based models (see Section 4.3.3 and Section 6.5) and several neural network-based end-to-end models (see Chapter 7). Since both the neural models and the corpus baseline produce concept maps of rather low quality and have already been found to be substantially weaker using automatic and manual evaluations, comparing them in a user study against the improved pipeline would add only very limited insights. However, as more well-performing methods are developed in future work, such experiments will become more interesting and the system presented here can be used to carry them out.

Chapter 8. Exploratory Search with Concept Maps

Chapter 9

Conclusions

„Science never solves a problem without creating ten more.“

—George Bernard Shaw This final chapter summarizes the findings of the thesis and outlines promising direc-tions for future research on CM-MDS and related tasks.

9.1 Summary of Findings and Contributions

This thesis has shown that the automatic creation of multi-document summaries in the form of concept maps is an important task and that many challenges for computational models arise that have not yet been adequately addressed in previous work. We therefore proposed new models for several subtasks of CM-MDS as well as different approaches to model the task as a whole. With several newly created corpora and suggested evaluation protocols, we conducted extensive experimental evaluations. And finally, we demonstrated the practical use of concept maps for exploratory search in a demo application.

Chapter 3 introduced the central problem studied in this thesis. Based on the review of user requirements, we argued that concept maps are a very useful text representation to support users during exploratory search. Given the limited amount of existing work on extracting them from text, we proposed to study the task of concept map–based multi-document summarization (CM-MDS). We presented all of its subtasks in detail and outlined the challenges that computational models for it have to face. We further proposed two automatic metrics based on METEOR and ROUGE to evaluate automatically created concept maps against manually created references. Because these metrics, similar as in traditional textual summarization, can only evaluate some aspects of the task, we additionally proposed manual evaluation protocols that complement the metrics.

Chapter 9. Conclusions

Chapter 4 addressed the lack of suitable corpora to train and evaluate computational models for CM-MDS and the fact that the annotation is particularly time-consuming and complex. We explored two different directions, the automatic extension of existing partial annotations and the use of a new scalable annotation process relying on crowdsourcing.

With regard to the second strategy, we developed a new corpus creation method that effec-tively combines automatic preprocessing, scalable crowdsourcing and high-quality expert annotations. Its crucial component is a novel crowdsourcing scheme called low-context importance annotation. Using it, we created a new corpus of 30 document sets on educa-tional topics, each with around 40 source documents and a summarizing concept map, that served as the main evaluation dataset for experiments in the thesis.

Chapter 5 focused on the concept and relation extraction subtasks of CM-MDS. As the first contribution, we addressed the lack of a clearly established state-of-the-art among pre-viously proposed extraction methods by carrying out a first direct comparison of such meth-ods. The most interesting finding of these experiments is that previously proposed methods for relation mention extraction performed particularly poor on our datasets. In addition, we proposed to extract concept and relation mentions from predicate-argument structures instead of syntactic representations. We found that this alternative approach substantially improves the relation extraction performance while performing comparable to previous work for concept mention extraction and being substantially easier to implement. As a second experiment, we performed a case study of porting a rule-based predicate-argument analysis tool from English to German. Since most previous work focused on English, it is important to know how challenging and laborious it is to also obtain such systems for other languages. We found that with roughly 10% of the effort that went into the English system, we could build a variant for German covering 89% of the rules and provided an extensive discussion of the cases that we found to be more difficult to transfer.

Chapter 6 looked at the CM-MDS subtasks of concept mention grouping, importance es-timation and concept map construction. For concept mention grouping, we proposed a novel solution based on pairwise mention classification and a subsequent partitioning step.

Compared to previous work on concept map mining, this approach can capture more types of coreferences and successfully improved the quality of summary concept maps on our benchmark corpus. With regard to importance estimation, we studied a broad set of features and different machine learning approaches. While we could not observe clear advantages of modeling the problem as regression, classification or ranking, we did design supervised models that clearly outperform the exclusively unsupervised methods suggested in previ-ous work. For concept map construction, we proposed an ILP formulation that allows us to find an optimal solution to the subgraph selection problem of CM-MDS. These optimal subgraphs are superior to heuristically selected ones on our evaluation corpus. We finally

9.1. Summary of Findings and Contributions

presented a pipeline covering the whole CM-MDS task that incorporates the new models we developed for the different subtasks. We performed automatic and manual evaluations on two corpora and observed that the pipeline improves upon a range of methods proposed in previous work, defining a new state-of-the-art for the task.

Chapter 7 explored approaches to model CM-MDS end-to-end as a single machine learn-ing problem with neural networks. Several challenges make such a modellearn-ing approach difficult: Only little training data is available, very large inputs have to be processed and no neural architectures to predict labeled graphs exist. We proposed a set of techniques that allow us to reduce CM-MDS to a sequence transduction problem and approach it with existing models for such problems. Further, we proposed sequence-to-graph networks, a novel neural network architecture than can directly predict labeled graphs. And third, we carried out a set of experiments that for the first time study the performance of neural end-to-end models on CM-MDS. While the overall performance of these neural models is not yet competitive with the pipeline-based approaches of the previous chapter, the experiments are an important first step towards developing such models and pointed out the remaining challenges that have to be addressed by future work. One of them is to obtain high-quality training datasets of sufficient size. The second part of this chapter outlines specific steps that could be taken in this direction.

Chapter 8 demonstrated how the computational models developed in the previous chap-ters can be used for practical purposes. We presented a first prototype of an exploratory search system that — using summary concept maps — provides a concise and structured overview of a document collection and allows a user to navigate to details in order to ex-plore the content. Besides its function as a demonstrator, the system can also facilitate user studies as an extrinsic evaluation for CM-MDS in the future.

Figure 9.1 shows a summary concept map that was automatically created based on the content of this thesis.76 Similar to the examples that were shown in Section 6.5 and the errors discussed there, this summary also reveals some open issues of the current methods:

concepts are sometimes not grouped as much as it would be desired (e.g. concept maps

and maps) and some extracted relations are not very clearly labeled (e.g. tries to keep).

However, we think it also demonstrates that even at the current level of performance, the pipeline already produces summaries that can be of use for a user. As Figure 9.1 shows, the summary does in fact contain many of the central concepts discussed in this thesis and it also presents several important relations between them. In the following and final part of the thesis, we point out how the remaining issues could be addressed in the future.

76We used the pipeline presented in Section 6.5 with models trained on Educ and a size limit of 10 concepts.

The input is a collection with one document per chapter, excluding this paragraph describing the result.

Chapter 9. Conclusions

creating

can support

included in zooms into

consist of to generate

tries to keep on using to be visualized to be beneficial for concept

maps users

summary concept maps

concept

propositions maps models

approaches

many

studies CM-MDS

Figure 9.1: Summary concept map automatically created for the content of this thesis using the pipeline presented in Section 6.5 trained on Educ with a size limit ofℒ𝐶 = 10.