• Keine Ergebnisse gefunden

Chapter 6. Pipeline-based Approaches

Conclusion Based on our experiments, we observe that there is no significant difference between modeling importance estimation as regression, classification or ranking on our data. While one could probably find significant differences using larger datasets, the fact that they can only be observed — if at all — with more data shows that they are presumably rather small and will thus have only a small impact on the overall task-level performance.

With regard to features, the graph-based measures that we included in addition to tradi-tional summarization features seem to be particularly useful for our task. Nevertheless, performance on importance estimation is still limited and should be further improved. We suspect that adding more external knowledge on what people generally consider to be im-portant can be particularly helpful. In the task-level experiments in Section 6.5, we also assess how well the supervised model with the current features performs against the unsu-pervised methods suggested in previous work.

6.4. Concept Map Construction

Let𝑥𝑖 be a binary decision variable that represents whether concept 𝑐𝑖 ∈ 𝐶is part of the selected subgraph. Then, the objective function can be written as67

max ∑|𝐶|𝑖=1 𝑥𝑖 𝜈(𝑐𝑖) (6.13)

𝑥𝑖 ∈ {0, 1} ∀ 𝑖 ∈ 𝐶 (6.14)

while the following constraint ensures that the subgraph obeys the size limit:

|𝐶|

𝑖=1 𝑥𝑖 ≤ 𝐿 (6.15)

Ensuring that the selected subgraph is also connected is a bit more intricate. A common approach to express such a constraint in an ILP are so-called commodity flow variables and, more specifically, the single commodity flow formulation for the minimum spanning tree problem proposed in the operations research community by Magnanti and Wolsey (1994).

It has been successfully used in ILPs addressing dependency parsing (Martins et al., 2009), sentence compression (Thadani and McKeown, 2013) and abstractive summarization (Liu et al., 2015, Li et al., 2016a). Let𝑓𝑖𝑗 be a non-negative integer variable capturing the flow from concept𝑐𝑖 to𝑐𝑗. We introduce flow variables for concept pairs with a relation in𝑅.

The constraints

𝑓𝑖𝑗 ≤ 𝑥𝑖⋅ |𝐶| ∀ (𝑖, 𝑗) ∈ 𝑅 (6.16) 𝑓𝑖𝑗 ≤ 𝑥𝑗⋅ |𝐶| ∀ (𝑖, 𝑗) ∈ 𝑅 (6.17)

𝑖𝑓𝑖𝑗− ∑𝑘𝑓𝑗𝑘− 𝑥𝑗 = 0 ∀ 𝑗 ∈ 𝐶 (6.18) 𝑓𝑖𝑗 ∈ ℕ ∀ (𝑖, 𝑗) ∈ 𝑅 (6.19) enforce that flow can only move between concepts that are selected (6.16 and 6.17) and a selected concept consumes one unit of flow (6.18). Further, let𝑖 = 0be a virtual root node and𝑒0𝑖 a virtual edge from the root to each concept. The additional constraints

|𝐶| ⋅ 𝑒0𝑖− 𝑓0𝑖 ≥ 0 ∀ 𝑖 ∈ 𝐶 (6.20)

|𝐶|

𝑖=1 𝑒0𝑖 = 1 (6.21)

|𝐶|𝑖=1 𝑓0𝑖 − ∑|𝐶|𝑖=1 𝑥𝑖 = 0 (6.22) 𝑒0𝑖 ∈ {0, 1} ∀ 𝑖 ∈ 𝐶 (6.23)

𝑓0𝑖 ∈ ℕ0 ∀ 𝑖 ∈ 𝐶 (6.24)

ensure that only one virtual edge can be active (6.21), that the virtual node can only send flow over this active edge (6.20) and that the total amount of flow sent from the root cannot

67To simplify the notation, we write𝑖 ∈ 𝐶instead of𝑖 ∈ {1, … , |𝐶|}and correspondingly for𝑅.

Chapter 6. Pipeline-based Approaches

exceed the number of selected concepts (6.22). As a consequence, if𝑛concepts are selected, 𝑛units of flow are sent from the virtual root over the edges of the graph and each selected concept consumes one of them. This is only possible if the selected subgraph is connected.

Equivalently, one can think of it as the edges with flow larger than zero forming a spanning tree of the selected subgraph that is rooted in the additional virtual node.

An important detail for the optimization is the range of the importance estimates. If some concepts receive negative scores, the objective can be improved by excluding them from the subgraph. As a result, some part of the size budget might remain unused although additional connected concepts would be available. In order to avoid that, we can simply shift all importance scores into the positive range, formally, by deriving𝜈 as

𝜈(𝑐𝑖) = 𝜈(𝑐𝑖) − 𝑚𝑖𝑛{ 𝜈(𝑐𝑗) | 𝑐𝑗 ∈ 𝐶 } (6.25) and then using𝜈in the ILP. However, if negative scores are only assigned to concepts that should in no case be part of the summary, the default behavior might actually be desired.

We take several measures to ensure that the ILP can be efficiently solved for the problem instances of CM-MDS. First, the above ILP formulation is already much more efficient than the one proposed by Li et al. (2016a) for MDS, which is the most similar ILP in related work.

While ours requires 𝒪(|𝐶| + |𝑅|) variables and constraints, their formulation uses two variables per pair of nodes for the connectivity constraint, resulting in𝒪(|𝐶|2) variables and constraints. For sparse graphs, where|𝑅| ≪ |𝐶|2, this leads to much smaller ILPs.

Second, we leverage the fact that𝐺is typically disconnected. Since a connected sub-graph has to be completely in one of the connected components of𝐺, we first identify these components and solve separate ILPs for each of them. These smaller ILPs can usually be solved faster than a single large one. And third, processing𝐺component by component also allows us to completely skip some of them. Starting with the biggest component, we can keep track of the best objective function value so far. If the next component has a total concept score less than that value, none of its subgraphs can be a better solution. And if the component consists of less concepts than the limit, we can also directly use the component instead of selecting a subset. With these measures, as we show in the experiments, the ILP can be efficiently solved for the problem sizes in the Educ corpus.

6.4.2 Experiments

To verify the effectiveness and efficiency of our proposed subgraph selection, we conduct an experiment that compares it against heuristic selection and alternative ILP formulations.

Experimental Setup We use the same data as for the concept importance estimation ex-periment (see Section 6.3.3), namely concepts extracted and grouped from the training top-ics of Educ. To evaluate subgraph selection independent of importance estimation, we do

6.4. Concept Map Construction

METEOR ROUGE

Pr Re F1 p Pr Re F1 p

Educ

ILP 23.32 27.52 25.16 26.09 23.93 24.74

Heuristic 18.28 25.15 21.13 .0003 17.52 21.97 19.34 .0014 Wiki

ILP 29.04 26.76 27.73 29.08 18.79 22.54

Heuristic 24.45 24.46 24.83 .0051 24.06 17.39 19.57 .0093 Table 6.7: Evaluation of summary concept maps obtained with the proposed ILP and heuristic selection. Inputs are graphs created by automatic extraction and grouping in combination with gold importance scores. P-values are computed with a permutation test comparing F1-scores.

not use a trained model but the gold scores derived as training labels in Section 6.3.3. We create a second dataset based on Wiki with the same approach. On both datasets, we evalu-ate the selected subgraphs by comparing them against the reference concept maps with the metrics proposed for CM-MDS in Section 3.5.2. ILPs are solved with CPLEX68on a compute server with 500 GB of memory and 24 Intel Xeon ES-2620 2.1GHz cores.

As a baseline for our proposed approach, we implement a greedyheuristic similar to Zubrinic et al. (2015): Given the graph of scored concepts, it starts with the most important one and selects the best neighbor (by score, breaking ties by the node’s degree) until the size limit is reached. While this procedure ensures that the selected subgraph is valid, i.e. not too big and connected, it is not necessarily, in contrast to the ILP, the best subgraph with regard to our objective function. As a second baseline, we include an alternative formulation of the subgraph selectionILPobtained by transferring Li et al. (2016a)’s ILP for MDS to our task.

The main difference is that it uses a quadratic number of variables to represent the presence or absence of all possible edges and the flow along them. While that has implications for its efficiency, it does of course also find an optimal subgraph.

Results Table 6.7 shows the results of this experiment. As expected, our ILP approach selects better subgraphs as summaries and the results on both datasets and in both metrics show that the difference between them and the summaries obtained with the heuristic are substantial and significant. Note that while the ILP finds the best solution to the optimiza-tion problem by definioptimiza-tion and is in that sense already known to be superior to the heuristic, this experiment verifies that the best solution to the optimization problem is also indeed a good solution for the CM-MDS task in terms of being closer to the reference map.

68Version 12.7, available athttps://www.ibm.com/analytics/cplex-optimizer.

Chapter 6. Pipeline-based Approaches

Method ILP Size Runtime

Variables Constraints sec (Li et al., 2016a) 37,273,062 74,530,095 2670.61 by component 25,810,465 51,607,172 999.25

Our ILP 21,596 31,129 7.31

by component 17,973 26,484 5.61

Table 6.8: Comparison of ILP sizes and runtimes on average per topic for subgraph selection on Educ with our ILP and the alternative formulation of Li et al. (2016a).

In table Table 6.8, we compare ILP sizes and the time required to solve them. Although the differences between our ILP formulation and the one by Li et al. (2016a) are small, they have a large effect in practice, resulting in orders of magnitude smaller problems and faster runtimes. Identifying connected components and selecting subgraphs for each of them separately further improves the efficiency of both ILP approaches. On the document sets of Educ, with on average over 100,000 tokens, that allows us to select a summary subgraph in just a few seconds, which is not possible with Li et al. (2016a)’s formulation.

Conclusion Based on these experimental results, we conclude that selecting summary subgraphs with our proposed ILP is effective and can also be done efficiently on our copora.

We will therefore include it in our CM-MDS pipeline described in the next section and assess it in an end-to-end task-level evaluation.