• Keine Ergebnisse gefunden

Application of Knowledge

MiTemP : Mining Temporal Patterns

4.5 Pattern Mining

4.5.2 Application of Knowledge

equivalent pattern pspec = (cp,T R,CR). The (potentially) specialized temporal restriction is defined as T R =propspec(T R) (see Def. 4.32 on page 96).

The specialized concept restriction takes for each termti in the conjunctive pat-tern the most specialized concept of the corresponding predicate templates, previous concept restriction, or direct concept of the instance. Let t1, . . . , tn be the list of terms in the conjunctive pattern and Cm = {cim,1, . . . , cim,k} ∪cim,prev be the set of concept identifiers of the corresponding positions in the predicate templates with CR={(t1, ci1,prev), . . . ,(tn, cin,prev)}. Iftmis atomic, i.e.,tm ∈ O, the corresponding most specialized concept is cim,spec with direct-instance-of(tm, cim,spec). Otherwise, cim,spec is the most special concept of Cm, i.e., cim,spec ∈ Cm and ∀cim ∈ Cm :cim = cim,spec =⇒is-a(cim, cim,spec). CRis defined asCR ={(t1, ci1,spec), . . . ,(tn, cin,spec)}.

If a pattern is inconsistent, the function maps to theinconsistent pattern (cf.

Def. 4.27).

Definition 4.54 (Temporal Pattern Equivalence Relation). Two patterns p1 and p2 are equivalent denoted by p1 p2 iff their most special equivalent patterns are identical: spec(p1) = spec(p2).

Theequivalence class of a patternp is defined as the set of all patterns that are equivalent to p: [p] = {q | q∈ Ltp˙ ∧p∼q} ⊆ Ltp˙.

In order to avoid checking equivalent patterns multiple times and to avoid check-ing inconsistent patterns (which cannot have any match and thus must have a sup-port and frequency of zero), the available knowledge should be used to create the most special pattern of a pattern’s equivalence class. In order to create this most special pattern, the temporal restriction and the concept restriction must be exam-ined. For the temporal restriction, the composition table must be used to remove all temporal relations that are not possible anymore. This can be done by constraint propagation as shown in Section 4.2.

For the concept restriction, the most restricted concept must be identified for each position taking into account the predicate templates and instantiation. A pat-tern is inconsistent if during this process an empty temporal restriction was created for a pattern pair or if the set of concepts for some argument of the conjunctive pat-tern cannot be mutually subsumed (i.e., if there is a contradiction that a variable should be instance of two different concepts which do not lie on the same path in the concept hierarchy).

With these new definitions it is possible to re-define the temporal pattern mining task: The goal is to find the set of all most special equivalent patterns of frequent patterns: Pf req ={p| p =spec(p)∧p∈ Ltp˙ ∧f req(p) = f req(p)≥minf req} ∪.

The mining process can be defined as:

Pf req = P0∪. . .∪ P

P0 = {}

Ci+1 = {p | p∈ρ(p˙ )∧p ∈ Pi}

Pi+1 = {p | spec(p) = p ∧p ∈ C1∪. . .∪ Ci+1∧f req(p)≥minf req

∧|v(p)|= (i+ 1)}

Theorem 4.7 (Optimality of the equivalence-based mining operator). The refine-ment operator ˙ρ with the minimal frequency condition and the maximal specifity condition is optimal forPf req.

Proof 4.7. The proof must show that all changes comply with the completeness and optimality properties. The new pattern space is restricted to those patterns that are frequent and maximally specialized within their equivalence class.

1. Completeness

As only those patterns are taken into account whose frequency is above the minf req threshold for generating patterns of the next refinement level, it is clear that less patterns are generated by this additional condition. At the same time it is only required to find those patterns that exceed the minf req threshold. Due to the anti-monotonicity property of the frequency (Def. 4.3) for any given pattern p with f req(p) < minf req none of the pat-terns subsumed by p can be relevant for the mining process as it holds that p p = f req(p) ≤f req(p) and thus f req(p) < minf req. It follows that the minimal frequency condition does not interfere with completeness.

In order to show that no relevant pattern is missed after applying the special-ization function, the different refinements must be examined:

Lengthening: Specialization has no effect, i.e., ∀p :p ρ˙L(p) = p = spec(p), i.e., no pattern is left out.

Temporal refinement: After restricting the set of possible temporal re-lations between a predicate pair, it is possible that the temporal con-straint propagation procedure further specializes the pattern. The fol-lowing cases have to be distinguished:

Case 1: Nothing is specialized, i.e., p=spec(p).

Case 2: Any of the temporal relation sets is empty after specializa-tion, i.e.,(i, j,)∈ T R In this case, the pattern is inconsistent and has got a frequency of zero and thus is not relevant for the mining process as it is defined that minf req >0.

Case 3: n > 0 of the directly succeeding temporal relation sets T Rm+1, . . . ,T Rm+nof the recently restricted temporal relationT Rm

have been restricted such that they all only consist of one element, i.e. ∀k : |T Rk| = 1 with 1 k (m +n). This can be seen as n temporal refinement operations without taking into account the remaining options that have been available in the temporal relation sets before. However, all these pruned options would have led to the inconsistent pattern anyway. This specialization only keeps the rele-vant combination but skips the intermediate refinement operations.

Case 4: If the directly succeeding temporal relation set is not com-pletely restricted, i.e.,|T Rm+1|>1, then the refinement level of the pattern does not change. All temporal relations that have been re-moved by the constraint propagation algorithm would have led to an inconsistent pattern, i.e., the search space (of upcoming refinement steps) is only pruned at the irrelevant parts of the search space.

Unification: The unification of two variables can lead to a specialization in the concept restriction. Letc1 and c2 be the corresponding concepts in the concept restriction of the unified variables. The following cases can be distinguished:

Case 1: Nothing is specialized, i.e.,p=spec(p).

Case 2: Neither is-a(c1, c2) nor is-a(c2, c1). In this case, the pattern is inconsistent as it is not possible that an object is instance of two different concepts without common path in the concept hierarchy.

This pattern and the ones subsumed by it are infrequent and thus irrelevant for the mining task.

Case 3: One concept subsumes the other one, i.e., is-a(cg, cs) with cg =c1∧cs=c2 orcg =c2∧cs=c1. This prunes all patterns where the more general concept would have been specialized to anyc with is-a(c, cg) ∧ ¬is-a(cs, c). The set of these concepts Cintermediate = {c | is-a(c, cg)∧ ¬is-a(cs, c)} can be divided into those that lie on the path fromcg tocs, i.e.,Cpath ={cp |cp ∈ Cintermediate∧is-a(cp, cs)} and those that are outside this pathCinc={cinc|cinc∈ Cintermediate

¬is-a(cp, cs)}. For those concepts in Cinc it is clear that they would have led to a contradiction, i.e., pruning them away does not affect any relevant pattern. The other concepts inCpath lie on the path and therefore these patterns are also elements of the equivalence class [p] with p =spec(p). Leaving these patterns out, only prunes patterns with the same equivalent maximal specialization patternp that has been found already.

Concept restriction: The concept restriction just restricts the concept of

a term in the conjunctive pattern. The patterns where the concept had not been set to this concept would still be members of the equivalence class and thus, leaving them out does not interfere with completeness as the most special equivalent pattern for this equivalence class has been found already.

Instantiation: The instantiation operation is just allowed for direct in-stances of the concept in the concept restriction. Thus, no specialization at the concept restriction is performed, i.e., p=spec(p).

2. Non-redundancy

For non-redundancy, the additional condition leaving out the patterns with a frequency below minf req is irrelevant as patterns are just left out and the valid refinements are not affected. It has to be shown that the use of the specialization function does not affect non-redundancy for the different refine-ments:

Lengthening: This operation is never called after any of the other refine-ments was called and is not affected by the specialization function.

Temporal refinement: This refinement operator restricts the temporal relations consecutively from left to right without leaving out any pred-icate pair. The specialization operation only prunes options for future refinements; if any of the previously processed predicate pairs is further restricted (from one temporal relation to zero temporal relations) the pattern must be inconsistent and is not relevant for the mining task.

Unification: After applying the lengthening and temporal refinement op-erations, all resulting patterns are mutually different (lengthening results in unique sequences and for each unique sequence different completely constrained temporal restrictions are generated). The unification process itself also creates just different unification patterns. Redundancy could thus only occur w.r.t. the concept restriction. As the concept restriction is immediately specialized after unification, the most special equivalent pattern is only created once (and the intermediate patterns are left out as described above in the completeness part of the proof).

Concept refinement, instantiation: As none of the influencing operations (that can be affected by the specialization function) can be applied after these operations, the non-redundancy is not affected.

As it has been shown that none of the relevant patterns is left out by the changes and non-redundancy is still given, it follows that ˙ρ with the minimal frequency condition and the maximal specifity condition is optimal for Pf req.

Algorithm 7 MiTemP-main (Pattern Generation)

Input: ds = (P,O,direct-instance-of, dss), winsize, sizemin, sizemax, minf req, maxlevel / dynamic scene, window size, minimal and maximal pattern size, min.

frequency, max. refinement level/

Output: All frequent patterns Pf req withsizemin ≤size≤sizemax

1: InitPf req =∅, i= 1

2: Ci create single predicate patterns() /∗ Create candidates for each pred. tem-plate /

3: while Ci =∅ ∧i≤maxlevel do

4: support[|Ci|] MiTemP-support(ds, winsize, Ci) 5: Li={cj ∈Ci | support(cmax suppj) ≥minf req}

6: Pf req ← Pf req∪{l∈Li|sizemin≤size(l)≤sizemax∧v(l) = (vl, vt, vu, vc, vi)(vt=

vl·(vl1)

2 ∨vu =vc =vi= 0)} 7: i←i+ 1

8: Ci = MiTemP-gen-lengthening(Li1) MiTemP-gen-temp-refinement(Li1) MiTemP-gen-unification(Li1) MiTemP-gen-concept-refinement(Li1) MiTemP-gen-instantiation(Li1)

9: end while