Generalizations and Specializations of Patterns

MiTemP : Mining Temporal Patterns

4.4 Generalizations and Specializations of Patterns

Similar to Dehaspe [Deh98] we deﬁne a quasi-order “more general than” for the set of patterns. Referring to Dehaspe [Deh98, p. 80] for a given set L and a binary relation onL,L, is a quasi-order if and only if is reﬂexive and transitive:

∀x∈ L:xx (Reﬂexivity)

∀x, y, z ∈ L:xy and yz ⇒xz (Transitivity)

Our pattern description consists of the basic pattern, the temporal restriction, and the concept restriction. All these parts must be taken account in the deﬁnition of the generalization (sub-pattern) relation:

Deﬁnition 4.37 (Generalization Relation). Let p₁ = (cp₁,T R1,CR1) and p₂ = (cp₂,T R2,CR2) be two temporal patterns with conjunctive pattern sizes s₁ = size(cp₁) ands₂ =size(cp₂) andcp₁ =ap_1,1∧. . .∧ap_1,s₁ andcp₂ =ap_2,1∧. . .∧ap_2,s₂ p₁ subsumes p₂ if and only if s₁ ≤ s₂ and there is an injective mapping μ : {1, . . . , s₁} → {1, . . . , s₂} such that ∀i, j :i < j =⇒μ(i)< μ(j) (the order must be preserved).

Furthermore, ∀i ∈ {1, . . . , s₁} : pi_1,i = pi_2,μ(i) with ap_1,i = (pi_1,i, args_1,i) and ap_2,μ(i)= (pi_2,μ(i), args_2,μ(i)). There must exist a substitution θ such that (ap_1,1, . . . , ap_1,s₁)θ = (ap_2,μ(1), . . . , ap_2,μ(s₁₎). It is also required that the temporal relation sets ofp₁ are supersets of the mapped ones inp₂: ∀i, j ∈ {1, . . . , s₁}:T Ri,j ⊇ T Rμ(i),μ(j)

with i < j, (i, j,T Ri,j)∈ T R1, and (μ(i), μ(j),T Rμ(i),μ(j))∈ T R2.

Let t₁ = arg(cp₁) = (t_1,1, . . . , t_1,l₁) and t₂ = arg((ap_2,μ(1), . . . , ap_2,μ(s₁₎)) = (t_2,1, . . . , t_2,l

2) be the tuples of terms of the conjunctive patterns. Then∀i: (t_1,i, ci_1,i)∈ CR1 =⇒(t_2,i, ci_2,i)∈ CR2∧(is-a(ci_2,i, ci_1,i)∨ci_2,i =ci_1,i).

In other words, a pattern p₁ subsumes another one p₂ if there exists an order-preserving mapping of the elements in the conjunctive pattern of the general pattern to the special pattern so that the conjunctive pattern subsumes the conjunction of the mapped elements. Additionally, the temporal restrictions for all predicate pairs of the general pattern must be supersets of the corresponding temporal restriction in the special pattern and the concept restrictions for each argument of the conjunctive pattern must subsume the corresponding concept in the concept restriction of the special pattern. A proper “more general than” relation " exists iﬀ:

p₁ "p₂ iﬀ p₁ p₂ ∧ p₂ p₁.

We also deﬁne direct, proper more general than relations and denote them by

"1:

tp₁ "1 tp₂ ⇐⇒ tp₁ "tp₂∧tp₃ :tp₁ "tp₃∧tp₃ "tp₂

The "1-relation can be used for enumerating the pattern space. It allows for systematically searching for frequent patterns starting from the most general pattern .

The pattern mining algorithm has to perform a search through the pattern space.

We deﬁne ﬁve diﬀerent reﬁnement operations in order to specialize a pattern: leng-thening, temporal reﬁnement, variable uniﬁcation, concept reﬁnement, and instanti-ation. If applied to a pattern p, each of these operations leads to a set of specialized patterns which are subsumed by p. These operations are similar to those deﬁned by Lee [Lee06], but in our case predicates have a temporal extent and thus, interval relations are used for temporal restrictions and Lee’s operations do not include the concept reﬁnement as deﬁned here. We follow Lee’s notation w.r.t. the reﬁnement operations [Lee06].

Deﬁnition 4.38 (Reﬁnement Operator). A reﬁnement operator ρ : Ltp → 2^L^tp maps a pattern p of the pattern language to a set of patterns such that it holds ρ(p)⊆ {p | p"1 p}.

Deﬁnition 4.39 (Lengthening). Let tp = ((ap₁, . . . , ap_m),T R,{(t₁, ci₁), . . . ,(t_n, ci_n)}) be a temporal pattern. The lengthening operator ρ_L adds an atomic pat-tern to the conjunctive patpat-tern: ρ_L(((ap₁, . . . , ap_m),T R,{(t₁, ci₁), . . . ,(t_n, ci_n)})) = {((ap₁, . . . , ap_i₋₁, ap, ap_i, . . . , ap_m),T R,{(t₁, ci₁), . . . ,(t_n, ci_n),(V₁, ci₁), . . . ,(V_arity, ci_arity)})}withap = (pi,(V₁, . . . , V_arity)),pt = (pi,(ci₁, . . . , ci_arity)), andpi ∈ PI. Variables V₁, . . . , V_arity must not occur in any of the previously existing atomic pat-terns and must be mutually unequal. The new temporal restriction is deﬁned as

T R ={(1,2,T R1,2), . . . ,(m, m+ 1,T Rm,m+1)} with:

T Rk,l =

⎧⎪

⎪⎪

⎪⎨

⎪⎪

⎩

T Rk,l if k, l < i T Rk,l−1 if k < i, l ≥i T Rk−1,l−1 if k, l≥i

IRolder if (k =i∨l =i)∧ap_new,k >_lex ap_new,l IR≤ if (k =i∨l =i)∧ap_new,k ≤lex ap_new,l with 1≤k < (m+ 1) and 1< l≤(m+ 1), k < l.

The new temporal restriction in the deﬁnition above keeps the temporal relations for the existing atomic pattern pairs and introduces new interval relation sets for the new atomic pattern in combination with all existing ones. Depending on the lexicographic order, the new interval relation set is set to IRolder orIR≤.

Deﬁnition 4.40 (Temporal Reﬁnement). Let p = (cp,T R,CR) be a temporal pattern with n = size(cp) atomic patterns. The set of temporal reﬁnements for p is deﬁned as ρ_T = {(cp,T R,CR) | ∃!i, j : (i, j,T Ri,j) ∈ T R ∧ |T Ri,j|+ 1 =

|T Ri,j| ∧ T Ri,j ⊂ T Ri,j∧ ∀k, l: (k =i∨l=j) =⇒(k, l,T Rk,l)∈ T R} with i < j and k < l.

Deﬁnition 4.41 (Variable Uniﬁcation). Lettp= (cp,T R,CR) be a temporal pat-tern andV₁, V₂be variables occurring incp. Thevariable uniﬁcationoperator uniﬁes two variables V₁ and V₂ with V₁ = V₂: ρ_U((cp,T R,CR)) = {(cpθ,T R,CR) | θ = {V₁/V₂}} where V₁ and V₂ occur in cp.

In the concept restriction, the entry of V₁ must be removed and V₂ must be up-dated according to the previously deﬁned concepts of V₁ and V₂: CR = CR \ {(V₁, ci₁),(V₂, ci₁)} ∪ {(V₂, ci₂)} with ci₂ = ci₂ ⇐⇒ is-a(ci₂, ci₁) and ci₂ = ci₁ ⇐⇒ is-a(ci₁, ci₂). The variable to be uniﬁed must be compatible: is-a(ci₁, ci₂)∨ is-a(ci₂, ci₁).

Deﬁnition 4.42 (Concept Reﬁnement). Let (cp,T R,{(t₁, ci₁), . . . ,(t_n, ci_n)}) be a temporal pattern. Aconcept reﬁnement replaces one of thenconcepts in the concept restriction by one of its direct sub-concepts: ρ_C((cp,T R,{(t₁, ci₁), . . . ,(t_n, ci_n)})) = {(cp,T R,CR) | CR ={(t₁, ci₁), . . . ,(t_i₋₁, c_i₋₁),(t_i, ci_i),(t_i+1, ci_i+1), . . . ,(t_n, ci_n)}, is-a(ci_i, ci_i)} with 1≤i≤n.

It is possible to calculate the number of concept reﬁnements for a term in a temporal pattern by computing the distance of the current concept to the most special concept of the predicate templates of the pattern:

Deﬁnition 4.43 (Concept Reﬁnement Level). Let tp = ((ap₁, . . . , ap_n),T R,CR) be a temporal pattern with atomic patterns ap_i = (pt_i,(t_i,1, . . . , t_i,arity)) and pt_i = (pi_i,(ci_i,1, . . . , ci_i,arity))∈P T. The set of concept identiﬁers for a term tin the tem-poral pattern tp is then deﬁned asCIt={ci_j,k |t_j,k =t}. The most special concept isci_specif and only if∀ci∈ CIt:is-a(ci_spec, ci). As it is not allowed that two concepts mutually subsume each other (Def. 4.15), the distance between two concepts in the concept hierarchy is deﬁned as: dist(ci₁, ci₂) :=|{ci ∈CI |is-a(ci₁, ci)∧ci =ci₂}|. The concept reﬁnement level for a concept with identiﬁer ci w.r.t. a temporal pattern tp is deﬁned as: crl(t, ci, tp) :=dist(ci, ci_spec).

Deﬁnition 4.44 (Instantiation). Let tp = (cp,T R,CR) be a temporal pattern, V be a variable occurring in cp, and o ∈ O be an object occurring in the dynamic scene. The binding of a variableV to an objectois denoted asinstantiation. Similar to a variable uniﬁcation, the set of reﬁned patterns is deﬁned by a substitution:

ρ_I((cp,T R,CR)) = {(cp θ,T R,CRθ)| θ ={V /o}} whereV must occur in cp and o ∈ O.

Example 4.18. Let pbe a temporal pattern p= (approaches(A, B)∧ inBallCon-trol(C, D), {(1,2,{older, head-to-head})},{(A, object),(B, object),(C, player),(D, ball)}). Then these are examples of valid reﬁnements:

• Lengthening: p_L = (approaches(A, B)∧inBallControl(C, D)∧ inBallCon-trol(E, F), {(1,2,{older, head-to-head}),(1,3,{older, head-to-head}),(2,3, { older, head-to-head})},{(A, object),(B, object),(C, player),(D, ball),(E, play-er), (F, ball))})

• Temporal reﬁnement: p_T = (approaches(A, B)∧inBallControl(C, D),{(1,2, {older})}, {(A, object),(B, object),(C, player),(D, ball)})

• Uniﬁcation: p_U = (approaches(A, B)∧ inBallControl(B, D),{(1,2,{older, head-to-head})},{(A, object),(B, player),(D, ball)})

• Concept Reﬁnement: p_C = (approaches(A, B)∧inBallControl(C, D),{(1,2, {older, head-to-head})},{(A, ball),(B, object),(C, player),(D, ball)})

• Instantiation: p_I = (approaches(A, B)∧ inBallControl(C, b),{(1,2,{older, head-to-head})},{(A, object),(B, object),(C, player),(b, ball)})

The combination of the ﬁve reﬁnement operations form the reﬁnement operator on the pattern language Ltp: ρ(p) =ρ_L(p)∪ρ_T(p)∪ρ_U(p)∪ρ_C(p)∪ρ_I(p).

One important property in pattern mining approaches is the anti-monotonicity w.r.t. the support of specialization operators. It is used, e.g., in Apriori in order

to prune the search space [AS94]: If an itemset is found to be not frequent, all its specializations cannot be frequent due to the anti-monotonicity property.

Theorem 4.3 (Anti-monotonicity of the reﬁnement operator). All reﬁnements of a pattern have the same or a smaller support than the pattern itself, i.e.,∀sp∈ρ(p) : supp(p)≥supp(sp).

Proof 4.3. In order to prove the anti-monotonicity property, it must be shown that for each of the ﬁve reﬁnement operations the support of the resulting patterns cannot increase in comparison to the support of the original patternp. It is obvious that a pattern can only have a match where a generalization of the pattern has a match as for each atomic pattern in the conjunctive pattern we must ﬁnd a matching predicate instance so that a valid substitution of the conjunctive pattern can be found and the temporal restriction and concept restriction are satisﬁed. There is only one operation which changes the size of the conjunctive pattern, namely the lengthening operation. In all other cases, the size of the conjunctive pattern does not change.

We show for both cases that the support cannot increase.

1. Temporal reﬁnement, variable uniﬁcation, concept reﬁnement, instantiation:

In all these reﬁnements, the size of the pattern does not change. A reﬁned patternp can only have a match wherepalso has a match, i.e.,matches(p)⊆ matches(p). For each matchm ∈matches(p) there are only two possibilities:

It either also satisﬁes the additional reﬁnement, i.e., m ∈ matches(p) or it does not, i.e.,m ∈matches(p). In particular, these cases are possible:

(a) Temporal reﬁnement: LetT Ri,j ⊂ T Ri,j be the reﬁned temporal restric-tion with T Ri,j =T Ri,j\tr and matchM={(1, pred₁), . . . ,(n, pred_n)} with pred_i = (pi, o₁, . . . , o_m, s_i, e_i, b). If ir(s_i, e_i,s_j, e_j) = tr then M ∈ matches(p), otherwise M ∈matches(p).

(b) Variable uniﬁcation: Let V₁ and V₂ be the two variables in p before uniﬁcation. For each match m of the pattern p a substitution θ with {V₁/o₁, V₂/o₂} ⊆ θ has been performed. After uniﬁcation, only those matches ofpwhere both variables are substituted by the same object can also be matches of p, i.e., if and only ifo₁ =o₂ m∈matches(p).

(c) Concept reﬁnement: Let CR = {(o₁, ci₁), . . . ,(o_i, ci_i), . . . ,(o_n, ci_n)} and CR ={(o₁, ci₁), . . . ,(o_i, ci_i), . . . ,(o_n, ci_n)} be the concept restrictions of p and p, respectively, withis-a(ci_i, ci_i). If the corresponding objecto_i is still an instance of the sub-concept c_i, i.e., instance-of(o_i, c_i), then M ∈ matches(p).

(d) Instantiation: Let V be the variable in p that has been instantiated by o₁ and let o₂ be the instance that has been used in the substitution θ

with {V /o₂} ⊆θ in order to match then M ∈matches(p) if and only if o₁ =o₂.

2. Lengthening: As lengthening increases the size of the conjunctive pattern, the matches of p cannot be directly used as matches for p. The reason is that p has one more atomic pattern and there is no corresponding predicate in the existing matches of p. Nevertheless, p can only have matches that extend a match M ∈ matches(p) as, of course, the conjunctive pattern without the added atomic pattern ap_new must also match. Thus, matches(p) can be seen as the relevant set of sub-matches where matches of p can occur. Ignoring the temporal restriction and concept restriction for simplicity a match M can only be a valid match of p if there is a match M ∈ matches(p) with

∀i : (i, pred_i) ∈ M =⇒ ∃j : (j, pred_i) ∈ M with i ≤ j and there exists a predicate (k, pred_new) ∈ M with pred_new = (pi, o₁, . . . , o_n, s, e, true) which occurs concurrently to the match interval of M. It is required that e ≥ (s_max−w) and s < e_min+w.

If a predicate satisfying these conditions exists, it must be shown that it cannot extend the validity interval of match M. The validity interval of a match M is v_i = (s_max_i, e_min_i+w] by deﬁnition. The following cases can occur:

• s ≤ s_max_i: Then s_max_i = s_max_i, i.e., the lower bound of the validity interval does not change.

• s > s_max_i: Then s_max_i > s_max_i, i.e., the interval length decreases.

• e ≥ e_min_i: Then e_min_i = e_min_i, i.e., the upper bound of the validity interval does not change.

• e < e_min_i: Then e_min_i < e_min_i, i.e., the interval length decreases.

Thus, it has been shown that the validity intervals of the matches ofp can only be within the validity intervals of the matches of p and it follows that supp(p) ≥ supp(p).

Fig. 4.10 illustrates an existing match (AandB) and three cases of an additional predicate C. C₁ and C₃ are not relevant as they do not occur concurrently to the previous match interval. C₂ illustrates the only situation where the new interval lies within the previous match interval. As shown in the proof above, it does not matter if C₂ starts before or ends after the match interval.

From Def. 4.36 it follows directly that ∀sp ∈ ρ(p) : f req(p) ≥ f req(sp). It also holds that any pattern that can be generated by more than one application of

Figure 4.10: Diﬀerent cases for a match applying the lengthening operation the reﬁnement operator must have a frequency equal to or less than the original pattern’s frequency:

pq =⇒f req(p)≤f req(q)

Im Dokument Temporal Pattern Mining in Dynamic Environments (Seite 117-123)