Pruning an existing method - Evolutionary Framework for the Identification of Influential Nodes

3.5 Evolutionary Framework for the Identification of Influential Nodes

3.5.2 Pruning an existing method

We first consider (APc). From Section 3.4.4 we know that ARRS keeps or eliminates a new sequence S^′ based on the global goal function ξg(·) (see also Algorithm 3.2). This strategy, to some extent, is inefficient since it considers the whole sequence. That is, during an iteration, one part ofS^′ maybe lead to a better result while another part might make the result worse. And if the worse one weighs more onF, thenS^′ would be eliminated. Actually, such conflict becomes more and more frequent as Tincreases²⁴. Indeed, the combination of n_sandr_ucould overcome that and facilitate ARRS to have the ability to optimize a sequence locally (such as, a largensin tandem with a smallru). But that ability is always limited.

Now assuming that there is a sequenceSregarding a given networkG(N,M)where each element ofScorresponds to a unique node inN, we define a slice ofSas

S_p(t₁,t^′₁) =S[t₁:t^′₁]_, _(3.35)

24It is also the main reason that ARRS is stuck in some local optimization if it is initialized by some shallow strategies.

wheret₁⩽^t^′₁are two given integers (refer to Section 2.1.2 for the definition ofS[t₁:t^′₁]). The corresponding local average ofG_p(q)_follows

F(S_p(t₁,t^′₁)) =

t^′₁/n q=

∑

t1/n

G_p(q). (3.36)

Then, one can easily observe thatSp(t₁,t^′₁)would be independent ofSp(t₂,t^′₂)if t^′₁ <t₂or t₂^′ <t₁, that is, F(S_p(t₁,t^′₁))would keep unchanged, no matter what permutationsS_p(t₂,t^′₂) takes, vice versa. Note that here those elements in bothS_p(t₁,t₁;)and S_p(t₂,t^′₂)are fixed, namely, they only change their orders internally.

Holding the above property, we can then try the following processes to prune a given sequence S, i.e., locally optimize an existing method. That is, i) randomly pick up two integers, and assign the small one tot₁ and the other one tot^′₁, respectively; ii) run ARRS on S_p(t₁,t^′₁); and iii) repeat those two steps a number of times. Now another problem arises as to how should we choose those two integers?

3.5.2.1 PruOrd

RanS HubS

AHubS APa

gS ACIS ABe

F ARRS

PruOrd(0,0) PruOrd(1,0) PruOrd(0,1) PruOrd(1,1)

(a)

RanS HubS

AHubS APa

gS ACIS ABe

F ARRS

PruGri(0,0) PruGri(1,0) PruGri(0,1) PruGri(1,1)

(b)

RanS HubS

AHubS APa

gS ACIS ABe

F ARRS

PruRan(0,0) PruRan(1,0) PruRan(0,1) PruRan(1,1)

(c)

RanS HubS

AHubS APa

gS ACIS ABe

F ARRS

PruRang(0,0) PruRang(1,0) PruRang(0,1) PruRang(1,1)

(d)

Figure 3.23:Performance of PruOrd, PruGri, PruRan and PruRang regardingFof different initial sequences compared to ARRS and ABetS (dashed lines). (a) PruOrd. (b) PruGri. (c) PruRan with uniformly random selection oft1. (d) PruRang considering Eq. (3.38). The difference ofF between every two ticks is 0.002. Each result is the mean of 20 IIs.

3.5 Evolutionary Framework for the Identification of Influential Nodes

The first strategy still follows a routine similar to ARRS and we call it prune orderly (PruOrd). Specifically, lettingTˆ_pbe the total pruning times and T_pbe the current pruning iteration, PruOrd considers the following steps to achieve one round of pruning²⁵:

1) seta(T_p) =⌊a(0)nT_p⌋,b₁(T_p) =⌊b₁(0)(1−δ_b₁)^T^p⌋andb₂(T_p) =⌊b₂(0)(1−δ_b₂)^T^p⌋; 2) lett₁= a(T_p)andt^′₁ =t₁+b₂(T_p);

3) run ARRS onS_p(t₁,t^′₁); 4) leta(T_p) =a(T_p) +b₁(T_p)_;

5) repeat 2), 3), and 4) until the termination is reached.

Then, let us look into the reason that we have those control parameters. From Section 3.1 and 3.4 we know that the order parameterG_p(q)decrease asqincreases and approaches 0 after the critical thresholdq_c, i.e.,q> q_c. Hence, when Sbecomes more and more orderly, the effect of a node from the subcritical regime on Fwould be less and less, especially those at the beginning of S²⁶. Therefore, we use a(T_p)to lock those nodes, and the number of such nodes increases as the rise of T_p. With respect to b₁(T_p)and b₂(T_p) which satisfies b₁(Tp)< b2(Tp), their combination ensures the interactions among different groups, whose significance have been demonstrated in Section 3.4. And similar to ARRS, we let both of them decrease along with the increase ofT_p.

To validate the effectiveness of PruOrd, Tˆ_p =10⁴, a(0) =0.9/Tˆ_p, b₁(0) =0.1n, δ_b₁ = 0.0001,b₂(0) =0.3nandδ_b₂ =0.0005 are conducted. Besides, since ARRS only runs on part of S, the following changes of its configuration are also considered:Tˆ =20,ru(0) = (t₁^′ −t₁)/n andδ_r_u =δ_n_s =0.5. Fig. 3.23a shows the corresponding results, where different combinations ofr_u(T)andn_s(T)are studies as well. That is, for example, PruOrd(α₁,α₂) in whichα₁=1 corresponds to ru(T) = ru(0) and α₂ = 1 represents thatns(T)is randomly chosen from [_1,n_s(₀)], otherwise, they follow the strategies same as ARRS. Moreover, if α₂ = _{1, we} also associate n_s(0) with T_p, e.g., here we let n_s(0) = n_s(0) +1 if F does not have any improvement for 10 rounds.

As we can see from Fig. 3.23a, PruOrd truly works, in particular PruOrd(0,1). And its performance relies on bothα₁andα₂. Through comparing PruOrd(0,1) or PruOrd(1,0) with PruOrd(1,1), one can easily find that eitherα₁ = 0 or α₂ = 1 facilitates better results and PruOrd(0,1) accounts for the best. Besides, better performance could also be achieved by tuning those control parameters. But it is usually difficult to find the optimal configuration, and different networks might need a different one. Nevertheless, PruOrd converges very fast and could have a better result within 10 iterations than those compared methods that we mentioned in Section 3.4.

25Note that againa(T_p),b₁(T_p)andb₂(T_p)are temporal parameters that we employ to help explain a method or strategy. Hence, one should refer to the specific place to check their associated meanings.

26Note that here a percolation process is considered instead of an attack process.

3.5.2.2 PruGri

The difficulty in managing those control parameters motivates us to simplify PruOrd.

Hence, we have another strategy which prunesSbased on a grid search (PruGri). In detail, PruGri conducts a round of pruning in the following ways:

1) given boundariesb_l^′ andb^′_usatisfyingb^′_u−b^′_l >0,b^′_l >0 andb_u^′ <n;

2) leta(T_p) =0 and getb₂(T_p)by randomly picking up an integer from[b^′_l,b_u^′]; 3) sett₁ =a(T_p)_andt^′₁= t₁+b₂(T_p)_;

4) run ARRS onSp(t₁,t^′₁); 5) leta(T_p) =a(T_p) +b₂(T_p);

6) repeat 3), 4) and 5) until the termination is reached.

In this manner, we only need to tune b_l^′ and b_u^′ excluding those of ARRS. Note that the random selection of b₂(Tp) achieves the similar goal that the combination of b₁(Tp) and b₂(T_p) has in PruOrd. And one can of course let a(T_p) follow the way same as PruOrd.

But here our purpose is to make PruGri as simple as possible. Besides, another advantage that PruGri has is that it could be paralleled sinceb₂(Tp)is fixed for each round anda(Tp) increases in steps ofb₂(T_p). As illustrated in Fig. 3.23b where b_l^′ = 100 andb^′_u = 0.2nare employed, PruGri(1,1) has the best performance on average. And compared to PruOrd (Fig.

3.23a), PruGri could benefit more from existing strategies except for a random sequence (i.e., RanS). Besides, considering ABetS, all four can find smaller Fthan ARRS. One can also compare PruOrd and PruGri directly, that is, count the number of ticks regardingFsince the difference between every pair of ticks is same.

3.5.2.3 PruRan and PruRang

Though PruGri could find a really good result, it is sometimes inefficient in time consumption because it takes a grid search. Thus, we reach the third strategy which prunes a sequence by repeatedly conducting ARRS on a random slice ofS(PruRan). Specifically,

1) give the boundaryb^′_usatisfying 0<b^′_u< n;

2) assign a random integera(T_p)_{drawn from}[_1,n]_tot₁and another one b₂(T_p)_from [1,b^′_u]fort^′₁=t₁+b₂(T_p), wheret^′₁< nmust hold;

3) run ARRS onS_p(t₁,t^′₁);

4) repeat 2) and 3) until the termination is reached.

Indeed, PruRan is similar to PruGri but it gives us the freedom to controlt₁. For instance, rather than randomly chooset₁uniformly, we here consider a strategy similar to Eq. (3.27), that is, repeatedly sample a random integera(Tp)∈ [1,n]until it is successfully assigned to t₁, which follows

t₁= a(Tp) ifa(Tp)> b₁(Tp)or with a probability Ap=e

−[a(_Tp)−b1(_Tp)]² 2[b2(_Tp)]2

. (3.37)

3.5 Evolutionary Framework for the Identification of Influential Nodes

As we mentioned during the introduction of PruOrd, the supercritical region makes the main contribution to F, which is also the reason that we have Eq. (3.37). For example, if lettingb₁(T_p) = q_c, then Eq. (3.37) is less likely to consider nodes that is in the beginning ofS. Further, ifb₂(Tp)→ 0, then Eq. (3.37) degenerates to the original PruRan. Here we consider the settings of

b₁(T_p) =q_cnandb₂(T_p) = ^q^cⁿ

b₂(₀)α^T^p (3.38)

withα>1 to ensure that a(T_p)gradually converges toq_cfrom the subcritical regime as T_p increases. Note thatqc would also change with the rise ofTp.

To distinguish the one considering Eq. (3.37) from the original PruRan, we mark it with a ‘g’, namely, PruRang. Figs. 3.23c and 3.23d illustrate the comparisons of PruRan (with b^′_u = 0.2n), PruRang (with b_u^′ = 0.2n and using Eq. (3.38) with b2(0) = 0.01 and α=1.001) and other methods. As we can see from them, both PruRan and PruRang hold the best performance on average by settingα₁ =α₂= 1. And PruRang is more capable of approaching PruGri.

3.5.2.4 Summary

To sum up, considering the performance based on different initial sequences, those strategies could be truly possible to shorten the gap between ABetS and other shallow methods, such as PruGri(1,1) initialized with HubS. Besides, it is worth mentioning that all those 4 tested strategies have the ability to surpass ABetS, even though they are based on a shallow initial sequence. For instance, PruGri(1,1) has better performance in 10 out of 20 results than ABetS for the case of the initial sequence drawn based on HubS. In addition, in what follows, if there is no specific explanation, all those 4 strategies are considered with α₁ =α₂=1 where PruGri, PruRan, and PruRang have their best performance on average.

Moreover, in the below sections, since we only aim to detect whether a new strategy works, n_s(0) =n_s(0) +1 would be processed if 10 rounds of pruning result in no change ofF, and a method terminates if eitherns(0)>50 orTˆ_p is reached. These two changes would speed up those strategies but reduce their effectiveness certainly.

Im Dokument Identification of nodes and Networks (Seite 89-93)