The Randomness and Robustness of Triadic, Tetradic, and Pentadic Evolution Patterns in a Large‑Scale Mobile Phone Network

(1)

ORIGINAL RESEARCH

The Randomness and Robustness of Triadic, Tetradic, and Pentadic Evolution Patterns in a Large‑Scale Mobile Phone Network

Cheng Wang¹

Received: 4 December 2020 / Accepted: 24 August 2021 / Published online: 1 September 2021

Abstract

Network evolution study has been one of the hottest topics in social network analysis in recent years. The paper investigates the randomness and robustness of triadic, tetradic, and pentadic evolution patterns in a communication network of over 10 million subscribers from a Spanish mobile phone company over a 65-day period in 2008. The data set is split into two time periods covering 28 days, or 4 weeks each. By fixing key parameters at the nodal and dyadic levels as in the observed network transition, randomly simulated networks are generated from the observed network at the first time period to examine how random the real transition was at the triadic level. We find that neither triangle (a ring of 3 nodes) decay nor formation followed a random process. In other words, they might be explained by some factors other than the nodal and dyadic evolution processes. Moreover, by fixing key parameters at the triadic level as well, randomly simulated networks are generated from the observed network at the first time period to examine how random the configuration decay and formation patterns were at the tetradic and pentadic levels. We find that the decay of square (a ring of 4 nodes) and pentagon (a ring of 5 nodes) followed a stochastic random process, but the formation of these two configurations did not.

Keywords Triadic evolution · Tetradic evolution · Pentadic evolution · Randomness · Robustness · Simulation

Introduction

Faust conducted a systematic analysis of 159 social networks and found that properties at nodal and dyadic levels accounted for 90% of variability in triad census. However, the network size only ranged from 4 to 217 in this study [1].

Will this conclusion still hold in large-scale social networks with millions of nodes and tens of millions of edges that change over time? To what extent configuration evolution at tetradic and pentadic levels is also predicted by lower level network features?

To answer these research questions, in this study a large- scale network dataset from a Spanish mobile phone company is separated into two time periods to investigate the network evolution patterns at the triadic, tetradic, and pentadic levels.

More specifically, we randomly remove and add the same number of nodes and edges as in the real transition and generate simulated networks from the observed network at the

first time period to examine how random and robust the real transition at the triadic level was. In addition, by also randomly removing and adding the same number of triangles as in the real transition, more simulated networks are generated from the real network of the first time period to check how random and robust the configuration decay and formation processes were at the tetradic and pentadic levels.

In so doing, this study has three main contributions. First, it uses a large-scale dataset, or “big data”, to revisit the classic network theory advanced by Faust [1]. Second, it pushes the network motif research to the tetradic and pentadic levels, while most previous work stops at the dyadic and triadic level. Third, it provides detailed and feasible simulation designs that future research can adopt and replicate in other network evolution studies.

* Cheng Wang

chengwang@wayne.edu

1 Department of Sociology, Wayne State University, Detroit, MI, USA

(2)

Dataset

The dataset consists of communication events from over 10 million subscribers of a Spanish mobile phone company¹ over a 65-day period in 2008. The network data from this mobile phone company have been used in many publications showing its face and criterion validity [2–10].

The dataset provides information on each dyad, including the originating phone number, call type (voice call or SMS which refers to short message service), destination phone number,² total number of voice calls or short messages, and total durations of all voice calls (there is no duration for short messages). We focus only on the data of voice calls³ between in-system subscribers who were customers of the mobile phone company. Any communication events outside the system, i.e., voice calls to or from a person who was not a customer of the company, are excluded.

The dataset is split into two time periods. As shown in Table 1, the first time period ( t

1 ) is from August 3, 2008 (Sunday), to August 30, 2008 (Saturday), and the second time period ( t₂ ) is right after the first time period and from August 31, 2008 (Sunday), to September 27, 2008 (Satur- day). Each time period covers 28 days, or 4 weeks.⁴

For each time period, directed edges are aggregated from voice calls in the original communication records and screened by the customer list to extract those between subscribers only. Then the edges between subscribers are sym- metrized using double counting.⁵ The subscribers who had at least one communication behavior through voice call enter the active node list. The distribution of nodal degree, i.e., the number of contacts for each subscriber, is similar across the two time periods.

Transition from t

1

to t

2

Transition at the Nodal Level

An undirected network (or graph) is generally represented as G(V,E) which defines the number of nodes (or vertices)

|V| at the nodal level and the number of edges |E| at the dyadic level. There were about 6.7 million nodes active at t (i.e., V_t 1

1 ) and nearly 6.8 million nodes active at t₂ (i.e., V_t

2 ).

As shown in Fig. 1, over these two time periods about 90%

of the nodes persisted (i.e., ), 10% nodes turned to be inactive in the network (i.e., ), and another 10% nodes became active in the network (i.e.,

).

Transition at the Dyadic Level There were nearly 32 million edges at t

1 (i.e., E_t

1 ) and over 32 million edges at t

2 (i.e., E_t

2 ). As shown in Fig. 2, over these two time periods about 55% of the edges persisted (i.e., ), about 45% edges decayed (i.e., ), and about 45% edges came into being (i.e.,

). This was a great deal of turnovers at the

Table 1 Descriptive statistics of

the two time periods 08/03/08–08/30/08 08/31/08–09/27/08

# Records 446,149,945 451,627,693

# Voice calls 300,373,516 307,886,967

# Edges 79,347,825 81,944,649

# Edges between subscribers 21,300,357 21,596,448

# Edges between subscribers when ignoring directions 31,823,222 32,480,540

# Nodes 6,719,330 6,787,249

Degrees Minimum 1 1

Maximum 600 645

95% 13 13

2 Both the originating and destination phone numbers were encrypted and hashed for security reasons.

3 There are several reasons why we only use voice calls as indica- tors of social ties but exclude SMS. First, for SMS we only have one measure of tie strength—frequency, while for voice call we have two measures—frequency and duration. Second, the SMS usage was not universal but very age related in 2008, i.e., youths were dispropor- tionately heavy users of SMS. Third, only four percent of social ties were just SMS based ties and most subscribers used both voice call and SMS for communications.

4 We attempted to split the data into every week or every two weeks but the communication pattern was not stable on a weekly or biweekly base. As a result, we choose every four weeks to perform the network evolution analysis.

5 Double counting means in the edge list we have both the edges eij

and eji which are the same in an undirected network. But technically double counting makes it easier for us to generate the list of 2-paths/

structural holes as well as the list of triangles.

1 This mobile phone company occupied about 30 percent of the mar- ket in 2008.

(3)

dyadic level when comparing with the transition at the nodal level.

Among the decayed edges, most disappeared but both the two nodes involving in them stayed active at t₂, few died and both the nodes involving in them also became inactive in the network, and some were gone since one of the nodes involving in them was inactive while the other stayed active.

Among the new edges, most formed between nodes who stayed active over time, few were between new nodes, and some were between one node who stayed active and one new node.

Transition at the Triadic Level

We use T to indicate triangles at the triadic level. As shown in Fig. 3, there were 5.6 million isomorphic triangles at t (i.e., T_t 1

1 ) and about 41% of them survived t₂ (i.e., ). The disintegrated triangles (i.e., ) were more likely to lose one of their edges and

turn into structural holes⁶ than lose two edges or turn into null triads.

When a triangle persisted or turned into a structural hole, the three nodes involving in it stayed active. But when it lost more than two edges, it might be due to the dormancy of node(s).

When the triangles lost two edges and only one of their edges survived t

2 , it is more likely that the third nodes not in the surviving edges stopped acting as common contacts than that the common contacts became inactive.

When the triangles lost all their three edges and turned into null triads, in most cases three persisting nodes survived t₂ , some lost one node, and very few lost two or all three nodes.

Turning to triangle formation, as shown in Fig. 4, there were 5.3 million isomorphic triangles at t

2 (i.e., |T|t₂ ) and about 43% of them survived from t₁ (i.e.,

Fig. 1 Transition over time at the nodal level

Fig. 2 Transition over time at dyadic level

6 A structural hole is a network motif of two nodes sharing one or more common contacts but lacking a direct edge between them.

(4)

). Among the new triangles (i.e., ), they were more likely to form by closing the third edges of structural holes than develop from single dyads or null triads.

When a triangle survived over time or formed from a structural hole, all three nodes involving in it stayed active.

Among the triangles developed from single dyads, it is more likely that a surviving node started to act as common contact than two contacts got a new node as common contact.

Among the triangles originated from null triads, most emerged from three persisting nodes, some were among two persisting nodes and one new node, and very few were among one persisting nodes and two new nodes or among three new nodes.

Fig. 3 Triangle decay over time

2,290,687 (40.92%)

5,597,411 1,917,322 (34.25%)

144,062 (2.57%) 1,134,861 (20.27%)

990,799 (17.70%)

254,541 (4.55%) 182,492 (3.26%)

58,416 (1.04%)

11,126 (0.20%)

2,507 (0.04%)

(5)

Simulated Transitions from t

1

to t

^′

2

When Specifying Characteristics at Nodal and Dyadic Level

By fixing key characteristics at nodal level and dyadic level, from the observed network at t₁ we simulated networks at t^′

2 . Then we compare the real transition with simulated ones to check how random and robust the real transition pattern was.

Process 1: randomly removing nodes and edges from the observed network at t₁.

The simulation consists of two processes. First, we randomly remove about 10% of the nodes (i.e., ) as in Fig. 1 and about 45% of the edges (i.e., ) as in Fig. 2 from the observed network at t

1 . There are three options we can use to remove the nodes:

1. Randomly remove about 10% nodes from the active nodes at t

1;

Fig. 4 Triangle formation over time

2,290,687 (43.12%)

1,720,099 (32.38%) 5,312,242

194,774 (3.67%)

1,044,879 (19.67%)

850,105 (16.00%)

153,401 (2.89%) 256,577 (4.83%)

74,088 (1.39%)

22,763 (0.43%)

6,325 (0.12%)

(6)

2. Based on the degree distribution of the active nodes at t₁ , randomly take about 10% nodes out from each degree group; and

3. Based on both the degree distribution of the active nodes at t

1 and the survival rate of nodes in each degree group from t₁ to t₂ , randomly choose the persisting nodes from each degree group.

We check the nodal degrees of the active nodes at t

1 , they are skewed to the left and have a long tail at right, which means we have a skewed distribution—there were more

low-degree nodes than high-degree nodes, or hubs. The persisted proportion of nodes in each degree group is shown in Table 2. The survival rate was lowest for single-degree nodes and it got close to 100% as the nodal degree kept increasing.

After experimenting each of the three options 20 times,⁷ we find the common problem for the first two options is that there are too many edges removed due to the removal of nodes—in the real transition from t₁ to t₂ there were about 2.1 million decayed edges due to the dormancy of one or both nodes participating in them, but this number arises to about 5.9 million in the first two options of simulation.

For the third option, on average about 2.2 million edges are removed due to the disappearance of one or both nodes participating in them, which is much closer to the number in the real transition.

By adopting the third option, we remove about 10% nodes and 2.2 million edges from the observed network at t₁ . Since there were about 14.3 million ties decayed from t

1 to t

2 (i.e., ), we continue to remove 12.1 (i.e., 14.3–2.2) million ties between the persisting nodes.

Here we note that we cannot match exactly the same number of decayed nodes and edges in this step but get very close numbers in the simulated networks.

Process 2: randomly adding new nodes and edges to the simulated networks

After removing nodes and edges in process 1, we continue to add new nodes at the nodal level. Since in the simulated networks the number of nodes removed in process 1 is not exactly the same as in the real transition from t₁ to t₂ , we add about 724 thousand new nodes.

At the dyadic level, first we randomly link the new nodes with persisting nodes so that each of them could participate in at least one of these edges. Then we randomly adding edges so that in total we will get the expected number of edges as in the observed network at t

2 . Since in the simulated networks the numbers of edges removed in process 1 are not exactly the same as in the real transition from t₁ to t₂ , we add about 12.2 million new edges between the persisting nodes.

The algorithm of the above-mentioned simulation processes can be programed as:

1. Read and symmetrize E_t 2. Extract V_t 1

3. Read and symmetrize 1 E_t 4. Extract V_t 2

5. Extract 2 , and

Table 2 Nodal survival patterns in the observed network Degree Frequency at t₁ Survival fre-

quency at t₂

Survival rate at t₂ (%)

1 1,361,155 944,294 69.37

2 1,109,930 976,629 87.99

3 924,887 872,636 94.35

4 743,662 720,242 96.85

5 583,993 572,406 98.02

6 450,306 443,887 98.57

7 345,238 341,447 98.90

8 263,848 261,462 99.10

9 201,633 200,050 99.21

10 154,478 153,341 99.26

11 118,996 118,186 99.32

12 92,748 92,152 99.36

13 72,048 71,591 99.37

14 56,839 56,480 99.37

15 44,703 44,473 99.49

16 35,355 35,151 99.42

17 28,205 28,044 99.43

18 22,878 22,769 99.52

19 18,420 18,327 99.50

20 14,853 14,774 99.47

21 12,111 12,037 99.39

22 9740 9678 99.36

23 8176 8125 99.38

24 6788 6746 99.38

25 5691 5654 99.35

26 4569 4531 99.17

27 3845 3821 99.38

28 3317 3294 99.31

29 2739 2722 99.38

30 2374 2351 99.03

31–35 7499 7447 99.31

36–40 3630 3599 99.15

41–75 4374 4337 99.15

≥ 76 302 299 99.01

Total 6,719,330 6,062,982 90.23

7 We have run the simulation 5 times, 10 times, 20 times, 40 times, 60 times, 80 times, and 100 times. The simulation results stabilize after 20 times and there is no significant improvement after that. 95%

is set as the simulation confidence level.

(7)

6. Extract , , and 7. Randomly remove nodes by survival rate of each

degree group

8. Randomly remove edges between persisting nodes 9. Add new nodes and randomly link them to persisting

nodes

10. Randomly add edges between nodes 11. Output the simulated edgelist

The R script that can be used to replicate this algorithm is publicly available at https:// github. com/ socne tfan/ Tsim/

blob/ main/ algor ithm1.R.

Triangle Decay in the Simulated Networks from t to t^′ 1

2.

We generate 20 simulated networks. As shown in Table 3, the transitional pattern is very stable in the simulated networks but quite different from the real transition: in the observed networks, the survival rate of triangles was much higher than in the simulated ones; on the contrary, the proportions of triangles at t

1 that turned into structural holes, or decayed into single dyads, or turned to be null

triads at t

2 were lower in the observed networks than in the simulated networks.

As shown in Table 3 and Fig. 5, the observed survival rate of triangles is about 2.8 times that in the simulated networks, and the dormant nodes and disappeared ties were less likely to be involved in triangles in the observed networks than in the simulated networks.

Among decayed triangles in the observed networks, most lost one of their edges and turned into structural holes. However, the decayed triangles in the simulated networks have about the same likelihood to disorganize into structural holes or single dyads.

For those turning into single dyads, the proportion of all the three nodes in the triangles survive t

2 was lower in the observed networks than in the simulated networks.

For those decaying into null triads, the proportion of all the three nodes in the triangles survive t

2 was also lower in the observed networks than in the simulated networks.

The change rates of other configurations are not as big as those we discuss above, as shown in Fig. 5.

To sum up, first, in the observed network, there was a strong triadic preservation effect—triangles were substantially more likely to survive over time and they were much less likely to disintegrate into lower-level motifs. This is

Table 3 Statistics of real transition from t₁ to t₂ and 20 simulated transitions

Triangle (%) Structural

hole (%) Single dyad (%) Null triad (%)

Real transition 40.92 34.25 20.27 4.55

Simulated transition 1 14.66 36.77 36.17 12.40

2 14.66 36.77 36.16 12.40

3 14.66 36.78 36.18 12.38

4 14.66 36.78 36.18 12.38

5 14.65 36.81 36.15 12.39

6 14.70 36.78 36.13 12.39

7 14.65 36.74 36.23 12.39

8 14.66 36.78 36.16 12.40

9 14.67 36.77 36.18 12.39

10 14.64 36.83 36.14 12.38

11 14.67 36.77 36.18 12.39

12 14.66 36.78 36.16 12.40

13 14.67 36.77 36.16 12.41

14 14.64 36.82 36.15 12.39

15 14.68 36.76 36.17 12.38

16 14.68 36.75 36.16 12.40

17 14.65 36.82 36.13 12.40

18 14.62 36.78 36.20 12.40

19 14.66 36.81 36.17 12.37

20 14.65 36.85 36.13 12.37

Mean 14.56 36.79 36.16 12.39

SD 0.01 0.02 0.02 0.01

(8)

820,548 (14.56%) 2,290,687 (40.92%) for

5,597,411 2,059,083 (36.79%) 1,917,322 (34.25%) for

174,501 (3.11%) 2,024,271 (36.16%) 144,062 (2.57%) for

1,134,861 (20.27%) for

1,849,770 (33.05%) 990,799 (17.70%) for

693,509 (12.39%) 554,659 (9.91%) 254,541 (4.55%) for 182,492 (3.26%) for

132,067 (2.36%) 58,416 (1.04%) for

5,976 (0.11%) 11,126 (0.20%) for

807 (0.01%) 2 507 (0 04%) for

Fig. 5 Triangle decay in the simulated networks

(9)

consistent with traditional theoretic accounts regarding the protective effect of embeddedness [11–16].

Second, although we detect a strong triadic preservation effect, there was still a great deal of triadic turnovers—in the observed networks almost 60% triangles disappeared.

As shown in Fig. 6, when a triangle dissolved it was more likely in the observed network to have just one edge removed. Instead, the line is symmetric in the middle and declines towards each end on the left and right for simulated transitions.

Third, the fact that triangles in the real network were much more likely to decay into the “forbidden triad” [17]

than into a single dyad (but in the simulated networks these two outcomes are equally likely) tell us something about the role of a common contact. In the simulated networks, the probability of tie decay appears to be independent of its embeddedness in a triadic configuration. In the observed network, the fact that two edges shared a common contact pro- tected either one of them from decaying. This may indicate that even when two persons did not get along, their common contact would attempt to maintain a relationship with them.

Triangle Formation in the Simulated Networks from t

1 to t^′

2

Turning to triangle formation, as shown in Fig. 7, the simulation process does not generate any new triangles but destroys many more than what we see in the real world as in Fig. 4.

Almost all triangles in the simulated networks are survived ones, which means that if random process is at work, we will not only lose a majority of triangles but gain almost no new ones.

And we randomly select one simulated network and continue the simulation process of removing about 10% nodes and 45% edges and adding 10% new nodes and 45% new edges another 19 times. As shown in Fig. 8, the number of triangles continues to decrease at first and then stabilizes after the 8th simulation, which suggests that random process at nodal and dyadic levels will degenerate a clustered network into a large-scale random graph which has no local structure beyond the dyadic level.

Simulation of the Transition from t

1

to t

^′′

When Fixing Characteristics at Nodal,

2

Dyadic, and Triadic Level

In this section, by fixing key parameters at the triadic level in addition to these at the nodal and dyadic levels, from the observed network at t

1 we generate simulated networks at t^′′

2 , which help us check the randomness and robustness of network evolution patterns at the tetradic and pentadic levels.

Process 1: randomly removing nodes, edges, and triangles from the real network at t

1.

The simulation in process 1 consists of two steps, with triangles being removed first and nodes and edges being removed next.⁸

Fig. 6 Triadic evolution patterns in real versus simulated transitions

8 We have experimented many other ways, including removing nodes and edges first and triangles next. More than 80% of triangles are lost in the first step, the number of which is quite off from that in the real network.

(10)

In the first step, we randomly remove about 59% triangles (i.e., ). To do this, we equally divide the nodes whose nodal degrees greater than 2 into three groups—low, medium, and high—and each group accounts for about 1/3 nodes, as Table 4 shows.

As shown in Table 5, the triangles at t

1 are divided into ten groups by the nodal-degree combinations and the

survival frequency and survival rate of each group is listed in the third and fourth column.

We randomly select triangles to be removed based on the survival rate in each group as shown in Table 5. The survived triangles result in a survived node and edge list.

The edges in the rest triangles are first screened by the survived edge list, and then these not on the list are randomly

820,537 (100.00%)

5 (0.00%) 820,548

1 (0.00%)

5 (0.00%)

4 (0.00%)

1 (0.00%) 1 (0.00%)

0 (0.00%)

Fig. 7 Triangle formation in the simulated networks

(11)

removed according to the triadic evolution pattern in the real transition shown in Fig. 3. The edges being removed might destruct more than one triangle and thus we will not

get exactly 2.3 million surviving triangles. But, it is very close to this number.

In the second step, we continue to remove the nodes and edges from what are left from step one. After being screened by the survived node and edge list, the remaining process is the same as we have done before—randomly remove the nodes based on the survival rates of degree distribution groups as shown in Table 2 and continue to remove the edges between persisting nodes.

Process 2: randomly adding new nodes, edges, and triangles to the simulated networks

This process consists of three steps. In the first step, we add new nodes and randomly link each of them to one persisting node. In the second step, we add new triangles based on the transitional pattern of triangle formation as shown in Fig. 4. In the final step, we keep adding edges by following the same way of adding new edges when generating the simulated networks at t^′

2.

The algorithm of the above-mentioned simulation processes can be programed as:

1. Read and symmetrize E_t 2. Extract V_t 1

3. Extract T_t1

4. Read and symmetrize 1 E_t 5. Extract V_t 2

6. Extract T_t2

7. E x t r a c t 2 , a n d

Fig. 8 Number of triangles in the simulated networks following sequential random processes

Table 4 Nodal-degree groups (di ≥ 2) at t₁

Group Nodal degree Frequency Percentage

Low 2–3 2,034,817 37.98

Medium 4–6 1,777,961 33.18

High ≥ 7 1,545,397 28.84

Table 5 Triangle survival patterns by nodal-degree groups in the observed network

L represents low, M represents medium, and H represents high Group Frequency at t₁ Survival fre-

quency at t₂

Survival rate at t₂ (%)

L–L–L 33,798 16,801 49.71

L–L–M 104,730 50,057 47.80

L–L–H 62,261 27,608 44.34

L–M–M 188,612 87,047 46.15

L–M–H 312,235 133,879 42.88

L–H–H 262,359 98,980 37.73

M–M–M 184,379 87,818 47.63

M–M–H 607,687 272,276 44.81

M–H–H 1,337,152 550,698 41.18

H–H–H 2,504,198 965,523 38.56

Total 5,597,411 2,290,687 40.92

(12)

8. E x t r a c t , a n d

9. E x t r a c t , a n d

10. Extract the number of triangles losing one edge, two edges, and all three edges in

11. Extract the number of triangles gaining one edge, two edges, and all three edges in

12. Randomly select triangles to be removed by their survive rate in each degree group

13. Randomly remove edges in the selected triangles by the triadic evolution pattern in the real transition 14. Keep removing nodes randomly by survival rate of

each degree group

15. Keep removing edges randomly between persisting nodes

16. Add new nodes and randomly link them to persisting nodes

17. Randomly add edges to form new triangles by the triadic evolution pattern in the real transition

18. Keep adding edges randomly between nodes 19. Output the simulated edge list

The R script that can be used to replicate this algorithm is publicly available at https:// github. com/ socne tfan/ Tsim/

blob/ main/ algor ithm2.R.

Square Decay from t

1 to t

2/t^′

2/t^′′

2

We generate 20 simulated networks by following the fore- going algorithm. A square S is ring of four nodes i, j, k, and l among which there are 4 edges e_ij, e_jk, e_kl, and e_il but the two diagonal edges eik and ejl do not exist otherwise the configuration is identified as triangle(s). As shown in Fig. 9, in the real transition squares were more likely to lose two edges than lose one of their edges or lose three edges, and they were relatively less likely to survive t

2 or turn to be null.

The square decay pattern of the simulated networks at t^′ which fixing characteristics at the nodal and dyadic level 2

shows some differences from the real transition but none of the discrepancies is greater than 5%. If we also fix characteristics at the triadic level as in the simulated networks at t^′′

2 the square decay pattern is closer to that of the real network, suggesting that most of the square census at the second time period is the result of the nodal and dyadic processes, and some is the result of the triadic processes.

Figure 10 shows that the square decay patterns are similar across the real transition and simulated ones. The lines are symmetric in the middle and gradually decline towards each end on the left and right.

246,221 (10.74%) 152,880 (6.67%) 223,101 (9.73%)

573,333 (25.00%) 566,921 (24.72%) 629,570 (27.45%)

533,682 (23.27%) 569,042 (24.81%) 545,603 (23.79%) 2,293,301

204,919 (8.94%) 272,104 (11.87%) 232,303 (10.13%)

486,524 (21.22%) 578,626 (25.23%) 490,304 (21.38%)

248,622 (10.84%) 153,728 (6.70%) 172,420 (7.52%)

Fig. 9 Square decay in the real and simulated networks

(13)

Pentagon Decay from t

1 to t

2/t^′

2/t^′′

2

A pentagon P is ring of five nodes i, j, k, l, and m among which there are five edges eij, ejk, ekl, elm, and eim but none

of the edges inside the ring—eik, eil, ejl, ejm, and ekm exists otherwise the configuration is identified either as triangle(s) or as square(s). As shown in Fig. 11, in the real transition, pentagons were more likely to lose two or three edges than

Fig. 10 Percentages of square decay in the real and simulated networks

140,787 (3.56%) 113,574 (2.87%) 267,213 (6.76%)

584,099 (14.78%) 575,164 (14.56%) 860,709 (21.79%)

579,040 (14.66%) 607,229 (15.37%) 648,446 (16.41%)

517,395 (13.10%) 583,200 (14.76%) 598,300 (15.14%) 3,950,882

607,644 (15.38%) 638,463 (16.16%) 529,707 (13.41%)

542,033 (13.72%) 618,030 (15.64%) 477,978 (12.10%)

721,327 (18.26%) 670,106 (16.96%) 464,307 (11.75%)

258,557 (6.54%) 145,116 (3.67%) 104,222 (2.64%)

Fig. 11 Pentagon decay in the real and simulated networks

(14)

lose four edges or one of their edges, and they were less likely to turned null or survive t

2.

The pentagon decay pattern of the simulated networks t^′

2 which fixing characteristics at the nodal and dyadic level shows some differences when comparing with that in the

real transition but none of the discrepancies is greater than 3%. If we also fix characteristics at the triadic level as in the simulated networks t^′′

2 , the pentagon decay pattern var- ies a little bit but the maximum discrepancy is 7%.

Fig. 12 Percentages of pentagon decay in the real and simulated networks

end with end with end with

266,068 (11.65%) 467,575 (99.98%) 281,259 (75.25%)

558,250 (24.44%) 48 (0.01%) 54,735 (14.64%)

526,256 (23.04%) 28 (0.00%) 22,403 (5.99%)

2,284,117 467,684 373,749 180,405 (7.90%) 17 (0.00%) 3,840 (1.03%)

463,544 (20.29%) 14 (0.00%) 8,938 (2.39%)

289,594 (12.68%) 2 (0.00%) 2,574 (0.69%)

Fig. 13 Square formation in the real and simulated networks

(15)

Figure 12 shows that the pentagon decay patterns are not too different across the real transition and simulated ones.

The lines are again approximately symmetric in the middle and gradually decline towards each end on the left and right.

Square Formation from t

1 to t

2/t^′

2/t^′′

2

As shown in Fig. 13, in the real transition squares at t

2 were more likely to emerge by gaining two edges than gaining the fourth edges to close the rings or getting three edges, and they were relatively less likely to be survived one or develop from nothing.

For the simulated networks at t^′

2 which fix characteristics at the nodal and dyadic level, most squares are survived ones. For the simulated networks at t^′′

2 which also fix characteristics at the triadic level, we get even fewer squares and about three quarters of them are survived ones. This is similar to the inflow pattern with triangles—in simulation,

most squares observed at t₁ will disappear and almost all squares in the simulated networks would have existed at t and very few will be created. 1

As shown in Fig. 14, a square can also be viewed as two nodes have no ties between them but they share two common contacts. The formation of a square can be seen as the gen- eration of double structural holes or multiple 2-paths—we need to find not only one null dyad but two common contacts for them, or if we put another way what is really making a square happen is not just multiple intermediaries, but for some reason the two nodes never connect. This task cannot be fulfilled by random processes at lower levels. Therefore, the square formation is not a random process. On the other hand, to destroy a square is much easier by just removing any node(s) and/or edge(s) in it. As a result, it is not surprising the square decay pattern and the square formation pattern do not follow the same mechanism.

Pentagon Formation from t

1 to t

2/t^′

2/t^′′

2

As shown in Fig. 15, pentagons in the observed network at t₂ were more likely to emerge by getting two or three edges than getting the four edges or the fifth edges to close the rings, and they were less likely to develop from nothing or be survived ones.

Fig. 14 Another way to view a square

end with end with end with t"2

151,869 (3.96%) 426,431 (99.88%) 368,747 (74.89%)

586,575 (15.29%) 260 (0.06%) 89,141 (18.10%) 565,964 (14.75%) 77 (0.02%) 19,288 (3.92%)

490,089 (12.78%) 65 (0.02%) 6,242 (1.27%)

3,837,488 426,932 578,914 (15.09%) 36 (0.00%) 4,758 (0.97%) 492,410 501,700 (13.07%) 30 (0.00%) 2,016 (0.41%)

685,352 (17.86%) 27 (0.00%) 1,848 (0.38%)

277,025 (7.22%) 6 (0.00%) 370 (0.08%) Fig. 15 Pentagon formation in the real and simulated networks

(16)

For the simulated networks at t^′

2 which fix characteristics at the nodal and dyadic level, most pentagons are survived ones. For the simulated networks at t^′′

2 which also fix characteristics at the triadic level we get slightly more pentagons on average and about three quarters of them are survived ones.

The formation of a pentagon is not a random process either. On the contrary, to destroy a pentagon is much easier by just removing any node(s) and/or edge(s) in it, and/or even by adding some tie(s) inside the pentagon to be a triangle. As a result, it is not surprising the pentagon decay pattern and the pentagon formation pattern do not follow the same mechanism.

To sum up, first, both the square and pentagon decay patterns follow a random process—the configurations would be destroyed by randomly remove nodes at the nodal level and edges and the dyadic level.

Second, the square and pentagon formation patterns do not follow a random process—the egos need to strategically create some ties (the ties located on the rings) to close the rings and at the same time prevent some ties (the ties inside the rings) from creating to avoid that the configurations will be identified as lower level ones—triangles or squares.

Third, the local structure at the triadic level does not affect those at even higher levels significantly—randomly removing and adding some triangles does influence the transitional pattern at the tetradic and pentadic levels but not much.

Limitations and Future Research

Some limitations of this study should be mentioned. First, the mobile phone data we used came from 2008. We expect our findings to be confirmed or refined by data collected from different, especially more recent years in future work.

Second, we only have access to data across two consecu- tive four-week periods. Future research needs to verify whether our findings still hold when the data are not con- secutive but spread apart. Third, this study compares the network evolution of mobile phone users from August to September. While across these two months the network characteristics were similar and the observed network evolved smoothly, we do not have data in other periods of the year (or in recent years) to investigate whether the turnaround numbers (or proportions) at the nodal, dyadic, and triadic levels were still consistent. Therefore, the gen- eralizability of the transition patterns to other months in a year awaits future inquiry.

Conclusions

Our simulations do not support the classic network theo- ries advanced by Faust [1], i.e., triadic configuration is mostly predicted by features at the nodal and dyadic levels.

Instead, by comparing the real transition with the simulated ones at triadic, tetradic, and pentadic levels, we reach four conclusions in this study.

First, in this large-scale mobile phone network edges were more likely to disappear from configurations that were not triangles, and so were the nodes. Being in triangles prevented the nodes and edges from decaying, and this conclusion is consistent with the embeddedness theory.

Second, a strong triadic preservation effect is detected—

triangles themselves were substantially more likely to survive over time and they were less likely to disintegrate into lower level motifs. If a triangle was going to decay, it had higher odds to lose one of its edges and turned to be a "forbidden triad" than into a single dyad. In addition, although this third edge of a structural hole might not be observed during certain time period(s), the existence of the common contact still maintained a relationship with them.

Third, the square and pentagon decayed over time by a stochastic random process at lower levels. They were more likely to lose two or three of their edges and less likely to lose none or all of their edges.

Finally, all the configuration formation processes at triadic, tetradic, and pentadic levels did not take place by chance. They came into being by a systematic process in the observed networks—the egos strategically try to create some ties (the ones located on the rings) to close the rings and at the same time prevent some ties (the ones located inside the rings) from creating to avoid that the higher level configurations will be degenerated to lower- level ones—triangles or squares.

Funding Not applicable.

Availability of Data and Material Data are available from the author upon reasonable request.

Declarations

Conflict of interest The author declares that there is no conflict of interest.

References

1. Faust K. A puzzle concerning triads in social networks: graph constraints and the triad census. Soc Netw. 2014;32(3):221–33.

(17)

2. Bagrow JP, Wang D, Barabási AL. Collective response on human populations to large-scale emergencies. PLoS ONE.

2011;6(3):e17680.

3. Ercsey-Ravasz M, Lichtenwalter RN, Chawla NV, Toroczkai Z.

Range-limited centrality measures in non-weighted and weighted complex networks. Phys Rev. 2012;85:066103.

4. Ghoshal G, Barabási AL. Ranking stability and super-stable nodes in complex networks. Nat Commun. 2011;2:394.

5. Liu YY, Slotine JJ, Barabási AL. Controllability of complex networks. Nature. 2011;473:167–73.

6. Onnela JP, Arbesman S, Gonzalez MC, Barabási AL, Christa- kis NA. Geographic constraints on social network groups. PLoS ONE. 2011;6(4):e16939.

7. Raeder T, Lizardo O, Hachen DS, Chawla NV. Predictors of short- term decay of cell phone contacts in a large-scale communication network. Soc Netw. 2011;33(4):245–57.

8. Wang C, Lizardo O, Hachen DS. Algorithms for generating large- scale clustered random graphs. Netw Sci. 2013;2(3):403–15.

9. Wang C, Lizardo O, Hachen DS. Triadic evolution in a large-scale mobile phone network. J Complex Netw. 2014;3(2):264–90.

10. Wang C, Lizardo O, Hachen DS. Using big data to examine the effect of urbanism on social networks. J Urban Aff. 2018. https://

doi. org/ 10. 1080/ 07352 166. 2018. 15503 50.

11. Granovetter MS. Economic action and social structure: the problem of embeddedness. Am J Sociol. 1985;91(3):481–510.

12. Bearman PS, Everett KD. The structure of social protest, 1961–

1983. Soc Netw. 1993;15(2):171–200.

13. Morgan DL, Neal MB, Carde P. The stability of core and periph- eral networks over time. Soc Netw. 1997;19(1):9–25.

14. Putnam RD. Bowling alone: the collapse and revival of American community. New York: Simon & Schuster; 2001.

15. Mitchell TR, Holtom BC, Lee TW, Sablynski CJ, Erez M. Why people stay: using job embeddedness to predict voluntary turnover. Acad Manag Ann. 2001;44(6):1102–21.

16. Felps W, Mitchell TR, Hekman DR, Lee TW, Holtom BC, Har- man WS. Turnover contagion: how coworkers’ job embeddedness and job search behaviors influence quitting. Acad Manag Ann.

2009;52(3):545–61.

17. Granovetter MS. The strength of weak ties. Am J Sociol.

1973;78(6):1360–80.

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.