• Keine Ergebnisse gefunden

The phase transition in random graphs and random graph processes

N/A
N/A
Protected

Academic year: 2022

Aktie "The phase transition in random graphs and random graph processes"

Copied!
141
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

The phase transition in random graphs and random graph processes

DISSERTATION

zur Erlangung des akademischen Grades

Doktor der Naturwissenschaften (doctor rerum naturalium) im Fach Informatik

eingereicht an der

Mathematisch-Naturwissenschaftlichen Fakultät II der Humboldt-Universität zu Berlin

von

Herr cand.scient. Taral Guldahl Seierstad geboren am 5. September 1979 in Lillehammer

Präsident der Humboldt-Universität zu Berlin:

Prof. Dr. Christoph Markschies

Dekan der Mathematisch-Naturwissenschaftlichen Fakultät II:

Prof. Dr. Wolfgang Coy Gutachter:

1. Prof. Dr. Hans Jürgen Prömel 2. Prof. Dr. Tomasz Łuczak 3. Prof. Dr. Stefan Felsner

eingereicht am: 29. Januar 2007

Tag der mündlichen Prüfung: 28. Juni 2007

(2)
(3)

Zusammenfassung

Zufallsgraphen sind Graphen, die durch einen zufälligen Prozess erzeugt werden. Ein im Zusammenhang mit Zufallsgraphen häufig auftretendes Phäno- men ist, dass sich die typischen Eigenschaften eines Graphen durch Hinzu- fügen einer relativ kleinen Anzahl von zufälligen Kanten radikal verändern.

Dieses Phänomen wurde zuerst in den bahnbrechenden Arbeiten von Erdős und Rényi untersucht.

Wir betrachten den Zufallsgraphen G(n, p), dern Knoten enthält und in dem zwei Knoten unabhängig und mit Wahrscheinlichkeitpdurch eine Kante verbunden sind. Erdős und Rényi zeigten, dass ein Graph für p = nc und c <1 mit hoher Wahrscheinlichkeit aus Komponenten mit O(logn) Knoten besteht. Für p = nc und c > 1 enthält G(n, p) mit hoher Wahrscheinlichkeit genau eine Komponente mit Θ(n) Knoten, welche viel größer als alle anderen Komponenten ist.

Der Punkt in der Entwicklung des Graphen, an dem sich die Kompo- nentenstruktur durch eine kleine Erhöhung der Anzahl von Kanten stark verändert, wird Phasenübergang genannt. Im G(n, p) passiert er bei p = n1. Darüber hinaus durchlebt G(n, p) einen sogenannten Doppelsprung. Wenn die Wahrscheinlichkeit pvon 1−εn auf n1 steigt, dann wächst die größte Kom- ponente von O(logn) auf Θ(n2/3) Knoten. Ist schließlich p gleich 1+εn , dann besteht die größte Komponente aus Θ(n) Knoten. Wenn p = 1+εn , wobei ε=ε(n) eine Funktion von n ist, die gegen 0 geht, sind wir in der kritischen Phase, welche eine der interessantesten Phasen der Entwicklung des Zufalls- graphen ist. In diesem Fall hängt die Komponentenstruktur des Graphen von der Geschwindigkeit ab, mit welcher ε gegen 0 konvergiert.

In dieser Arbeit betrachten wir drei verschiedene Modelle von Zufalls- graphen. In Kapitel 4 studieren wir den Minimalgrad-Graphenprozess. In diesem Prozess werden sukzessive Kanten vw hinzugefügt, wobei v ein zu- fällig ausgewählter Knoten von minimalem Grad ist. Wir beweisen, dass es in diesem Graphenprozess einen Phasenübergang und wie im G(n, p) einen Doppelsprung gibt.

Die zwei anderen Modelle sind Zufallsgraphen mit einer vorgeschriebenen Gradfolge und zufällige gerichtete Graphen. Für diese Modelle wurde bereits in den Arbeiten von Molloy und Reed (1995), Karp (1990) und Łuczak (1990) gezeigt, dass es einen Phasenübergang bezüglich der Komponentenstruktur gibt. In dieser Arbeit untersuchen wir in Kapitel 5 und 6 die kritische Phase dieser Prozesse genauer und zeigen, dass sich diese Modelle ähnlich zum G(n, p) verhalten.

Schlagwörter:

Zufallsgraphen, Phasenübergang, kritische Phase, größte Komponente

(4)

Abstract

Random graphs are graphs which are created by a random process. They are used among other places in the study of large networks, and in the analysis of the performance of algorithms.

A common phenomenon in random graphs is that the typical properties of a graph change radically by the addition of a relatively small number of random edges. This phenomenon was first investigated in the seminal papers of Erdős and Rényi.

We consider the graph G(n, p) which contains n vertices, and where any two vertices are connected by an edge independently with probability p.

Erdős and Rényi showed that ifp= nc and c <1, then with high probability G(n, p) consists of components with O(logn) vertices. If p = nc and c > 1, then with high probability G(n, p) contains exactly one component, called thegiant component, with Θ(n) vertices, which is much larger than all other components.

The phase transition in a random graph refers to the point at which the giant component is formed. In G(n, p) this is when p = 1n. Moreover, G(n, p) undergoes a so-calleddouble jumpat this point. When the probability p increases from 1−εn to n1, the largest component grows from O(logn) to Θ(n2/3) vertices. Whenpbecomes 1+εn , the graph contains a giant component with Θ(n) vertices. If we let p = 1+εn , where ε is a function of n tending to 0, we are in thecritical phaseof the random graph, which is one of the most interesting phases in the evolution of the random graph. In this case the structure depends on how fast ε tends to 0.

In this dissertation we consider three different random graph models. In Chapter 4 we consider the so-called minimum degree graph process. In this process edgesvware added successively, wherev is a randomly chosen vertex with minimum degree. We prove that a phase transition occurs in this graph process as well, and also that it undergoes a double jump, similar toG(n, p).

The two other models we will consider, are random graphs with a given degree sequence and random directed graphs. In these models the point of the phase transition has already been found, by Molloy and Reed (1995), Karp (1990) and Łuczak (1990). In Chapter 5 and 6 we investigate the critical phase of these processes, and show that their behaviour resemblesG(n, p).

Keywords:

random graphs, phase transition, critical phase, giant component

(5)

Acknowledgement

I would like to thank my supervisor Prof. Dr. Hans Jürgen Prömel for letting me join his research group at the Humboldt University, and the European Graduate Program “Combinatorics, Geometry, and Computation” for schol- arship and support. I also want to thank the members of the research group for algorithms and complexity at the Humboldt University for interesting discussions and seminars, especially Prof. Dr. Anusch Taraz who helped me much during my first year, and Dr. Mihyun Kang with whom I have had many stimulating discussions.

Furthermore I want to thank Dr. Manuel Bodirsky who gave me feed- back on a draft of this dissertation. I am also grateful to Prof. Dr. Michał Karoński and the members of the Department for Discrete Mathematics at the Adam Mickiewicz University for their hospitality, and in particular to Prof. Dr. Tomasz Łuczak for very inspiring collaboration, and to Dr. Mał- gorzata Bednarska who was very helpful during my six-month stay in Poznań.

Finally I want to thank my family for their support and encouragement through many years.

v

(6)

Preface

In this thesis we study the evolution of random graph processes. The study of random graphs was initiated by Erdős and Rényi around 1960, and became a flourishing research area in the following decades. We will mostly concern ourselves with a phenomenon which occurs in several random graph pro- cesses, generally known as the phase transition, namely that the component structure of a random graph changes substantially caused by the addition of relatively few random edges.

One of the motivations for studying random graphs is the desire to de- scribe a “typical” graph. For example, if we consider all labelled graphs on nvertices, it is known that the vast majority of the graphs are connected, con- tain a copy of any fixed graph F, and has chromatic number close to 2 logn

2n, provided that n is large enough. Moreover, the proportion of graphs not having these properties decreases as n grows. We therefore feel justified in saying that “almost all” graphs have these properties. In the terminology of random graphs we say that a random graph has these properties with probability tending to 1 as the number of vertices tends to infinity.

More interesting results can be obtained if we restrict ourselves to sub- classes of graphs, for example by fixing the number of edges and asking what the typical properties of a graph with n vertices and m edges are. Often we think of random graphs as states in a process. We begin at time 0 with an empty graph on n vertices. Then as the time goes, we add edges to the graphs at random, either uniformly or according to some other random pro- cedure. A discovery of Erdős and Rényi was that many graph properties enjoy so-called threshold phenomena: when the number of edges in the ran- dom graph is significantly smaller than the threshold, it has the property with probability very close to 0, while if the number of edges is significantly greater than the threshold, it has the property with probability very close to 1.

The main topic of this dissertation is the phenomenon known as the phase transition. A random graph with n vertices and 0.49n edges is very likely to consist of many small components, none of which has more than

vi

(7)

vii O(logn) vertices, while a random graph with n vertices and 0.51n edges most probably contains a unique large component containing a linear number of vertices. This large component is called the giant component, and there has been much interest in studying the evolution of this component, and in particular in examining the particular point in a graph process where the giant component is first formed. This point of a graph process is generally referred to as thephase transition, because of the similarities to the physical phenomenon of substances turning from one phase to another by a small change in temperature or pressure.

The phenomenon is also related topercolationsin statistical mechanics. In percolation theory a typical question is whether, and with which probability, the centre of a porous stone becomes wet if the stone is put into water. The stone can be represented by a graph, with different points being adjacent if they are connected by a hole in the stone.

The thesis is organised as follows. Chapter 1 contains an introduction of random graph theory and presents the random graph models we will examine later in the thesis. In Chapter 2 we present some well-known results from various areas of mathematics, which will be used in the analysis of the random graph processes later on. In Chapter 3 we discuss recursive trees and the early phase of the minimum degree graph process, while Chapter 4 is concerned with the phase transition of that process. In Chapter 5 we study the critical phase of random graphs with a given degree sequence, and in Chapter 6 we consider the critical phase for random digraphs.

Chapters 4 and 5 are joint work with Mihyun Kang, and are based on [29] and [28] respectively. Chapter 6 is joint work with Tomasz Łuczak and is based on [40].

(8)

Contents

1 Introduction 1

1.1 The basic random graph models . . . 1

1.2 Notation . . . 2

1.3 Threshold functions . . . 3

1.4 The phase transition . . . 4

1.5 Random graphs with a given degree sequence . . . 6

1.5.1 The configuration model . . . 7

1.5.2 The phase transition in random graphs with a given degree sequence . . . 8

1.5.3 Random regular graphs . . . 10

1.6 The minimum degree graph process . . . 11

1.7 Other random graph models . . . 12

1.8 Random digraphs . . . 13

1.9 Summary of the main results . . . 14

2 Tools 17 2.1 Generating functions . . . 17

2.2 Branching processes . . . 20

2.2.1 The Galton-Watson process . . . 20

2.2.2 Branching process argument forG(n, p) . . . 21

2.2.3 Multitype branching processes . . . 23

2.3 Differential equations . . . 24

2.4 Martingale central limit theorem . . . 28

3 Recursive trees and forests 31 3.1 Recursive trees . . . 31

3.2 The number of leaves . . . 34

3.3 A random recursive forest . . . 38

3.4 Martingale . . . 42

3.5 Quadratic variation . . . 43

3.6 The final stage . . . 46 viii

(9)

ix

4 The minimum degree graph process 49

4.1 Connectivity . . . 50

4.2 Main theorem . . . 56

4.3 The red phase . . . 58

4.4 The blue phase . . . 59

4.5 Branching process . . . 61

4.6 The phase transition . . . 68

4.7 The critical phase . . . 72

4.8 The evolution of the minimum degree graph process . . . 78

5 Random graphs with a given degree sequence 81 5.1 Main theorem . . . 81

5.2 Branching process . . . 85

5.3 Analysis of generating functions . . . 87

5.4 The phase transition . . . 95

5.4.1 The subcritical case . . . 95

5.4.2 The supercritical case . . . 96

5.5 Comparision withG(n, p) . . . 101

6 The critical behaviour of random digraphs 103 6.1 D(n, p) and G(n, p) . . . 104

6.2 The subcritical case . . . 106

6.3 The supercritical case . . . 107

6.4 Proof of main lemma . . . 109

(10)
(11)

Chapter 1 Introduction

The starting point of the theory of random graphs is generally considered to be a series of papers by Erdős and Rényi from between 1959 and 1961, in which random graphs were studied in their own right for the first time. Two notable papers are [17] from 1959 and [16] from 1960; some of the results contained therein will be given below. The field is vast, so we will mostly restrict ourselves to topics which are relevant to later chapters.

1.1 The basic random graph models

The model which Erdős and Rényi first concentrated on is the model we now know as the G(n, m) model. Let Ωn,m be the set of those graphs on the vertex set [n] = {1, . . . , n} which contain exactly m edges. Then G(n, m) is a graph chosen uniformly at random from Ωn,m. Thus, if G is a graph with n vertices and m edges, then

P[G(n, m) = G] =

n 2

m

!−1

.

Another model, which was also described in [16], but was first introduced by Gilbert [21], is the G(n, p) model. We let Ωn be the set of graphs on n vertices, and we let G(n, p) be chosen at random from Ωn such that if G has n vertices andm edges, then

P[G(n, p) =G] =pm(1−p)(n2)−m.

Equivalently, every pair of vertices in G(n, p) forms an edge with probabil- ity p, independently of every other pair of vertices. This model is called the binomial modelof random graphs.

1

(12)

2 CHAPTER 1. INTRODUCTION Typicallyp and m are not fixed numbers, but functions ofn, and we are interested in the asymptotic properties of G(n, p) or G(n, m) as n tends to infinity. In general, ifP is a graph property, we want to determine the limit of the probability thatG(n, p) or G(n, m) has P asn tends to infinity. Often this probability tends to either 0 or 1; we say that G(n, p) (or G(n, m)) has the property P asymptotically almost surely, abbreviated a.a.s., if the probability thatG(n, p) (or G(n, m)) hasP tends to 1 as n tends to infinity.

The two models G(n, m) and G(n, p) have very similar properties when m∼n2p. In this case the expected number of edges in G(n, p) is aboutm.

Although the probability that G(n, p) actually contains exactly m edges is generally very small, in many cases theorems proved for G(n, p) also hold for G(n, m) with m∼n2p, and vice versa; see Łuczak [36]. We will mostly considerG(n, p), rather thanG(n, m), since the fact that the edges inG(n, p) are present independently of each other makes it more comfortable to work with than G(n, m), although some of the graph processes we will consider later have more in common with G(n, m).

We will also consider random graphs from a dynamical viewpoint. In one such model we start with an empty graph G on n vertices which changes over time. At every step in the process we choose a pair of vertices {v, w}

uniformly at random from the set of pairs of nonadjacent vertices, and add the edge vw to the graph. After precisely m edges have been added, the probability distribution ofG is the same as that ofG(n, m).

It is also possible to consider G(n, p) as a dynamical graph process. Here we give every potential edge a “birth-time” chosen uniformly at random from the interval [0,1]. Then we let p increase gradually from 0 to 1. The graph G(n, p) then consists of those edges which have birth-time at most p.

Thus the graph process starts as an empty graph and grows until it becomes complete, at the latest when the time reaches 1.

1.2 Notation

Before we can go further, we have to introduce some notation, in particu- lar to deal with the asymptotic behaviour of functions. Let f(n) and g(n) be two positive functions of n. We write f(n) = O(g(n)) if there is a con- stant C such that f(n) Cg(n) for all (large enough) n, and we write f(n) = Ω(g(n)) if there is a constant c > 0 such that f(n) cg(n) for all (large enough) n. If f(n) = O(g(n)) and f(n) = Ω(g(n)), we write f(n) = Θ(g(n)) orf(n)g(n). If limn→∞ f(n)

g(n) = 0, we writef(n) =o(g(n)), orf(n)g(n), and if limn→∞ f(n)

g(n) = 1, we write f(n)∼g(n).

(13)

1.3. THRESHOLD FUNCTIONS 3 As already mentioned, an event happens asymptotically almost surely (a.a.s.) if its probability tends to 1 as n tends to infinity. An event happens almost surely (a.s) if its probability equals 1.

If A is an event, IA is the indicator variable of A, which is equal to 1 if A holds and 0 otherwise. A special case of an indicator variable is the Kronecker delta δij, which equals 1 if i = j and is 0 otherwise. The set {1, . . . , n} is denoted by [n]. If v is a vector, its transpose is denoted by v0. All logarithms are natural.

1.3 Threshold functions

One of the important discoveries by Erdős and Rényi was that many proper- ties exhibit a so-calledthreshold phenomenon: a small change in the number of edges of a random graph may have a big impact on the probability that the random graph has a certain property. LetP be a graph property. Then we say that t(n) is a threshold function forP if

n→∞lim P[G(n, p) has P] =

( 0 if p(n)t(n), 1 if p(n)t(n).

Bollobás and Thomason [13] proved that every property which is preserved by the addition of edges has such a threshold function. However, for sev- eral properties the increase in probability happens even more abruptly. A function t(n) is said to be a sharp thresholdfor P if for every ε >0

n→∞lim P[G(n, p) has P] =

( 0 if p(n)<(1−ε)t(n), 1 if p(n)>(1 +ε)t(n).

In [17] Erdős and Rényi proved that the property of a graph being con- nected has the function t(n) = lognn as a sharp threshold. In fact, they described the transition phase, in which the limit probability increases from 0 to 1, much more precisely. Let p(n) = logn+c(n)n . Then

n→∞lim P[G(n, p) is connected] =

0 if c(n)→ −∞, e−e−c if c(n)→c∈R,

1 if c(n)→ ∞.

They also proved that G(n, m), when considered as a dynamic process, be- comes connected a.a.s. at the moment when the last isolated vertex disap- pears. This holds even more generally: Erdős and Rényi [15] showed that G(n, m) becomes k-connected a.a.s. at the moment when the last vertex

(14)

4 CHAPTER 1. INTRODUCTION of degree k−1 disappears; this has a.a.s. happened when m = n2(logn + klog logn+α(n)) where α(n) → ∞. The first Hamiltonian cycle appears inG(n, m) a.a.s. when the graph becomes 2-connected, which is a.a.s. when the last vertex of degree 1 disappears. This was shown by Komlós and Sze- merédi [33] and Bollobás [9].

All the properties just mentioned thus have sharp thresholds. A threshold which is not sharp is called acoarse threshold. An example of a property with a coarse threshold is that of subgraph containment. Let F be a fixed graph with v vertices and e edges. The density of F is defined to be d(F) = ve, and themaximal density m(F) is the density of the subgraph ofF with the highest density. Bollobás [10] showed that

n→∞lim P[G(n, p)⊃F] =

( 0 if pn−1/m(F), 1 if pn−1/m(F).

In the case of balanced graphs — that is graphs for which m(F) = d(F)

— this was proved already by Erdős and Rényi [16]. If p n−1/m(F), then G(n, p) contains a copy of F with probability bounded away from 0 and 1.

More generally it appears that local properties often have coarse thresh- olds, while global properties often have sharp thresholds. We will now con- sider a property with a sharp threshold, namely the property that a random graph has a component of order Θ(n). The short period of time in which this component evolves is dubbed the phase transition and is arguably one of the best studied periods of the entire evolution of random graphs.

1.4 The phase transition

Thephase transitionrefers to the sudden change in the component structure of many random graph processes. In the random graph G(n, p) it happens around the time 1n. It was first described in [16] and has later been examined in minute detail by several authors. Let us consider G(n, p) where p = nc for a constant c. When c < 1, the graph G(n, p) consists a.a.s. of small components, all of which have O(logn) vertices. As we let c increase, these components merge and grow larger, and as soon as c > 1, there is a.a.s. a unique large component which consists of Θ(n) vertices. Thus, in the very short time from c = 1−ε to c = 1 +ε, for any ε > 0, many of the small components join to form one large component. This component is known as the giant component. If we stop the time exactly at the point when c = 1, we will see that the largest component has in the order of n2/3 vertices, and that there are many components of roughly the same size. As soon asc >1, however, there is only one large component; the second largest component

(15)

1.4. THE PHASE TRANSITION 5 has O(logn) vertices. Thus the order of the largest component first makes a jump from Θ(logn) vertices to Θ(n2/3), and then to Θ(n); for this reason the phenomenon is also called the double jump. We state this fundamental theorem here.

Theorem 1.1 (Erdős, Rényi 1960). Let c be a positive constant, and let p= nc.

(i) If c < 1, then a.a.s. no component in G(n, p) contains more than one cycle, and no component has more than c−1−loglogn c vertices.

(ii) If c = 1 and ω(n) is a function tending to infinity as n → ∞, then G(n, p)a.a.s. contains at least one component with more thann2/3/ω(n) vertices and no component with more than n2/3ω(n) vertices.

(iii) If c > 1, then G(n, p) a.a.s. contains a component with (d +o(1))n vertices, where d+e−cd = 1, while every other component has at most

logn

c−1−logc vertices and contains at most one cycle.

We present a sketch of a proof of part (i) and (iii) of this theorem in Chapter 2.2.2, using branching processes. The original proof of this theorem, by Erdős and Rényi, uses a counting argument.

Theorem 1.1 tells us much about the random graph G(n, p) when c6= 1, but the most interesting question is arguably to find out how the giant com- ponent is formed, which happens whenc= 1. This is called thecritical phase of the graph process. In this period there are several large components of roughly the same size vying for dominance. Asp increases, these large com- ponents eat many of the small components and merge with each other until a single giant component remains. One important problem is to determine at which point there is a component which a.a.s. remains largest until the end of the process in the dynamical model. It turns out that the appropriate parametrisation of p in the critical phase is

p= 1 n + λ

n4/3. (1.1)

Bollobás [8] goes a long way to explain the development in this phase, but a fully satisfactory answer was only found by Łuczak [35] in 1990, thirty years after Erdős and Rényi first described the phase transition.

Theorem 1.2 (Łuczak 1990). Let np = 1 +λn−1/3, and let Lk(G(n, p)) be the order of the kth largest component in G(n, p).

(i) If λ→ −∞, then a.a.s. L1(G(n, p))n2/3.

(16)

6 CHAPTER 1. INTRODUCTION (ii) Ifλ → ∞, then a.a.s.L1(G(n, p))n2/3 L2(G(n, p)). Furthermore

L1(G(n, p)) = (2 +o(1))λn2/3 a.a.s.

If we consider the dynamical model, Łuczak also proved that whenλ→ ∞, the largest component in G(n, p) will a.a.s. remain the largest until the end of the process, while when λ → −∞, the largest component will a.a.s. not remain the largest. When λ tends to a constant, the probability that the largest component remains the largest is bounded away from 0 and 1; the larger λ is, the closer the probability is to 1. For a detailed description of this phase of the process, see Janson and Spencer [26].

The process G(n, p) obeys an interesting symmetry rule. Suppose that p = nc with c > 1, and let d < 1 be such that de−d = ce−c. Let C be the giant component in G(n, p). The structure of G(n, p)\C is essentially that of G(n0, p0), where n0 is the number of vertices outside the giant component, and p0 = nd.

This symmetry rule has a parallel when np→1. Suppose thatnp= 1 + λn−1/3 withλ → ∞, but λn−1/3 =o(1). At this point a giant component, C, has a.a.s. appeared inG(n, p). Then the structure ofG(n, p)\C is essentially similar to the structure ofG(n0, p0), wheren0 =n− |C|andn0p0 = 1−λn−1/3. The situation is very similar in the random graph model G(n, m). In fact, this is the model for which Erdős and Rényi first described the phase transition. If m = cn2 with c < 1, then the largest component in G(n, m) a.a.s. has O(logn) vertices. If m = cn2 with c > 1, then there is a.a.s.

a unique component with Θ(n) vertices, and every other component has O(logn) vertices.

1.5 Random graphs with a given degree se- quence

The random graph models G(n, p) and G(n, m) are by far the best under- stood, but many other ways of generating random graphs have been sug- gested. One of them is to choose a random graph with a given degree se- quence, or as a special case, to choose a random regular graph. Newman, Strogatz and Watts [45] present several real-world graphs, which they demon- strate can be well approximated by this graph model.

A sequence d = (a1, a2, . . . , an) of integers is called a degree sequence if

Pn

i=1ai is even, and 0≤ai ≤n−1 for alli= 1, . . . , n. We let Ωd be the set of all graphs onnvertices with degree sequenced. Provided that Ωd 6=∅, we say that a random graph with degree sequenced is a graph chosen uniformly at random from Ωd.

(17)

1.5. RANDOM GRAPHS WITH A GIVEN DEGREE SEQUENCE 7 Since we are mostly interested in random graph models for which we can prove asymptotic results as n tends to infinity, we should define this random graph model for increasingn. We will mostly use the model and the terminology used by Molloy and Reed [44].

Let A N be an infinite set of positive integers. An asymptotic de- gree sequence is a sequence of functions D = (d0(n), d1(n), d2(n), . . .), where di : A N0 for every i 0, such that di(n) = 0 whenever i n, and

P

i≥0di(n) = n. If D is an asymptotic degree sequence and n A, we let Dn be the degree sequence (a1, a2, . . . , an), where aj aj+1 for every j = 1, . . . , n1, and #{j|aj = i} = di(n). Thus, di(n) denotes the num- ber of vertices of degree i in a graph of order n. The asymptotic degree sequence D is said to be feasible if ΩDn 6=∅ for all n ∈A.

Suppose that D is a feasible asymptotic degree sequence. If n A, a random graph Gn(D) is a graph chosen uniformly at random from the set ΩDn. The graph Gn(D) is called a random graph with the given degree sequence Dn.

In order to be able to state interesting theorems about graphs in this random graph model, we need to impose some structure on D. One way is to consider only random regular graphs: for some r 0, dr(n) = n and di(n) = 0 for i6=r, and A is restricted to the even numbers if r is odd. We will come back to this model in Section 1.5.3.

Another way, which allows for more general graphs, is to assume that the proportion of vertices of any given order is roughly the same for alln∈A. We will say that an asymptotic degree sequence issmoothif there are constantsλi for i 0 such that λi(n) := di(n)/n λi as n → ∞for all i≥ 0. We will consider this model in Section 1.5.2.

1.5.1 The configuration model

It is difficult to study random graphs with a given degree sequence directly.

Instead, it has become customary to take the route via random config- urations. The configuration model was introduced by Bender and Can- field [6] and Bollobás [7], and later examined closer by Bollobás [11] and Wormald [58].

Given a degree sequence d= (a1, . . . , an), we define a configuration with degree sequence d in the following way. Let V = {v1, . . . , vn} be a set of vertices. Let L be a set consisting of ai distinct copies of the vertex vi for i = 1, . . . , n. These copies are called half-edges. A configuration C consists of the set L, together with a perfect matching P of the half-edges in L. A random configuration C based on the set L is a configuration, in which the perfect matchingP is chosen uniformly at random from the set of all perfect

(18)

8 CHAPTER 1. INTRODUCTION matchings ofC.

A random perfect matching can be constructed greedily: at every step we take an arbitrary, unmatched half-edge, and match it with another half-edge chosen uniformly at random from the remaining half-edges. Using this pro- cedure, every perfect matching has the same probability of being generated.

Given a configuration C on L, we define the underlying multigraph G of C to be the multigraph obtained by identifying all the copies of vi with each other for i = 1, . . . , n. For an asymptotic degree sequence D we let Gn(D) be the underlying multigraph of a random configuration Cn with de- gree sequenceDn. Figure 1.1 shows a randomly generated configuration with degree sequence (1,1,1,2,2,2,3,3,4,5) and its underlying multigraph. The graphGn(D) is a random multigraph, but is not chosen uniformly at random from the set of multigraphs with degree sequence Dn: the probability that Gn(D) = G, where G is a multigraph with degree sequence Dn is propor- tional to 2−lQjj!−mj, where l is the number of loops, andmj is the number of multiedges with multiplicityj in G. (See for example Section 9.1 in [27].) This means, however, thatP[Gn(D) = G] is the same for anysimplegraphG.

Thus, if we repeat the above procedure until we obtain a simple graph, we have generated a simple graph with degree sequenceDnuniformly at random.

The configuration model can therefore be used to generate the graphGn(D).

If this procedure is to be used to generate random simple graphs, it is important that the probability that Gn is simple is not too small. If the probability that Gn is simple tends to 0 as n tends to infinity, the expected number of times the procedure must be repeated to obtain a simple graph increases with n. If we impose certain restrictions on D, we can ensure that Gnis simple with probability bounded away from 0. This holds in particular when the maximum degree is bounded. In this case, ifGnhas the propertyP a.a.s., then a simple random graph with degree sequenceD also has P a.a.s.

1.5.2 The phase transition in random graphs with a given degree sequence

Molloy and Reed [44] showed that there is a phase transition for random graphs with given asymptotic degree sequence D. We will assume thatD is smooth — that is λi(n) := di(n)/n λi for i 0 — and that it is sparse, which means that Pidi(n) = O(n). We require moreover that D should be “well-behaved” in a way which we will define precisely in Chapter 5, see page 82. In particular the maximum degree should not be too large. Molloy

(19)

1.5. RANDOM GRAPHS WITH A GIVEN DEGREE SEQUENCE 9

Figure 1.1: A configuration and its multigraph

and Reed defined the quantity

Q(D) = X

i≥1

i(i−2)λi,

and proved the following theorem about the phase transition in this graph model.

Theorem 1.3 (Molloy, Reed 1995). Let D be a well-behaved sparse asymp- totic degree sequence for which there exists ε > 0 such that for all n and i > n1/4−ε, di(n) = 0, and let G=Gn(D). Then:

(i) If Q(D)< 0, and for some function 0 ω(n)≤ n1/8−ε, di(n) = 0 for all i ω(n), then for some constant R dependent on Q(D), G a.a.s.

has no component with more thanRω(n)2logn vertices, and a.a.s. has fewer than 2Rω(n)2logn cycles. Also, a.a.s. no component of G has more than one cycle.

(ii) IfQ(D)>0, then there exist constantsζ1, ζ2 >0 dependent onD, such thatG a.a.s. has a component with at leastζ1n vertices andζ2n cycles.

Furthermore, ifQ(D)is finite, thenGa.a.s. has exactly one component of size greater than Clogn for some constant C dependent on D.

It may not be obvious why the quantity Q(D) appears in Theorem 1.3.

Suppose that we are given a randomly chosen vertex v in the graph and want to determine the order of the component it lies in. We can do this by exposing the component vertex by vertex. The vertex v has degree i with probability λi(n). In this case there are i unexplored edges incident to v. If we follow one of these, we reach a new vertex (unless the edge is a loop); if this

(20)

10 CHAPTER 1. INTRODUCTION vertex has degreej, the number of edges we can explore increases byj−2. We continue to explore the component, until at some point there are no longer any unexplored edges, in which case we have exposed the entire component.

Whenever we follow an edge, the probability that the vertex we find at the other end has degreej is roughly jd(n), whered is the average degree. This holds as long as the number of explored vertices is small compared to the total number of vertices. Since the number of unexplored edges then increases with j 2, the expected increase in the number of unexplored edges is roughly

1 d

P

jj(j 2)λj. If this value is negative, we expect that the process will die out rather quickly; if it is positive, there is a chance that the number of unsaturated vertices will just continue to grow, so that a large component is generated.

In a subsequent paper, [43], Molloy and Reed also determined the order of the giant component: they found a function γ(D) such that the giant component in Gn a.a.s. consists of γ(D)n+o(n) vertices. Furthermore they proved that a duality principle holds for this graph model: the structure of the graph formed by removing the giant component fromGnis essentially the same as the structure of a random graph with asymptotic degree sequence D0 = (d00(n), d01(n), . . .), which can be calculated from D.

The case thatQ(D) = 0, which is thecritical phaseof this random graph model, is not covered by Theorem 1.3; this will be the subject of Chapter 5.

We will show that this graph model behaves roughly as G(n, p) does in the critical phase.

1.5.3 Random regular graphs

A random r-regular graph is a graph chosen uniformly at random from the set of r-regular graphs. It is a special case of random graphs with a given degree sequence, where for alln, dr(n) = n and di(n) = 0 if i6= r, and A is restricted to the even integers if r is odd. Because it is a natural model, it has mostly been studied in its own right, and not connected with the more general model described earlier in this section. The configuration model can be used to generate random regular graphs. Ifris fixed, the graph produced by the configuration model is simple with probability bounded away from 0 when n grows. Hence, if the underlying multigraph of a random r-regular configuration has the propertyP a.a.s., a random simpler-regular graph also has the property P a.a.s.

Let Greg(n, r) be a random r-regular graph. For r = 1,Greg(n, r) is sim- ply a perfect matching. Forr= 2, G(n, r) is a collection of cycles. It is fairly straightforward to show that Greg(n,2) is a Hamiltonian cycle with prob-

(21)

1.6. THE MINIMUM DEGREE GRAPH PROCESS 11 ability asymptotically equal to q4nπ ; thus Greg(n,2) is a.a.s. disconnected.

However, for r 3, Greg(n, r) is a.a.s. r-connected. A problem which was open for a long time is whether Greg(n, r) a.a.s. contains a Hamiltonian cy- cle. This was settled in the affirmative by Robinson and Wormald, forr= 3 in [46] and for r >3 in [47].

1.6 The minimum degree graph process

In Chapter 4 we will consider the minimum degree graph process, where the mechanism for adding edges guarantees that the minimum degree increases relatively quickly.

Let{Gmin(n, m)}m≥0 be a Markov chain whose states are multigraphs on the set {1,2, . . . , n}. The graph Gmin(n,0) is the empty graph onn vertices, and for m≥0,Gmin(n, m+ 1) is obtained fromGmin(n, m) by first choosing a vertex of minimum degree in Gmin(n, m) uniformly at random, and then connecting it by a new edge to another vertex chosen uniformly at random among the remaining vertices inGmin(n, m). Thus, at every step at least one vertex of minimum degree has its degree increased.

The process was originally introduced by Wormald [57] to illustrate the usage of the differential equation method, which we will come back to in Section 2.3. Kang et al. [30] later found the connectivity threshold for the process, and in Chapter 4 we will determine the point of the phase transition.

Let Hi for i 1 be the random variables such that Gmin(n, Hi 1) has minimum degree less than i and Gmin(n, Hi) has minimum degree at least i. Kang et al. [30] proved that there are constantsh1 0.69, h2 1.22 and h3 1.73 such that a.a.s. H1 = h1n + o(n), H2 = h2n +o(n) and H3 =h3n+o(n). (Exact formulas for h1, h2 and h3 are given on page 49.) They moreover proved that ift= mn, then ift < h2, the graph is a.a.s. discon- nected, while ift > h3, the graph is a.a.s. connected. Whenh2 < t < h3, the probability that the graph is connected tends to a value bounded away from both 0 and 1 asntends to infinity. In the case that the graph is not connected in this period, it a.a.s. consists of a giant component with n−o(n) vertices, and one or more isolated cycles. In Section 4.1 we will find an exact expres- sion for the limit probability thatGmin(n, tn) is connected whenh2 < t < h3. In Chapter 4 we will prove that there is a phase transition in Gmin(n, m), as there is in G(n, m). We will prove that there is a constant hg 0.86 such that if t < hg, then the graph a.a.s. consists of only small components, while if t > hg, there is a giant component. In addition, we will show that the phase transition happens as a double jump: first the order of the largest component jumps from O(logn) to Θ(n2/3), and then from Θ(n2/3)

(22)

12 CHAPTER 1. INTRODUCTION to Θ(n). The behaviour of the phase transition is therefore similar to the phase transition inG(n, m).

1.7 Other random graph models

A graph process similar to the minimum degree graph process is the so- called min-min process. In this process we also start with an empty graph, but at every step we choose two vertices of minimum degree, and add an edge between them. This is an attempt to define a graph process which produces random regular graphs, since after rn/2 edges have been added, the graph is necessarily r-regular, provided that rn is even. For r = 1, we get a perfect matching chosen uniformly at random. However, forr= 2, the resulting graph does not have the same distribution as the random regular graph Greg(n,2). For r 3, it is not known whether the min-min graph process has the same distribution as Greg(n, r). The min-min process is closely studied by Coja-Oghlan and Kang [14]. They find that as soon as the number of edges in the process is (1+ε)n, for anyε >0, there is a.a.s. a giant component with more than n2 vertices, and there is a positive probability that the graph is connected.

Another process which can be used to produce regular graphs is the d-process. In this process edges are added at random, subject to the re- striction that the maximum degree should remain at most d. Every edge which is not already in the graph, and whose addition to the graph does not increase the degree of any vertex to larger than d has the same probability of being added. The process ends when no further edge can be added. Ru- ciński and Wormald [48] show that the final graph in this process is a.a.s.

d-regular ifndis even, and has a.a.s.n−1 vertices of degreedand one vertex of degree d−1 if nd is odd. They also show in [49] that the final graph is a.a.s. connected whend 3. For d= 2 this is not the case: Telcs, Wormald and Zhou [53] show that the graph is Hamiltonian, and thereby connected, with probability Θ(n−1/2), although with a different constant than in the uniformly random 2-regular graph.

In recent years there has been an interest in studying real-world networks and modelling these with random graph processes. This was sparked by a paper by Watts and Strogatz [54] from 1998. They consider some real- world networks — the neural network of a roundworm, the power grid of the Western United States, and the collaboration graph for movie actors — and show that these graphs share some characteristic properties, namely high clustering along with small diameter, in spite of the graphs being sparse.

They call these “small-world” networks, in analogy with the “small-world”

(23)

1.8. RANDOM DIGRAPHS 13 phenomenon, popularised by the notion of “six degrees of separation”. The random graphs which we have presented so far in this chapter typically have small diameters, but lack the clustering property, whereas graphs which ex- hibit more regularity, such as lattices, are generally clustering, but have large diameters. Thus, those models are inadequate for the purpose of studying these real-world networks, and researchers have attempted to find random graph models which typically produce graphs that share these traits, hoping to achieve mathematical models which are useful in the study of real-world networks.

Another real-world network that has been given much attention, is the

“web graph”, in which the vertices are pages on the World Wide Web, and links between pages are represented by edges. Albert, Barabási and Jeong [1]

showed that the web graph has relatively small diameter, while Faloutsos, Faloutsos and Faloutsos [19] found that its degree sequence obeys a power law: the number of vertices of degree d is proportional to da for some con- stant a. Graphs satisfying such a law are often called scale-free, and several other real-world networks have also been shown to closely follow such power laws.

Subsequently it has been attempted to design random graph models which produce scale-free graphs having the “small-world” properties men- tioned above. A survey of such scale-free random graphs is found in [12].

One such scale-free graph process was proposed by Barabási and Albert [5].

In this process one starts with some small arbitrary graph, and adds vertices one by one to the graph. Every vertex added is connected to some of existing vertices, in such a way that the probability that a vertex receives a new edge is proportional to its degree.

1.8 Random digraphs

Finally we will consider another random structure, namely random digraphs.

In the binomial model for random digraphs, D(n, p), a random digraph on nvertices is chosen such that for every ordered pair (v, w) of vertices there is an arc fromv towwith probabilityp. All pairs are considered independently of one another. In the context of digraphs we will use the term component to mean a strongly connected component.

Both Karp [31] and Łuczak [38] considered the phase transition, and proved that it occurs when p = 1/n. That is, let p = c/n, where c is a constant. If c < 1, then a.a.s. all the components in D(n, p) are single vertices or cycles of length at most ω(n), for any function ω(n) → ∞. If c >1, then there is a.a.s. a giant component with Θ(n) vertices. In fact, the

(24)

14 CHAPTER 1. INTRODUCTION order of the giant component is a.a.s. (d2+o(1))n, where d is the constant appearing in Theorem 1.1(iii). Thus the probability that a specified vertex belongs to the giant component in D(n, p) is the square of the probability that an arbitrary vertex belongs to the giant component inG(n, p).

In Chapter 6 we will consider the critical phase in the evolution of random digraphs, namely whenc= 1. We will prove that using the same parametri- sation as in G(n, p), namely (1.1), the random digraph exhibits the same behaviour as G(n, p): if λ → −∞, then a.a.s. all components in D(n, p) are relatively small cycles or single vertices, while if λ→ ∞, then there is a.a.s.

exactly one component which is neither a cycle nor a vertex, which is much bigger than every other component.

1.9 Summary of the main results

The main results in this dissertation regard the phase transitions in three different random graph models: the minimum degree graph process, random graphs with a given degree sequence and random digraphs. In the first of these random graph models we will locate the point of the phase transition and show that the graph process undergoes adouble jump, similar toG(n, p).

In the two other random graphs, for which the phase transitions have already been located ([44], [31], [38]), we will consider how the random graphs behave very close to the point of the phase transition. The three random graph models will be treated in Chapters 4, 5 and 6 respectively. Below we state the main theorem in each of these chapters.

The minimum degree random graph process Gmin(n, m) was introduced in Section 1.6, and will mainly be treated in Chapter 4, although Chapter 3 also considers some of its aspects. We will prove that the phase transition occurs when roughly hgn edges have been added, for a constant hg.

Theorem 4.2. Let

hg = log 16 log 22 3 log 21 + log 2·√

2716 log 2 0.8607.

(i) If t < hg, then a.a.s. every component in Gmin(n, tn) has O(logn) vertices.

(ii) Ift =hg andω(n)→ ∞, thenGmin(n, tn)a.a.s. contains no component of order greater than n2/3ω(n), and at least one component of order greater than n2/3/ω(n).

(25)

1.9. SUMMARY OF THE MAIN RESULTS 15 (iii) If t > hg, then a.a.s. the largest component in Gmin(n, tn) has s(t)n+ o(n) vertices, where s(t) is a function depending only on t, and every other component, if any, has O(logn) vertices.

Random graphs with a given degree sequence Gn(D) were presented in Section 1.5, and the point of the phase transition was already found by Molloy and Reed [44], see Theorem 1.3. In Chapter 5 we will consider more closely the critical point itself. In order to state the main theorem of Chapter 5, we will need to introduce some more notation. Recall thatλi(n) is the proportion of vertices of degree i in a graph with n vertices. Let

Qn(z) =X

i≥1

i(i−2)λi(n)zi, and letτn be the largest value such that

Qnn) = 0.

We will prove the following theorem.

Theorem 5.2. Assume that D is a well-behaved asymptotic degree sequence, such that for some ε > 0, di(n) = 0 whenever i > n1/4−ε. Furthermore assume that limn→∞τn = 1 and λ1 >0. Let

δn = 1−τn.

(i) If δnn1/3 → −∞, then a.a.s. all components in Gn(D) have o(n2/3) vertices.

(ii) There is a constant c1 such that if δnn1/3 ≥c1logn then a.a.s. Gn(D) has a single component withn2/3 vertices, while all other components have o(n2/3) vertices.

Finally, in Chapter 6, we will consider the random digraphD(n, p), which we introduced in Section 1.8. It was proved by Karp [31] and Łuczak [38]

that the phase transition in D(n, p) happens when p = n1, which is the same as for G(n, p). The next natural question is whether one can prove an analogue of Theorem 1.2 for random digraphs when np→ 1. We will prove the following theorem, which shows that the situation for random digraph has similarities to the random undirected graph.

Theorem 6.1. Let np= 1 +ε, such that ε=ε(n)→0, and let ω(n)→ ∞.

(i) Ifε3n→ −∞, then a.a.s. every component inD(n, p)is either a vertex or a cycle of length at most ω(n)/|ε|.

(ii) If ε3n→ ∞, then a.a.s. D(n, p)contains a unique complex component, which has order (4 +o(1))ε2n, while every other component is either a vertex or a cycle of length at most ω(n)/ε.

(26)
(27)

Chapter 2 Tools

In this chapter we will present some important results from different areas of mathematics, which will prove useful in later chapters.

In Section 2.1 we present the concept of generating functions, and a the- orem about the asymptotic growth of the coefficients of certain generating functions, which can be derived using singularity analysis of complex func- tions.

In Section 2.2 we present some basic results from the theory of branching processes. We will then sketch a proof of parts (i) and (iii) of Theorem 1.1 using a branching process. In Chapters 4 and 5 we will use branching pro- cesses to study the phase transitions in the minimum degree graph process from Section 1.6 and in random graphs with a given degree sequence from Section 1.5, respectively. However, the branching process argument is easier and more transparent in the case of G(n, p), and it is therefore useful to consider this case first, so that it can serve as a model for the discussion of the other processes.

In Section 2.3 we present a method using differential equations to study discrete random variables defined on graph processes, and Section 2.4 con- tains a martingale central limit theorem, which we will use in Chapter 3 to show asymptotic joint normality of a sequence of random variables.

2.1 Generating functions

Generating functions will be used throughout this dissertation, often because their usage significantly simplifies calculations, but sometimes we will also use properties of the generating functions themselves. We therefore include a short introduction of generating functions and present a theorem which provides a connection between the asymptotic behaviour of a sequence and

17

(28)

18 CHAPTER 2. TOOLS the singularities of the corresponding generating function considered as a function in the complex plane. A nice exposition of generating functions has been written by Wilf [55]. A more comprehensive book, which especially focuses on the analytic aspects of generating functions, is being written by Flajolet and Sedgewick [20].

Given a sequence (a0, a1, a2, . . .) of real numbers, thegenerating function of the sequence is the formal power series

f(z) = X

k≥0

akzk.

Given a generating function f(z), we use the notation hzkif(z) to denote the coefficient of zk in f(z), so in this example hzkif(z) =ak.

We will often work with probability generating functions. Let X be a random variable which takes nonnegative integers as values, and let pk = P[X =k] fork 0. Then the probability generating function corresponding toX is

p(z) = X

k≥0

pkzk.

Usually we havep(1) = 1; in some cases, however, we allow the eventX = to have nonzero probability, and in this casep(1) =P[X <∞]<1. Several properties ofX are easy to derive from the probability generating functions.

Assume that p(1) = 1. Then the expected value of X is E[X] =X

k≥0

kpk =p0(1), and the variance is

Var [X] =p00(1) +p0(1)−p0(1)2.

Given the generating function f(z) of a sequence {a0, a1, . . .}, it can be used to determine the asymptotic growth of the sequence. Let ρ be the radius of convergence of f(z): if z is a point in the complex plane, then

P

kakzk converges whenever |z| < ρ, and diverges whenever |z| > ρ. Then an=ρ−nθ(n), whereθ(n) is a subexponential factor. Sometimes the notation an./ ρ−n is used to denote this.

The formal power series can then be viewed as a function in the complex plane, which is analytic in a disc of radius ρ around the origin. We know from complex analysis that there is a unique analytic continuation of the function to the entire complex plane, and we also know that this function must have a singularity on the circle |z| = ρ. Furthermore, Pringsheim’s

Referenzen

ÄHNLICHE DOKUMENTE

Examples include random hypergraphs, random directed graphs, random graphs with a given vertex degree sequence, inhomogeneous random graphs, random planar graphs, and ran- dom

We proceed to state the results on the condensation phase transition, the limiting distribution of the free energy, the overlap, the reconstruction and the detection thresholds

Embedding spanning subgraphs is well studied for various kinds of graphs such as perfect matchings, Hamilton cycles, trees, factors, and to some extent general bounded degree

As mentioned above, embeddings of spanning structures in G α , G (n, p), and G α ∪ G (n, p) for fixed α &gt; 0 have also been studied for other graphs such as powers of Hamilton

We propose a new method for detailed insect pose estima- tion, which aims to detect landmarks as the tips of an insect’s anten- nae and mouthparts from a single image.. In this

Claim: A necessary and sufficient condition for a walk of the desired form is that the graph is connected and has exactly zero (-&gt; Eulerian cycle) or two nodes (-&gt;

When we store the number of nodes for a graph G=(V,E) plus the degree and the neighbours of each node , such a data structure will be called an adjecency- list

Semicircle law for a class of random matrices with dependent entries In view of Theorem 3.1.6 to prove Theorem 3.1.2 it remains to show convergence to semicircle law in the