Strong induction - A closer look at induction

2. A closer look at induction

2.8. Strong induction

n²m+1

=n²⁽^m⁺¹⁾ =n⁽^m⁺²⁾⁺^m (since 2(m+1) = (m+2) +m)

=n^m⁺²n^m, so thatn^m⁺² | n²m+1

We haven | n. Hence, Corollary 2.57 (applied to d = _{n) yields} _aⁿ ≡ _bⁿ_mod_nn.

In other words,aⁿ ≡bⁿmodn² (sincenn =n²).

We have assumed that (88) holds for k =m. Hence, we can apply (88) to aⁿ, bⁿ, n² and minstead ofa, b, nand k (sinceaⁿ ≡_bⁿ_mod_n²). We thus conclude that

(aⁿ)ⁿ^m ≡(bⁿ)ⁿ^mmod

n²m+1

. Now,n^m⁺¹=nn^m, so that

aⁿ^m+1 = aⁿⁿ^m = (aⁿ)ⁿ^m ≡(bⁿ)ⁿ^m =bⁿⁿ^m =bⁿ^m+1mod

n²m+1

(since nn^m = n^m⁺¹). Hence, Proposition 2.11(c) (applied to aⁿ^m+1, bⁿ^m+1, n²m+1

and n^m⁺² instead of a, b, n and m) yields aⁿ^m+1 ≡ bⁿ^m+1modn^m⁺² (since n^m⁺² | n²m+1

). In view ofm+2 = (m+1) +1, this rewrites as aⁿ^m+1 ≡bⁿ^m+1modn⁽^m⁺¹⁾⁺¹.

Now, forget that we fixed n, aand b. We thus have proven that

aⁿ^m+1 ≡ bⁿ^m+1modn⁽^m⁺¹⁾⁺¹ for all integers a and b and all n ∈ _N satisfying a ≡ bmodn. In other words, (88) holds for k = m+1. This completes the induction step. Thus, (88) is proven by induction. Hence, Corollary 2.59 is proven again.

2.8. Strong induction

2.8.1. The strong induction principle

We shall now show another “alternative induction principle”, which is known as thestrong induction principlebecause it feels stronger than Theorem 2.1 (in the sense that it appears to get the same conclusion from weaker assumptions). Just as Theo-rem 2.53, this principle is not a new axiom, but rather a consequence of the standard induction principle; we shall soon deduce it from Theorem 2.53.

Theorem 2.60. Let g∈ Z. For eachn ∈_Z_≥_g_{, let} A(n)be a logical statement.

Assume the following:

Assumption 1: Ifm ∈_Z≥g is such that

A(n) holds for every n∈ _Z≥g satisfyingn <m , then A(m) holds.

Then, A(n) holds for eachn ∈_Z_≥_g_.

Notice that Theorem 2.60 has only one assumption (unlike Theorem 2.1 and Theorem 2.53). We shall soon see that this one assumption “incorporates” both an induction base and an induction step.

Let us first explain why Theorem 2.60 is intuitively clear. For example, if you have g =4, and you want to prove (under the assumptions of Theorem 2.60) that A(7) holds, you can argue as follows:

• We know that A(n) holds for every n ∈ _Z_≥₄ satisfying n < 4. (Indeed, this is vacuously true, since there is non ∈_Z≥4 satisfyingn <4.)

Hence, Assumption 1 (applied to m = 4) shows that the statement A(4) holds.

• Thus, we know that A(n) holds for every n ∈ _Z≥4 satisfyingn <5 (because A(4) holds).

Hence, Assumption 1 (applied to m = 5) shows that the statement A(5) holds.

• Thus, we know that A(n) holds for every n ∈ _Z_≥₄ satisfyingn <6 (because A(4) andA(5) hold).

Hence, Assumption 1 (applied to m = 6) shows that the statement A(6) holds.

• Thus, we know that A(n) holds for every n ∈ _Z≥4 satisfyingn <7 (because A(4), A(5) and A(6) hold).

Hence, Assumption 1 (applied to m = 7) shows that the statement A(7) holds.

A similar (but longer) argument shows that the statementA(8) holds; likewise, A(n)can be shown to hold for each n ∈ _Z≥g by means of an argument that takes n−g+1 steps.

It is easy to see that Theorem 2.60 generalizes Theorem 2.53 (because if the two Assumptions 1 and 2 of Theorem 2.53 hold, then so does Assumption 1 of Theorem 2.60). More interesting for us is the converse implication: We shall show that Theorem 2.60 can be derived from Theorem 2.53. This will allow us to use Theorem 2.60 without having to taking it on trust.

Before we derive Theorem 2.60, let us restate Theorem 2.53 as follows:

Corollary 2.61. Let g ∈Z. For each n∈ _Z≥g, let B(n) be a logical statement.

Assume the following:

Assumption A:The statement B(g) holds.

Assumption B:If p∈ _Z≥gis such that B(p) holds, thenB(p+1)also holds.

Then, B(n) holds for eachn ∈_Z_≥_g_.

Proof of Corollary 2.61. Corollary 2.61 is exactly Theorem 2.53, except that some names have been changed:

• The statements A(n) have been renamed asB(n).

• Assumption 1 and Assumption 2 have been renamed as Assumption A and Assumption B.

• The variable min Assumption B has been renamed as p.

Thus, Corollary 2.61 holds (since Theorem 2.53 holds).

Let us now derive Theorem 2.60 from Theorem 2.53:

Proof of Theorem 2.60. For each n∈ _Z≥g, we let B(n) be the statement A(q) holds for every q∈ _Z≥g satisfyingq<n

Now, let us consider the Assumptions A and B from Corollary 2.61. We claim that both of these assumptions are satisfied.

The statementB(g)holds⁶⁵. Thus, Assumption A is satisfied.

Next, let us prove that Assumption B is satisfied. Indeed, let p ∈ _Z≥g be such thatB(p)holds. We shall show that B(p+1) also holds.

Indeed, we have assumed thatB(p) holds. In other words,

A(q) holds for everyq ∈ _Z≥g satisfying q< p (89) (because the statement B(p) is defined as

A(q) holds for every q∈ _Z≥gsatisfying q< p

). Renaming the variableqasnin this statement, we conclude that

A(n) holds for everyn ∈_Z_≥_g _satisfyingn <p. (90) Hence, Assumption 1 (applied tom= p) yields thatA(p)holds.

Now, we claim that

A(q) holds for every q ∈_Z≥g satisfyingq <p+1. (91) [Proof of (91): Let q ∈ _Z≥g be such that q < p+1. We must prove that A(q) holds.

65Proof.Letq∈Z≥gbe such thatq<g. Then,q≥g(sinceq∈Z≥g); but this contradictsq<g.

Now, forget that we fixedq. We thus have found a contradiction for eachq∈Z≥gsatisfying q<g. Hence, there exists noq∈Z≥gsatisfyingq<g. Thus, the statement

A(q) holds for everyq∈_Z_≥gsatisfyingq<g

is vacuously true, and therefore true. In other words, the statementB(g)is true (sinceB(g)is defined as the statement A(q) holds for everyq∈_Z≥gsatisfyingq<g

). Qed.

If q = p, then this follows from the fact that A(p) holds. Hence, for the rest of this proof, we WLOG assume that we don’t have q = p. Thus, q 6= p. But q < p+1 and therefore q ≤ (p+1)−_{1 (since} _q _and _p+1 are integers). Hence, q ≤ (p+1)−1 = p. Combining this with q 6= p, we obtain q < p. Hence, (89) shows that A(q) holds. This completes the proof of (91).]

But the statement B(p+1) is defined as

A(q) holds for every q∈ _Z≥gsatisfying q< p+1

. In other words, the state-mentB(p+1)is precisely the statement (91). Hence, the statementB(p+1)holds (since (91) holds).

Now, forget that we fixed p. We thus have shown that if p ∈ _Z≥g is such that B(p) holds, then B(p+1) also holds. In other words, Assumption B is satisfied.

We now know that both Assumption A and Assumption B are satisfied. Hence, Corollary 2.61 shows that

B(n) holds for each n∈ _Z≥g. (92) Now, letn∈ _Z≥g. Thus,nis an integer such thatn≥g(by the definition ofZ≥g).

Hence, n+1 is also an integer and satisfies n+1 ≥ n ≥ g, so that n+1 ∈ _Z≥g. Hence, (92) (applied to n+1 instead of n) shows that B(n+1) holds. In other words,

A(q) holds for everyq ∈_Z≥g satisfyingq <n+1 (because the statement B(n+1) is defined as

A(q) holds for every q∈ _Z≥gsatisfying q<n+1

). We can apply this to q = n (because n∈ _Z≥g satisfies n<n+1), and conclude that A(n)holds.

Now, forget that we fixed n. We thus have shown that A(n) holds for each n∈ _Z≥g. This proves Theorem 2.60.

Thus, proving a sequence of statements A(0),A(1),A(2), . . . using Theorem 2.60 is tantamount to proving a slightly different sequence of statements

B(0),B(1),B(2), . . . using Corollary 2.61 and then deriving the former from the latter.

Theorem 2.53 is called the principle of strong induction starting at g, and proofs that use it are usually called proofs by strong induction. We illustrate its use on the following easy property of the Fibonacci sequence:

Proposition 2.62. Let (f0, f₁, f2, . . .)be the Fibonacci sequence (defined as in Ex-ample 2.25). Then,

fn ≤2ⁿ⁻¹ (93)

for each n∈ _N.

Proof of Proposition 2.62. For eachn ∈_Z_≥₀, we letA(n)be the statement fn ≤2ⁿ⁻¹ . Thus, A(0) is the statement f0≤2⁰⁻¹

; hence, this statement holds (since f0 = 0≤2⁰⁻¹).

Also, A(1) is the statement f₁ ≤2¹⁻¹

(by the definition of A(1)); hence, this statement also holds (since f₁ =1=2¹⁻¹).

Now, we claim the following:

Claim 1: Ifm∈ _Z≥0 is such that

(A(n) holds for everyn ∈_Z_≥₀ _satisfying_n <m), then A(m) holds.

[Proof of Claim 1: Let m∈ _Z≥0 be such that

(A(n) holds for every n∈ _Z≥0 satisfyingn<m). (94) We must prove thatA(m) holds.

This is true ifm∈ {_{0, 1}} (because we have shown that both statementsA(0) and A(1) hold). Thus, for the rest of the proof of Claim 1, we WLOG assume that we don’t havem ∈ {0, 1}. Hence, m∈ _N\ {0, 1} ={2, 3, 4, . . .}, so thatm ≥2.

From m ≥ 2, we conclude that m−₁ ≥ ₂−₁ = 1 ≥ _{0 and} _m−₂ ≥ ₂−₂ = 0.

Thus, bothm−1 andm−2 belong toN; therefore, f_m−1and fm−2are well-defined.

We have m−1 ∈ _N =_Z≥0 and m−1 < m. Hence, (94) (applied to n = m−1) yields thatA(m−1)holds. In other words, f_m−1≤2⁽^m⁻¹⁾⁻¹ (because this is what the statementA(m−1) says).

We have m−2 ∈ _N =_Z≥0 and m−2 < m. Hence, (94) (applied to n = m−2) yields thatA(m−2)holds. In other words, fm−2≤2⁽^m⁻²⁾⁻¹ (because this is what the statementA(m−2) says).

We have(m−1)−1=m−2 and thus 2⁽^m⁻¹⁾⁻¹=2^m⁻² =2·2⁽^m⁻²⁾⁻¹≥2⁽^m⁻²⁾⁻¹ (since 2·₂⁽^m⁻²⁾⁻¹−₂⁽^m⁻²⁾⁻¹ =₂⁽^m⁻²⁾⁻¹≥0). Hence, 2⁽^m⁻²⁾⁻¹≤₂⁽^m⁻¹⁾⁻¹_.

But the recursive definition of the Fibonacci sequence yields fm = f_m−1+ fm−2

(sincem≥2). Hence, fm = f_m−1

| {z }

≤2^(m−1)−1

+ f_m−2

| {z }

≤2^(m−2)−1≤2^(m−1)−1

≤2⁽^m⁻¹⁾⁻¹+2⁽^m⁻¹⁾⁻¹ =2·2⁽^m⁻¹⁾⁻¹ =2^m⁻¹.

In other words, the statementA(m) holds (since the statementA(m) is defined to be fm ≤2^m⁻¹

). This completes the proof of Claim 1.]

Claim 1 shows that Assumption 1 of Theorem 2.60 (applied tog =0) is satisfied.

Hence, Theorem 2.60 (applied to g =0) shows that A(n) holds for eachn ∈ _Z≥0. In other words, fn ≤ 2ⁿ⁻¹ holds for each n ∈ _Z≥0 (since the statement A(n) is defined to be fn ≤2ⁿ⁻¹

). In other words, fn ≤2ⁿ⁻¹ holds for each n ∈ _N(since Z≥0 =N). This proves Proposition 2.62.

2.8.2. Conventions for writing strong induction proofs

Again, when using the principle of strong induction, one commonly does not di-rectly cite Theorem 2.60; instead one uses the following language:

Convention 2.63. Let g∈ Z. For eachn ∈ _Z≥g, let A(n) be a logical statement.

Assume that you want to prove that A(n) holds for each n∈ _Z≥g.

Theorem 2.60 offers the following strategy for proving this: Show that As-sumption 1 of Theorem 2.60 is satisfied; then, Theorem 2.60 automatically com-pletes your proof.

A proof that follows this strategy is called a proof by strong induction on n starting at g. The proof that Assumption 1 is satisfied is called the induction step of the proof. This kind of proof does not have an “induction base” (unlike proofs that use Theorem 2.1 or Theorem 2.53).⁶⁶

In order to prove that Assumption 1 is satisfied, you will usually want to fix an m ∈_Z≥g such that

A(n) holds for everyn ∈_Z≥g satisfyingn<m

, (95)

and then prove that A(m) holds. In other words, you will usually want to fix m ∈ _Z≥g, assume that (95) holds, and then prove that A(m) holds. When doing so, it is common to refer to the assumption that (95) holds as theinduction hypothesis(orinduction assumption).

Using this language, we can rewrite our above proof of Proposition 2.62 as fol-lows:

Proof of Proposition 2.62 (second version). For eachn∈ _Z≥0, we letA(n)be the state-ment fn ≤₂ⁿ⁻¹. Thus, our goal is to prove the statement A(n) for eachn ∈ _N.

In other words, our goal is to prove the statement A(n) for each n ∈ _Z_≥₀ (since N=_Z≥0).

We shall prove this by strong induction onnstarting at 0:

Induction step: Let m∈ _Z_≥₀. Assume that

(A(n) holds for every n∈ _Z≥0 satisfyingn<m). (96) We must then show that A(m) holds. In other words, we must show that fm ≤ 2^m⁻¹ holds (since the statementA(m) is defined as fm ≤2^m⁻¹

This is true ifm =0 (since f0 =0 ≤2⁰⁻¹) and also true if m=1 (since f1 =1= 2¹⁻¹ and thus f₁ ≤ ₂¹⁻¹). In other words, this is true if m ∈ {_{0, 1}}. Thus, for the rest of the induction step, we WLOG assume that we don’t havem ∈ {0, 1}. Hence, m∈ {/ 0, 1}, so that m∈ _N\ {0, 1}={2, 3, 4, . . .}. Hence,m ≥2.

From m ≥ 2, we conclude that m−₁ ≥ ₂−₁ = ₁ ≥ _{0 and} m−₂ ≥ ₂−₂ = _0.

Thus, bothm−1 andm−2 belong toN; therefore, f_m−1and fm−2are well-defined.

We have m−1 ∈ _N =_Z≥0 and m−1 < m. Hence, (96) (applied to n =m−1) yields thatA(m−1)holds. In other words, f_m−1≤2⁽^m⁻¹⁾⁻¹ (because this is what the statementA(m−1) says).

66There is a version of strong induction which does include an induction base (or even several). But the version we are using does not.

We have m−2 ∈ _N =_Z_≥₀ and m−2 < m. Hence, (96) (applied to n = m−2) yields thatA(m−2)holds. In other words, fm−2≤2⁽^m⁻²⁾⁻¹ (because this is what the statementA(m−2) says).

We have(m−1)−1=m−2 and thus 2⁽^m⁻¹⁾⁻¹=2^m⁻² =2·2⁽^m⁻²⁾⁻¹≥2⁽^m⁻²⁾⁻¹ (since 2·2⁽^m⁻²⁾⁻¹−2⁽^m⁻²⁾⁻¹ =2⁽^m⁻²⁾⁻¹≥0). Hence, 2⁽^m⁻²⁾⁻¹≤2⁽^m⁻¹⁾⁻¹.

But the recursive definition of the Fibonacci sequence yields fm = fm−1+ fm−2

(sincem≥2). Hence, fm = fm−1

| {z }

≤2^(m−1)−1

+ fm−2

| {z }

≤2^(m−2)−1≤2^(m−1)−1

≤₂⁽^m⁻¹⁾⁻¹+2⁽^m⁻¹⁾⁻¹ =2·₂⁽^m⁻¹⁾⁻¹ =2^m⁻¹.

In other words, the statementA(m) holds (since the statementA(m) is defined to be fm ≤₂^m⁻¹_).

Now, forget that we fixed m. We thus have shown that if m ∈ _Z_≥₀ is such that (96) holds, thenA(m) holds. This completes the induction step. Hence, by strong induction, we conclude that A(n) holds for each n ∈ _Z_≥₀. This completes our proof of Proposition 2.62.

The proof that we just showed still has a lot of “boilerplate” text that conveys no information. For example, we have again explicitly defined the statement A(n), which is unnecessary: This statement is exactly what one would expect (namely, the claim that we are proving, without the “for each n ∈ N” part). Thus, in our case, this statement is simply (93). Furthermore, we can remove the two sentences

“Now, forget that we fixed m. We thus have shown that if m ∈ _Z_≥₀ is such that (96) holds, then A(m) holds.”.

In fact, these sentences merely say that we have completed the induction step; but this is clear anyway when we say that the induction step is completed.

We said that we are proving our statement “by strong induction onn starting at 0”. Again, we can omit the words “starting at 0” here, since this is the only option (because our statement is about all n∈ _Z_≥₀_).

Finally, we can remove the words “Induction step:”, because a proof by strong induction (unlike a proof by standard induction) does not have an induction base (so the induction step is all that it consists of).

Thus, our above proof can be shortened to the following:

Proof of Proposition 2.62 (third version). We shall prove (93) by strong induction on n:

Let m ∈ _Z≥0. Assume that (93) holds for every n ∈ _Z≥0 satisfying n < m. We must then show that (93) holds for n = m. In other words, we must show that

fm ≤2^m⁻¹holds.

This is true ifm =0 (since f0 =0 ≤2⁰⁻¹) and also true if m=1 (since f1 =1= 2¹⁻¹ and thus f₁ ≤ 2¹⁻¹). In other words, this is true if m ∈ {0, 1}. Thus, for the

rest of the induction step, we WLOG assume that we don’t havem ∈ {0, 1}. Hence, m∈ {/ 0, 1}, so that m∈ _N\ {0, 1}={2, 3, 4, . . .}. Hence,m ≥2.

From m ≥ 2, we conclude that m−₁ ≥ ₂−₁ = 1 ≥ _{0 and} _m−₂ ≥ ₂−₂ = 0.

Thus, bothm−1 andm−2 belong toN; therefore, f_m−1and fm−2are well-defined.

We have m−1 ∈ _N =_Z≥0 and m−1 < m. Hence, (93) (applied to n = m−1) yields that f_m−1 ≤ 2⁽^m⁻¹⁾⁻¹ (since we have assumed that (93) holds for every n ∈ Z≥0 satisfyingn<m).

We have m−2 ∈ _N =_Z≥0 and m−2 < m. Hence, (93) (applied to n = m−2) yields that fm−2 ≤ 2⁽^m⁻²⁾⁻¹ (since we have assumed that (93) holds for every n ∈ Z≥0 satisfyingn<m).

But the recursive definition of the Fibonacci sequence yields fm = f_m−1+ fm−2

(sincem≥2). Hence, fm = fm−1

| {z }

≤2^(m−1)−1

+ fm−2

| {z }

≤2^(m−2)−1≤2^(m−1)−1

≤2⁽^m⁻¹⁾⁻¹+2⁽^m⁻¹⁾⁻¹ =2·2⁽^m⁻¹⁾⁻¹ =2^m⁻¹.

In other words, (93) holds forn =m. This completes the induction step. Hence, by strong induction, we conclude that (93) holds for eachn ∈_Z_≥₀. In other words, (93) holds for each n ∈ _N (since Z≥0 = N). This completes our proof of Proposition 2.62.

Im Dokument Notes on the combinatorial fundamentals of algebra (Seite 115-122)