Induction with shifted base - A closer look at induction

2. A closer look at induction

2.7. Induction with shifted base

2.7.1. Induction starting at g

All the induction proofs we have done so far were applications of Theorem 2.1 (even though we have often written them up in ways that hide the exact statementsA(n) to which Theorem 2.1 is being applied). We are soon going to see several other

“induction principles” which can also be used to make proofs. Unlike Theorem 2.1, these other principles need not be taken on trust; instead, they can themselves be proven using Theorem 2.1. Thus, they merely offer convenience, not new logical opportunities.

Our first such “alternative induction principle” is Theorem 2.53 below. First, we introduce a simple notation:

Definition 2.52. Let g ∈ _{Z. Then,} _Z_≥_g denotes the set{g,g+1,g+2, . . .}; this is the set of all integers that are ≥g.

For example, Z≥0 = {0, 1, 2, . . .} = _N is the set of all nonnegative integers, whereasZ≥1 ={1, 2, 3, . . .} is the set of all positive integers.

Now, we state our first “alternative induction principle”:

Theorem 2.53. Let g∈ Z. For eachn ∈_Z≥g, let A(n)be a logical statement.

Assume the following:

Assumption 1: The statementA(g) holds.

Assumption 2: If m ∈ _Z≥g is such that A(m) holds, then A(m+1) also holds.

Then, A(n) holds for eachn ∈_Z≥g.

Again, Theorem 2.53 is intuitively clear: For example, if you haveg =4, and you want to prove (under the assumptions of Theorem 2.53) that A(8) holds, you can argue as follows:

• By Assumption 1, the statement A(4)holds.

• Thus, by Assumption 2 (applied to m=4), the statement A(5)holds.

• Thus, by Assumption 2 (applied to m=5), the statement A(6)holds.

• Thus, by Assumption 2 (applied to m=6), the statement A(7)holds.

• Thus, by Assumption 2 (applied to m=7), the statement A(8)holds.

A similar (but longer) argument shows that the statementA(9) holds; likewise, A(n)can be shown to hold for each n ∈ _Z_≥_g by means of an argument that takes n−g+1 steps.

Theorem 2.53 generalizes Theorem 2.1. Indeed, Theorem 2.1 is the particular case of Theorem 2.53 for g = _{0 (since} _Z_≥₀ = N). However, Theorem 2.53 can also be derived from Theorem 2.1. In order to do this, we essentially need to

“shift” the indexn in Theorem 2.53 down byg– that is, we need to rename our se-quence (A(g),A(g+₁),A(g+₂), . . .) of statements as (B(0),B(1),B(2), . . .), and apply Theorem 2.1 to B(n) instead of A(n). In order to make this renaming procedure rigorous, let us first restate Theorem 2.1 as follows:

Corollary 2.54. For each n∈ _{N, let} B(n) be a logical statement.

Assume the following:

Assumption A:The statement B(0)holds.

Assumption B: If p ∈ _N is such that B(p) holds, then B(p+1) also holds.

Then, B(n) holds for eachn ∈_N.

Proof of Corollary 2.54. Corollary 2.54 is exactly Theorem 2.1, except that some names have been changed:

• The statements A(n) have been renamed asB(n).

• Assumption 1 and Assumption 2 have been renamed as Assumption A and Assumption B.

• The variable min Assumption B has been renamed as p.

Thus, Corollary 2.54 holds (since Theorem 2.1 holds).

Let us now derive Theorem 2.53 from Theorem 2.1:

Proof of Theorem 2.53. For any n ∈ _{N, we have} n+g ∈ _Z_≥_g ⁶⁰. Hence, for each n∈ N, we can define a logical statementB(n)by

B(n) =A(n+g). Consider thisB(n).

Now, let us consider the Assumptions A and B from Corollary 2.54. We claim that both of these assumptions are satisfied.

Indeed, the statement A(g) holds (by Assumption 1). But the definition of the statementB(0) shows that B(0) = A(0+g) = A(g). Hence, the statement B(0) holds (since the statementA(g) holds). In other words, Assumption A is satisfied.

Now, we shall show that Assumption B is satisfied. Indeed, let p ∈ N be such that B(p) holds. The definition of the statement B(p) shows that B(p) = A(p+g). Hence, the statementA(p+g) holds (sinceB(p)holds).

Also, p ∈ N, so that p ≥ 0 and thus p+g ≥ g. In other words, p+g ∈ Z≥g

(sinceZ≥g is the set of all integers that are ≥g).

Recall that Assumption 2 holds. In other words, if m ∈ _Z≥g is such that A(m) holds, then A(m+1) also holds. Applying this to m = p+g, we conclude that A((p+g) +1)holds (since A(p+g) holds).

But the definition ofB(p+1)yieldsB(p+1) = A





p+1+g

| {z }

=(p+_g)+1





=A((p+g) +1). Hence, the statementB(p+1) holds (since the statementA((p+g) +1) holds).

Now, forget that we fixed p. We thus have shown that if p ∈_Nis such thatB(p) holds, thenB(p+₁)also holds. In other words, Assumption B is satisfied.

We now know that both Assumption A and Assumption B are satisfied. Hence, Corollary 2.54 shows that

B(n) holds for eachn ∈_N. (83)

Now, let n ∈ _Z≥g. Thus, n is an integer such that n ≥ g (by the definition of Z≥g). Hence, n−g ≥ 0, so that n−g ∈ N. Thus, (83) (applied to n−g

60Proof. Letn ∈N. Thus,n ≥0, so that n

|{z}

≥0

+g ≥0+g = g. Hence,n+gis an integer ≥ g. In other words,n+g∈_Z≥g(sinceZ≥gis the set of all integers that are≥g). Qed.

instead of n) yields that B(n−g) holds. But the definition of B(n−g) yields B(n−_g) = A



(n−_g) +g

| {z }



 = A(n). Hence, the statement A(n) holds (since B(n−g) holds).

Now, forget that we fixed n. We thus have shown that A(n) holds for each n∈ _Z≥g. This proves Theorem 2.53.

Theorem 2.53 is called theprinciple of induction starting at g, and proofs that use it are usually called proofs by induction or induction proofs. As with the standard induction principle (Theorem 2.1), we don’t usually explicitly cite Theorem 2.53, but instead say certain words that signal that it is being applied and that (ideally) also indicate what integer g and what statements A(n) it is being applied to⁶¹. However, for our very first example of the use of Theorem 2.53, we are going to reference it explicitly:

Proposition 2.55. Let aand bbe integers. Then, every positive integern satisfies (a+b)ⁿ ≡aⁿ+naⁿ⁻¹bmodb². (84) Note that we have chosen not to allown=0 in Proposition 2.55, because it is not clear what “aⁿ⁻¹” would mean whenn =_{0 and}a=0. (Recall that 0⁰⁻¹=₀⁻¹_{is not} defined!) In truth, it is easy to convince oneself that this is not a serious hindrance, since the expression “naⁿ⁻¹” has a meaningful interpretation even when its sub-expression “aⁿ⁻¹” does not (one just has to interpret it as 0 when n = _{0, without} regard to whether “aⁿ⁻¹” is well-defined). Nevertheless, we prefer to rule out the case of n = 0 by requiring n to be positive, in order to avoid having to discuss such questions of interpretation. (Of course, this also gives us an excuse to apply Theorem 2.53 instead of the old Theorem 2.1.)

Proof of Proposition 2.55. For each n∈ _Z_≥₁, we let A(n)be the statement

(a+b)ⁿ ≡aⁿ +naⁿ⁻¹bmodb² . Our next goal is to prove the statement A(n) for eachn ∈_Z_≥₁_.

We first notice that the statementA(1) holds⁶². Now, we claim that

if m∈ _Z_≥₁ is such thatA(m) holds, thenA(m+1) also holds. (85)

61We will explain this in Convention 2.56 below.

62Proof. We have (a+b)¹ = a+b. Comparing this with a¹

|{z}=a

+1 a¹⁻¹

|{z}

=a⁰=1

b = a+b, we obtain (a+_b)¹ = _a¹+_1a¹⁻¹_{b. Hence,} (a+_b)¹ ≡ a¹+_1a¹⁻¹_b_mod_b². But this is precisely the state-mentA(1)(sinceA(1)is defined to be the statement

(a+b)¹≡a¹+1a¹⁻¹bmodb²

). Hence, the statementA(1)holds.

[Proof of (85): Let m ∈ _Z_≥₁ be such that A(m) holds. We must show that A(m+1) also holds.

We have assumed thatA(m) holds. In other words, (a+b)^m ≡a^m+ma^m⁻¹bmodb² holds⁶³. Now,

(a+b)^m⁺¹ = (a+b)^m

| {z }

≡a^m+ma^m−1bmodb²

(a+b)

≡a^m +ma^m⁻¹b

(a+b)

= a^ma

|{z}

=a^m+1

+a^mb+m a^m⁻¹ba

| {z }

=a^m−1ab=a^mb (sincea^m−1a=a^m)

+ ma^m⁻¹bb

| {z }

=ma^m−1b²≡0 modb² (sinceb²|ma^m−1b²)

≡a^m⁺¹+a^mb+ma^mb

| {z }

=(m+1)a^mb

=a^m⁺¹+ (m+1) a^m

|{z}

=a^(m+1)−1 (sincem=(m+1)−1)

b=a^m⁺¹+ (m+1)a⁽^m⁺¹⁾⁻¹bmodb².

So we have shown that (a+b)^m⁺¹ ≡ a^m⁺¹+ (m+₁)a⁽^m⁺¹⁾⁻¹bmodb². But this is precisely the statementA(m+1) ⁶⁴. Thus, the statement A(m+1) holds.

Now, forget that we fixed m. We thus have shown that if m ∈ _Z≥1 is such that A(m)holds, then A(m+₁) also holds. This proves (85).]

Now, both assumptions of Theorem 2.53 (applied to g=1) are satisfied (indeed, Assumption 1 is satisfied because the statement A(1) holds, whereas Assumption 2 is satisfied because of (85)). Thus, Theorem 2.53 (applied to g = 1) shows that A(n)holds for eachn ∈_Z_≥₁. In other words,(a+b)ⁿ ≡aⁿ+naⁿ⁻¹bmodb²holds for each n ∈ _Z≥1 (since A(n) is the statement (a+b)ⁿ ≡aⁿ+naⁿ⁻¹bmodb²

In other words, (a+b)ⁿ ≡ aⁿ+naⁿ⁻¹bmodb² holds for each positive integer n (because the positive integers are exactly the n ∈ _Z_≥₁). This proves Proposition 2.55.

2.7.2. Conventions for writing proofs by induction starting at g

Now, let us introduce some standard language that is commonly used in proofs by induction starting at g:

63becauseA(m)is defined to be the statement (a+b)^m≡a^m+ma^m−1bmodb²

64becauseA(m+1)is defined to be the statement

(a+b)^m+1≡a^m+1+ (m+1)a^(m+1)−1bmodb²

Convention 2.56. Let g∈ Z. For eachn ∈ _Z≥g, let A(n) be a logical statement.

Assume that you want to prove that A(n) holds for each n∈ _Z≥g.

Theorem 2.53 offers the following strategy for proving this: First show that Assumption 1 of Theorem 2.53 is satisfied; then, show that Assumption 2 of The-orem 2.53 is satisfied; then, TheThe-orem 2.53 automatically completes your proof.

A proof that follows this strategy is called a proof by induction on n (or proof by induction over n) starting at g or (less precisely) aninductive proof. Most of the time, the words “starting at g” are omitted, since they merely repeat what is clear from the context anyway: For example, if you make a claim about all integers n≥3, and you say that you are proving it by induction onn, then it is clear that you are using induction onn starting at 3. (And if this isn’t clear from the claim, then the induction base will make it clear.)

The proof that Assumption 1 is satisfied is called the induction base (or base case) of the proof. The proof that Assumption 2 is satisfied is called the induction stepof the proof.

In order to prove that Assumption 2 is satisfied, you will usually want to fix an m ∈ _Z≥g such that A(m) holds, and then prove that A(m+1) holds. In other words, you will usually want to fix m ∈ _Z_≥_g, assume that A(m) holds, and then prove thatA(m+1)holds. When doing so, it is common to refer to the assumption thatA(m)holds as the induction hypothesis(orinduction assumption).

Unsurprisingly, this language parallels the language introduced in Convention 2.3 for proofs by “standard” induction.

Again, we can shorten our inductive proofs by omitting some sentences that convey no information. In particular, we can leave out the explicit definition of the statementA(n) when this statement is precisely the claim that we are proving (without the “for each n ∈ _Z≥g” part). Thus, we can rewrite our above proof of Proposition 2.55 as follows:

Proof of Proposition 2.55 (second version). We must prove (84) for every positive inte-ger n. In other words, we must prove (84) for every n ∈ _Z_≥₁ (since the positive integers are precisely the n ∈ _Z_≥₁). We shall prove this by induction on n starting at 1:

Induction base: We have (a+b)¹ = a+b. Comparing this with a¹

|{z}=a

+1 a¹⁻¹

|{z}

=a⁰=1

b = a+b, we obtain (a+b)¹ = a¹+1a¹⁻¹b. Hence, (a+b)¹ ≡ a¹+1a¹⁻¹bmodb². In other words, (84) holds for n=1. This completes the induction base.

Induction step: Let m ∈ _Z_≥₁. Assume that (84) holds for n = m. We must show that (84) also holds for n=m+1.

We have assumed that (84) holds forn =m. In other words, (a+b)^m ≡a^m+ma^m⁻¹bmodb²

holds. Now,

(a+b)^m⁺¹ = (a+b)^m

| {z }

≡a^m+ma^m−1bmodb²

(a+b)

≡a^m +ma^m⁻¹b

(a+b)

= a^ma

|{z}

=a^m+1

+a^mb+m a^m⁻¹ba

| {z }

=a^m−1ab=a^mb (sincea^m−1a=a^m)

+ ma^m⁻¹bb

| {z }

=ma^m−1b²≡0 modb² (sinceb²|ma^m−1b²)

≡a^m⁺¹+a^mb+ma^mb

| {z }

=(m+1)a^mb

=a^m⁺¹+ (m+1) a^m

|{z}

=a^(m+1)−1 (sincem=(m+1)−1)

b=a^m⁺¹+ (m+1)a⁽^m⁺¹⁾⁻¹bmodb².

So we have shown that (a+b)^m⁺¹ ≡ a^m⁺¹+ (m+1)a⁽^m⁺¹⁾⁻¹bmodb². In other words, (84) holds for n=m+1.

Now, forget that we fixedm. We thus have shown that ifm∈ _Z_≥₁is such that (84) holds forn = m, then (84) also holds for n =m+1. This completes the induction step. Hence, (84) is proven by induction. This proves Proposition 2.55.

Proposition 2.55 can also be seen as a consequence of the binomial formula (Proposition 3.21 further below).

2.7.3. More properties of congruences

Let us use this occasion to show two corollaries of Proposition 2.55:

Corollary 2.57. Let a, band nbe three integers such thata ≡_b_mod_{n. Let}_d ∈_N be such that d| n. Then,a^d ≡b^dmodnd.

Proof of Corollary 2.57. We have a ≡ bmodn. In other words, a is congruent to b modulo n. In other words, n | a−b (by the definition of “congruent”). In other words, there exists an integer w such that a−b = nw. Consider this w. From a−b = nw, we obtain a = b+nw. Also, d | n, thus dn | nn (by Proposition 2.6, applied to d, n and n instead of a, b and c). On the other hand, nn | (nw)² (since (nw)² = nwnw = nnww). Hence, Proposition 2.5 (applied to dn, nn and (nw)² instead of a, b and c) yields dn | (nw)² (since dn | nn and nn | (nw)²). In other words,nd | (nw)²(since dn=nd).

Next, we claim that

nd | a^d−b^d. (86)

[Proof of (86): If d = 0, then (86) holds (because if d = 0, then a^d−b^d = a⁰

|{z}

− b⁰

|{z}

= ₁−₁ = ₀ = 0nd, and thus nd | a^d−b^d). Hence, for the rest of

this proof of (86), we WLOG assume that we don’t haved =0. Thus, d6=0. Hence, dis a positive integer (sinced ∈N). Thus, Proposition 2.55 (applied tod, b andnw instead ofn, a andb) yields

(b+nw)^d≡b^d+db^d⁻¹nwmod(nw)². In view of a=b+nw, this rewrites as

a^d ≡b^d+db^d⁻¹nwmod(nw)².

Hence, Proposition 2.11(c)(applied toa^d, b^d+db^d⁻¹nw,(nw)² andnd instead ofa, b, nand m) yields

a^d ≡b^d+db^d⁻¹nwmodnd (sincend | (nw)²). Hence,

a^d ≡_b^d+ _db^d⁻¹_nw

| {z }

=_ndb^d−1_w≡0 modnd (sincend|ndb^d−1w)

≡_b^d+₀=_b^d_mod_nd.

In other words,nd | a^d−b^d. This proves (86).]

From (86), we immediately obtain a^d ≡b^dmodnd (by the definition of “congru-ent”). This proves Corollary 2.57.

For the next corollary, we need a convention:

Convention 2.58. Let a, b and c be three integers. Then, the expression “a^b^c” shall always be interpreted as “a⁽^b^c⁾”, never as “ a^bc

”.

Thus, for example, “3³³” means 3(³³) =₃²⁷ =7625 597 484 987, not 3³3

=27³ = 19 683. The reason for this convention is that a^bc

can be simplified toa^bc and thus there is little use in having yet another notation for it. Of course, this convention applies not only to integers, but to any other numbersa,b,c.

We can now state the following fact, which is sometimes known as “lifting-the-exponent lemma”:

Corollary 2.59. Let n ∈ _{N. Let} a and b be two integers such that a ≡ bmodn.

Let k ∈_{N. Then,}

aⁿ^k ≡bⁿ^kmodn^k⁺¹. (87) We shall give two different proofs of Corollary 2.59 by induction on k, to illus-trate once again the point (previously made in Remark 2.27) that we have a choice of what precise statement we are proving by induction. In the first proof, the state-ment will be the congruence (87) for threefixedintegers a, band n, whereas in the second proof, it will be the statement

aⁿ^k ≡bⁿ^kmodn^k⁺¹ forallintegers a and band alln ∈_Nsatisfying a≡bmodn .

First proof of Corollary 2.59. Forget that we fixedk. We thus must prove (87) for each k ∈_N.

We shall prove this by induction onk:

Induction base: We have n⁰ = 1 and thus aⁿ⁰ = a¹ = a. Similarly, bⁿ⁰ = b. Thus, aⁿ⁰ =a ≡b =bⁿ⁰modn. In other words,aⁿ⁰ ≡bⁿ⁰modn⁰⁺¹ (sincen⁰⁺¹=n¹=n).

In other words, (87) holds fork =0. This completes the induction base.

Induction step: Letm ∈N. Assume that (87) holds fork =m. We must prove that (87) holds fork =m+1.

We haven^m⁺¹ =nn^m. Hence,n |n^m⁺¹.

We have assumed that (87) holds fork =m. In other words, we have aⁿ^m ≡bⁿ^mmodn^m⁺¹.

Hence, Corollary 2.57 (applied to aⁿ^m, bⁿ^m, n^m⁺¹ and n instead of a, b, n and d) yields

aⁿ^mn

≡bⁿ^mn

modn^m⁺¹n.

Now,n^m⁺¹=n^mn, so that

aⁿ^m+1 =aⁿ^mⁿ =aⁿ^mn

≡_bⁿ^mⁿ =bⁿ^mⁿ =bⁿ^m+1modn^m⁺¹n (sincen^mn=n^m⁺¹). In view of n^m⁺¹n=n⁽^m⁺¹⁾⁺¹, this rewrites as

aⁿ^m+1 ≡_bⁿ^m+1_mod_n⁽^m⁺¹⁾⁺¹_.

In other words, (87) holds for k= m+1. This completes the induction step. Thus, (87) is proven by induction. Hence, Corollary 2.59 holds.

Second proof of Corollary 2.59. Forget that we fixed a,b,nandk. We thus must prove

aⁿ^k ≡bⁿ^kmodn^k⁺¹ for all integers aand b and alln∈ _Nsatisfyinga ≡bmodn (88) for all k∈ _N.

We shall prove this by induction onk:

Induction base: Let n ∈ _{N. Let} a and b be two integers such that a ≡ bmodn.

We have n⁰ = 1 and thus aⁿ⁰ = a¹ = a. Similarly, bⁿ⁰ = b. Thus, aⁿ⁰ = a ≡ b = bⁿ⁰modn. In other words,aⁿ⁰ ≡bⁿ⁰modn⁰⁺¹ (sincen⁰⁺¹ =n¹ =n).

Now, forget that we fixedn,aandb. We thus have proven thataⁿ⁰ ≡bⁿ⁰modn⁰⁺¹ for all integers a and b and alln ∈ _Nsatisfying a ≡bmodn. In other words, (88) holds fork =0. This completes the induction base.

Induction step: Letm ∈N. Assume that (88) holds fork =m. We must prove that (88) holds fork =m+1.

Let n∈ _{N. Let} a andb be two integers such thata ≡bmodn. Now,

n²m+1

=n²⁽^m⁺¹⁾ =n⁽^m⁺²⁾⁺^m (since 2(m+1) = (m+2) +m)

=n^m⁺²n^m, so thatn^m⁺² | n²m+1

We haven | n. Hence, Corollary 2.57 (applied to d = _{n) yields} _aⁿ ≡ _bⁿ_mod_nn.

In other words,aⁿ ≡bⁿmodn² (sincenn =n²).

We have assumed that (88) holds for k =m. Hence, we can apply (88) to aⁿ, bⁿ, n² and minstead ofa, b, nand k (sinceaⁿ ≡_bⁿ_mod_n²). We thus conclude that

(aⁿ)ⁿ^m ≡(bⁿ)ⁿ^mmod

n²m+1

. Now,n^m⁺¹=nn^m, so that

aⁿ^m+1 = aⁿⁿ^m = (aⁿ)ⁿ^m ≡(bⁿ)ⁿ^m =bⁿⁿ^m =bⁿ^m+1mod

n²m+1

(since nn^m = n^m⁺¹). Hence, Proposition 2.11(c) (applied to aⁿ^m+1, bⁿ^m+1, n²m+1

and n^m⁺² instead of a, b, n and m) yields aⁿ^m+1 ≡ bⁿ^m+1modn^m⁺² (since n^m⁺² | n²m+1

). In view ofm+2 = (m+1) +1, this rewrites as aⁿ^m+1 ≡bⁿ^m+1modn⁽^m⁺¹⁾⁺¹.

Now, forget that we fixed n, aand b. We thus have proven that

aⁿ^m+1 ≡ bⁿ^m+1modn⁽^m⁺¹⁾⁺¹ for all integers a and b and all n ∈ _N satisfying a ≡ bmodn. In other words, (88) holds for k = m+1. This completes the induction step. Thus, (88) is proven by induction. Hence, Corollary 2.59 is proven again.

Im Dokument A version without solutions (Seite 102-111)