Properties of matrix operations - Introduction to matrices 4

2. Introduction to matrices 4

2.7. Properties of matrix operations

The operations of adding, scaling and multiplying matrices, in many aspects, “be-have almost as nicely as numbers”. Specifically, I mean that they satisfy a bunch of laws that numbers satisfy:

Proposition 2.20. Let n ∈_N_and m∈ _{N. Then:}

(a)We haveA+B=B+Afor any twon×m-matricesAandB. (This is called

“commutativity of addition”.)

(b)We have A+ (B+C) = (A+B) +Cfor any threen×m-matrices A, Band C. (This is called “associativity of addition”.)

(c₁) We have λ(A+B) = λA+λB for any number λ and any two n× m-matrices Aand B.

(c₂) We haveλ(µA) = (λµ)A and (λ+µ)A = λA+µA for any numbers λ and µ and anyn×m-matrix A.

(c₃)We have 1A= A for anyn×m-matrix A.

Let furthermore p∈ _{N. Then:}

(d)We haveA(B+C) = AB+ACfor anyn×m-matrix Aand any twom× p-matrices B andC. (This is called “left distributivity”.)

(e) We have (A+B)C = AC+BC for any two n×m-matrices A and B and any m×p-matrixC. (This is called “right distributivity”.)

(f) We have λ(AB) = (λA)B = A(λB) for any number λ, any n×m-matrix A and anym×p-matrix B.

Finally, let q ∈_{N. Then:}

(g) We have A(BC) = (AB)C for any n×_m-matrix _{A, any} _m×_p-matrix _B and any p×q-matrixC. (This is called “associativity of multiplication”.)

Example 2.21. Most parts of Proposition 2.20 are fairly easy to visualize and to prove. Let me give an example for the least obvious one: part(g).

Part (g) essentially says that A(BC) = (AB)C holds for any three matrices A, B and C for which the products AB and BC are well-defined (i.e., A has as many columns as B has rows, and B has as many columns as C has rows). For example, take n=1, m=3, p=2 andq =3. Set

A = a b c

, B=



 d d⁰ e e⁰ f f⁰



, C =

x y z x⁰ y⁰ z⁰

Then,

AB= ad+be+c f ad⁰+be⁰+c f⁰

and thus

(AB)C= ad+be+c f ad⁰+be⁰+c f⁰

x y z x⁰ y⁰ z⁰





ad⁰x⁰+be⁰x⁰+c f⁰x⁰+bex+adx+c f x ad⁰y⁰+be⁰y⁰+c f⁰y⁰+bey+ady+c f y ad⁰z⁰+be⁰z⁰+c f⁰z⁰+bez+adz+c f z





after some computation. (Here, we have written the result as a transpose of a column vector, because if we had written it as a row vector, it would not fit on this page.) But

BC =



 d d⁰ e e⁰ f f⁰





x y z x⁰ y⁰ z⁰





d⁰x⁰+dx d⁰y⁰+dy d⁰z⁰+dz e⁰x⁰+ex e⁰y⁰+ey e⁰z⁰+ez f⁰x⁰+ f x f⁰y⁰+ f y f⁰z⁰+ f z



 and as before

A(BC) =





ad⁰x⁰+be⁰x⁰+c f⁰x⁰+bex+adx+c f x ad⁰y⁰+be⁰y⁰+c f⁰y⁰+bey+ady+c f y ad⁰z⁰+be⁰z⁰+c f⁰z⁰+bez+adz+c f z





Hence, (AB)C =A(BC). Thus, our example confirms Proposition 2.20(g).

The laws of Proposition 2.20 allow you to do many formal manipulations with matrices similarly to how you are used to work with numbers. For example, if you havenmatrices A1,A2, . . . ,An such that successive matrices can be multiplied (i.e., for eachi∈ {1, 2, . . . ,n−1}, the matrix A_ihas as many columns as A_i+1has rows), then the product A1A2· · ·An is well-defined: you can parenthesize it in any order, and the result will always be the same. For example, the product ABCD of four matrices A,B,C,Dcan be computed in any of the five ways

((AB)C)D, (AB) (CD), (A(BC))D, A((BC)D), A(B(CD)), and all of them lead to the same result. This is called general associativity and is not obvious (even if you know that Proposition 2.20(g) holds)¹¹. Let me state this result again as a proposition, just to stress its importance:

Proposition 2.22. Let A₁,A₂, . . . ,An be n matrices. Assume that, for each i ∈ {1, 2, . . . ,n−1}, the number of columns of A_iequals the number of rows of A_i+1 11If you are curious about the proofs:

We shall prove Proposition 2.20(g)further below (in Section 2.9). General associativity can be derived from Proposition 2.20(g)in the general context of “binary operations”; see (for example) [Zuker14] for this argument.

(so that the product A_iA_i+1makes sense). Then, the product A₁A₂· · ·An is well-defined: Any way to compute this product (by parenthesizing it) yields the same result. In particular, it can be computed both as A1(A2(A3(· · ·(An−1An))))and as((((A₁A₂)A₃)· · ·)A_n−1)An.

Please take a moment to appreciate general associativity! Without it, we could not make sense of products like ABC and ABCDE, because their values could de-pend on how we choose to compute them. This is one reason why, in the definition of AB, we multiply entries of the i-th row of A with entries of the j-th column of B. Using rows both times would break associativity!¹²

There is also a general associativity law for addition:

Proposition 2.23. Let A1,A2, . . . ,An be n matrices of the same size. Then, the sum A₁+A₂+· · ·+Anis well-defined: Any way to compute this sum (by paren-thesizing it) yields the same result. In particular, it can be computed both as A₁+ (A2+ (A3+ (· · ·+ (An−1+An))))and as((((A1+A2) +A3) +· · ·) +An−1) + An.

There is also another variant of general associativity that concerns the interplay of matrix multiplication and scaling. It claims that products of matrices and num-bers can be parenthesized in any order. For example, the product λµAB of two numbers λ and µ and two matrices A and B can be computed in any of the five ways

((λµ)A)B, (λµ) (AB), (λ(µA))B, λ((µA)B), λ(µ(AB)), and all of them lead to the same result. This can be deduced from parts (c₂), (f) and(g) of Proposition 2.20.

We shall give proofs of parts(d)and (g)of Proposition 2.20 in Section 2.9 below.

Various other identities follow from Proposition 2.20. For example, if A, B and C are three matrices of the same size, then A−(B+C) = A−B−C. For another example, if A and Bare two n×m-matrices (for some n∈ _Nandm ∈ _{N) and if} C is anm×p-matrix (for some p∈ _{N), then} (A−B)C = AC−BC. These identities are proven similarly as the analogous properties of numbers are proven; we shall not linger on them.

12Of course, our formulation of general associativity was far from rigorous. After all, we have not defined what a “way to compute a product” means, or what “parenthesizing a product” means.

There are several ways to make Proposition 2.22 rigorous. See [m.se709196] for a discussion of such ways. (Note that the simplest way actually avoids defining “parenthesizing”. Instead, it defines the product A1A2· · ·An by recursion on n, namely defining it to be A1 when n = 1, and defining it to be (A₁A₂· · ·An−1)An otherwise (where we are using the already-defined productA₁A₂· · ·An−1). Informally speaking, this means that the productA₁A₂· · ·Anis defined as ((((A₁A₂)A₃)· · ·)An−1)An. Now, general associativity says that this product A₁A₂· · ·An

equals (A₁A₂· · ·A_k) (A_k+1A_k+2· · ·An) for eachk ∈ {1, 2, . . . ,n−1}. (This is not too hard to prove by induction overn.) Informally speaking, this shows that our product A₁A2· · ·An also equals the result of any way of computing it (not only the((((A₁A2)A3)· · ·)An−1)An way).)

Im Dokument Notes on linear algebra (Seite 18-21)