39. Characteristic roots

The characteristic equation of Section 32 is not only important for Hermitian matrices. For an arbitrary (n,n)-matrix A = (aik), you call the polynomial in x

j (x) = |xE - A|

the characteristic polynomial of A. The roots of the characteristic equation j(x)=0 are also in this general case called characteristic roots or eigenvalues of the matrix A

In particular, if you set as in (32.3),

j(x) = xn - p1xn-1 + p2xn-2- ··· + (-1)npn,

then also now

p1 = a11 + a22 + ··· + ann = Trace of A,

pn, = |A|.

On the other hand, if a1, a2, ··· , an are the characteristic roots of A, you have the decomposition

j(x)=(x-a1)(x-a2)···(x-an)=xn-(a1+a2+···+an)xn-1+···+(-1)n(a1a2···an),

whence

a11+a22+···+ann = a1+a2+···+an, |A| = a1a2···an.

The last equation yields the result:

|A| = 0 if and only if one characteristic root of A vanishes.

In particular, if A is triangular,

whence the characteristic roots of a triangular matrix are also the elements of its principal diagonal.

The transposed matrix A' has the same characteristic polynomial as A, because

|xE - A'| = |(xE - A)'| = |xE - A|.

Theorem 22

Similar matrices have the same characteristic polynomials.

Proof: If B = P-1AP, then

|xE - B| = |xP-1P - P-1AP| = |P-1||aE - A||P| = |aE - A|,

because

|P-1||P| = |P-1P| = |E| = 1.

In particular, similar matrices have therefore the same traces and the same determinants.

Let f(x) = q0 + q1x + ··· + qmxm be an arbitrary polynomial and define the polynomial f(A) of an (n,n)-matrix A as follows:

f(A) = q0En + q1A + ··· + qmAm.

Theorem 23

If a1, a2, ··· , an are the characteristic roots of A, the polynomial f(A) has the characteristic roots f(a1), f(a2), ··· , f(an).

Proof: If a1, a2, ··· , an are the characteristic roots of A, the polynomial f(A) has the characteristic roots

f(a1), f(a2), ··· , f(an).

Moreover, A-1 has the characteristic roots a1-1, a2-1, ··· , an-1.*

* If A-1 exists, then |A| 0, whence all the characteristic roots of A are non-zero.

It follows from

P-1(lA)P=l(P-1AP), P-1(A+B)P=P-1AP+P-1BP,P-1(AB)P = (P-1AP)( P-1BP)

that for every polynomial

P-1f(A)P = f(P-1AP).

Select now P so that P-1AP is triangular; this can be done by Theorem 21. Then you have in the principal diagonal a1, a2, ··· , an. The sum and product of triagonal matrices are again diagonal matrices, the elements in the diagonal during addition and multiplication add and multiply, respectively, whence f(P-1AP) is a triagonal matrix, in the principal diagonal of which are the numbers f(a1), ··· , f(an). Consequently, P-1f(A)P = f(P-1AP) has the characteristic roots f(a1), ··· , f(an). By Theorem 22, these are also the characteristic roots of f(A).

In order to prove that A-1 has the characteristic roots a1-1, ··· , an-1, you observe that (P-1AP)-1 = P-1A-1P, whence

(P-1A-1P)(P-1AP) = E.

However, (P-1AP)-1 and P-1AP are triangular matrices, as follows from (15.4) - the formation of inverse matrices. Since in the principal diagonal of (P-1AP)-1 must occur the numbers a1-1, ··· , an-1, for during the multiplication

(P-1A-1P)(P-1AP) = E

the elements in the principal diagonal are multiplied with each other and yield the 1s of the main diagonal of E.

Since now P-1A-1 = (P-1AP)-1 and A-1, by Theorem 22, has the same characteristic roots, the rule has been proved.

You can also derive from polynomials in several variables polynomials of matrices. However, useable manipulation laws for such polynomials in several matrices can only be obtained if the matrices, which formally replace the variables, are like these variables pairwise interchangeable. If

f(x, y, ··· , z) = qr,s···,txrys ··· zt

is a polynomial in x, y, ··· , z, you allot to every variable an (n,n)-matrix

x ® A, y ® B, ··· , z ® C,

where the matrices A, B, ··· , C must be pairwise interchangeable. You then set

f(A, B, ··· , C) = Sqr,s···,tArBs ··· Ct.

If f(x, y, ··· , z) contains a constant term q0, you must replace it by q0En as above during the transition to matrices.

The generalization of Theorem 23 to polynomials in several matrices is

Theorem 24: Let A, B, ··· , C be pairwise interchangeable matrices. The characteristic roots

ai of A, bi of B, ··· , gi of C

can then be arranged so that every polynomial f(A, B, ··· , C) has the characteristic roots

f(a1, b1, ··· , g1) ··· f(an, bn, ··· , gn).

This arrangement does not depend on the polynomial f.

The proof employs the same arguments as that of Theorem 23, for you can find by Theorem 21 a matrix P with the property that all P-1AP, P-1BP, ··· , P-1CP becomes triangular. In the principal diagonals of

P-1f(A, B, ··· , C)P = f(P-1AP, P-1BP, ··· , P-1CP)

are then the numbers

f(a1, b1, ··· , g1) ··· f(an, bn, ··· , gn).

The fact that Theorem 24 without the assumption of interchangeability is wrong demonstrates the example

The matrices on the left hand side have the characteristic roots 1, 1 and 0, 0, respectively. The characteristic roots of their sum are 0, 2.

In contrast, also for non-interchangeable matrices:

The matrices AB and BA have the same characteristic roots.

In fact, if |A| 0, then

A-1(AB)A = BA,

so that AB and BA are similar and have therefore the same characteristic roots. If f(x) and g(x) are the characteristic polynomials of AB and BA, respectively, you form the sum of the squares F of the absolute values of the coefficients of f(x) - g(x). The equality of the characteristic roots of AB and BA, or, what is the same, the agreement of the characteristic polynomials f(x) and g(x) is then equivalent to F = 0. F is a continuous function of the elements of A, moreover F=0 for |A| 0, whence F must also vanish for |A| = 0.

Instead of a detailed knowledge of the characteristic roots of a matrix a, which demands the solution of an algebraic equation, it is frequently sufficient for applications to obtain bounds for the roots. We prove now from among estimates of this kind

Theorem 25

If you set for the (n,n)-matrix A = (aik)

zk = |ak1| + |ak2| + ··· + |akk| and z = Max(z1, z2,··· zk),

then you have for every characteristic root a of A the inequality |a| z. There exists a corresponding estimate regarding the sum of the absolute values in the columns of A.

Proof: Let x be an eigenvector of A belonging to a, that is, ax = Ax or, in detail,

If you let x = Max(|x1|, |x2|, ··· , |xn|) and, may be, it is |x1| = x, then (39.1) yields for i = j

|a|x=|a||xj||aj1||xj|+|aj2||x2|+···+|ajn||xn| (|aj1|+|aj2|+···+|ajn|)zx,

that is, |a| z.

In order to obtain corresponding estimates for the columns, note that A and the transposed matrix A' have the same characteristic roots.

last next