Page 1 :
LINEAR ALGEBRA, Second Edition, , KENNETH, , HOFFMAN, , Professor of Mathematics, Massachusetts Institute of Technology, , RAY KUNZE, Professor of Mathematics, University, of California,, , PRENTICE-HALL,, , INC.,, , Irvine, , Englewood, , Cliffs, New Jersey
Page 2 :
@ 1971, 1961 by, Prentice-Hall,, Inc., Englewood, Cliffs, New Jersey, , All rights reserved. No part of this book may be, reproduced in any form or by any means without, permission in writing, from the publisher., , London, Sydney, PRENTICE-HALL, OF CANADA,, LTD., Toronto, PRENTICE-HALLOFINDIA, PRIVATE, LIMITED,, New Delhi, PRENTICE-HALL, OFJAPAN,INC.,, Tokyo, PRENTICE-HALL~NTERNATXONAL,INC.,, PRENTICE-HALLOFAUSTRALIA,PTY., , Current, 10, , 9, , printing, 8, , 7, , LTD.,, , (last digit) :, 6, , Library, Printed, , of Congress Catalog Card No. 75142120, in the United States of America
Page 3 :
Pfre, , ace, , Our original purpose in writing this book was to provide a text for the undergraduate linear algebra course at the Massachusetts, Institute, of Technology., This, course was designed for mathematics, majors at the junior level, although, threefourths of the students were drawn from other scientific and technological, disciplines, and ranged from freshmen, through graduate, students. This description, of the, M.I.T., audience for the text remains generally accurate today. The ten years since, the first edition have seen the proliferation, of linear algebra courses throughout, the country and have afforded one of the authors the opportunity, to teach the, basic material, to a variety of groups at Brandeis University,, Washington, University (St. Louis), and the University, of California, (Irvine)., Our principal, aim in revising Linear Algebra has been to increase the variety, of courses which can easily be taught from it. On one hand, we have structured, the, chapters, especially the more difficult ones, so that there are several natural stopping points along the way, allowing the instructor, in a one-quarter, or one-semester, course to exercise a considerable, amount of choice in the subject matter. On the, other hand, we have increased the amount of material in the text, so that it can be, used for a rather comprehensive, one-year course in linear algebra and even as a, reference book for mathematicians., The major changes have been in our treatments, of canonical forms and inner, product spaces. In Chapter 6 we no longer begin with the general spatial theory, which underlies the theory of canonical forms. We first handle characteristic, values, in relation to triangulation, and diagonalization, theorems and then build our way, up to the general theory. We have split Chapter 8 so that the basic material, on, inner product spaces and unitary diagonalization, is followed by a Chapter 9 which, treats sesqui-linear, forms and the more sophisticated, properties of normal operators, including, normal operators on real inner product spaces., We have also made a number of small changes and improvements, from the, first edition. But the basic philosophy, behind the text is unchanged., We have made no particular, concession to the fact that the majority, of the, students may not be primarily, interested in mathematics., For we believe a mathematics course should not give science, engineering,, or social science students a, hodgepodge, of techniques,, but should provide them with an understanding, of, basic mathematical, concepts., .. ., am
Page 4 :
Preface, On the other hand, we have been keenly aware of the wide range of backgrounds which the students may possess and, in particular,, of the fact that the, students have had very little experience with abstract mathematical, reasoning., For this reason, we have avoided the introduction, of too many abstract ideas at, the very beginning of the book. In addition,, we have included an Appendix which, presents such basic ideas as set, function, and equivalence relation. We have found, it most profitable, not to dwell on these ideas independently,, but to advise the, students to read the Appendix when these ideas arise., Throughout, the book we have included a great variety of examples of the, important, concepts which occur. The study of such examples is of fundamental, importance, and tends to minimize, the number of students who can repeat definition, theorem, proof in logical order without grasping the meaning of the abstract, concepts. The book also contains a wide variety of graded exercises (about six, hundred),, ranging from routine applications, to ones which will extend the very, best students. These exercises are intended, to be an important, part of the text., Chapter 1 deals with systems of linear equations and their solution by means, of elementary, row operations on matrices. It has been our practice to spend about, six lectures on this material., It provides the student with some picture of the, origins of linear algebra and with the computational, technique necessary to understand examples of the more abstract ideas occurring in the later chapters. Chapter 2 deals with vector spaces, subspaces, bases, and dimension., Chapter 3 treats, linear transformations,, their algebra, their representation, by matrices, as well as, isomorphism,, linear functionals,, and dual spaces. Chapter 4 defines the algebra of, polynomials, over a field, the ideals in that algebra, and the prime factorization, of, a polynomial., It also deals with roots, Taylor’s formula, and the Lagrange interpolation formula. Chapter 5 develops determinants, of square matrices, the determinant being viewed as an alternating, n-linear function of the rows of a matrix,, and then proceeds to multilinear, functions on modules as well as the Grassman ring., The material, on modules places the concept of determinant, in a wider and more, comprehensive, setting than is usually found in elementary, textbooks., Chapters 6, and 7 contain a discussion of the concepts which are basic to the analysis of a single, linear transformation, on a finite-dimensional, vector space; the analysis of characteristic (eigen) values, triangulable, and diagonalizable, transformations;, the concepts of the diagonalizable, and nilpotent, parts of a more general transformation,, and the rational and Jordan canonical forms. The primary and cyclic decomposition, theorems play a central role, the latter being arrived at through the study of, admissible subspaces. Chapter 7 includes a discussion of matrices over a polynomial, domain, the computation, of invariant, factors and elementary, divisors of a matrix,, and the development, of the Smith canonical form. The chapter ends with a discussion of semi-simple, operators, to round out the analysis of a single operator., Chapter 8 treats finite-dimensional, inner product spaces in some detail. It covers, the basic geometry,, relating orthogonalization, to the idea of ‘best approximation, to a vector’ and leading to the concepts of the orthogonal, projection, of a vector, onto a subspace and the orthogonal, complement, of a subspace. The chapter treats, unitary operators and culminates, in the diagonalization, of self-adjoint, and normal, operators. Chapter 9 introduces, sesqui-linear, forms, relates them to positive and, self-adjoint, operators on an inner product space, moves on to the spectral theory, of normal operators and then to more sophisticated, results concerning, normal, operators on real or complex inner product spaces. Chapter 10 discusses bilinear, forms, emphasizing, canonical forms for symmetric, and skew-symmetric, forms, as, well as groups preserving non-degenerate, forms, especially the orthogonal,, unitary,, pseudo-orthogonal, and Lorentz groups., We feel that any course which uses this text should cover Chapters 1, 2, and 3
Page 5 :
Preface, thoroughly,, possibly excluding Sections 3.6 and 3.7 which deal with the double dual, and the transpose of a linear transformation., Chapters 4 and 5, on polynomials, and, determinants,, may be treated with varying, degrees of thoroughness., In fact,, polynomial, ideals and basic properties, of determinants, may be covered quite, sketchily without serious damage to the flow of the logic in the text; however, our, inclination, is to deal with these chapters carefully (except the results on modules),, because the material, illustrates, so well the basic ideas of linear algebra. An elementary course may now be concluded nicely with the first four sections of Chapter 6, together with (the new) Chapter 8. If the rational and Jordan forms are to, be included, a more extensive coverage of Chapter 6 is necessary., Our indebtedness, remains to those who contributed, to the first edition, especially to Professors Harry Furstenberg,, Louis Howard, Daniel Kan, Edward Thorp,, to Mrs. Judith Bowers, Mrs. Betty Ann (Sargent) Rose and Miss Phyllis Ruby., In addition,, we would like to thank the many students and colleagues whose perceptive comments, led to this revision, and the staff of Prentice-Hall, for their, patience in dealing with two authors caught in the throes of academic administration. Lastly, special thanks are due to Mrs. Sophia Koulouras, for both her skill, and her tireless efforts in typing the revised manuscript., K. M. H., , / R. A. K., , V
Page 6 :
Contents, , Chapter, , 1., , Linear, 1.1., 1.2., 1.3., 1.4., 1.5., 1.6., , Chapter, , 2., , Vector, 2.1., 2.2., 2.3., 2.4., 2.5., 2.6., , Chapter, , 3., , Linear, 3.1., 3.2., 3.3., 3.4., 3.5., 3.6., 3.7., , Vi, , Equations, Fields, Systems of Linear Equations, Matrices and Elementary, Row Operations, Row-Reduced, Echelon Matrices, Matrix, Multiplication, Invertible, Matrices, , Spaces, Vector Spaces, Subspaces, Bases and Dimension, Coordinates, Summary, of Row-Equivalence, Computations, Concerning, Subspaces, , Transformations, Linear Transformations, The Algebra of Linear Transformations, Isomorphism, Representation, of Transformations, by Matrices, Linear Functionals, The Double Dual, The Transpose of a Linear Transformation, , 1, 1, 3, 6, 11, 16, 21, , 28, 28, 34, 40, 49, 55, 58, , 67, 67, 74, 84, 86, 97, 107, 111
Page 7 :
Contents, Chapter, , 4., , 4.1., 4.2., 4.3., 4.4., 4.5., , Chapter, , 5., , 6., , Chapter, , 7., , 8., , 140, , Commutative, Rings, Determinant, Functions, Permutations, and the Uniqueness of Determinants, Additional, Properties, of Determinants, Modules, Multilinear, Functions, The Grassman Ring, , Elementary, 6.1., 6.2., 6.3., 6.4., 6.5., , Chapter, , 117, 119, 124, 127, 134, , Algebras, The Algebra of Polynomials, Lagrange Interpolation, Polynomial, Ideals, The Prime Factorization, of a Polynomial, , Determinants, 5.1., 5.2., 5.3., 5.4., 5.5., 5.6., 5.7., , Chapter, , 117, , Polynomials, , Canonical, , 181, , Forms, , 6.6., 6.7., 6.8., , Introduction, Characteristic, Values, Annihilating, Polynomials, Invariant, Subspaces, Simultaneous, Triangulation;, Diagonalization, Direct-Sum, Decompositions, Invariant, Direct Sums, The Primary, Decomposition, , The, , Rational, , 7.1., 7.2., 7.3., 7.4., 7.5., , Cyclic Subspaces and Annihilators, Cyclic Decompositions, and the Rational, The Jordan Form, Computation, of Invariant, Factors, Summary;, Semi-Simple, Operators, , Inner, 8.1., 8.2., 8.3., 8.4., 8.5., , and, , Jordan, , 140, 141, 150, 156, 164, 166, 173, , 181, 182, 190, 198, Simultaneous, 206, 209, 213, 219, , Theorem, , 227, , Forms, Form, , 227, 231, 244, 251, 262, , Spaces, , 270, , Inner Products, Inner Product Spaces, Linear Functionals, and Adjoints, Unitary, Operators, Normal, Operators, , 270, 277, 290, 299, 311, , Product, , vii
Page 8 :
.. ., 0222, , Contents, Chapter, , 9., , Operators, 9.1., 9.2., 9.3., 9.4., 9.5., 9.6., , Chapter, , 10., , Bilinear, 10. I., 10.2., 10.3., 10.4, , on Inner, , Product, , Spaces, , Introduction, Forms on Inner Product Spaces, Positive Forms, More on Forms, Spectral Theory, Further Properties, of Normal Operators, , 319, 319, 320, 325, 332, 335, 349, 359, , Forms, Bilinear Forms, Symmetric, Bilinear Forms, Skew-Symmetric, Bilinear, Forms, Groups Preserving, Bilinear, Forms, , 359, 367, 375, 379, 386, , Appendix, A.1., A.2., A.3., A.4., A.5., A.6., , Sets, Functions, Equivalence, Relations, Quotient Spaces, Equivalence, Relations, The Axiom of Choice, , in Linear, , Algebra, , 387, 388, 391, 394, 397, 399, , Bibliography, , 400, , Index, , 401
Page 9 :
1. Linear, , Equations, , 1 .l., We assume that the reader is familiar with the elementary algebra of, real and complex numbers. For a large portion of this book the algebraic, properties of numbers which we shall use are easily deduced from the, following brief list of properties of addition and multiplication., We let F, denote either the set of real numbers or the set of complex numbers., 1. Addition, , is commutative,, x+y=y+x, , for all x and y in F., 2. Addition is associative,, x + (Y + x> = (x + Y) + 2, for all 2, y, and z in F., 3. There is a unique element 0 (zero) in F such that 2 + 0 = x, for, every x in F., 4. To each x in F there corresponds a unique element (-x) in F such, that x + (-x), = 0., 5. Multiplication, is commutative,, , xy = yx, for all x and y in F., 6. Multiplication, , is associative,, dYZ> = (XY>Z, , for all x, y, and x in F., , Fields
Page 10 :
2, , Linear Equations, , Chap. 1, , 7. There is a unique non-zero element 1 (one) in F such that ~1 = 5,, for every x in F., 8. To each non-zero x in F there corresponds a unique element x-l, (or l/x) in F such that xx-’ = 1., 9. Multiplication, distributes, over addition;, that is, x(y + Z) =, xy + xz, for all x, y, and z in F., Suppose one has a set F of objects x, y, x, . . . and two operations on, the elements of F as follows. The first operation, called addition, associates with each pair of elements 2, y in F an element (x + y) in F; the, second operation, called multiplication,, associates with each pair x, y an, element zy in F; and these two operations satisfy conditions (l)-(9) above., The set F, together with these two operations, is then called a field., Roughly speaking, a field is a set together with some operations on the, objects in that set which behave like ordinary, addition,, subtraction,, multiplication,, and division of numbers in the sense that they obey the, nine rules of algebra listed above. With the usual operations of addition, and multiplication,, the set C of complex numbers is a field, as is the set R, of real numbers., For most of this book the ‘numbers’ we use may as well be the elements from any field F. To allow for this generality, we shall use the, word ‘scalar’ rather than ‘number.’ Not much will be lost to the reader, if he always assumes that the field of scalars is a subfield of the field of, complex numbers. A subfield, of the field C is a set F of complex numbers, which is itself a field under the usual operations of addition and multiplication of complex numbers. This means that 0 and 1 are in the set F,, and that if x and y are elements of F, so are (x + y), -x, xy, and z-l, (if x # 0). An example of such a subfield is the field R of real numbers;, for, if we identify the real numbers with the complex numbers (a + ib), for which b = 0, the 0 and 1 of the complex field are real numbers, and, if x and y are real, so are (x + y), -Z, zy, and x-l (if x # 0). We shall, give other examples below. The point of our discussing subfields is essentially this: If we are working with scalars from a certain subfield of C,, then the performance, of the operations of addition, subtraction,, multiplication, or division on these scalars does not take us out of the given, subfield., EXAMPLE, 1. The set of positive, integers:, 1, 2, 3, . . . , is not a subfield of C, for a variety of reasons. For example, 0 is not a positive integer;, for no positive integer n is -n a positive integer; for no positive integer n, except 1 is l/n a positive integer., EXAMPLE, 2. The set of integers:, . . . , - 2, - 1, 0, 1, 2, . . . , is not a, subfield of C, because for an integer n, l/n is not an integer unless n is 1 or
Page 11 :
Sec. 1.2, , Systems of Linear, , -1. With the usual operations of addition, integers satisfies all of the conditions (l)-(9), , and multiplication,, except condition, , Equations, , the set of, (8)., , numbers,, that is, numbers of the, EXAMPLE 3. The set of rational, form p/q, where p and q are integers and q # 0, is a subfield of the field, of complex numbers. The division which is not possible within the set of, integers is possible within the set of rational numbers. The interested, reader should verify that any subfield of C must contain every rational, number., , EXAMPLE 4. The set of all complex numbers of the form 2 + yG,, where x and y are rational, is a subfield of C. We leave it to the reader to, verify this., In the examples and exercises of this book, the reader should assume, that the field involved, is a subfield of the complex numbers, unless it is, expressly stated that the field is more general. We do not want to dwell, on this point; however, we should indicate why we adopt such a convention. If F is a field, it may be possible to add the unit 1 to itself a finite, number of times and obtain 0 (see Exercise 5 following Section 1.2) :, 1+, , 1 + ..., , + 1 = 0., , That does not happen in the complex number field (or in any subfield, thereof). If it does happen in F, then the least n such that the sum of n, of the field F. If it does not happen, l’s is 0 is called the characteristic, in F, then (for some strange reason) F is called a field of characteristic, zero. Often, when we assume F is a subfield of C, what we want to guaranzero; but, in a first exposure to, tee is that F is a field of characteristic, linear algebra, it is usually better not to worry too much about characteristics of fields., , 1.2., , of Linear, , Systems, , Suppose F is a field. We consider the problem of finding, (elements of F) x1, . . . , x, which satisfy the conditions, &Xl, , +, , A12x2, , +, , .-a, , +, , Al?&, , =, , y1, , &XI, , +, , &x2, , +, , ..., , +, , Aznxn, , =, , y2, , +, , . . . + A;nxn, , Equations, n scalars, , (l-1), , A :,x:1 + A,zxz, , = j_, , where yl, . . . , ym and Ai?, 1 5 i 5 m, 1 5 j 5 n, are given elements, of m linear, equations, in n unknowns., of F. We call (l-l) a system, Any n-tuple (xi, . . . , x,) of elements of F which satisfies each of the, , 3
Page 12 :
Linear, , Chap. 1, , Equations, , of the system. If yl = yZ = . . . =, equations in (l-l) is called a solution, or that each of the, ym = 0, we say that the system is homogeneous,, equations is homogeneous., Perhaps the most fundamental, technique for finding the solutions, of a system of linear equations is the technique of elimination., We can, illustrate this technique on the homogeneous system, 2x1 x2 +, x3, =, 0, x1 + 322 + 4x3 = 0., , If we add (-2), , times the second equation, , to the first equation,, , we obtain, , - 723 = 0, , -7X2, , or, x2 = -x3. If we add 3 times the first equation, we obtain, 7x1 + 7x3 = 0, , to the second equation,, , or, x1 = -x3. So we conclude that if (xl, x2, x3) is a solution then x1 = x2 =, -x3. Conversely, one can readily verify that any such triple is a solution., Thus the set of solutions consists of all triples (-a, -a, a)., We found the solutions to this system of equations by ‘eliminating, unknowns,’ that is, by multiplying, equations by scalars and then adding, to produce equations in which some of the xj were not present. We wish, to formalize this process slightly so that we may understand why it works,, and so that we may carry out the computations, necessary to solve a, system in an organized manner., For the general system (l-l), suppose we select m scalars cl, . . . , c,,, multiply, the jth equation by ci and then add. We obtain the equation, (Cl&, , +, , . . ., , +, , CmAml)Xl, , +, , . . *, , +, , (Cl&a, , +, , . . ., , +, , c,A,n)xn, =, , Such an equation we shall call, (l-l). Evidently,, any solution, also be a solution of this new, the elimination, process. If we, &1X1, , +, , c1y1, , +, , . . ., , +, , G&7‘., , a linear, combination, of the equations in, of the entire system of equations (l-l) will, equation. This is the fundamental, idea of, have another system of linear equations, . . ., , +, , BlnXn, , =, , Xl, , U-2), &-lx1 + . * . + Bk’nxn = z,,, in which each of the k equations is a linear combination, of the equations, in (l-l), then every solution of (l-l) is a solution of this new system. Of, course it may happen that some solutions of (l-2) are not solutions of, (l-l). This clearly does not happen if each equation in the original system, is a linear combination, of the equations in the new system. Let us say, if each equation, that two systems of linear equations are equivalent, in each system is a linear combination of the equations in the other system., We can then formally state our observations as follows.
Page 13 :
Sec. 1.2, Theorem, , Systems of Linear, 1. Equivalent, , systems of linear, , equations, , Equations, , have exactly the, , same solutions., If the elimination, process is to be effective, in finding, the solutions, of, a system, like (l-l),, then one must see how, by forming, linear, combinations of the given equations,, to produce, an equivalent, system of equations, which is easier to solve. In the next section we shall discuss one method, of doing this., , Exercises, 1. Verify that, field of C., , the set of complex, , numbers, , described, , in Example, , 4 is a sub-, , 2. Let F be the field of complex numbers. Are the following two systems of linear, equations, equivalent?, If so, express each equation, in each system as a linear, combination, of the equations in the other system., Xl - x2 = 0, 2x1 + x2 = 0, 3. Test the following, , 321 + x2 = 0, Xl + x2 = 0, , systems of equations, , as in Exercise, , -x1 + x2 + 4x3 = 0, x1 + 3x2 + 8x3 = 0, &Xl + x2 + 5x3 = 0, 4. Test the following, , systems, , 2x1 + (- 1 + i)x2, , 21, , as in Exercise, , +, , x4=0, , - 23 = 0, x2 + 3x8 = 0, , 2., , (, , 1+ i, , I, , 3x2 - %x3 + 5x4 = 0, 5. Let F be a set which contains exactly, and multiplication, by the tables:, +, , Verify, , 0, , 1, , 0 0, 110, , 1, , that the set F, together, , x1 + 8x2 - ixg +x1 -, , two elements,, , --, , each subfield, , 8. Prove that, number field., , each field, , of characteristic, , x3 + 7x4 = 0, , 0 and 1. Define an addition, , 0 00, 101, , with these two operations,, , of the field, , gx, +, , x4 = 0, , .Ol, , 6. Prove that if two homogeneous, systems of linear, have the same solutions, then they are equivalent., 7. Prove that, rational number., , 2., , is a field., , equations, , of complex, zero contains, , numbers, , in two unknowns, contains, , every, , a copy of the rational, , 5
Page 14 :
6, , 1.3., , Row, , Chap. 1, , Linear Equations, , Matrices, , and, , Elementary, , Operations, One cannot fail to notice that in forming linear combinations, of, equations there is no need to continue writing the ‘unknowns’, . . , GL, since one actually computes only with the coefficients Aij and, Fie’ scalars yi. We shall now abbreviate the system (l-l) by, linear, , AX, , = Y, , where, 11, , ***, , -4.1,, , [: A,1 . -a A’,,:I, Yl, Y = [ : ] ., Ym, of coefficients, of the system. Strictly speaking,, We call A the matrix, the rectangular, array displayed above is not a matrix, but is a repreover the field, F is a function, sentation of a matrix. An m X n matrix, A from the set of pairs of integers (i, j), 1 5 i < m, 1 5 j 5 n, into the, of the matrix A are the scalars A (i, j) = Aij, and, field F. The entries, quite often it is most convenient to describe the matrix by displaying its, entries in a rectangular, array having m rows and n columns, as above., Thus X (above) is, or defines, an n X 1 matrix and Y is an m X 1 matrix., For the time being, AX = Y is nothing more than a shorthand notation, for our system of linear equations. Later, when we have defined a multiplication for matrices, it will mean that Y is the product of A and X., We wish now to consider operations on the rows of the matrix A, which correspond to forming, linear combinations, of the equations in, the system AX = Y. We restrict our attention to three elementary, row, operations, on an m X n matrix A over the field F:, x=;;,A, , and, , 1. multiplication, of one row of A by a non-zero scalar c;, 2. replacement of the rth row of A by row r plus c times row s, c any, scalar and r # s;, 3. interchange of two rows of A., An elementary, row operation is thus a special type of function (rule) e, which associated with each m X n matrix A an m X n matrix e(A). One, can precisely describe e in the three cases as follows:, 1. e(A)ii = Aii, 2. e(A)ij = A+, 3. e(A)ij = Aij, e(A)8j = A,+, , if, if, if, , i # T, e(A)7j = cAyi., i # r, e(A)?j = A,i + cA,~., i is different from both r and s,, , e(A),j, , = A,j,
Page 15 :
Sec. 1.3, , Matrices and Elementary Row Operations, , In defining e(A), it is not really important how many columns A has, but, the number of rows of A is crucial. For example, one must worry a little, to decide what is meant by interchanging, rows 5 and 6 of a 5 X 5 matrix., To avoid any such complications, we shall agree that an elementary row, operation e is defined on the class of all m X n matrices over F, for some, fixed m but any n. In other words, a particular e is defined on the class of, all m-rowed matrices over F., One reason that we restrict ourselves to these three simple types of, row operations is that, having performed such an operation e on a matrix, A, we can recapture A by performing, a similar operation on e(A)., Theorem, 2. To each elementary row operation e there corresponds, elementary row operation el, of the same type as e, such that el(e(A)), e(el(A)) = A for each A, In other words, the inverse operation (junction), an elementary row operation exists and is an elementary row operation of, same type., , an, =, of, the, , Proof. (1) Suppose e is the operation which multiplies the rth row, of a matrix by the non-zero scalar c. Let el be the operation which multiplies row r by c-l. (2) Suppose e is the operation which replaces row r by, row r plus c times row s, r # s. Let el be the operation which replaces row r, by row r plus (-c) times row s. (3) If e interchanges rows r and s, let el = e., In each of these three cases we clearly have ei(e(A)) = e(el(A)) = A for, each A. 1, Dejinition., If A and B are m X n matrices over the jield F, we say that, to A if B can be obtained from A by a$nite sequence, B is row-equivalent, of elementary row operations., , Using Theorem 2, the reader should find it easy to verify the following., Each matrix is row-equivalent, to itself; if B is row-equivalent, to A, then A, is row-equivalent, to B; if B is row-equivalent, to A and C is row-equivalent, to B, then C is row-equivalent, to A. In other words, row-equivalence, is, an equivalence relation (see Appendix)., Theorem, 3. If A and B are row-equivalent, m X n matrices, the homogeneous systems of linear equations Ax = 0 and BX = 0 have exactly the, same solutions., , elementary, , Proof. Suppose we pass from, row operations:, A = A,,+A1+, , A to B by a finite, , ... +Ak, , sequence of, , = B., , It is enough to prove that the systems AjX = 0 and Aj+lX = 0 have the, same solutions, i.e., that one elementary row operation does not disturb, the set of solutions., , 7
Page 16 :
8, , Linear Equations, , Chap. 1, , So suppose that B is obtained from A by a single elementary, row, operation. No matter which of the three types the operation is, (l), (2),, or (3), each equation in the system BX = 0 will be a linear combination, of the equations in the system AX = 0. Since the inverse of an elementary, row operation is an elementary row operation, each equation in AX = 0, will also be a linear combination of the equations in BX = 0. Hence these, two systems are equivalent,, and by Theorem, 1 they have the same, solutions., 1, EXAMPLE 5. Suppose F is the field of rational, , numbers,, , and, , We shall perform a finite sequence of elementary, row operations on A,, indicating by numbers in parentheses the type of operation performed., , 6-l, , The row-equivalence, tells us in particular, , of A with the final, that the solutions of, , 5, , matrix, , in the above, , sequence, , 2x1 - x2 + 3x3 + 2x4 = 0, - x4 = 0, xl + 4x2, 2x1 + 6x2 ~3 + 5x4 = 0, and, x3, , Xl, , -, , 9x4, , = 0, , +yx4=0, x2, , -, , 5x -0, g4-, , are exactly the same. In the second system it is apparent, , that if we assign
Page 17 :
Matrices and Elementary Row Operations, , Sec. 1.3, any rational value c to x4 we obtain, that every solution is of this form., , a solution, , (-+c,, , %, J+c, c), and also, , EXAMPLE 6. Suppose F is the field of complex numbers, , and, , Thus the system of equations, -51 + ix, = 0, --ix1 + 3x2 = 0, x1 + 2x2 = 0, has only the trivial, , solution, , x1 = x2 = 0., , In Examples 5 and 6 we were obviously not performing, row operations at random. Our choice of row operations was motivated by a desire, to simplify the coefficient matrix in a manner analogous to ‘eliminating, unknowns’ in the system of linear equations. Let us now make a formal, definition of the type of matrix at which we were attempting to arrive., DeJinition., , An m X n matrix R is called row-reduced, , if:, , (a) the jirst non-zero entry in each non-zero row of R is equal to 1;, (b) each column of R which contains the leading non-zero entry of some, row has all its other entries 0., EXAMPLE 7. One example of a row-reduced, matrix is the n X n, (square) identity, matrix, I. This is the n X n matrix defined by, Iii, , = 6,j =, , 1,, -t 0,, , if, if, , i=j, i # j., , This is the first of many occasions on which we shall use the Kronecker, delta, , (6)., , In Examples 5 and 6, the final matrices in the sequences exhibited, there are row-reduced, matrices. Two examples of matrices which are not, row-reduced are:
Page 18 :
10, , Linear Equations, , Chap. 1, , The second matrix fails to satisfy condition (a), because the leading nonzero entry of the first row is not 1. The first matrix does satisfy condition, (a), but fails to satisfy condition (b) in column 3., We shall now prove that we can pass from any given matrix to a rowreduced matrix, by means of a finite number of elementary, row opertions. In combination, with Theorem 3, this will provide us with an effective tool for solving systems of linear equations., Theorem, , a row-reduced, , 4. Every m X n matrix, matrix., , over the field F is row-equivalent, , to, , Proof. Let A be an m X n matrix over F. If every entry in the, first row of A is 0, then condition (a) is satisfied in so far as row 1 is concerned. If row 1 has a non-zero entry, let k be the smallest positive integer, j for which Alj # 0. Multiply, row 1 by AG’, and then condition, (a) is, satisfied with regard to row 1. Now for each i 2 2, add (-Aik), times row, 1 to row i. Now the leading non-zero entry of row 1 occurs in column k,, that entry is 1, and every other entry in column k is 0., Now consider the matrix which has resulted from above. If every, entry in row 2 is 0, we do nothing to row 2. If some entry in row 2 is different from 0, we multiply row 2 by a scalar so that the leading non-zero, entry is 1. In the event that row 1 had a leading non-zero entry in column, k, this leading non-zero entry of row 2 cannot occur in column k; say it, occurs in column Ic, # k. By adding suitable multiples of row 2 to the, various rows, we can arrange that all entries in column k’ are 0, except, the 1 in row 2. The important thing to notice is this: In carrying out these, last operations, we will not change the entries of row 1 in columns 1, . . . , k;, nor will we change any entry of column k. Of course, if row 1 was identically 0, the operations with row 2 will not affect row 1., Working with one row at a time in the above manner, it is clear that, in a finite number of steps we will arrive at a row-reduced, matrix., 1, , Exercises, 1. Find all solutions to the system of equations, (1 - i)Zl - ixz = 0, 2x1 + (1 - i)zz = 0., 2. If, , A=2, , 3, , -1, , 2, , 1, , -3 11 0, , [ 1, , find all solutions of AX = 0 by row-reducing A.
Page 19 :
Row-Reduced, , Sec. 1.4, , Echelon, , Matrices, , 3. If, , find all solutions of AX = 2X and all solutions of AX = 3X. (The symbol cX, denotes the matrix each entry of which is c times the corresponding, entry of X.), 4. Find a row-reduced, , matrix, , which is row-equivalent, , to, , 6. Let, , be a 2 X 2 matrix with complex entries. Suppose that A is row-reduced, and also, that a + b + c + d = 0. Prove that there are exactly three such matrices., 7. Prove that the interchange, of two rows of a matrix can be accomplished, finite sequence of elementary, row operations of the other two types., 8. Consider, , the system, , of equations, , AX, , by a, , = 0 where, , is a 2 X 2 matrix over the field F. Prove the following., (a) If every entry of A is 0, then every pair (xi, Q) is a solution of AX = 0., (b) If ad - bc # 0, the system AX = 0 has only the trivial, solution, z1 =, x2 = 0., (c) If ad - bc = 0 and some entry of A is different from 0, then there is a, solution, (z:, x20) such that (xi, 22) is a solution if and only if there is a scalar y, such that zrl = yxy, x2 = yxg., , 1 .P. Row-Reduced, , Echelon, , Matrices, , Until now, our work with systems of linear equations was motivated, by an attempt to find the solutions of such a system. In Section 1.3 we, established a standardized technique for finding these solutions. We wish, now to acquire some information, which is slightly more theoretical,, and, for that purpose it is convenient to go a little beyond row-reduced matrices., matrix, , DeJinition., if:, , An, , m X n matrix, , R is called, , a row-reduced, , echelon, , 11
Page 20 :
12, , Chap. 1, , Linear Equations, , (a) R is row-reduced;, (b) every row of R which has all its entries 0 occurs below every row, which has a non-zero entry;, r are the non-zero rows of R, and if the leading non(c) ifrowsl,...,, zero entry of row i occurs in column ki, i = 1, . . . , r, then kl <, kz < . . . < k,., One can also describe an m X n row-reduced, echelon matrix R as, follows. Either every entry in R is 0, or there exists a positive integer r,, 1 5 r 5 m, and r positive integers kl, . . . , k, with 1 5 ki I: n and, (a) Rij=Ofori>r,andRij=Oifj<k;., (b) &ki = 8ij, 1 5 i 5 r, 1 5 j 5 r., (c) kl < . . . < k,., EXAMPLE, 8. Two examples of row-reduced, echelon matrices are the, O”J’, in which all, n X n identity matrix, and the m X n zero matrix, entries are 0. The reader should have no difficulty, in making other examples, but we should like to give one non-trivial, one:, , Theorem, , 5. Every m X n matrix A is row-equivalent, , to a row-reduced, , echelon matrix., Proof. We know that A is row-equivalent, to a row-reduced, matrix. All that we need observe is that by performing, a finite number of, row interchanges on a row-reduced, matrix we can bring it to row-reduced, echelon form., 1, In Examples 5 and 6, we saw the significance of row-reduced, matrices, in solving homogeneous systems of linear equations. Let us now discuss, briefly the system RX = 0, when R is a row-reduced, echelon matrix. Let, rows 1, . . . , r be the non-zero rows of R, and suppose that the leading, non-zero entry of row i occurs in column ki. The system RX = 0 then, consists of r non-trivial, equations. Also the unknown xk; will occur (with, non-zero coefficient) only in the ith equation. If we let ul, . . . , u+,. denote, the (n - r) unknowns which are different, from xk,, . . . , xk,, then the, r non-trivial, equations in RX = 0 are of the form, Xkl, , +, , Z, , CljUj, , =, , 0, , . j=l, , (l-3), Xk,, , n--r, -I- Z CrjUj = 0., j=l
Page 21 :
Row-Reduced Echelon Matrices, , Sec. 1.4, All the solutions to, assigning any values, corresponding, values, matrix displayed in, non-trivial, equations, , the system of equations RX = 0 are obtained by, whatsoever to ~1, . . . , u,-, and then computing the, of xk,, . . . , xk, from (l-3). For example, if R is the, Example 8, then r = 2, ICI = 2, i& = 4, and the two, in the system RX = 0 are, , x2 - 3x3, , + $x5 = 0, x4+2x5=0, , or, or, , x2 = 3x3 - +x5, x4= -2x5., , So we may assign any values to xi, x3, and x5, say x1 = a, 23 = b, x5 = c,, and obtain the solution (a, 3b - +c, 6, -2c, c)., Let us observe one thing more in connection with the system of, equations RX = 0. If the number r of non-zero rows in R is less than n,, then the system RX = 0 has a non-trivial, solution, that is, a solution, (Xl, . . . ) x,) in which not every xi is 0. For, since r < n, we can choose, some Xj which is not among the r unknowns xk,, . . . , xk,, and we can then, construct a solution as above in which this xi is 1. This observation, leads, us to one of the most fundamental, facts concerning systems of homogeneous linear equations., Theorem, 6. Zf A is an m X n matrix and m < n, then the homogeneous system of linear equations Ax = 0 has a non-trivial, solution., , Proof. Let R be a row-reduced, echelon matrix which is rowequivalent to A. Then the systems AX = 0 and RX = 0 have the same, solutions by Theorem 3. If r is the number of, rows in R, then, certainly r 5 m, and since m < n, we have r < n. It follows immediately, from our remarks above that AX = 0 has a non-trivial, solution., 1, Theorem, 7. Zf A is an n X n (square) matrix, then A is row-equivalent, to the n X n identity matrix if and only if the system of equations AX = 0, has only the trivial solution., , Proof. If A is row-equivalent, to I, then AX = 0 and IX = 0, have the same solutions. Conversely, suppose AX = 0 has only the trivial, solution X = 0. Let R be an n X n row-reduced, echelon matrix which is, row-equivalent, to A, and let r be the number of non-zero rows of R. Then, RX = 0 has no non-trivial, solution. Thus r 2 n. But since R has n rows,, certainly r < n, and we have r = n. Since this means that R actually has, a leading non-zero entry of 1 in each of its n rows, and since these l’s, occur each in a different one of the n columns, R must be the n X n identity, matrix., 1, Let us now ask what elementary row operations do toward solving, a system of linear equations AX = Y which is not homogeneous. At the, outset, one must observe one basic difference between this and the homogeneous case, namely, that while the homogeneous system always has the
Page 22 :
14, , Linear Equations, , Chap. 1, , trivial solution 51 = . . . = x, = 0, an inhomogeneous, system need have, no solution at all., matrix, A’ of the system AX = Y. This, We form the augmented, is the m X (n + 1) matrix whose first n columns are the columns of A, and whose last column is Y. More precisely,, , A& = Aii,, Ai(n+l), , =, , if, , j 5 n, , Yi., , Suppose we perform, a sequence of elementary, row operations, on A,, arriving, at a row-reduced, echelon matrix R. If we perform this same, sequence of row operations on the augmented matrix A’, we will arrive, at a matrix R’ whose first n columns are the columns of R and whose last, column contains certain scalars 21, . . . , 2,. The scalars xi are the entries, of the m X 1 matrix, , 21, z=, ;[I, Gn, , which results from applying the sequence of row operations to the matrix, Y. It should be clear to the reader that, just as in the proof of Theorem 3,, and hence have the, the systems AX = Y and RX = Z are equivalent, same solutions. It is very easy to determine whether the system RX = Z, has any solutions and to determine all the solutions if any exist. For, if R, has r non-zero rows, with the leading non-zero entry of row i occurring, in column ki, i = 1, . . . , rr then the first r equations of RX = Z effectively express zk,, . . . , xk, in terms of the (n - r) remaining xj and the, scalars zl, . . . , zT. The last (m - r) equations are, 0, , =, , G+1, , and accordingly the condition for the system to have a solution is zi = 0, for i > r. If this condition is satisfied, all solutions to the system are, found just as in the homogeneous case, by assigning arbitrary, values to, (n - r) of the xj and then computing xk; from the ith equation., EXAMPLE 9. Let F be the field of rational, , numbers, , and, , and suppose that we wish to solve the system AX = Y for some yl, yz,, and y3. Let us perform a sequence of row operations on the augmented, matrix A’ which row-reduces A :
Page 23 :
Row-Reduced Echelon Matrices, , Sec. 1.4, , E, , -;, , 1, 0, [ 0, , -2, , -i, , p, 1, , 5, 0, , -1, 0, , E, , -i, , (y3 -, , (yz $24, , 0, , 1, [0, , Q, , 0, , -*, 0, , 0, , Yl, gyz - Q>, (ya - yz + 2%), , 3CYl + 2Yz), icy2 - &/I), ., (Y3 - y2 + 2Yl) I, , -4, 0, , that the system AX, 2Yl, , 1, , (1!0, , yz + 2Yd I, , [, , 3, , l-2, , Yl, (Y/z- 2?/1), 10, 0 1, , The condition, , j, , = Y have a solution, , -, , yz, , +, , y3, , =, , 1, (2!, , is thus, , 0, , and if the given scalars yi satisfy this condition,, by assigning a value c to x3 and then computing, , all solutions, , are obtained, , x1 = -$c, 22 =, , + Q(y1 + 2Yd, Bc + tcyz - 2Yd., , Let us observe one final thing about the system AX = Y. Suppose, the entries of the matrix A and the scalars yl, . . . , ym happen to lie in a, sibfield Fl of the field F. If the system of equations AX = Y has a solution with x1, . . . , x, in F, it has a solution with x1, . . . , xn in Fl. F’or,, over either field, the condition for the system to have a solution is that, certain relations hold between ~1, . . . , ym in FI (the relations zi = 0 for, i > T, above). For example, if AX = Y is a system of linear equations, in which the scalars yk and Aij are real numbers, and if there is a solution, in which x1, . . . , xn are complex numbers, then there is a solution with, 21,, . . . , xn real numbers., , Exercises, 1. Find all solutions to the following, coefficient matrix:, , system of equations by row-reducing, , ;a + 2x2 6x3 =, -4x1, + 55.7 =, -3x1 + 622 - 13x3 =, -$x1+, 2x2 *x 73-, , 0, 0, 0, 0, , 1;“., ., [ 1, , 2. Find a row-reduced echelon matrix which is row-equivalent, , A=2, , i, , What are the solutions of AX = O?, , 1+i, , 15, , to, , the, , ’
Page 24 :
16, , Linear Equations, , Chap., , 3. Describe, , explicitly, , all 2 X 2 row-reduced, , 4. Consider, , the system, , 1, , matrices., , echelon, , of equations, , x2 + 2x3 = 1, + 2x3 = 1, Xl - 3x2 + 4x3 = 2., , Xl -, , 2x1, Does this system, , have a solution?, , 5. Give an example, has no solution., 6. Show that, , If so, describe, , of a system, , of two linear, , explicitly, , all solutions., , equations, , in two unknowns, , which, , the system, Xl - 2x2 +, Xl + X2 21, , +, , 7X2, , -, , x3 + 2x4 = 1, + xp = 2, 5X3 X4 = 3, x3, , has no solution., 7. Find all solutions of, 2~~-3~~-7~~+5~4+2x~=, ZI-~XZ-~X~+~X~+, -4X3+2X4+, , 2x1, XI, , -2, x5= -2, , -, , 5X2, , -, , 7x3 +, , 25, , 6x4, , =, , 3, , + 2x5 = -7., , 8. Let, 3, , A=2, , -1, , [ 1, 1, , For which triples, , 2, , 11., , -3, , (yr, y2, y3) does the system, , 0, , AX, , = Y have a solution?, , 9. Let, 3, , For which, , (~1, y2, y3, y4) does the system, , -6, , 2, , -1, , of equations, , AX, , = Y have a solution?, , echelon matrices, and that the, 10. Suppose R and R’ are 2 X 3 row-reduced, systems RX = 0 and R’X = 0 have exactly the same solutions. Prove that R = R’., , 1.5., , Matrix, , Multiplication, , It is apparent (or should be, at any rate) that the process of forming, linear combinations of the rows of a matrix is a fundamental, one. For this, reason it is advantageous, to introduce a systematic scheme for indicating, just what operations are to be performed., More specifically, suppose B, is an n X p matrix over a field F with rows PI, . . . , Pn and that from B we, construct a matrix C with rows 71, . . . , yrn by forming certain linear, combinations, (l-4), , yi = Ail/G + A&, , + . . . + AinPn.
Page 25 :
Matrix Multiplication, , Sec. 1.5, , The rows of C are determined by the mn scalars Aij which are themselves, the entries of an m X n matrix A. If (l-4) is expanded to, (Gil, , * *, , .Ci,>, , =, , i, , 64i,B,1., , r=l, , . . Air&p), , we see that the entries of C are given by, Cij = 5 Ai,Brj., r=l, , DeJnition., , n X p matrix, entry is, , Let A be an m X n matrix over the jield F and let R be an, over I?. The product, AB is the m X p matrix C whose i, j, Cij = 5 Ai,B,j., r=l, , EXAMPLE, , (4, , 10. Here are some products, , of matrices with rational, , [; -: ;I =[ -5 ;I [l; -: ;I, , Here, =, (5, Y-2 = (0, , Yl, , Cb), , [I;, , -1, , 2) = 1 . (5, 2) = -3(5, , 7, , ;, , Ii], , = [-i, , 12, 62, , -8), -3), , = -2(O, =, 5(0, , -1, -1, , 2) + 0. (15, 2) + 1 . (15, , gK, , 4, 4, , s” -8-l, , Here, yz=(9, 73, = (12, , cc>, (4, , [2i], , [-;, , 6, 6, , 1) + 3(3, 1) + 4(3, , 8, 8, , -2), -2), , =[i Xl, , J=[-$2, , 41, , Here, yz = (6, , (0, k>, , 12) = 3(2, , 4), , [0001, [0 01, 2, , 3, , 4, , 0, , 912, , 0, , 8), 8), , entries., , 17
Page 26 :
Linear Equations, , Chap. 1, , It is important to observe that the product of two matrices need not, be defined; the product is defined if and only if the number of columns in, the first matrix coincides with the number of rows in the second matrix., Thus it is meaningless to interchange the order of the factors in (a), (b),, and (c) above. Frequently, we shall write products such as AB without, explicitly mentioning the sizes of the factors and in such cases it will be, understood that the product is defined. From (d), (e), (f), (g) we find that, even when the products AB and BA are both defined it need not be true, that AB = BA; in other words, matrix multiplication, is not commutative., EXAMPLE 11., (a) If I is the m X m identity, , matrix, , and A is an m X n matrix,, , IA=A., AI, , (b) If I is the n X n identity matrix and A is an m X n matrix,, = A., Ok+ = OksmA. Similarly,, (c) If Ok+ is the k X m zero matrix,, , ‘4@BP, , = ()%P., , EXAMPLE 12. Let A be an m X n matrix over F. Our earlier shorthand notation, AX = Y, for systems of linear equations is consistent, with our definition of matrix products. For if, , [:I, [:I, Xl, , x=, , “.”, , &I, , with xi in F, then AX is the m X 1 matrix, , y=, , Yl, y.”, , Ym, , such that yi = Ails1 + Ai2~2 + . . . + Ai,x,., The use of column matrices suggests a notation which is frequently, useful. If B is an n X p matrix, the columns of B are the 1 X n matrices, BI,. . . , BP defined by, lljip., The matrix, , B is the succession of these columns:, B = [BI, . . . , BP]., , The i, j entry of the product, , matrix, , AB is formed, , from the ith row of A
Page 27 :
Matrix Multiplication, , Sec. 1.5, and the jth column of B. The reader should verify, , that the jth column of, , AB is AB,:, AB = [ABI, . . . , A&]., In spite of the fact that a product of matrices depends upon the, order in which the factors are written, it is independent, of the way in, which they are associated, as the next theorem shows., Theorem, 8. If A, B, C are matrices over the field F such that the products BC and A(BC) are defined, then so are the products AB, (AB)C and, , A(BC), , = (AB)C., , Proof. Suppose B is an n X p matrix. Since BC is defined, C is, a matrix with p rows, and BC has n rows. Because A(BC) is defined we, may assume A is an m X n matrix. Thus the product AB exists and is an, m X p matrix, from which it follows that the product (AB)C exists. To, show that A(BC) = (AB)C means to show that, [A(BC)lij, , = [W)Clij, , for each i, j. By definition, , [A(BC)]ij, , = Z A+(BC)rj, = d AC 2 BmCnj, = 6 Z AbmCsj, r 8, , = 2 (AB)i,C,j, 8, = [(AB)C’]ij., , 1, , When A is an n X n (square) matrix, the product AA is defined., We shall denote this matrix by A 2. By Theorem 8, (AA)A = A(AA) or, A2A = AA2, so that the product AAA is unambiguously, defined. This, product we denote by A3. In general, the product AA . . . A (k times) is, unambiguously, defined, and we shall denote this product by A”., Note that the relation A(BC) = (AB)C implies among other things, that linear combinations of linear combinations of the rows of C are again, linear combinations of the rows of C., If B is a given matrix and C is obtained from B by means of an elementary row operation, then each row of C is a linear combination, of the, rows of B, and hence there is a matrix A such that AB = C. In general, there are many such matrices A, and among all such it is convenient and, , 19
Page 28 :
20, , Linear Equations, , Chap. 1, , possible to choose one having a number of special properties., into this we need to introduce a class of matrices., , Before going, , Definition., An m X n matrix is said to be an elementary, matrix, if, it can be obtained from the m X m identity matrix by means of a single elementary row operation., , EXAMPLE 13. A 2 X 2 elementary, following:, , [c 01, 0, , 1’, , c # 0,, , matrix, , is necessarily, , [ 1, , c # 0., , 01, , 0c’, , one of the, , Theorem, 9. Let e be an elementary row operation and let E be the, m X m elementary matrix E = e(1). Then, for every m X n matrix A,, , e(A) = EA., Proof. The point of the proof is that the entry in the ith row, and jth column of the product matrix EA is obtained from the ith row of, E and the jth column of A. The three types of elementary row operations, should be taken up separately. We shall give a detailed proof for an operation of type (ii). The other two cases are even easier to handle than this, one and will be left as exercises. Suppose r # s and e is the operation, ‘replacement of row r by row r plus c times row s.’ Then, , F’+-rk, , Eik =, , rk, , ’, , s 7 i = r., , Therefore,, , In other words EA = e(A)., , 1, , Corollary., Let A and B be m X n matrices over the field F. Then B, is row-equivalent to A if and only if B = PA, where P is a product of m X m, elementary matrices., , Proof. Suppose B = PA where P = E, ’ * * EZEI and the Ei are, m X m elementary, matrices. Then EIA is row-equivalent, to A, and, E,(EIA) is row-equivalent, to EIA. So EzE,A is row-equivalent, to A; and, continuing in this way we see that (E, . . . E1)A is row-equivalent, to A., Now suppose that B is row-equivalent, to A. Let El, E,, . . . , E, be, the elementary, matrices corresponding, to some sequence of elementary, row operations which carries A into B. Then B = (E, . . . EI)A., 1
Page 29 :
Sec. 1.6, , Invertible, , Matrices, , Exercises, 1. Let, , A = [;, , -;, , ;I,, , B= [-J,, , c= r1 -11., , Compute ABC and CAB., 2. Let, A-[%, , Verify directly that A(AB), , -i, , ;],, , B=[;, , -;]a, , = A2B., , 3. Find two different 2 X 2 matrices A such that A* = 0 but A # 0., 4. For the matrix A of Exercise 2, find elementary matrices El, Ez, . . . , Ek, such that, Er ..., , EzElA, , = I., , 5. Let, A=[i, , -;],, , B=, , [-I, , ;]., , Is there a matrix C such that CA = B?, 6. Let A be an m X n matrix and B an n X k matrix. Show that the columns of, C = AB are linear combinations of the columns of A. If al, . . . , (Y* are the columns, of A and yl, . . . , yk are the columns of C, then, yi =, , 2 B,g~p, ?.=I, , 7. Let A and B be 2 X 2 matrices such that AB = 1. Prove that BA = I., 8. Let, , be a 2 X 2 matrix. We inquire when it is possible to find 2 X 2 matrices A and B, such that C = AB - BA. Prove that such matrices can be found if and only if, , Cl1+ czz= 0., , 1.6., , Invertible, , Matrices, , Suppose P is an m X m matrix which is a product of elementary, matrices. For each m X n matrix A, the matrix B = PA is row-equivalent, to B and there is a product Q of elemento A; hence A is row-equivalent, tary matrices such that A = QB. In particular this is true when A is the, , 11
Page 30 :
22, , Linear Equations, , Chap. 1, , m X m identity matrix. In other words, there is an m X m matrix Q,, which is itself a product of elementary matrices. such that QP = I. As, we shall soon see, the existence of a Q with QP = I is equivalent, to the, fact that P is a product of elementary matrices., DeJinition., Let A be an n X n (square) matrix over the field F. An, of A; an n X n, n X n matrix B such that BA = I is called a left inverse, of A. If AB = BA = I,, matrix B such that AB = I is called a right inverse, inverse, of A and A is said to be invertible., then B is called a two-sided, Lemma., , Proof., , Tf A, , has a left inverse B and a right inverse C, then B = C., , Suppose BA = I and AC = I. Then, B = BI = B(AC), , = (BA)C, , = IC = C., , 1, , Thus if A has a left and a right inverse, A is invertible, and has a, unique two-sided inverse, which we shall denote by A-’ and simply call, the inverse, of A., Theorem, , 10., , Let A and B be n X n matrices over E’., , (i) If A is invertible, so is A-l and (A-l)-’, = A., (ii) If both A and B are invertible, so is AR, and (AB)-l, definition., , Proof. The first statement is evident, verification, The second follows upon, (AB)(B-‘A-‘), , Corollary., Theorem, , = (B-‘A-‘)(AB), , A product of invertible, 11., , = B-‘A-‘., , from the symmetry, of the relations, = I., , of the, , 1, , matrices is invertible., , An elementary matrix is invertible., , Proof. Let E be an elementary, matrix corresponding, to the, elementary row operation e. If el is the inverse operation of e (Theorem 2), and El = el(1), then, EE, = e(El) = e(el(I)), , = I, , ElE = cl(E) = el(e(I)), , = I, , and, so that E is invertible, EXAMPLE, , (4, , (b), , 14., , and E1 = E-l., , 1
Page 31 :
Sec. 1.6, , Invertible Matrices, , (cl, , [c’;I-l =[-: !I, , (d) When c # 0,, , Theorem, , 12., , I, , If A is an n X n matrix, the following, , (i) A is invertible., (ii) A is row-equivalent to the n X n identity, (iii) A is a product of elementary matrices., equivalent, , Proof. Let R be a row-reduced, echelon, to A. By Theorem 9 (or its corollary),, , are equivalent., , matrix., matrix, , which, , is row-, , R = EI, . ’ . EzE,A, where El, . . . , Ee are elementary matrices. Each Ei is invertible,, A = EC’..., E’,‘R., , and so, , Since products of invertible, matrices are invertible,, we see that A is invertible if and only if R is invertible., Since R is a (square) row-reduced, echelon matrix, R is invertible, if and only if each row of R contains a, non-zero entry, that is, if and only if R = I. We have now shown that A, is invertible, if and only if R = I, and if R = I then A = EL’ . . . EC’., It should now be apparent that (i), (ii), and (iii) are equivalent statements, about A. 1, Corollary., If A is an invertible n X n matrix and if a sequence of, elementary row operations reduces A to the identity, then that same sequence, of operations when applied to I yields A-‘., Corollary., Let A and B be m X n matrices. Then B is row-equivalent, to A if and only if B = PA where P is an invertible m X m matrix., Theorem, , 13. For an n X n matrix A, the following, , are equivalent., , (i) A is invertible., (ii) The homogeneous system AX = 0 has only the trivial solution, x = 0., (iii) The system of equations AX = Y has a solution X for each n X 1, matrix Y., Proof. According to Theorem 7, condition (ii) is equivalent, to, the fact that A is row-equivalent, to the identity matrix. By Theorem 12,, (i) and (ii) are therefore equivalent., If A is invertible,, the solution of, AX = Y is X = A-‘Y. Conversely,, suppose AX = Y has a solution for, each given Y. Let R be a row-reduced, echelon matrix which is row-, , 23
Page 32 :
24, , Chap. 1, , Linear Equations, equivalent, to A. We wish to show that R = I. That, that the last row of R is not (identically), 0. Let, , Es, , amounts, , to showing, , 0, 0, i., 0, [I 1, , If the system RX = E can be solved for X, the last row of R cannot be 0., We know that R = PA, where P is invertible., Thus RX = E if and only, if AX = P-IE. According to (iii), the latter system has a solution., m, Corollary., , A square matrix, , with either a left or right inverse is in-, , vertible., Proof. Let A be an n X n matrix. Suppose A has a left inverse,, i.e., a matrix B such that BA = I. Then AX = 0 has only the trivial, solution, because X = IX = B(AX)., Therefore, A is invertible., On the, other hand, suppose A has a right inverse, i.e., a matrix C such that, AC = I. Then C has a left inverse and is therefore invertible., It then, follows that A = 6-l and so A is invertible with inverse C. 1, Corollary., Let A = AlA, . . . Ak, where A1 . . . , Ak are n X n (square), matrices. Then A is invertible if and only if each Aj is invertible., , Proof. We have already shown that the product of two invertible, matrices is invertible. From this one sees easily that if each Aj is invertible, then A is invertible., Suppose now that A is invertible., We first prove that Ak is invertible., Suppose X is an n X 1 matrix and AkX = 0. Then AX =, (A1 ... Akel)AkX, = 0. Since A is invertible, we must have X = 0. The, system of equations AkX = 0 thus has no non-trivial, solution, so Ak is, invertible., But now A1 . . . Ak--l = AAa’ is invertible., By the preceding, argument, Ak-l is invertible., Continuing, in this way, we conclude that, each Aj is invertible., u, We should like to make one final comment about the solution of, linear equations. Suppose A is an m X n matrix and we wish to solve the, system of equations AX = Y. If R is a row-reduced, echelon matrix which, is row-equivalent, to A, then R = PA where P is an m X m invertible, matrix. The solutions of the system A& = Y are exactly the same as the, solutions of the system RX = PY (= Z). In practice, it is not much more, difficult to find the matrix P than it is to row-reduce A to R. For, suppose, we form the augmented matrix A’ of the system AX = Y, with arbitrary, scalars yl, . . . , ylnzoccurring in the last column. If we then perform on A’, a sequence of elementary row operations which leads from A to R, it will
Page 33 :
Invertible, , Sec. 1.6, , 25, , Matrices, , become evident what the matrix P is. (The reader should refer to Example 9 where we essentially carried out this process.) In particular,, if A, is a square matrix, this process will make it clear whether or not A is, invertible, and if A is invertible, what the inverse P is. Since we have, already given the nucleus of one example of such a computation,, we shall, content ourselves with a 2 X 2 example., EXAMPLE, , 15. Suppose F is the field of rational, , [ 1, 1 [, , A=, Then, 2, 1, , -1, 3, , y1 (3), yz -, , 1, 2, , 1 [, , 3, 71, , 2, 1, , yz (2), y1 1, 0, , 3’, , 1, 0, , 3, , -7, , 3, 1 S(2yB2, , =, , and, , -l, , from which it is clear that A is invertible, A-’, , numbers, , (1), 1, , y.2, y1-2yz, , -, , 0, 1, , 1 [, (2), , y1) -, , 1, 0, , 3(y2 + 3YI), 4@Y, - Yl), , 1, , and, , [-4++, 31., , It may seem cumbersome to continue writing the arbitrary, scalars, of inverses. Some people find it less awkward, Yl, Y-2,. . . in the computation, to carry along two sequences of matrices, one describing the reduction of, A to the identity, and the other recording the effect of the same sequence, of operations starting from the identity. The reader may judge for himself which is a neater form of bookkeeping., EXAMPLE, , 16. Let us find the inverse, , of, , 1, 1, 1, 1, , 0, 0, 1, , 0, 0, 1, 0, 0, 1, 0, 0, 180
Page 34 :
Linear Equations, , 1f, [ 1, 0, , 1, , 0, 09, , 001, 10, , 0, [ 00, , 0, , 1, , 07, 1I, , Chap. 1, , 1, , -36, _--9 30, 9, -36, 30, , 192 -180, 180, -18060, -60, -36, 30, 192 -180 ., -180, 180I, , -, , It must have occurred to the reader that we have carried on a lengthy, discussion of the rows of matrices and have said little about the columns., We focused our attention on the rows because this seemed more natural, from the point of view of linear equations. Since there is obviously nothing, sacred about rows, the discussion in the last sections could have been, carried on using columns rather than rows. If one defines an elementary, column operation and column-equivalence, in a manner analogous to that, of elementary row operation and row-equivalence,, it is clear that each, m X n matrix will be column-equivalent, to a ‘column-reduced, echelon’, matrix. Also each elementary, column operation, will be of the form, A + AE, where E is an n X n elementary matrix-and, so on., , Exercises, 1. Let, , Find a row-reduced echelon matrix R which is row-equivalent, vertible 3 X 3 matrix P such that R = PA., , to A and an in-, , 2. Do Esercise 1, but with, , A = [;, , -3, , 21., , 3. For each of the two matrices, , use elementary row operations to discover whether it is invertible,, inverse in case it is., 4. Let, 5, , A=, , 0, , 0, , [ 1, 015 1 0., 5, , and to find the
Page 35 :
Sec. 1.6, , Invertible, , For which X does there exist a scalar c such that AX, 5. Discover, , Matrices, , 27, , = cX?, , whether, 1, , 2, , 3, , 4, , 0, 0, , 0, 0, , 3, 0, , 4, 4, , [ 1, , A=O234, , is invertible,, , and find A-1 if it exists., , 6. Suppose A is a 2 X I matrix, is not invertible., , and that B is a 1 X 2 matrix., , Prove that C = AB, , 7. Let A be an n X n (square) matrix. Prove the following two statements:, (a) If A is invertible, and AB = 0 for some n X n matrix B, then B = 0., (b) If A is not invertible,, then there exists an n X n matrix, B such that, AB = 0 but B # 0., 8. Let, , Prove, using elementary, (ad - bc) # 0., , row operations,, , that, , A is invertible, , if and, , only, , if, , if Ai, = 0 for i > j, that is,, 9. An n X n matrix A is called upper-triangular, if every entry below the main diagonal is 0. Prove that an upper-triangular, (square), matrix is invertible, if and only if every entry on its main diagonal is different, from 0., 10. Prove the following, generalization, of Exercise 6. If A is an m X n matrix,, B is an n X m matrix and n < m, then AB is not invertible., , 11. Let A be an m X n matrix. Show that by means of a finite number of elementary row and/or column operations, one can pass from A to a matrix R which, is both ‘row-reduced, echelon’ and ‘column-reduced, echelon,’ i.e., Rii = 0 if i # j,, Rii = 1, 1 5 i 5 r, Rii = 0 if i > r. Show that R = PA&, where P is an inn X n matrix., vertible m X m matrix and Q is an invertible, 12. The result of Example, , is invertible, , and A+, , 16 suggests that perhaps, , has integer, , entries., , the matrix, , Can you prove that?
Page 36 :
2. Vector, , 2.1., , Vector, , Spaces, , Spaces, , In various parts of mathematics,, one is confronted with a set, such, that it is both meaningful, and interesting to deal with ‘linear combinations’ of the objects in that set. For example, in our study of linear equations we found it quite natural to consider linear combinations, of the, rows of a matrix. It is likely that the reader has studied calculus and has, dealt there with linear combinations, of functions; certainly this is so if, he has studied differential, equations. Perhaps the reader has had some, experience with vectors in three-dimensional, Euclidean, space, and in, particular, with linear combinations of such vectors., Loosely speaking, linear algebra is that branch of mathematics which, treats the common properties of algebraic systems which consist of a set,, together with a reasonable notion of a ‘linear combination’, of elements, in the set. In this section we shall define the mathematical, object which, experience has shown to be the most useful abstraction, of this type of, algebraic system., Dejhition., , A vector, , space, , (or linear space) consists of the following:, , 1. a field F of scalars;, 2. a set V of objects, called vectors;, 3. a rule (or operation), called vector addition, which associates with, each pair of vectors cy, fl in V a vector CY+ @in V, called the sum of (Y and &, in such a way that, (a) addition is commutative, 01+ /I = ,k?+ CI;, (b) addition is associative, cx + (p + y) = (c11+ p) + y;, 28
Page 37 :
Sec. 2.1, , Vector Spaces, , (c) there is a unique vector 0 in V, called the zero vector, such that, = aforallarinV;, (d) for each vector (Y in V there is a unique vector --(Y in V such that, a! + (-a), = 0;, 4. a rule (or operation), called scalar multiplication,, which associates, with each scalar c in F and vector (Y in V a vector ca in V, called the product, of c and 01,in such a way that, (a) la = LYfor every a! in V;, (b) (eda, = C~(CBCY), ;, (c) c(a + P) = Cm!+ @;, (4 (cl + cz)a = cla + czar., a+0, , It is important, to observe, as the definition, states, that a vector, space is a composite object consisting of a field, a set of ‘vectors,’ and, two operations with certain special properties. The same set of vectors, may be part of a number of distinct vector spaces (see Example 5 below)., When there is no chance of confusion, we may simply refer to the vector, space as V, or when it is desirable to specify the field, we shall say V is, space over the field, F. The name ‘vector’ is applied to the, a vector, elements of the set V largely as a matter of convenience. The origin of, the name is to be found in Example 1 below, but one should not attach, too much significance to the name, since the variety of objects occurring, as the vectors in V may not bear much resemblance to any preassigned, concept of vector which the reader has. We shall try to indicate this, variety by a list of examples; our list will be enlarged considerably as we, begin to study vector spaces., EXAMPLE 1. The n-tuple, space, F n. Let F be any field, and let V be, the set of all n-tuples (Y = (x1, Q, . . . , 2,) of scalars zi in F. If p =, (Yl, Yz, . . . , yn) with yi in F, the sum of (Y and p is defined by, (2-l), The product, (2-2), , a, , +, , P, , =, , (21, , +, , y/1,, , 22, , of a scalar c and vector, , +, , yz,, , . f f , &, , +, , Y/n>., , LYis defined by, , ca = (CZl, cz2, . . . , CZJ ., , The fact that this vector addition and scalar multiplication, satisfy conditions (3) and (4) is easy to verify, using the similar properties of addition and multiplication, of elements of F., EXAMPLE 2. The space of m X n matrices,, Fmxn. Let F be any, field and let m and n be positive integers. Let Fmxn be the set of all m X n, matrices over the field F. The sum of two vectors A and B in FmXn is defined by, (2-3), , (A + B)ii = Aij + Bij., , 29
Page 38 :
SO, , Vector Spaces, The product, , Chap. 2, of a scalar c and the matrix A is defined, (cA)ij, , (z-4), , by, , = CAij., , Note that F1xn = Fn., from, a set to a field. Let F be, EXAMPLE 3. The space of functions, any field and let S be any non-empty set. Let V be the set of all functions, from the set S into F. The sum of two vectors f and g in V is the vector, f + g, i.e., the function from S into F, defined by, , (2-5), The product, , (f + g)(s) = f(s) + g(s)., of the scalar c and the function f is the function, , cf defined by, , (cf) (8) = cf(s>., (2-6), The preceding examples are special cases of this one. For an n-tuple of, elements of F may be regarded as a function from the set S of integers, 1 . . , n into F. Similarly, an m X n matrix over the field F is a function, fkm the set S of pairs of integers, (;,j), 1 < i I m, 1 5 j < n, into the, field F. For this third example we shall indicate how one verifies that the, operations we have defined satisfy conditions, (3) and (4). For vector, addition:, (a) Since addition, , in F is commutative,, , f(s) + g(s) = g(s) + f(s), for each s in S, so the functions f + g and g + f are identical., (b) Since addition in F is associative,, , f(s) + [g(s) + h(s)1= [f(s) + g(s)1+ h(s), for each, (c), element, (cl), , s, so f + (g + h) is the same function as (f + g) + h., The unique zero vector is the zero function which assigns to each, of S the scalar 0 in F., For each f in V’, (-f), is the function which is given by, , (-f>(S), = -f(s)., The reader should find it easy to verify that scalar multiplication, satisfies the conditions of (4), by arguing as we did with the vector addition., EXAMPLE 4. The space of polynomial, functions, over a field, F., Let F be a field and let V be the set of all functions f from F into F which, have a rule of the form, (z-7), , f(z), , = co + c111:+ * . . + c&P, , where co, cl, . . . , c, are fixed scalars in F (independent, of x). A funcfunction, on F. Let addition, tion of this type is called a polynomial, and scalar multiplication, be defined as in Example 3. One must observe, here that if f and g are polynomial functions and c is in F, then f + g and, cf are again polynomial functions.
Page 39 :
Vector Spaces, , Sec. 2.1, , EXAMPLE, 5. The field C of complex numbers may be regarded as a, vector space over the field R of real numbers. More generally, let F be the, field of real numbers and let V be the set of n-tuples o( = (51, . . . , z,), where zl, . . . , x, are complex numbers. Define addition of vectors and, scalar multiplication, by (2-l) and (a-2), as in Example 1. In this way we, obtain a vector space over the field R which is quite different from the, space C” and the space R”., , There are a few simple facts which follow almost immediately, from, the definition of a vector space, and we proceed to derive these. If c is, a scalar and 0 is the zero vector, then by 3(c) and 4(c), co = c(0 + 0) = co + co., Adding, (243), Similarly,, , - (CO) and using 3(d), we obtain, co = 0., for the scalar 0 and any vector, , (Y we find that, , Oa = 0., (2-9), If c is a non-zero scalar and O(is a vector such that ccz = 0, then by (2-8),, c-l(co() = 0. But, ccl(ca), , = (c-lc)a, , = lcr = ck!, , hence, a! = 0. Thus we see that if c is a scalar and QI a vector, C(Y= 0, then either c is the zero scalar or a is the zero vector., If cxis any vector in V, then, 0 = OCY= (1 - l)a = la + (-1)cr, from which it follows, , such that, , = Ly + (-l)a, , that, , (2-10), , (-l)a!, , = -(Y., , Finally, the associative and commutative, properties of vector addition, imply that a sum involving, a number of vectors is independent of the way, in which these vectors are combined and associated. For example, if, (~1, Q, cy3,CQare vectors in V, then, (a1 + 4, , + (a3 +, , and such a sum may be written, , a4), , = [Iw+ (a1+ 41 +, , without, , ffl + a2 +, , confusion, ff3, , +, , ff4, , as, , ff4., , Dejhition., A vector p in V is said to be a linear, combination, of the, vectors (~1,. . . , CY,in V provided there exist scalars cl, . . . , c, in F such that, , p = ClcYl+ . . . + cnffn, , 31
Page 40 :
Vector Spaces, , Chap. 2, , Other extensions of the associative property of vector addition and, the distributive, properties 4(c) and 4(d) of scalar multiplication, apply, to linear combinations:, 5, , i=l, , CiW, , +, , j,, , c ;%,, , hi, , C&i, , =, , =, , jl, , (Ci, , 5, i=l, , (CCJoLi., , +, , d;)CCi, , Certain parts of linear algebra are intimately, related to geometry., The very word ‘space’ suggests something geometrical, as does the word, ‘vector’ to most people. As we proceed with our study of vector spaces,, the reader will observe that much of the terminology, has a geometrical, connotation. Before concluding this introductory, section on vector spaces,, we shall consider the relation of vector spaces to geometry to an extent, which will at least indicate the origin of the name ‘vector space.’ This, will be a brief intuitive discussion., Let us consider the vector space R3. In analytic geometry, one identifies triples (x1, x2, x3) of real numbers with the points in three-dimensional, Euclidean space. In that context, a vector is usually defined as a directed, line segment PQ, from a point P in the space to another point Q. This, amounts to a careful formulation, of the idea of the ‘arrow’ from P to Q., As vectors are used, it is intended that they should be determined, by, their length and direction. Thus one must identify two directed line segments if they have the same length and the same direction., The directed line segment PQ, from the point P = (x1, x2: x3) to the, point Q = (yl, yz, y3), has the same length and direction as the directed, line segment from the origin 0 = (0, 0, 0) to the point (yl - x1, yz - x2,, y3 - x3). Furthermore,, this is the only segment emanating from the origin, which has the same length and direction as PQ. Thus, if one agrees to, treat only vectors which emanate from the origin, there is exactly one, vector associated with each given length and direction., The vector OP, from the origin to P = (zr, x2, x3), is completely determined by P, and it is therefore possible to identify this vector with the, point P. In our definition of the vector space R3, the vectors are simply, defined to be the triples (x1, x2, 2,)., Given points P = (x1, x2, x3) and Q = (yl, y2, y3), the definition, of, the sum of the vectors OP and OQ can be given geometrically., If the, vectors are not parallel, then the segments OP and OQ determine a plane, and these segments are two of the edges of a parallelogram, in that plane, (see Figure 1). One diagonal of this parallelogram, extends from 0 to a, point S, and the sum of OP and OQ is defined to be the vector OS. The, coordinates of the point S are (x1 + ylr x2 + ~2, x3 + 1~3) and hence this, geometrical definition, of vector addition is equivalent, to the algebraic, definition of Example 1.
Page 41 :
Sec. 2.1, , Vector Spaces, , -\\\, P(Xl,XZIX3), , FIGURE, , 1, , Scalar multiplication, has a simpler, geometric, interpretation., If c is, a real number,, then the product, of c and the vector OP is the vector from, the origin, with length, Ic/ times the length, of OP and a direction, which, is opposite, to the, agrees with the direction, of OP if c > 0, and which, direction, of OP if c < 0. This scalar multiplication, just yields the vector, OT where T = (~1, ~52, CQ), and is therefore, consistent, with the algebraic, definition, given for R3., From time to time, the reader will probably, find it helpful, to ‘think, geometrically’, about vector spaces, that is, to draw pictures, for his own, benefit to illustrate, and motivate, some of the ideas. Indeed,, he should do, this. However,, in forming, such illustrations, he must bear in mind, that,, because we are dealing, with vector spaces as algebraic, systems,, all proofs, we give will be of an algebraic, nature., , Exercises, 1. If F is a field, verify, the field F., , that Fn (as defined, , in Example, , 2. If V is a vector space over the field F, verify, , (a1 +4, , 1) is a vector, , space over, , that, , + (013+ ad = [a* + (a3 + aJ1 + a4, , for all vectors o(1, 01~,cy3, and a4 in V., 3. If C is the field of complex numbers, which, tions of (I, 0, -l),, (0, 1, I), and (1, 1, l)?, , vectors in C3 are linear, , combina-, , 33
Page 42 :
34, , Vector Spaces, , Chap. 2, , 4. Let V be the set of all pairs (2, y) of real numbers, and let F be the field of, real numbers. Define, , (2, Y) + CQ, Yd = (z + Xl, Y + YJ, c(z, Y> = CC?Y)., Is V, with these operations, a vector space over the field of real numbers?, 5. On Rn, define two operations, cU@P=cY-p, c . ff = -ccY., The operations on the right are the usual ones. Which of the axioms for a vector, space are satisfied by (R”, 0, .)?, 6. Let V be the set of all complex-valued functions f on the real line such that, (for all t in R), n-0, = f(t)., The bar denotes complex conjugation. Show that V, with the operations, (f + g)(t) = f(t) + g(t), (cf)(t) = d(t), is a vector space over the field of real numbers. Give an example of a function in V, which is not real-valued., 7. Let V be the set of pairs (2, y) of real numbers and let F be the field of real, numbers. Define, (z,, , Y), , +, , h, 45, , YJ, Y), , =, , (x, , =, , CC&, , +, , a01, 0)., , Is V, with these operations, a vector space?, , 2.2., , Subspaces, In this section we shall introduce, study of vector spaces., , some of the basic concepts in the, , De$nition., Let V be a vector space over the jield F. A subspace, of V, is a subset W of V which is itself a vector space over F with the operations of, vector addition and scalar multiplication, on V., , A direct check of the axioms for a vector space shows that the subset, W of V is a subspace if for each (Y and p in W the vector LY+ /3 is again, in W; the 0 vector is in W; for each o( in W the vector (-a) is in W; for, each a! in W and each scalar c the vector CQ!is in W. The commutativity, and associativity, of vector addition, and the properties, (4)(a), (b), (c),, and (d) of scalar multiplication, do not need to be checked, since these, are properties of the operations on V. One can simplify things still further.
Page 43 :
Subspaces, , Sec. 2.2, , Theorem, 1. A non-empty subset W of V is a subspace of V if and only, if for each pair of vectors cw,0 in W and each scalar c in F the vector cat + p, is again in W., , Proof. Suppose that W is a non-empty, subset of V such that, CO(+ p belongs to W for all vectors cy,fl in W and all scalars c in F. Since, W is non-empty, there is a vector p in W, and hence (- 1)~ + p = 0 is, in W. Then if (Yis any vector in W and c any scalar, the vector ca = ca! + 0, is in W. In particular,, (- l)o( = - LYis in W. Finally, if cr and p are in W,, then ac + p = lo( + p is in W. Thus W is a subspace of V., Conversely, if W is a subspace of V, o( and /3 are in W, and c is a scalar,, certainly C(Y+ p is in W. 1, Some people prefer to, definition of a subspace. It, is that, if W is a non-empty, p in W and all c in F, then, vector space. This provides, , use the ca! + /3 property in Theorem 1 as the, makes little difference. The important, point, subset of V such that C(Y+ p is in V for all 01,, (with the operations inherited from V) W is a, us with many new examples of vector spaces., , EXAMPLE 6., (a) If V is any vector space, V is a subspace of V; the subset consisting of the zero vector alone is a subspace of V, called the zero subspace of V., (b) In F”, the set of n-tuples (51, . . . , 2,) with 51 = 0 is a subspace;, however, the set of n-tuples with x1 = 1 + 22 is not a subspace (n 2 2)., (c) The space of polynomial functions over the field F is a subspace, of the space of all functions from F into F., if, (d) An n X n (square) matrix A over the field F is symmetric, Aij = Aji for each i and j. The symmetric matrices form a subspace of, the space of all n X n matrices over F., (e) An n X n (square) matrix A over the field C of complex num(or self-adjoint), if, bers is Hermitian, , for each j, k, the bar denoting complex, Hermitian if and only if it has the form, x, [ x-iy, , conjugation., , x + iy, w, , A 2 X 2 matrix, , is, , 1, , where x, y, x, and w are real numbers. The set of all Hermitian, matrices, is not a subspace of the space of all n X n matrices over C. For if A is, Hermitian,, its diagonal entries A,,, A22, . . . , are all real numbers, but the, diagonal entries of iA are in general not real. On the other hand, it is easily, verified that the set of n X n complex Hermitian, matrices is a vector, space over the field R of real numbers (with the usual operations)., , 35
Page 44 :
Vector Spaces, , Chap. 2, , space of a system, of homogeneous, EXAMPLE 7. The solution, equations., Let A be an m X n matrix over F. Then the set of all, n X 1 (column) matrices X over F such that AX = 0 is a subspace of the, space of all n X 1 matrices over F. To prove this we must show that, A (cX + Y) = 0 when AX = 0, A Y = 0, and c is an arbitrary scalar in F., This follows immediately, from the following general fact., , linear, , Lemma., , If A is an m X n matrix over F and B, C are n X p matrices, , over F then, (2-11), , A(dB, , + C) = d(AB), , + AC, , for each scalar d in F., Proof., , [A(dB, , + C)]ii, , = T Ao(dB, = T (dAdkj, , + C)ki, + Ad’kj), , = dZ AikBkj + Z AikCkj, k, , k, , = d(AB)<j, , + (AC)ii, , = [d(AB), , + AC]+, , 1, , Similarly, one can show that (dB + C)A = d(BA), matrix sums and products are defined., , + CA,, , if the, , Theorem, 2. Let V be a vector space over the field F. The intersection, of any collection of subspaces of V is a subspace of V., , Proof. Let {Wa} be a collection of subspaces of V, and let W =, n W, be their intersection. Recall that W is defined as the set of all elea, ments belonging to every W, (see Appendix). Since each W, is a subspace,, each contains the zero vector. Thus the zero vector is in the intersection, W? and W is non-empty. Let (Y and /3 be vectors in W and let c be a scalar., By definition of W, both (Y and /3 belong to each Wa, and because each W,, is a subspace, the vector (ca + /3) is in every W,. Thus (ca + /3) is again, in W. By Theorem 1, W is a subspace of V. 1, From Theorem 2 it follows that if S is any collection of vectors in V,, then there is a smallest subspace of V which contains S, that is, a subspace which contains S and which is contained in every other subspace, containing S., De$nition., Let S be a set of vectors in a vector space V. The suhspace, spanned, by S is dejked to be the intersection W of all subspaces of V which, , contain S. When S is a jinite set of vectors, S = {CQ, CQ, . . . , cq,}, we shall, spanned, by the vectors, al, cz2, . . . , CC,., simply call W the suhspace
Page 45 :
Subspaces, , Sec. 2.2, , Theorem, 3. The subspace spanned by a non-empty subset S of a vector, space V is the set of all linear combinations of vectors in S., , Proof., combination, , Let W be the subspace spanned, , by S. Then, , each linear, , ff = Xl(Y1 + x2cY2+ . . . + x&!,, of vectors al, (Ye,. . . , (Y,,,in S is clearly in W. Thus W contains the set L, of all linear combinations of vectors in S. The set L, on the other hand,, contains S and is non-empty., If (Y, /3 belong to L then CYis a linear, combination,, CY= Xlffl + x2ayz+ * f * + XmQ,, of vectors, , (pi in S, and ,B is a linear combination,, P =, , of vectors, , YlPl, , +, , Y2P2, , +, , * * * +, , Y&I, , @j in S. For each scalar c,, Cff + P = 5 (CXi)ai +jgl, , yjPj*, , i=l, , Hence ca! + ,Obelongs to L. Thus L is a subspace of V., Now we have shown that L is a subspace of V which contains S, and, also that any subspace which contains S contains L. It follows that L is, the intersection of all subspaces containing S, i.e., that L is the subspace, spanned by the set S. 1, DeJinition., , If S1, S2, . . . , Sk are subsets of a vector space V, the set of, , all sums, of the subsets &, S2, . . . , Sk and is de-, , of vectors cq in Si is called the sum, noted by, sl, , +, , f32, , +, , “., , +, , Sk, , or by, 2 Si., i=l, , If WI, w,, . . . ) Wk are subspaces of V, then the sum, w = w, + w2 + ..., , + wr,, , is easily seen to be a subspace of V which contains each of the subspaces, Wi. From this it follows, as in the proof of Theorem 3, that W is the subspace spanned by the union of WI, W2, . . . , Wk., EXAMPLE 8. Let F be a subfield, Suppose, , of the field C of complex, , numbers., , 37
Page 46 :
58, , Vector Spaces, , Chap. 2, a1, , =, , (1,2,0,3,0), , = (0,0,1,4,0), a3 = (0, 0, 0, 0, 1)., , a2, , By Theorem 3, a vector a is in the subspace W of F5 spanned by aI, a2, a3, if and only if there exist scalars cl, c2, c3 in F such that, ff, , =, , Clctl, , Thus W consists of all vectors, a, , =, , c2a2, , +, , c3cY3., , of the form, , (cl,, , where cl, c2, c3 are arbitrary, as the set of all Stuples, , +, , 2~1,, , ~2,, , 3s + 4~2, ~3), , scalars in F. Alternatively,, cf, , =, , (a,, , x2,, , 23,, , 24,, , W can be described, , 25), , with xi in F such that, x2 = 2x1, x4, = 3x1 + 4x3., Thus (-3,, , -6,, , 1, - 5, 2) is in W, whereas, , (2, 4, 6, 7, 8) is not., , EXAMPLE 9. Let F be a subfield of the field C of complex numbers,, and let V be the vector space of all 2 X 2 matrices over F. Let WI be the, subset of V consisting of all matrices of the form, , [zx 01, Y, , where 2, y, z are arbitrary scalars in F. Finally,, consisting of all matrices of the form, , let W2 be the subset of V, , [0x 01, Y, , where x and y are arbitrary, of V. Also, , scalars in F. Then, , WI and W2 are subspaces, , v = Wl + w,, because, , The subspace WI n W2 consists of all matrices, , 2, 0, 0, [ 0’1, , of the form, , EXAMPLE 10. Let A be an m X n matrix over a field F. The row, vectors, of A are the vectors in F” given by (Y~= (Ail, . . . , Ai,), i = 1, . . . ,, m. The subspace of Fn spanned by the row vectors of A is called the row
Page 47 :
Subspaces, , Sec. 2.2, of A. The subspace considered, , space, , in Example, , 8 is the row space of the, , matrix, , It is also the row space of the matrix, , EXAMPLE, 11. Let V be the space of all polynomial functions over F., Let S be the subset of V consisting of the polynomial functions fo, fi, fi, . . ., defined by, n = 0, 1, 2, . . . ., fn(x> = xn,, , Then V is the subspace spanned by the set S., , Exercises, 1. Which, ofRn(n>3)?, (a), (b), (c), (d), (e), , all, all, all, all, all, , of the following, (Y such, 01 such, a! such, a! such, a! such, , that, that, that, that, that, , sets of vectors, , cx = (al, . . . , a,) in R” are subspaces, , aI 2 0;, al + 3az = al;, u2 = a:;, ala2 = 0;, a2 is rational., , 2. Let V be the (real) vector space of all functions, following sets of functions are subspaces of V?, (a), (b), (c), (d), (e), , f from R into R. Which, , of the, , all f such that f(S) = f(z) *;, all f such that f(0) = f(l) ;, allfsuch, thatf(3), = 1 +f(-5);, allfsuch, thatf(-1), = 0;, all f which are continuous., , 3. Is the vector (3, -1, 0, - 1) in the subspace, (2, -1, 3, 2), (-1, 1, 1, -3), and (1, 1, 9, -5)?, , of RS spanned, , 4. Let W be the set of all (x1, x2, Q, zr4, Q,) in R5 which satisfy, , 2x1 Xl, , x2, , +, , Qx1, , +, , $23, , -, , x4, -, , x5, , =, , 0, , =, , 0, , 9x1 - 3x2 + 6x3 - 3x4 - 325 = 0., Find a finite, , set of vectors which spans W., , by the vectors, , 39
Page 48 :
40, , Vector Spaces, , Chap., , 2, , 5. Let I” be a field and let 12 be a positive integer (n 2 2). Let V be the vector, space of all n X n matrices over Ii’. Which of the following sets of matrices B in V, are subspaces of V?, (a), (b), (c), (d), , all, all, all, all, , invertible, A;, non-invertible, A;, A such that AB = &I,, A such that A2 = A., , where B is some fixed matrix, , in V;, , 6. (a) Prove that the only subspaces of R1 are R1 and the zero subspace., (b) Prove that a subspace of R* is R2, or the zero subspace, or consists of all, scalar multiples, of some fixed vector in R2. (The last type of subspace is, intuitively,, a straight line through the origin.), (c) Can you describe the subspaces of R3?, 7. Let WI and WZ be subspaces of a vector space V such that the set-theoretic, union of WI and Wz is also a subspace. Prove that one of the spaces Wi is contained, in the other., 8. Let 7.7 be the vector space of all functions, from R into R; let 8, be the, subset of even functions,, f(-2), = f(s); let V, be the subset of odd functions,, f(-z), = -f(z)., (a) Prove that, (b) Prove that, (c) Prove that, , 8, and V, are subspaces of V., V, + V, = V., V, n V, = (0)., , 9. Let WI and Wz be subspaces of a vector space V such that WI + Wz = V, and WI n W2 = (0). Prove that for each vector LY in V there are unique vectors, (Ye in WI and (Ye in W2 such that a = crI + LYE., , 2.3., , Bases and, , Dimension, , We turn now to the task of assigning, a dimension, to certain, vector, spaces. Although, we usually, associate, ‘dimension’, with something, geometrical, we must find a suitable, algebraic, definition, of the dimension, of a, vector space. This will be done through, the concept of a basis for the space., Let V be a vector space over F. A subset S of TJ is said to, dependent, (or simply, dependent), if there exist distinct, vectors, (Y,,, in, S, and, scalars, cl,, c2,, ., ., ., ,, c,, in, F,, not, all, of, which, are 0,, ), , DeJinition., , be linearly, a,, , a-2,, , . . ., , such that, Clcrl + c.gY2+ . * - + C&t, = 0., A set which is not linearly, dependent, is called linearly, independent., the set S contains only$nitely, many vectors q, o(~, . . . , LY,, we sometimes, that cq, a2, . . . , 01, are dependent, (or independent), instead of saying, dependent, (or independent), ., , If, say, S is
Page 49 :
Bases and Dimension, , Sec. 2.3, , The following, , are easy consequences, , of the definition., , 1. Any set which contains a linearly dependent set is linearly dependent., 2. Any subset of a linearly independent, set is linearly independent., 3. Any set which contains the 0 vector is linearly dependent;, for, 1 * 0 = 0., 4. A set X of vectors is linearly independent if and only if each finite, subset of S is linearly independent,, i.e., if and only if for any distinct, vectors cq, . . . , a, of X, clczl + . . . + c,(Y, = 0 implies each ci = 0., Definition., Let V be a vector space. A basis for V is a lineady, independent set of vectors in V ‘which spans the space V. The space V is finitedimensional, if it has aJinite, basis., , EXAMPLE 12. Let F be a subfield, vectors, w=(, a3, , = (, , 3,0,-3), 1,, 2), 4, 2, -2), , a4, , = (, , 2,, , a2 = (-1,, , are linearly, , dependent,, , of the complex numbers., , 1,, , In F3 the, , 1), , since, 201 + 2cYz -, , cY3, , + 0 ., , a4, , = 0., , The vectors, El = 0, 0, 0), = (0, 1, 0), E3 = (0, 0, 1), E2, , are linearly, , independent, , EXAMPLE 13. Let F be a field and in Fn let S be the subset consisting, of the vectors cl, c2, . . . , G, defined by, t1 = (1, 0, 0,, c2 = (0, 1, 0,, . . . ., tn = (0, 0, 0,, , . . . , 0), . . . ) 0), . . ., . . . ) 1)., , Let x1, x2, . . . , xn be scalars in F and put (Y = xlcl + x2c2 + . . . + x,E~., Then, (2-12), , a=, , (x1,52 )...), , 5,)., , This shows that tl, . . . , E, span Fn. Since a! = 0 if and only if x1 =, x2 = . . . = 5, = 0, the vectors Q, . . . , E~ are linearly independent., The, set S = {q, . . . , en} is accordingly a basis for Fn. We shall call this parbasis of P., ticular basis the standard, , 41
Page 50 :
Chap. 2, , Vector Spaces, , EXAMPLE 14. Let P be an invertible, n X n matrix with entries in, the field F. Then PI, . . . , P,, the columns of P, form a basis for the space, of column matrices, FnX1. We see that as follows. If X is a column matrix,, then, PX = XlPl + . . * + xnPn., Since PX = 0 has only the trivial, solution X = 0, it follows that, {Pl, . . . , P,} is a linearly independent set. Why does it span FnX1? Let Y, be any column matrix. If X = P-‘Y, then Y = PX, that is,, Y = XlPl + * ’ * + G&P,., so (Pl, . . . , Pn) is a basis for Fnxl., EXAMPLE 15. Let A be an m X n matrix and let S be the solution, space for the homogeneous system AX = 0 (Example 7). Let R be a rowreduced echelon matrix which is row-equivalent, to A. Then S is also the, solution space for the system RX = 0. If R has r non-zero rows, then the, system of equations RX = 0 simply expresses r of the unknowns x1, . . . , xn, in terms of the remaining (n - r) unknowns xi. Suppose that the leading, non-zero entries of the non-zero rows occur in columns kl, . . . , k,. Let J, be the set consisting of the n - r indices different from kl, . . . , k,:, J = (1, . . . , n} -, , {kl, . . . , IGT}., , The system RX = 0 has the form, xk,, , i-, , i? cljxj, J, , =, , 0, , xk,, , +, , Z, J, , =, , 0, , G$j, , where the cij are certain scalars. All solutions are obtained by assigning, (arbitrary), values to those xj’s with j in J and computing the corresponding values of xk,, . . . , %k,. For each j in J, let Ei be the solution obtained, by setting xj = 1 and xi = 0 for all other i in J. We assert that the (n - r), vectors Ej, j in J, form a basis for the solution space., Since the column matrix Ej has a 1 in row j and zeros in the rows, indexed by other elements of J, the reasoning of Example 13 shows us, that the set of these vectors is linearly independent., That set spans the, solution space, for this reason. If the column matrix T, with entries, t1, . . . , t,, is in the solution space, the matrix, N = 2; tjEj, J, , is also in the solution space and is a solution such that xi = tj for each, j in J. The solution with that property is unique; hence, N = T and T is, in the span of the vectors Ej.
Page 51 :
Bases and Dimension, , Sec. 2.3, , EXAMPLE 16. We shall now give an example of an infinite basis. Let, F be a subfield of the complex numbers and let I’ be the space of polynomial functions over F. Recall that these functions are the functions, from F into F which have a rule of the form, f(x), , = co + ClX + * . . + c,xn., , Let 5(z) = xk, Ic = 0, 1, 2, . . . . The (infinite) set {fo, fr, fi, . . .} is a basis, for V. Clearly the set spans V, because the functionf, (above) is, f = cofo + Clfl + * * * + cnfn., The reader should see that this is virtually, a repetition of the definition, of polynomial function, that is, a function f from F into F is a polynomial, function if and only if there exists an integer n and scalars co, . . . , c, such, that f = cofo + . . . + cnfn. Why are the functions independent?, To show, that the set {fo, h, .h, . . .} is independent means to show that each finite, subset of it is independent., It will suffice to show that, for each n, the set, Suppose that, dfo, * * * , fn) is independent., Cojfo, , + * * * + cJn = 0., , This says that, co + ClX + * * * + cnxn = 0, for every x in F; in other words, every x in F is a root of the polynomial, f(x) = co + ClX + * . . + cnxn. We assume that the reader knows that a, polynomial of degree n with complex coefficients cannot have more than n, distinct roots. It follows that co = cl = . . . = c, = 0., We have exhibited an infinite basis for V. Does that mean that V is, not finite-dimensional?, As a matter of fact it does; however, that is not, immediate from the definition, because for all we know V might also have, a finite basis. That possibility is easily eliminated., (We shall eliminate it, in general in the next theorem.) Suppose that we have a finite number of, polynomial functions gl, . . . , gT. There will be a largest power of z which, appears (with non-zero coefficient) in gl(s), . . . , gJx). If that power is Ic,, clearly fk+l(x) = xk+’ is not in the linear span of 91, . . . , g7. So V is not, finite-dimensional., A final remark about this example is in order. Infinite bases have, nothing to do with ‘infinite linear combinations.’, The reader who feels an, irresistible urge to inject power series, , co, z CkXk, k=O, , into this example should study the example carefully again. If that does, not effect a cure, he should consider restricting, his attention, to finitedimensional spaces from now on., , 43
Page 52 :
44, , Vector Spaces, , Chap. 2, , Theorem, 4. Let V be a vector space which is spanned by a finite set of, vectors PI, & . . . , Pm. Then any independent set of vectors in V is jinite and, contains no more than m elements., , Proof. To prove the theorem it suffices to show that every subset, X of V which contains more than m vectors is linearly dependent. Let S be, such a set. In S there are distinct vectors (~1, Q, . . . , (Y, where n > m., Since pl, . . . , Pm span V, there exist scalars Aij in F such that, , For any n scalars x1, x2, . . . , x, we have, XlLyl, , +, , . . . +, , XfS(r?l, , Since n > m, Theorem 6 of Chapter, a,, x2,, . . . , xn not all 0 such that, 5 Aijxj=O,, , =, , i, j=l, , XjcUj, , 1 implies, , that, , there, , exist scalars, , l<i<m., , j=l, , Hence xlal + x2a2 + +. . + X~CG,= 0. This, dependent set. 1, , shows that, , S is a linearly, , Corollary, 1. If V is a finite-dimensional, vector space, then any two, bases of V have the same (jinite) number of elements., , Proof., , Since V is finite-dimensional,, @l,PZ,., , it has a finite basis, , . .,Pm)., , By Theorem 4 every basis of V is finite and contains no more than m, elements. Thus if {CQ, +, . . . , oc,} is a basis, n I m. By the same argument, m 2 n. Hence m = n. 1, of a finite-dimensional, This corollary allows us to define the dimension, vector space as the number of elements in a basis for V. We shall denote, the dimension of a finite-dimensional, space V by dim V. This allows us, to reformulate, Theorem 4 as follows., Corollary, , dim V. Then, , 2., , Let V be a finite-dimensional, , vector space and let n =
Page 53 :
Bases and Dimension, , Sec. 2.3, , (a) any subset of V which contains more than n vectors is linearly, dependent;, (b) no subset of V which contains fewer than n vectors can span V., EXAMPLE 17. If F is a field, the dimension of Fn is n, because the, standard basis for Fn contains n vectors. The matrix space Fmxlr has, dimension mn. That should be clear by analogy with the case of Fn, because the mn matrices which have a 1 in the i, j place with zeros elsewhere, form a basis for Fmxn. If A is an m X n matrix, then the solution space, for A has dimension n - r, where r is !he number of non-zero rows in a, row-reduced echelon matrix which is row-equivalent, to A. See Example 15., If V is any vector space over F, the zero subspace of V is spanned by, the vector 0, but {0} is a linearly dependent set and not a basis. For this, reason, we shall agree that the zero subspace has dimension 0. Alternatively, we could reach the same conclusion by arguing that the empty set, is a basis for the zero subspace. The empty set spans {0}, because the, intersection, of all subspaces containing, the empty set is {0}, and the, empty set is linearly independent because it contains no vectors., Lemma., Let S be a linearly independent subset of a vector space V., Suppose p is a vector in V which is not in the subspace spanned by S. Then, the set obtained by adjoining p to S is linearly independent., , Proof., , Suppose al, . . . , CY,are distinct, , vectors, , in S and that, , Clcrl + * * - + ~,,,a, + bfi = 0., Then b = 0; for otherwise,, p=, , -;, (, , al+..., >, , +, , (, , -2, , >, , ffm, , and fi is in the subspace spanned by S. Thus clal + . . . + cn,ol, = 0, and, since S is a linearly independent set each ci = 0. 1, Theorem, , every linearly, for W., , 5. If W is a subspace of a finite-dimensional, vector space V,, independent subset of W is finite and is part of a (finite) basis, , Proof. Suppose So is a linearly independent, subset of W. If X is, a ‘inearly independent subset of W containing So, then S is also a linearly, independent subset of V; since V is finite-dimensional,, S contains no more, than dim V elements., We extend So to a basis for W, as follows. If So spans W, then So is a, basis for W and we are done. If So does not span W, we use the preceding, lemma to find a vector p1 in W such that the set S1 = 2%~U {PI} is independent. If S1 spans W, fine. If not, npolv the lemma to obtain a vector 02, , 45
Page 54 :
46, , Vector Spaces, , Chap. 2, , in W such that Sz = X1 U {/3z> is independent. If we continue, then (in not more than dim V steps) we reach a set, , in this way,, , s, = so u {Pl, f . . , PnJ, which is a basis for W., , 1, , Corollary, 1. Ij W is a proper subspace of a finite-dimensional, space V, then W is finite-dimensional, and dim W < dim V., , vector, , Proof. We may suppose W contains a vector cy # 0. By Theorem, 5 and its proof, there is a basis of W containing (Y which contains no more, than dim V elements. Hence W is finite-dimensional,, and dim W 5 dim V., Since W is a proper subspace, there is a vector /3 in V which is not in W., Adjoining, p to any basis of W, we obtain a linearly independent, subset, of V. Thus dim W < dim V. 1, 2. In a finite-dimensional, vector space V every non-empty, independent set of vectors is part of a basis., , Corollary, , linearly, , Corollary, 3. Let A be an n X n matrix over a field F, and suppose the, row vectors of A form a linearly independent set of vectors in F”. Then A is, invertible., , Proof. Let (~1,crz, . . . , ayn be the row vectors of A, and suppose, W is the subspace of Fn spanned by al, (Ye,. . . , czn. Since al, LYE,. . . , (Ye, are linearly independent, the dimension of W is n. Corollary, 1 now shows, that W = F”. Hence there exist scalars Bij in F such that, E; = i, j=l, , B..G3), , where (~1, Q, . . . , en} is the standard, with entries Bii we have, , lliln, basis of Fn. Thus for the matrix, , BA = I., , B, , 1, , Theorem, 6. If WI and Wz are finite-dimensional, space V, then Wr + Wz is Jinite-dimensional, and, , subspaces of a vector, , dim Wr + dim Wz = dim (WI n W,) + dim (WI + WZ)., Proof. By Theorem 5 and its corollaries,, basis {cq, . . . , CQ} which is part of a basis, {al, . . . , Uk,, , Pl, * * . , Pm), , for, , W1 n W2 has a finite, WI, , and part of a basis, for, -t’l,, . . . , rn}, 1al, . . . , ak,, The subspace Wl + W2 is spanned by the vectors, , Wz.
Page 55 :
Sec. 2.3, , Bases and Dimension, , and these vectors form an independent, Z, , + 2, , XiO!i, , set. For suppose, + Z 277~ = 0., , yjfij, , Then, - 2 7$-y, = 2, , Xi%, , + L: YjPj, , which shows that Z z,y, belongs to W,. As 2 x,y, also belongs to W, it, follows that, 2 X,y, = 2 CiCti, for certain scalars cl, . . . , ck. Because the set, is independent,, , Yl,, . . . , Yn, { al) . . . , ax,, each of the scalars X~ = 0. Thus, , 2, , >, , + Z yjpj = 0, , XjOri, , and since, {% . . . ,, is also an independent, { al,, , 01, . . . , &), , ak,, , set, each zi = 0 and each y, = 0. Thus,, . . . , ak,, , Pl,, , ., , . . , bn,, , 71,, , . . . , -in, , >, , is a basis for WI + Wz. Finally, dim WI + dim l,t7, = (Ic + m) + (Ic + n), =k+(m+k+n), = dim (W, n Wz) + dim (WI + W,)., , 1, , Let us close this section with a remark about linear independence, and dependence. We defined these concepts for sets of vectors. It is useful, to have them defined for finite sequences (ordered n-tuples) of vectors:, dependent, al, . . . ) a,. We say that the vectors (~1, . . . ,01, are linearly, if there exist scalars cl, . . . , c,, not all 0, such that clczl + . . . + cnan = 0., This is all so natural that the reader may find that he has been using this, terminology, already. What is the difference between a finite sequence, and, a set {CQ, . . . , CY,}? There are two differences, identity, al., . . ,&I, and order., If we discuss the set {(Ye, . . . , (Y,}, usually it is presumed that no, two of the vectors CQ . . . , 01, are identical. In a sequence CQ, . . . , ac, all, the CX;)Smay be the same vector. If ai = LYEfor some i # j, then the sequence (Y], . . . , 01, is linearly dependent:, (Yi + (-1)CXj, , = 0., , Thus, if CY~,. . . , LYEare linearly independent,, they are distinct and we, may talk about the set {LYE,. . . , a,} and know that it has n vectors in it., So, clearly, no confusion will arise in discussing bases and dimension. The, dimension of a finite-dimensional, space V is the largest n such that some, so on. The reader, n-tuple of vectors in V is linearly independent-and
Page 56 :
Vector Spaces, , Chap., , who feels that this paragraph, self whether, the vectors, , is much, , ado about, , nothing, , might, , 2, , ask him-, , a1 = (esj2, 1), , a2 = (rn,, , 1), , are linearly, independent, in Rx., The elements, of a sequence are enumerated, in a specific order. A set, is a collection, of objects,, with, no specified, arrangement, or order., Of, course, to describe, the set we may list its members,, and that requires, choosing, an order. But, the order is not part of the set. The sets {1,2, 3,4}, and (4, 3, 2, l} are identical,, whereas 1, 2,3,4, is quite a different, sequence, from 4, 3, 2, 1. The order aspect of sequences, has no bearing, on questions of independence,, dependence,, etc., because dependence, (as defined), is not affected by t,he order. The sequence, o(,, . . . , o(] is dependent, if and, only if the sequence, al, . . . , 01, is dependent., In the next section,, order, will be important., , Exercises, 1. Prove that if two vectors, multiple, of the other., , are linearly, , dependent,, , one of them, , is a scalar, , 2. Are the vectors, a1, , =, , (1,, , 1,, , linearly, 3. Find, , independent, , cY2= (2, -1,, , 2,4),, , a.3 = (1, -1,, , -4,O),, , a4, , =, , (2,, , -5, 2), 1, 1, 6), , in R4?, , a basis for the subspace of R4 spanned, , by the four vectors of Exercise, , 2., , 4. Show that the vectors, ffz = (1, 2, I),, , a = (1, 0, --I),, form a basis for R3. Express, tions of al, (Ye, and LYE., , each of the standard, , 5. Find three vectors in R3 which are linearly, any two of them are linearly independent., , a3, , = (0, -3,, , 2), , basis vectors as linear, dependent,, , combina-, , and are such that, , 6. Let V be the vector space of all 2 X 2 matrices over the field F. Prove, has dimension, 4 by exhibiting, a basis for V which has four elements,, , that, , 7. Let V be the vector space of Exercise, form, , of the, , 6. Let W1 be the set of matrices, , 1 -x21, X, , Y, , and let Wz be the set of matrices, , of the form, , V
Page 57 :
Sec. 2.4, , Coordinates, , (a) Prove that WI and Wz are subspaces of V., (b) Find the dimensions, of WI, Wz, WI + Wz, and WI (7 Wz., 8. Again let V be the space of 2 X 2 matrices, for V such that A: = Aj for each j., , over F. Find a basis {AI, AZ, Aa, 44), , 9. Let V be a vector space over a subfield F of the complex numbers. Suppose, a, fi, and y are linearly independent, vectors in V. Prove that (a + /3), (0 + y),, and (y + (Y) are linearly independent., 10. Let V be a vector space over the field F. Suppose there are a finite, of vectors al, . . . , LYEin V which span V. Prove that V is finite-dimensional., , number, , 11. Let V be the set of all 2 X 2 matrices A with complex entries which satisfy, AlI $ Azz = 0., (a) Show that V is a vector space over the field of real numbers,, with the, usual operations, of matrix addition, and multiplication, of a matrix by a scalar., (b) Find a basis for this vector space., (the bar, (c) Let W be the set of all matrices A in V such that Azl = -&z, denotes complex conjugation)., Prove that W is a subspace of V and find a basis, for W., 12. Prove that the space of all m X n matrices, by exhibiting, a basis for this space., , over the field F has dimension, , mn,, , 13. Discuss Exercise 9, when V is a vector space over the field with two elements, described in Exercise 5, Section 1.1., 14. Let V be the set of real numbers. Regard V as a vector space over the field, of rational numbers, with the usual operations. Prove that this vector space is not, , finite-dimensional., , 2.4., , Coordinates, , One of the useful features of a basis @ in an n-dimensional, space V is, that it essentially enables one to introduce coordinates in V analogous to, the ‘natural coordinates’ zi of a vector LY = (x1, . . . , z,) in the space Fn., In this scheme, the coordinates of a vector Q(in V relative to the basis @, will be the scalars which serve to express (Y as a linear combination of the, vectors in the basis. Thus, we should like to regard the natural, coordinates, of a vector cy in Fn as being defined by a! and the standard basis for F”;, however, in adopting this point of view we must exercise a certain amount, of care. If, a = (x1, . . . , x,) = z X&, and @ is the standard basis for Fn, just how are the coordinates of QIdetermined by @ and LY?One way to phrase the answer is this. A given vector (Y, has a unique expression as a linear combination, of the standard basis, vectors, and the ith coordinate xi of a: is the coefficient of pi in this expression. From this point of view we are able to say which is the ith coordinate, , 49
Page 58 :
50, , Vector Spaces, , Chap. 2, , because we have a ‘natural ordering of the vectors in the standard basis,, that is, we have a rule for determining, which is the ‘first’ vector in the, basis, which is the ‘second,’ and so on. If @ is an arbitrary, basis of the, n-dimensional, space V, we shall probably have no natural ordering of the, vectors in 63, and it will therefore be necessary for us to impose some, order on these vectors before we can define ‘the ith coordinate of a! relative to a.’ To put it another way, coordinates will be defined relative to, sequences of vectors rather than sets of vectors., DeJinition., If V is a$nite-dimensional, vector space, an ordered, basis, for V is a finite sequence of vectors which is linearly independent and spans V., , If the sequence al, . . . , a+, is an ordered basis for V, then the set, { al, . . . , or,} is a basis for V. The ordered basis is the set, together with, the specified ordering. We shall engage in a slight abuse of notation and, describe all that by saying that, 63 = {(Yl, . . . ) a,}, is an ordered basis for V., Now suppose V is a finite-dimensional, vector, and that, 03 = (al, . . . ) a,}, is an ordered basis for V. Given, (Xl, . * . , 2,) of scalars such that, , a! in V, there, , space over the field F, , is a unique, , n-tuple, , n, Ck! =, , The n-tuple, , Z, i=l, , X&ii., , is unique, because if we also have, , then, n, 2, i=l, , (Xi, , -, , Xi)aCi, , =, , 0, , and the linear independence of the ai tells us that xi - zi = 0 for each i., of ar relative, to the ordered, basis, We shall call xi the ith coordinate, 03 = {cdl,., , . .,Nn}., , If, , then, a + P = i4, (Xi +, so that the ith coordinate, , yi)Ui, , of (a + p) in this ordered, , basis is (xi + yJ.
Page 59 :
Coordinates, , Sec. 2.4, , Similarly, the ith coordinate of (ca) is cxi. One should also note that every, n-tuple (x1, . . . , z,) in Fn is the n-tuple of coordinates of some vector in, V, namely the vector, 5 X&i., i=l, , To summarize,, correspondence, , each ordered, , basis for, , V determines, , a one-one, , a + (Xl, . . . , %a>, between the set of all vectors in V and the set of all n-tuples in F”. This, correspondence has the property that the correspondent, of (a + 0) is the, of Q! and ,8, and that the correspondent, sum in Fn of the correspondents, of (Y., of (car) is the product in Fn of the scalar c and the correspondent, One might wonder at this point why we do not simply select some, ordered basis for V and describe each vector in V by its corresponding, n-tuple of coordinates, since we would then have the convenience of operating only with n-tuples. This would defeat our purpose, for two reasons., First, as our axiomatic definition of vector space indicates, we are attempting to learn to reason with vector spaces as abstract algebraic systems., Second, even in those situations in which we use coordinates, the significant results follow from our ability to change the coordinate system, i.e.,, to change the ordered basis., Frequently,, it will be more convenient for us to use the coordinate, matrix, , of (Y relative, , to the, , ordered, , basis, , a:, , [I, Xl, , xc, , ;, , %I, , rather than the n-tuple (xl, . . . , x,) of coordinates. To indicate the dependence of this coordinate matrix on the basis, we shall use the symbol, , [al@3, for the coordinate matrix of the vector (Y relative to the ordered basis a3., This notation will be particularly, useful as we now proceed to describe, what happens to the coordinates of a vector QI as we change from one, ordered basis to another., and that, Suppose then that V is n-dimensional, @ = {aI, . . . , a,}, are two ordered, (2-13), , and, , bases for V. There, , a’ = (~4, . . . , c&}, , are unique, , CY(I= 5 PijaCi,, , scalars Pij, , such that, , l<jLn., , i=l, , Let xi, . . . , XL be the coordinates, 63’. Then, , of a given vector, , a! in the ordered, , basis, , 61
Page 60 :
56, , Vector Spaces, , Chap. 2, , = i, , 21, , Thus we obtain, , g Pijffi, i=l, , i=l, , the relation, , (2-14), Since the coordinates x1, x2, . . . , xR of a in the ordered basis (~3are uniquely, determined, it follows from (2-14) that, (2-15), , Xi =, , i, , lliln., , PijX;,, , j=l, , Let P be the n X n matrix whose i, j entry is the scalar Pij, and let X and, X’ be the coordinate matrices of the vector CYin the ordered bases @ and, a’. Then we may reformulate, (2-15) as, x = PX’., , (2-16), , Since @ and a are linearly independent sets, X = 0 if and only if X’ = 0., Thus from (2-16) and Theorem 7 of Chapter 1, it follows that P is invertible., Hence, (2-17), , X’ = P-IX., , If we use the notation introduced, above for the coordinate matrix, vector relative to an ordered basis, then (2-16) and (2-17) say, , of a, , [~I63 = P[cflrnb', [a]&jt = P-‘[ar]@., Thus the preceding, , discussion may be summarized, , as follows., , Theorem, 7. Let V be an n-dimensional, vector space over the jield F,, and let (a and 6~’ be two ordered bases of V. Then there is a unique, necessarily, invertible, n X n matrix P with entries in F such that, , (ii;, , [~I63= P[~lW, [a]@! = P-‘[ala, , for every vector o( in V. The columns of P are given by, , Pj = [41&, , j = 1, . . . , n.
Page 61 :
Coordinates, , Sec. 2.4, To complete, result., , the above, , analysis, , we shall also prove, , the following, , Theorem, 8. Suppose P is an n X n invertible matrix over F. Let V, be an n-dimensional vector space over F, and let & be an ordered basis of V., Then there is a unique ordered basis (8’ of V such that, , ["II3 = P[alw, , ,g, , [cr](B’ = P-‘[(Y]@, , for every vector o( in V., Proof. Let CB consist of the vectors, (~1, . . . , (Ye. If 6~’ =, { a:, . . .) cy:} is an ordered basis of V for which (i) is valid, it is clear that, , Thus we need only show that the vectors, form a basis. Let Q = P-l. Then, , LU;,defined, , by these equations,, , 2 QjkaI = z Qik T Pijai, i, = i I: PIjQp ai, j i, = Z 2 PijQjk ai, , i (j, , =, , 1, , (Yk., , Thus the subspace spanned by the set, 63 = {a;, . . . ) aA}, contains CBand hence equals V. Thus CB’is a basis, and from its definition, 1, and Theorem 7, it is clear that (i) is valid and hence also (ii)., EXAMPLE 18. Let F be a field and let, a = (Xl, x2, . . . ) 4, be a vector, , in Fn. If CBis the standard, , ordered, , basis of Fn,, , 63 = {Cl, . . . ) En}, the coordinate, , matrix, , of the vector, , CYin the basis CBis given by, Xl, , [al&3=, , X2, , : *, , XT2, L-1, EXAMPLE 19. Let R be the field of the real numbers, fixed real number. The matrix, P=, , cos e, sin 0, , -sin 0, cos e, , 1, , and let 8 be a, , 53
Page 62 :
54, , Vector Spaces, is invertible, , Chap. 2, with inverse,, cos e sin e, -sin0, cos0 ’, , 1, , p-1 =, , Thus for each 0 the set 63 consisting of the vectors (cos 0, sin e), (-sin 0,, cos e) is a basis for R2; intuitively, this basis may be described as the one, obtained by rotating the standard basis through the angle 0. If (Y is the, vector (x1, a), then, cos e sin e x1, [alw = [- sin 8 cos e x2, or, 2; =, x1 cos e + x2 sin e, x6 = -x1 sin e + x2 ~0s 8., , I[ 1, , EXAMPLE 20. Let F be a subfield of the complex numbers., , [, , -14, 0 2, 00, , P=, is invertible, , with inverse, , 5, -3, 8, , The matrix, , 1, , Thus the vectors, a; = (-1,, 0, 0), a;=(, 4, 2, 0), a$=(, 5,-3,8), form a basis 6%’of F3. The coordinates, in the basis a’ are given by, , [I [, X:, x;, , -1, , -21 + 2x2 + ‘s’x3, , =, , 3X2, , d, , In particular,, , x{, x4, xi of the vector, , +, , &$X3, , ix3, , (3, 2, -8), , = -lOa{, , 2, , _II-, , 0, , +, , &, , 0, , 0, , (II = (x1, x2, x3), , I[1, Xl, , x2, , 6., , ., , 53, , - -;cY; - a;., , Exercises, 1. Show that the vectors, a1, , =, , (1,, , a3, , =, , u,o,, , 1, 0,, , O),, , 0,4),, , a2, , =, , (0,, , 0,, , 1,, , a4, , =, , (0,, , 0,, , 0,2), , 1), , form a basis for R4. Find the coordinates of each of the standard basis vectors, in the ordered basis {q, (~2,a3, (~4).
Page 63 :
Sec. 2.5, , Summary, , 2. Find the coordinate, matrix, of the vectors (2i, 1, 0), (2, -1,, , of the vector (1, 0, 1) in the basis of C3 consisting, I), (0, 1 + i, 1 - i), in that order., , 3. Let @ = {Q, cys,cyz} be the ordered, w = (1, 0, -11,, What, , are the coordinates, , 4. Let TV be, (a) Show, (b) Show, form another, (c) What, , 56, , of Row-Equivalence, , basis for R3 consisting, , a2 = (1, 1, l),, , of the vector, , of, , = (1, 0, 0)., , a3, , (a, b, c) in the ordered, , basis @?, , the subspace of C3 spanned by cyI = (1, 0, i) and (Y~ = (1 + i, 1, - 1)., that LYEand (Yeform a basis for W., that the vectors p1 = (1, 1,O) and pz = (1, i, 1 + i) are in W and, basis for W., are the coordinates, of (Y~and LYZin the ordered basis {PI, pz} for W?, , 5. Let (Y = (x1, 2,) and /3 = (yl, yz) be vectors in R2 such that, x:+x;=, , XlYl + x2y2 = 0,, , y:+y;=, , 1., , Prove that GA = {cy, p} is a basis for R2. Find the coordinates, of the vector (a, b), in the ordered basis B = {cy, /?}. (The conditions, on cx and /3 say, geometrically,, that cr and /3 are perpendicular, and each has length 1.), 6. Let V be the vector space over the complex numbers of all functions from R, into C, i.e., the space of all complex-valued, functions on the real line. Let fi(z) = 1,, fi(x) = eiz, f3(z) = e+., (a) Prove that fi, fi, and f8 are linearly independent., (b) Let gl(z) = 1, gz(z) = cos 2, g3(x) = sin 2. Find an invertible, 3 X 3 matrix, P such that, gi =, , s, i=l, , Pijfi., , 7. Let V be the (real) vector space of all polynomial, functions, of degree 2 or less, i.e., the space of all functions f of the form, f(x), Let t be a fixed real number, gdx) = 1,, , from, , R into, , R, , = co + Cl% + c222., , and define, 92(x) = 2 + t,, , 93(x), , = (x + 0”., , Prove that @ = (91, g2, 93) is a basis for V. If, f(z), what are the coordinates, , = co +, , Cl5, , off in this ordered, , 2.5., , +, , c2x2, , basis a?, , Summary, , of Row-Equivalence, , In this section we shall utilize some elementary facts on bases and, dimension in finite-dimensional, vector spaces to complete our discussion, of row-equivalence, of matrices. We recall that if A is an m X n matrix, over the field F the row vectors of A are the vectors Q~, . . . , Q, in Fn, defined by, ai = (Ail,, , . . . , Ai,)
Page 64 :
56, , Vector Spaces, , Chap. 2, , and that the row space of A is the subspace of Fn spanned by these vectors., The row rank of A is the dimension of the row space of A., If P is a k X m matrix over F, then the product B = PA is a k X n, matrix whose row vectors fir, . . . , & are linear combinations, pi = PillXl + . * ’ + PinC&, of the row vectors of A. Thus the row space of B is a subspace of the row, space of A. If P is an m X m invertible, matrix, then B is row-equivalent, to A so that the symmetry of row-equivalence,, or the equation A = P-‘B,, implies that the row space of A is also a subspace of the row space of B., Theorem, , 9. Row-equivalent, , matrices habe the same row space., , Thus we see that to study the row space of A we may as well study, the row space of a row-reduced, echelon matrix which is row-equivalent, to A. This we proceed to do., Theorem, 10. Let R be a non-zero row-reduced echelon matrix., the non-zero row vectors of R form a basis for the row space of R., , Proof., , Let pl, . . . , p7 be the non-zero row vectors, pi, , =, , (EL,, , . . . ,, , Then, , of R:, , Ri,)., , Certainly these vectors span the row space of R; we need only prove they, are linearly independent., Since R is a row-reduced, echelon matrix, there, are positive integers k,, . . . , k, such that, for i 5 r, (2-18), , (a) R(i, j) = 0 if, (b) R(i, IGJ = 6ii, (c) kl < * - * < k,., , j < lci, , Suppose p = (b,, . . . , b,) is a vector, , in the row space of R:, , (2-19), , ** . +, , P =, , ClPl, , +, , G/b, , Then we claim that cj = bki. For, by (2-18), (2-20), , blci = E ciR(i, IQ), i=l, , = cj., , In particular,, if p = 0, i.e., if clpl + . . . + c,p, = 0, then cj must be the, k&h coordinate, of the zero vector so that cj = 0, j = 1, . . . , r. Thus, independent., i, Pl, * * * , Pr are linearly, Theorem, Il., Let m and n be positive integers and let F be a field., Suppose W is a subspace of 17” and dim W < m. Then there is precisely one, m X n row-reduced echelon matrix over F which has W as its row space.
Page 65 :
Sec. 2.5, , Summary, , Proof., There is at least one m X n row-reduced, row space W. Since dim W 5 m, we can select, span W. Let A be the m X n, w,, . . . , %fa in W which, echelon, vectors al, . . . , ffy, and let R be a row-reduced, row-equivalent, to A. Then the row space of R is W., Now let R be any row-reduced echelon matrix which, space. Let pl, . . . , pr be the non-zero row vectors of R, the leading non-zero entry of pi occurs in column ki, i, vectors pl, . . . , pI form a basis for W. In the proof of, observed that if p = (b,, . . . , b,) is in W, then, , with, , P, , =, , ClPl, , +, , * * *, , and ci = bk,; in other words, the unique, bination of pl, . . . , p, is, (2-21), , P, , =, , j,, , +, , of Row-Equitdence, , echelon matrix, some m vectors, matrix with row, matrix which is, has W as its row, and suppose that, = 1, . . . , r. The, Theorem 10, we, , CtPr,, , expression, , for fl as a linear com-, , bk,Pi*, , Thus any vector fi is determined if one knows the coordinates bk,, i = 1, . . . ,, r. For example, ps is the unique vector in W which has k,th coordinate 1, and kith coordinate 0 for i # s., Suppose p is in W and p # 0. We claim the first non-zero coordinate, of /? occurs in one of the columns k,. Since, P = ii,, , bkiPi, , and 0 # 0, we can write, (2-22), , P = f: b/c&,, i=s, , From the conditions, , bk, # 0., , (2-18) one has Rcj = 0 if i > s and j 5 k,. Thus, , P = (0, . . . , 0,, , bk., . . . , b,),, , h # 0, , and the first non-zero coordinate of p occurs in column k,. Note also that, for each k,, s = 1, . . . , r, there exists a vector in W which has a non-zero, k&h coordinate, namely ps., It is now clear that R is uniquely determined by W. The description, of R in terms of W is as follows. We consider all vectors 0 = (bl, . . . , b,), in W. If p # 0, then the first non-zero coordinate of p must occur in some, column t:, bt # 0., P = (0, . . . , 0, bt, . . . , bn),, Let kl, . . . , k, be those positive integers t such that there is some /3 # 0, in W, the first non-zero coordinate of which occurs in column t. Arrange, k . . . 7k, in the order Icr < kz < . . . < k,. For each of the positive, imegers k, there will be one and only one vector ps in W such that the, k&h coordinate of ps is 1 and the kith coordinate of ps is 0 for i # s. Then, R is the m X n matrix which has row vectors pl, . . . , pr, 0, . . . , 0. 1, , 57
Page 66 :
58, , Vector Spaces, Corollary., one row-reduced, , Chap. 2, Each m X n matrix, echelon matrix., , A is row-equivalent, , to one and only, , Proof. We know that A is row-equivalent, to at least one rowreduced echelon matrix R. If A is row-equivalent, to another such matrix, R’, then R is row-equivalent, to R’; hence, R and R’ have the same row, space and must be identical., [, Corollary., Let A and B be m X n matrices over the jield F. Then A, and B are row-equ,ivalent if and only if they have the same row space., Proof. We know that if A and 13 are row-equivalent,, then they, have the same row space. So suppose that A and B have the same row, space. h70w A is row-equivalent, to a row-reduced, echelon matrix R and, B is row-equivalent, to a row-reduced, echelon matrix R’. Since A and B, have the same row space, R and R’ have the same row space. Thus R = R’, and A is row-equivalent, to B. 1, To summarize-if, following statements, , A and B are m X n matrices, are equivalent:, , over the field F, the, , 1. A and B are row-equivalent., 2. A and B have the same row space., 3. B = PA, where P is an invertible, m X m matrix., A fourth equivalent, statement is that the homogeneous, systems, AX = 0 and BX = 0 have the same solutions; however, although we, know that the row-equivalence, of A and B implies that these systems, have the same solutions, it seems best to leave the proof of the converse, until later., , 2.6., , Computations, , Concerning, , Subspaces, , We should like now to show how elementary row operations provide, a standardized method of answering certain concrete questions concerning, subspaces of Fn. We have already derived the facts we shall need. They, are gathered here for the convenience of the reader. The discussion applies, to any n-dimensional, vector space over the field F, if one selects a fixed, ordered basis @ and describes each vector a in V by the n-tuple (xi, . . . , x,), which gives the coordinates of QJin the ordered basis (R., Suppose we are given m vectors al, . . . , CL, in Fn. We consider the, following questions., 1. How does one determine if the vectors CQ, . . . , LY, are linearly, independent?, More generally, how does one find the dimension of the, subspace W spanned by these vectors?
Page 67 :
Computations Concerning Subspaces, , Sec. 2.6, , 2. Given fl in Fn, how does one determine whether p is a linear combination of al, . . . , CY,, i.e., whether /3 is in the subspace W?, 3. How can one give an explicit description of the subspace W?, The third question is a little vague, since it does not specify what is, meant by an ‘explicit description’;, however, we shall clear up this point, by giving the sort of description we have in mind. With this description,, questions (1) and (2) can be answered immediately., Let A be the m X n matrix with row vectors (pi:, ai = (Ail, . . . , Ai,)., Perform a sequence of elementary, row operations, starting with A and, terminating, with a row-redu’ced echelon matrix R. We have previously, described how to do this. At this point, the dimension of W (the row space, of A) is apparent, since this (dimension is simply the number of non-zero, row vectors of R. If ~1, . . . , P+.are the non-zero row vectors of R, then, @I = {Pl, . . . , PJ is a basis for W. If the first non-zero coordinate of pi is, the l&h one, then we have for i 5 r, R(i, j) = 0, if, R(i, kj) = 6ii, ;;i, k, < . . . < k,., (c>, The subspace W consists of all vectors, P, , =, , Wl, , = iil, The coordinates, , +, , * * ., , j < ki, , +, , c,pr, , ci(Ril, . a a ) Rin)., , bl, . . . , b, of such a vector, , (2-23), , fi are then, , bi = i CiRij., i=l, , In particular,, bki = ci, and SOIif p = (bl, . . . , b,) is a linear, of the pz, it must be the particular linear combination, (2-24), , P = iil, , The conditions, , bkiPi*, , on fl that (2-24) should hold are, bj = i, , (2-25), , combination, , bk,Rii,, , j = 1, . . . , n., , i=l, , Kow, , is the explicit description, of the subspace W spanned by, that is, the subspace consists of all vectors fl in Fn whose coordinates satisfy (2-25). What kind of description is (2-25)? In the first, place it describes W as all solutions /3 = (bl, . . . , b,) of the system of, homogeneous linear equations (2-25). This system of equations is of a, very special nature, because it expresses (n - r) of the coordinates as, ‘yl,, , (2-25), , . . . , an,, , 59
Page 68 :
60, , Vector Spaces, , Chap. 2, , linear combinations, of the r distinguished, coordinates, bk,, . . . , bk,. One, has complete freedom of choice in the coordinates IQ,., that is, if cl, . . . , cr, are any r scalars, there is one and only one vector p in W which has ci as, its k&h coordinate., The significant point here is this: Given the vectors ai, row-reduction, is a straightforward, method of determining, the integers r, k,, . . . , k, and, the scalars Rii which give the description, (2-25) of the subspace spanned, by (~1, . . . , (Y,. One should observe as we did in Theorem 11 that every, subspace W of Fn has a description of the type (2-25). We should also, point out some things about question (2). We have already stated how, one can find an invertible, m X m matrix P such that R = PA, in Section, 1.4. The knowledge of P enables one to find the scalars x1, . . . , xm such, that, p = XlcYl + f . . + xmcYm, when this is possible. For the row vectors, , of R are given by, , pi = 2 PijcUj, j=l, , so that if p is a linear combination, , of the ai, we have, , and thus, , bk;Pij, , Xj, , = i, i=l, , is one possible choice for the xj (there may be many)., The question of whether j3 = (bl, . . . , b,) is a linear combination, of, the CQ, and if so, what the scalars xi are, can also be looked at by asking, whether the system of equations, 5 Aijxi, , i=l, , = bj,, , j=l,...,n, , has a solution and what the solutions are. The coefficient matrix of this, system of equations is the n X m matrix B with column vectors al, . . . , (Y,., In Chapter 1 we discussed the use of elementary row operations in solving, a system of equations BX = Y. Let us consider one example in which we, adopt both points of view in answering questions about subspaces of Fn., EXAMPLE, 21. Let us pose the following, space of R4 spanned by the vectors, , problem., , Let W be the sub-
Page 69 :
Sec. 2.6, , Computations Concerning Subspaces, a1 = (1, 2, 2, 1), 012= (0, 2, 0, 1), a3, = (-2, 0, -4,, , 3)., , (a) Prove that al, LYE,(~3 form a basis for W, i.e., that these vectors, are linearly independent., (b) Let /3 = (b,, 62, b3, bd) be a vector in W. What are the coordinates, of p relative to the ordered basis ((~1, (Ye,a3} ?, (c) Let, a: = (1, 0, 2, 0), a:. = (0, 2, 0, 1), a; = (0,0,0,3)., Show that a:, al, 014form a basis for W., (d) If fi is in W, let X denote the coordinate matrix of p relative to, the a-basis and X’ the coordinate matrix of /3 relative to the &-basis. Find, the 3 X 3 matrix P such that X = PX’ for every such p., To answer these questions by the first method we form the matrix A, with row vectors (~1, (~2, (~3, find the row-reduced, echelon matrix R which, is row-equivalent, to A and simultaneously, perform the same operations, on the identity to obtain the invertible, matrix Q such that R = &A:, , [ -2 02, 12 0, , -4 01, 21- 3*, , rl 0 01, (a) Clearly R has rank 3, so al, czzand a3 are independent., (b) Which vectors 13 = (bl, bz, b3, b4) are in W? We have the basis, for W given by ~1, PZ, ~3, the row vectors of R. One can see at a glance that, the span of pl, p2, p3 consists of the vectors fl for which b3 = 2b,. For such, a j? we have, P = bm + bm + bm, =, , Eb,,, , 62,, , b4lR, , = i31 bz U&A, =, , where xi = [b,, , +, , xzff2, , +, , x3a3, , b2 b4]Qi:, Xl, , (2-26), , %a, , =, , x2=, x3, , =, , b, - +b2 + 94, -bl+gb2-$b4, - &bz + +b4., , 6
Page 70 :
62, , Vector Spaces, , Chap., , 2, , (c) The vectors CY:,CL&a; are all of the form (yl, Q, y3, y4) with y3 = 2y,, and thus they are in W. One can see at a glance that they are independent., (d) The matrix, , P has for its columns, , pj = [&l, where a3 = {cx~,Q, as}. The equations (2-26) tell us how to find the coordinate matrices for oc:, cul, &. For example with /3 = CX:we have bl = 1,, bz = 0, ba = 2, bq = 0, and, 1 - 5(O) + $(O) =, 1, Xl =, x2 = - 1 + g(0) - 8(O) = -1, - i(O) + go> =, 0., x3, =, Thus CL; = (Ye- CQ. Similarly, Hence, P=, , we obtain, , [, , (Y; = CQand CY~= 2a1 - 2az + LYE., , 10, , -1, , 2, 1, , 00, , -2, , 1, , 1, , Now let us see how we would answer the questions by the second, method which we described. We form the 4 X 3 matrix B with column, vectors (~1, czz,(~3:, , Thus the condition that the system BX = Y have a solution is y3 = 2y1., So p = (b,, bz, b3, bl) is in W if and only if b3 = 2b,. If ,G’is in W, then the, coordinates (x1, x2, x3) in the ordered basis {cY~,(Ye,(~3) can be read off from, the last matrix above. We obtain once again the formulas (2-26) for those, coordinates., The questions (c) and (d) are now answered as before.
Page 71 :
Computations, , Sec. 2.6, , Concerning, , Subspaces, , 63, , EXAMPLE 22. We consider the 5 X 5 matrix, , and the following, , problems, , concerning, , A, , (a) Find an invertible, matrix P such that PA is a row-reduced, echelon matrix R., (b) Find a basis for the row space W of A., (c) Say which vectors (&, bz, bar b4, by,) are in W., (d) Find the coordinate matrix of each vector (br, &, bz, bq, b5) in W, in the ordered basis chosen in (b)., (e) Write each vector (bI, bz, bat bq, BJ in W as a linear combination, of the rows of A., (f) Give an explicit description, of the vector space V of all 5 X 1, column matrices X such that AX = 0., (g) Find a basis for V., (h) For what 5 X 1 column matrices Y does the equation AX = Y, have solutions X?, To solve these problems we form the augmented matrix A’ of the, sequence of row operations, system AX = Y and apply an appropriate, to A’., , k; -i j;jj;, ;, 00, 1, , 002, , 01, , 403, , 00, 1, , -3y1, -yl+Yz+Y3, Yl + Yl, -Y5 Yzys +, , -;-;; :;$jY4 1 -, , i 01, , 20, , 001, , 340, , 001, , -3yl+Yz+Y4-Y6, -y1 Yl + Yl, - Yz Yz+, Y6, , Y3, , 1
Page 72 :
64, , Vector Spaces, (4, , If, , I 1, : 1, , Chap. 2, , Yl, , Yl, , PY =, , Y2, , Y6, , -Y1, , +, , Yz +, , Y3, , -3Y1 + yz + Y4 - y/6, , for all I’, then, , 1, 000, l-100, 000, 0, -1, 110, -3, 10, , P=, , hence PA is the row-reduced, , It should be stressed that, many invertible, matrices, operations used to reduce, (b) As a basis for W, , 0, 0, 1, 0, , 1, , -1, , echelon matrix, , the matrix P is not unique. There are, in fact,, P (which arise from different choices for the, A’) such that PA = R., we may take the non-zero rows, p1 = (1, pz = (0, p3 = (0, , 2, 0, 0, , 0 3, 1 4, 0 0, , 0), 0), 1), , of R., (c) The row-space, , W consists of all vectors, P =, , ClPl, , +, , CZP2, , +, , of the form, , C3P3, , = (cl, 2~1, 6, 3s + 4~2, ~3), where cl, ~2, cg are arbitrary, only if, @I,, , b2,, , b3,, , scalars. Thus (b,, b2, b3, bd, b5) is in W if and, b4,, , 65), , =, , blpl, , +, , bm, , +, , bm, , which is true if and only if, bz = 26,, bq = 3b, + 4b3., These equations are instances of the general system (2-25), and using, them we may tell at a glance whether a given vector lies in W. Thus, (-5, -10, 1, -11, 20) is a linear combination, of the rows of A, but, (1, 2, 3,4, 5) is not., (d) The coordinate matrix of the vector (b,, 2b1, b3, 3b1 + 4b3, b5) in, the basis {pl, p2, p3} is evidently
Page 73 :
Computations Concerning Subspaces, , Sec. 2.6, , (e) There are many ways to write the vectors in IV as linear combinations of the rows of A. Perhaps the easiest method is to follow the first, procedure indicated before Example 21:, P = 6, 2b1, b3, 3b, + 4b3, b5), = PI, bs, b5, 0, 01 . R, = [b,, bs, bs, 0, 01 . PA, 1, 000, l-100, , [, -1, -3, , = Pl, ba, b5,0,01, In particular,, , = [bl + b3, -63,0,0,, with p = (-5, -10,, , p = (-4,, , (f), , The equations, , -l,O,O,, , 0, 0, , 1, I 1, 110, , 10, 0 000, b5]. A., , 1, , 0, -1 1.A, , 1, - 11, 20) we have, , 20), , 12, 1 2, 0 0, 24, 00, , 30, , 0, , -1, , -1, , 1, 1, 0, , 0, 4 0 *, 10 1, 01, , in the system RX = 0 are, Xl + 2x2 + 3x4 = 0, x3 + 4x4 = 0, 26 = 0., , Thus V consists of all columns of the form, -2x2, , - 3x4, X2, - 4x4, , x=, , x4, , 0, where x2 and x4 are arbitrary., (g) The columns, , ., , [I II, -2 01, , -3, -4 01, , form a basis of V. This is an example of the basis described, , in Example, , 15., , 65
Page 74 :
66, , Vector Spaces, , Chap., , (h) The equation, , AX, , = Y has solutions, -y1, , -3y1, , +, , y2, , +, , Y3, , +, , y2, , +, , y4, , -, , ys, , 2, , X if and only if, =, , 0, , =, , 0., , Exercises, 1. Let s < TZand A an s X n matrix with entries in the field F. Use Theorem, 4, (not its proof) to show that there is a non-zero X in FnX1 such that AX = 0., , 2. Let, a = (3,0,4, -l),, , -2, I),, , a1 = (Ll,, , a3 = (- 1,2,5,2)., , Let, , a = (4, -5, 9, -7),, , y = (-1,, , P = (3, 1, -4, 4),, , 1, 0, 1)., , (a) Which of the vectors cy, 0, y are in the subspace of R4 spanned, (b) Which of the vectors (Y, /3, y are in the subspace of C4 spanned, (c) Does this suggest a theorem?, 3. Consider, , in R4 defined, , the vectors, , 1, 21,, , al = (-LO,, , by the ai?, by the ori?, , by, , a!2 = (3,4, -2, 5),, , a3 = (1,4, 0, 9)., , Find a system of homogeneous, linear equations for which the space of solutions, is exactly the subspace of R4 spanned by the three given vectors., , 4. In C3, let, a1 = (1, 0, -i),, , ffz=, , Prove that these vectors form, vector (a, b, c) in this basis?, 5. Give an explicit, , (l+i,l--,l),, , a3 = (i, i, i)., , a basis for C3. What, , description, , are the coordinates, , of the, , of the type (2-25) for the vectors, P = (b,, bz, ba, b4, W, , in R5 which are linear, , combinations, , of the vectors, , a2 = (-1,2, -4,2,0), a4 = (2, 1, 3, 5, 2)., , a1 = (1, 0,2, 1, -l),, a3, , =, , (2,, , -1, 5,2, I),, , 6. Let V be the real vector space spanned, , A=[;, , ;, , j, , by the rows of the matrix, , -j, , -;I., , (a) Find a basis for V., (b) Tell which vectors (xi, x2, Q, z4, zg) are elements of V., (c) If (z~, z2, x3, Q, Q,) is in V what are its coordinates in the basis chosen in, part (a)?, 7. Let A be an m X n matrix over the field F, and consider the system of equations AX = Y. Prove that this system of equations has a solution if and only if, matrix of the system., the row rank of A is equal to the row rank of the augmented
Page 75 :
3. Linear, , Transformations, , 3.1. Linear, , Transformations, , We shall now introduce linear transformations,, the objects which we, shall study in most of the remainder of this book. The reader may find it, helpful to read (or reread) the discussion of functions in the Appendix,, since we shall freely use the terminology, of that discussion., DeJinition.., transformation, , Let V and W be vector spaces over the jield F. A linear, from, V into, W is a function T from V into W such that, T(ca + P> = c(Ta), , for, , + TP, , all a! and p in V and all scalars c in F., , EXAMPLE 1. If V is any vector, , space, the identity transformation, by ICY = (Y, is a linear transformation, from V into V. The, zero transformation, 0, defined by Oa! = 0, is a linear transformation, from V into V., , I, defined, , EXAMPLE 2. Let F be a field and let V be the space of polynomial, functions f from F into F, given by, , f(z) = co+ ClX+ . . . + CkXk., Let, , @f)(x) = Cl +, Then D is a linear, transformation., , transformation, , 2czx + * * - + kCkxk-‘., from, , V into, , V-the, , differentiation, , 67
Page 76 :
68, , Linear Transformations, , Chap. 3, , EXAMPLE 3. Let A be a fixed m X n matrix with entries in the field F., The function T defined by T(X) = AX is a linear transformation, from, Fnxl into FmX1. The function U defined by U(a) = cuA is a linear transformation from Fn’ into Fn., EXAMPLE 4. Let P be a fixed m X m matrix with entries in the field F, and let Q be a fixed n X n matrix over F. Define a function T from the, space Fmxn into itself by T(A) = PA&. Then T is a linear transformation, from Fnzxn into Fmxn, because, T(cA + B) =, =, =, =, , P(cA, (CPA, cPAQ, CT(A), , + B)Q, + PB)Q, + PBQ, + T(B)., , EXAMPLE 5. Let R be the field of real numbers and let V be the space, of all functions from R into R which are continuous. Define T by, , (Tf)(z), , = ff(t), , dt., , Then T is a linear transformation, from V into V. The function Tf is, not only continuous but has a continuous first derivative., The linearity, of integration is one of its fundamental, properties., The reader should have no difficulty in verifying, that the transformations defined in Examples 1, 2, 3, and 5 are linear transformations., We, shall expand our list of examples considerably, as we learn more about, linear transformations., It is important, to note that if T is a linear transformation, from I’, into W, then T(0) = 0; one can see this from the definition because, T(0), , = T(0 + 0) = T(0) + T(0)., , This point is often confusing to the person who is studying linear algebra, for the first time, since he probably has been exposed to a slightly different, use of the term ‘linear function.’ A brief comment should clear up the, confusion. Suppose V is the vector space R’. A linear transformation, from, V into V is then a particular type of real-valued function on the real line R., In a calculus course, one would probably call such a function linear if its, graph is a straight line. A linear transformation, from R1 into R’, according, to our definition, will be a function from R into R, the graph of which is a, straight line passing through the origin., In addition to the property T(0) = 0, let us point out another property, of the general linear transformation, T. Such a transformation, ‘preserves’, linear combinations;, that is, if ~1, . . . , cynare vectors in V and cl, . . . , Cn, are scalars, then, T(cw, , + . . . + CA), , = cl(Tal), , + . . . + cn(T4
Page 77 :
Linear Transformations, , Sec. 3.1, This follows readily, , from the definition., T(clal, , For example,, , + cm) = cl(Ta~) + T(w), = Cl(Tcyl) + cs(Tarz)., , Theorem, 1. Let V be a jinite-dimensional, vector space over the field F, and let {al, . . . , a,} be an ordered basis for V. Let W be a vector space over the, same jield F and let PI, . . . , Pn be any vectors in W. Then there is precisely, one linear transformation, T from V into W such that, , j = l,...,n., , Taj = fij,, , Proof. To prove there is some linear transformation, T with Tq =, pj we proceed as follows. Given a! in I’, there is a unique n-tuple (21, . . . , z,), such that, a = Xl(Y1 + . . . + xnan., For this vector, , Q!we define, Ta! = x&, , + . . . + x,&., , Then T is a well-defined, rule for associating with each vector CYin V a, vector TCX in W. From the definition it is clear that Tai = pi for each j., To see that T is linear, let, P =, , y1w, , +, , +, , . **, , ynwz, , be in V and let c be any scalar. Now, cc-i+ P = (CXl + Y&l, , + * . . + (cxn + Yn)%, , and so by definition, T(ca! + 13) = (CG + YI)& + . . . + (~5% + y&L, On the other hand,, , = i4, (CXi + YJPi, and thus, T(ca! + P) = c(Tcx) + TP., If U is a linear, 17”‘), , transformation, , n, then for the vector, , from, , V into W with, , a! = 2 xiai we have, i=l, , =, , iS,, , Xi(, , UQ(i), , Uai = @j, j =, , 69
Page 78 :
70, , Linear Transformations, , Chap. 3, , so that U is exactly the rule T which we defined above. This shows that the, linear transformation, T with TCY~= pi is unique., 1, Theorem 1 is quite elementary; however, it is so basic that we have, stated it formally. The concept of function is very general. If V and W are, (non-zero) vector spaces, there is a multitude of functions from V into W., Theorem 1 helps to underscore the fact that the functions which are linear, are extremely special., EXAMPLE 6. The vectors, a1, , =, , (1,2>, , a2 = (3,4), are linearly independent, and therefore form a basis for R2. According to, Theorem 1, there is a unique linear transformation, from R2 into R3 such, that, TcxI = (3, 2, 1), TCQ = (6, 5, 4)., If so, we must be able to find T(e1). We find scalars cl, c2 such that EI =, clal + c2a2 and then we know that TQ = clTczl + czTa2. If (1, 0) =, ~(1, 2) + ~(3, 4) then cl = -2 and c2 = 1. Thus, TO, 0) = -2(3, 2, 1) + (6, 5, 4), = (0, 1, 2)., EXAMPLE 7. Let T be a linear transformation, from the m-tuple space, Fm into the n-tuple space Fn. Theorem 1 tells us that T is uniquely determined by the sequence of vectors PI, . . . , &,, where, pi = TEE,, In short, T is uniquely determined, vectors. The determination, is, , i=l, , , . . . , m., , by the images of the standard, , basis, , a = (Xl, . . . ) x,), Tcx = x& + 1. . + x,,&,v, If B is the m X n matrix, , which has row vectors /31, . . . , Pm, this says that, Ta: = aB., , In other words, if ,& = (Bil, . . . , Bi,), then, T(zl, . . . , x,) = [xl . . . x,], , [Fl, , :::, , ?J, , This is a very explicit8 description of the linear transformation., 3.4 we shall make a serious study of the relationship, between, , In Section, linear trans-
Page 79 :
Sec. 3.1, , Linear Transformations, , formations, and matrices. We shall not pursue the particular, description, Ta = CXBbecause it has the matrix R on the right of the vector CY,and that, can lead to some confusion. The point of this example is to show that we, can give an explicit and reasonably simple description of all linear transformations from Fm into P., If T is a linear transformation, from li into W, then the range of T is, rlot only a subset of W; it is a subspace of W. Let Rr be the range of T, that, is, the set of all vectors 0 in W such that p = Ta for some cx in F/‘. Let fll, and pz be in Rr and let c be a scalar. There are vectors al and cy2in V such, that Ta, = p1 and Taz = ps. Since T is linear, T(cw +, , (~2), , = cTal + Ta2, =, , CPl, , +, , P2,, , which shows that cpl + p2 is also in RT., Another interesting subspace associated with the linear transformation, T is the set N consisting of the vectors (Y in V such that Tar = 0. It is a, subspace of V because, (a) T(0) = 0, so that N is non-empty;, (b) if Tar, = Ta2 = 0, then, T(cal +, , a2), , = CTCQ + Tolz, =co+o, = 0, , so that CLY~+ a2 is in N., DeJinition., Let V and W be vector spaces over the jield F and let T, space of T is the set, be a linear transformation, from V into W. The null, of all vectors CYin V such that TCX = 0., If V is finite-dimensional,, the rank of T is the dimension of the ranye, of T is the dimension of the null space of T., of T and the nullity, , The following, , is one of the most important, , results in iinear algebra., , Theorem, 2. Let V and W be vector spaces over the field F and let T be, a linear transformation from V into W. Suppose that V is jinite-dimensional., Then, rank (T) + nullity (T) = dim V., , Proof. Let {(Ye, . . . , CQ} be a basis for N, the null space of T., There are vectors CQ+~,. . . , a, in V such that {cY~,. . . , LY%}is a basis for V., We shall now prove that {Tcx~+~, . . . , Ta,} is a basis for the range of T., The vectors Toll, . . . , Ta, certainly span the range of T, and since Tcq = 0,, for j 5 k, we see that TcQ+~, . . . , TCY, span the range. To see that these, vectors are independent, suppose we have scalars ci such that, , ,i+,, , ci(Tai) = 0., , 71
Page 80 :
72, , Linear Transformations, , Chap. 3, , This says that, , and accordingly, , the vector, , a! =, , $, , ciai is in the null space of T. Since, , i=lc+1, a,, , . . . ,, , ak form a basis for N, there must be scalars bl, . . . , bk such that, k, a = 2 biai., i=l, , Thus, ;, i-l, , biai -, , i, , CjCKj, , = 0, , j=k+l, , and since (~1, . . . , cylLare linearly independent we must have, bl = . . . = bk = Ck+l = . . . = Cn = 0., If r is the rank of T, the fact that Tak+I, . . . , TCY, form a basis for, the range of T tells us that r = n - lc. Since k is the nullity of T and n is, the dimension of V, we are done., 1, Theorem, , 3. If A is an m X n matrix, , with entries in the field F, then, , row rank (A) = column rank (A)., Proof. Let T be the linear transformation, from Fnxl into FmXl, defined by T(X) = &4X. The null space of T is the solution space for the, system AX = 0, i.e., the set of all column matrices X such that AX = 0., The range of T is the set of all m X 1 column matrices Y such that AX =, Y has a solution for X. If Al, . . . , A,, are the columns of A, then, AX, , = xlAl + . . . + xnAn, , so that the range of T is the subspace spanned by the columns of A. In, other words, the range of T is the column space of A. Therefore,, rank (T) = column rank (A)., Theorem, then, , 2 tells us that if S is the solution, , space for the system AX, , = 0,, , dim S + column rank (A) = n., We now refer to Example 15 of Chapter 2. Our deliberations, there, showed that, if r is the dimension of the row space of A, then the solution, space X has a basis consisting of n - r vectors:, dim X = n - row rank (A)., It is now apparent, , that, row rank (A) = column rank (A)., , The proof, , of Theorem, , 3 which, , 1, , we have just given, , depends, , upon
Page 81 :
Sec. 3.1, , Linear, , Transformations, , explicit, calculations, concerning, systems, of linear, equations., more conceptual, proof which does not rely on such calculations., give such a proof in Section 3.7., , There, We, , is a, shall, , Exercises, 1. Which, , of the following, , T from R2 into R2 are linear transformations?, , functions, , ~2) = (1 + x1, ~2);, , (a) Th,, (b) T(zl,, , 22), , =, , (22,, , Cc), , Th,, , x2), , =, , (~4,221;, , (4, , T(xI,, , (4, , Th, , XI), , ;, , 22) = (sin 21, ~2) ;, x2) = (xl - 22,O)., , 2. Find the range, rank,, the identity, transformation, , null space, and nullity, on a finite-dimensional, , for the zero transformation, space V., , and, , 3. Describe the range and the null space for the differentiation, transformation, of Example 2. Do the same for the integration, transformation, of Example, 5., , T from R3 into R2 such that, , 4. Is there a linear transformation, (1, 0) and T(l, 1, 1) = (0, l)?, , T(1, -1, 1) =, , 5. If, a, , =, , (1,, , a2, , =, , (2, -11,, , -0,, , a = (-3,2),, is there a linear, and 3?, 6. Describe, , transformation, , explicitly, , PI, , =, , (1,, , 0), , P2, , =, , (0,, , 1), , P3, , =, , (1,, , 1), , T from R2 into R2 such that Tai = /?i for i = 1, 2, , (as in Exercises, , 1 and 2) the linear, , transformation, , T from, , F2 into F2 such that Tel = (a, b), TQ = (c, d)., 7. Let F be a subfield, F3 into F3 defined by, , of the complex, , numbers, , and let T be the function, , from, , T(xI, xz, x3) = (XI - x2 + 2~3,221 + 22, -XI - 2x2 + 223)., (a) Verify that T is a linear transformation., on a, b, and c that, (b) If (a, 6, c) is a vector in F3, what are the conditions, the vector be in the range of T? What is the rank of T?, (c) What are the conditions, on a, b, and c that (a, b, c) be in the null space, of T? What is the nullity of T?, 8. Describe explicitly, a linear transformation, from R3 into, range the subspace spanned by (1, 0, - 1) and (1, 2, 2)., 9. Let V be the vector space of all n X n matrices, be a fixed n X n matrix. If, , R3 which has as its, , over the field F, and let B, , T(A) = AB - BA, verify, , that, , T is a linear transformation, , 10. Let V be the set of all complex, , from, numbers, , V into V., regarded, , as a vector, , space over the
Page 82 :
c, , 74, , Linear, , Chap. 3, , Transformations, , field of real numbers (usual operations)., Find a function from V into V which is, a linear transformation, on the above vector space, but which is not a linear transformation, on Cl, i.e., which is not complex linear., 11. Let V be the space of n X 1 matrices over F and let W be the space of m X 1, matrices over F. Let A be a fixed m X n matrix over F and let T be the linear, transformation, from V into W defined by T(X) = AX. Prove that T is the zero, transformation, if and only if A is the zero matrix., 12. Let V be an n-dimensional, vector space over the field F and let T be a linear, transformation, from V into V such that the range and null space of Tare identical., Prove that n is even. (Can you give an example of such a linear transformation, T?), 13. Let, that the, (a), subspace, (b), , 3.2. The, , V be a vector space and T a linear transformation, following two statements, about T are equivalent., The intersection, of the range of T and the null, of V., If T(Tor) = 0, then TCY = 0., , Algebra, , of Linear, , from, , V into, , V. Prove, , space of T is the zero, , Transformations, , In the study of linear transformations, from V into W, it is of fundamental importance that the set of these transformations, inherits a natural, vector space structure. The set of linear transformations, from a space V, into itself has even more algebraic structure, because ordinary composition, of functions provides a ‘multiplication’, of such transformations., We shall, explore these ideas in this section., Theorem, , U be linear, , 4. Let V and W be vector spaces over the field F. Let T and, transformations, from V into W. The function, (T + U) defined by, , (T+U)(cr), is a linear, , transformation, , from, , = Ta+Ua!, , V into W. If c is any element, , of F, the function, , (CT) defined by, (CT)@, , = C(TCY), , is a linear transformation, from V into W. The set of all linear transformations, from V into W, together with the addition, and scalar multiplication, de$ned, above, is a vector space over the field F., Proof., , Suppose T and U are linear transformations, , W and that we define, 0” +, , W(ca, , (T +, , U) as above., , + PI = T(ca, , from, , Then, , + P) +, , U(ca, , + P), , = c(Ta) + TP+ c(Uol) + UP, = @"a + Ua) + (773+ UP), = 0' + U>(a) + (T + U>(P), which shows that (T + U) is a linear transformation., , Similarly,, , V into
Page 83 :
The Algebra of Linear Transformations, , Sec. 3.2, , W>@~ + PI = cW(da + P)], , =, =, =, =, , c[d(W + TPI, cd(Ta) + c(TP), a041 + c(m), 4w%l + wop, , which shows that (CT) is a linear transformation., To verify that the set of linear transformations, of I’ into W (together, with these operations) is a vector space, one must directly check each of, the conditions on the vector addition and scalar multiplication., We leave, the bulk of this to the reader, and content ourselves with this comment:, The zero vector in this space will be the zero transformation,, which sends, every vector of V into the zero vector in W; each of the properties of the, two operations follows from the corresponding, property of the operations, in the space W. 1, We should perhaps mention another way of looking at this theorem., If one defines sum and scalar multiple as we did above, then the set of, all functions from V into W becomes a vector space over the field F. This, has nothing to do with the fact that V is a vector space, only that V is a, non-empty set. When V is a vector space we can define a linear transformation from V into W, and Theorem 4 says that the linear transformations, are a subspace of the space of all functions from V into W., We shall denote the space of linear transformations, from V into W, by L(V, W). We remind the reader that L(V, W) is defined only when V, and W are vector spaces over the same field., Theorem, 5. Let V be an n-dimensional, vector space over the jield I?,, and let W be an m-dimensional, vector space over F. Then the space L(V, W), is finite-dimensional, and has dimension mn., , Proof., , Let, CB = {(Ye, . . . , cy,}, , and, , CB’ = {PI, . . . , Pm), , be ordered bases for V and W, respectively. For each pair of integers (p, q), Ep*q, with 1 _< p 5 m and 1 5 q _< n, we define a linear transformation, from V into W by, , = &,P,., According to Theorem 1, there is a unique linear transformation, from V, into W satisfying these conditions. The claim is that the mn transformations Ep*q form a basis for L(V, W)., Let T be a linear transformation, from V into W. For each j, 1 5 j 5 n,
Page 84 :
76, , Linear Transformations, , Chap. 3, , let Aij, . . . , A,j be the coordinates, Ct.?,i.e.,, , Taj in the ordered basis, , Taj = 2 Apjpp., p=l, , (3-U, We wish to show that, , T = 5, , (3-2), , p=l, , Let U be the linear, Then for each j, , of the vector, , transformation, , i, , ApqEPsq., , q=l, , in the right-hand, , member, , of (3-2)., , Uaj = 2 2 ApPEP’q(aj), P Q, = Z L: Apq%q8p, P P, = pzl Ad=, , and consequently, U = T. Now (3-2) shows that the Ep,q span L(V, W) ;, we must prove that they are independent., But this is clear from what, we did above; for, if the transformation, , is the zero transformation,, , U = 2 2 ApgEP.q, P P, then Uaj = 0 for each j, so, , Z A&p=0, , p=l, , and the independence, , of the &, implies that Apj = 0 for every p and j., , 1, , Theorem, 6. Let V, W, and Z be vector spaces over the jield F. Let T, be a linear transformation, from V into W and U a linear transformation, from W into Z. Then the composed function UT dejined by (UT)(a), =, U(T(a)), is a linear transformation from V into Z., , Proof., , (UT)(ca + P) = U[Tb, , + PII, , = U(cTcu + TO), = c[U(Tcx)] + U(TP), = cWT)(~, + (W(P)., , I, , In what follows, we shall be primarily, concerned with linear transformation, of a vector space into itself. Since we would so often have to, write ‘T is a linear transformation, from V into V,’ we shall replace this, with ‘T is a linear operator on V.’, DeJnition., If V is a vector space over the field F, a linear, V is a linear transformation from V into V., , operator, , on
Page 85 :
Sec. 3.2, , The Algebra, , of Linear, , 77, , Transformations, , In the case of Theorem 6 when V = W = 2, so that U and T are, linear operators on the space V, we see that the composition UT is again, a linear operator on V. Thus the space L(V, V) has a ‘multiplication’, defined on it by composition. In this case the operator TU is also defined,, and one should note that in general UT # TU, i.e., UT - TU # 0. We, should take special note of the fact that if T is a linear operator on V then, we can compose T with T. We shall use the notation T2 = TT, and in, general Tn = T . . . T (n times) for n = 1, 2, 3, . . . . We define To = I if, T # 0., , F; let U, T1 and Tz be, , Let V be a vector space over the jield, on V; let c be an element of F., , Lemma., , linear, , operators, , (a) IU = UI = U;, (b) U(TI + Tz) = UT1 + UTg (T1 + Tz)U = TIU + TJJ;, (c) c(UT1) = (cU)T, = U(cT1)., Proof. (a) This property of the identity, have stated it here merely for emphasis., , is obvious., , We, , [U(TI + Tz)lb) = U[(TI + Td(41, , (b), , = U(TNI, , + Tzcw), , = U(Td, , + U(TzaY), , = (UTI)(cu), , so that U(T1, , function, , +, , Tz) = UT1, , +, , UT,., , +, , (UT,)(a), , Also, , [VI + Tz)Ulb) = (TI + Tz)(Ua), = TdUa) + Tz(Ua), = (TlU)(a) + (T,U)(a), so that (T1 + Tz) U = TIU + T,U. (The reader may note that the proofs, of these two distributive, laws do not use the fact that T1 and Tz are linear,, and the proof of the second one does not use the fact that U is linear either.), [, (c) We leave the proof of part (c) to the reader., The contents of this lemma and a portion of Theorem 5 tell us that, operation, is, the vector space L(V, V), together with the composition, what is known as a linear algebra with identity. We shall discuss this in, Chapter 4., EXAMPLE 8. If A is an m X n matrix with entries in F, we have the, T defined by T(X), = AX, from FnXl into FmX1. If, linear transformation, U from Fmxl into, B is a p X m matrix, we have the linear transformation, Fpxl defined by U(Y), = BY. The composition UT is easily described:, (UT)(X), , = VT(X)), = U(AX), = B(AX), = (BA)X., , Thus UT is ‘left multiplication, , by the product, , matrix, , BA.’
Page 86 :
78, , Chap. 3, , Linear Transformations, , EXAMPLE 9. Let F be a field and V the vector space of all polynomial, functions from F into F. Let D be the differentiation, operator defined in, Example 2, and let T be the linear operator ‘multiplication, by z’ :, (U)(z), , = d(x)-, , Then DT # TD. In fact, the reader should find it easy to verify that, DT - TD = I, the identity operator., Even though the ‘multiplication’, we have on L(V, V) is not commutative, it is nicely related to the vector space operations of L(V, V)., EXAMPLE 10. Let 63 = {CQ, . . . , CU,} be an ordered basis for a vector, space V. Consider the linear operators Ep*q which arose in the proof of, Theorem 5:, EP-q&i) = &,cQ,., These n2 linear operators form a basis for the space of linear operators, What is EB’~QIP~~?We have, (E%PJ), , on I’., , (cq) = EP~q(&,(r,.), = &,Ep~q(Cr,.), = &&a*., , Therefore,, EPen,lj+d, , 0,, , =, , EP~S, , Let T be a linear operator, that if, Ai, A, then, T, , if, ,, , r #, if, , q, , q, =, , r., , on V. We showed in the proof of Theorem, , 5, , = [Tc&, = [A,, . . . , A,], = Z Z ApqEp*q., P, , Q, , If, , is another, , linear operator, , on V, then the last lemma tells us that, , TU = (2 2 ApqEps@)(z 2 B,,J+), P P, = 2 2 2 2 Ap,B,:E~%!P., P P 7 8, As we have noted, the only terms which survive in this huge sum are the, terms where q = r, and since EpvTE7vS = Epes, we have, , TU = 23Z (2 AprBra)Ep~8, PS, T, = 2 2: (AB)p,E~‘8., P 8, Thus, the effect of composing, , T and U is to multiply, , the matrices A and B.
Page 87 :
Sec. 3.2, , The Algebra of Linear Transformations, , In our discussion of algebraic operations with linear transformations, we have not yet said anything about invertibility., One specific question of, interest is this. For which linear operators T on the space IJ does there, exist a linear operator T-l such that TT-1 = T-IT = I?, The function T from V into W is called invertible, if there exists a, function U from W into V such that UT is the identity function on V and, the function U is, TU is the identity function on W. If 7’ is invertible,, unique and is denoted by T-l. (See Appendix.) Furthermore,, T is invertible, if and only if, 1. T is l:l, that is, Ta = Tp implies a = p;, 2. T is onto, that is, the range of T is (all of) W., Theorem, 7. Let V and W be vector spaces over the field F and let T, be a linear transformation from V into W. If T is invertible, then the inverse, function T-’ is a linear transformation from W onto V., , Proof. We repeat ourselves in order to underscore a point. When, T is one-one and onto, there is a uniquely determined inverse function T-l, which maps W onto V such that T-IT is the identity function on V, and, TT-’ is the identity function on W. What we are proving here is that if a, linear function 7‘ is invertible, then the inverse T-l is also linear., Let p1 and ,& be vectors in W and let c be a scalar. We wish to show, that, T-‘(c/3, + ,&) = CT-Y& + T-l/h., Let CQ= T-lpi, i = 1, 2, that is, let CY~be the unique vector in V such that, Tai = pi. Since T is linear,, T(cw, , + az) = cTcq + TCY~, =, , Thus corl + az is the unique vector, and so, T-Y@1, , +, , 132), , CL4, , -I-, , in V which is sent by T into ~$1 + /?z,, =, , cm, , +, , = c(T-‘@I), and T-l, , is linear., , P2., , a2, , + T-92, , 1, , Suppose that we have an invertible, linear transformation, T from V, onto W and an invertible linear transformation, U from W onto 2. Then UT, is invertible, and (UT)-’, = T-‘U-1., That conclusion does not require the, linearity nor does it involve checking separately that UT is 1: 1 and onto., All it involves is verifying that T-Ii?’, is both a left and a right inverse for, UT., If T is linear, then T(a! - /?) = Ta! - T@; hence, TCY = Tp if and only, if T(cy - p) = 0. This simplifies enormously the verification, that T is 1: 1., Let us call a linear transformation, T non-singular, if Ty = 0 implies, , 79
Page 88 :
80, , Linear Transformations, , Chap. 3, , y = 0, i.e., if the null space of T is (0). Evidently,, T is 1: 1 if and only if T, is non-singular., The extension of this remark is that non-singular, linear, transformations, are those which preserve linear independence., Theorem, , T is non-singular, V onto a linearly, , 8. Let T be a linear transformation, from V into W. Then, if and only if T carries each linearly independent subset of, independent subset of W., , Proof. First suppose that T is non-singular., Let S be a linearly, independent subset of V. If al, . . . , CQ are vectors in S, then the vectors, TCQ, . . . , TCQ are linearly independent; for if, cl(Tad, , + . . . + ck(Tak) = 0, , then, T(CKY~ + . . . + CLCYL)= 0, and since T is non-singular, ClcY1+ . . . + Ckffk = 0, from which it follows that each ci = 0 because S is an independent, set., This argument shows that the image of X under T is independent., Suppose that T carries independent subsets onto independent subsets., Let a! be a non-zero vector in V. Then the set S consisting of the one vector, a is independent. The image of S is the set consisting of the one vector Tu,,, and this set is independent., Therefore TCY # 0, because the set consisting, of the zero vector alone is dependent. This shows that the null space of T is, the zero subspace, i.e., T is non-singular., 1, EXAMPLE 11. Let F be a subfield of the complex numbers (or, characteristic zero) and let V be the space of polynomial functions, Consider the differentiation, operator D and the ‘multiplication, operator T, from Example 9. Since D sends all constants into, singular; however, V is not finite dimensional, the range of D is, and it is possible to define a right inverse for D. For example, if, indefinite integral operator :, E(co + ClX + . . . + CnX”) = cox + f c1x2 + . . * + n+l, , 1, , a field of, over F., by x’, 0, D is, all of V,, E is the, , c,xn+l, , then E is a linear operator on V and DE = I. On the other hand, ED # I, because ED sends the constants into 0. The operator T is in what we might, call the reverse situation. If xf(x) = 0 for all 5, then f = 0. Thus T is nonsingular and it is possible to find a left inverse for T. For example if U is, the operation ‘remove the constant term and divide by x’:, U(c0 + Cl5 + . . . + CnX”) = Cl + c2x + . . . + c,x+l, then U is a linear operator, , on V and UT = I. But TU # I since every
Page 89 :
Sec. 3.2, , The Algebra of Linear Transformations, , function in the range of TU is in the range of T, which, polynomial functions j such that j(0) = 0., , is the space of, , EXAMPLE 12. Let F be a field and let T be the linear operator, defined by, WI, 4 = (21 + xz, a>., Then T is non-singular,, because if T(Q, x2) = 0 we have, , on F2, , x1 + x2 = 0, Xl = 0, so that x1 = x2 = 0. We also see that T is onto; for, let (zl, z2) be any, vector in F2. To show that (Q, z2) is in the range of T we must find scalars, XI and x2 such that, Xl, , +, , x2, , =, , 21, , x1, , =, , 22, , and the obvious solution is x1 = 22, 22 = z1 - x2. This last computation, gives us an explicit formula for T-l, namely,, T-l(21,, , 22), , =, , (22,, , zr, , -, , 4., , We have seen in Example 11 that a linear transformation, may be, non-singular, without being onto and may be onto without being nonsingular. The present example illustrates an important, case in which that, cannot happen., Theorem, 9. Let V and W be j%te-dimensional, vector spaces over the, jield F such that dim V = dim W. If T is a linear transformation from V into, W, the following are equivalent:, , (i) T is invertible., (ii) T is non-singular., (iii) T is onto, that is, the range of T is W., Proof., , Let n = dim V = dim W. From Theorem, rank (T) + nullity, , 2 we know that, , (T) = n., , Now T is non-singular if and only if nullity (T) = 0, and (since n = dim, W) the range of T is W if and only if rank (T) = n. Since the rank plus the, T is, nullity is n, the nullity is 0 precisely when the rank is n. Therefore, non-singular, if and only if T(V) = W. So, if either condition (ii) or (iii), holds, the other is satisfied as well and T is invertible., 1, We caution the reader not to apply Theorem 9 except in the presence, of finite-dimensionality, and with dim V = dim W. Under the hypotheses, of Theorem 9, the conditions (i), (ii), and (iii) are also equivalent to these., (iv) If {al, . . . , cr,} is basis for V, then {Ta,, . . . , Ta,} is a basis for, W., , 81
Page 90 :
Linear Transformations, , Chap. 3, , (v) There is some basis (CQ, . . . , cr,} for V such that {Tal, . . . , Tan}, is a basis for W., We shall give a proof of the equivalence of the five conditions which, contains a different proof that (i), (ii), and (iii) are equivalent., (i) + (ii). If T is invertible,, T is non-singular., (ii) + (iii). Suppose, T is non-singular., Let ((~1, . . . , cr,} be a basis for V. By Theorem 8,, {Ta, . . . , TcY,,} is a linearly independent set of vectors in W, and since, the dimension of W is also n, this set of vectors is a basis for W. Now let ,8, be any vector in W. There are scalars cl, . . . , c,, such that, P = CIU’LYI) + . - - + ~n(Tc~n), = T(CIW + * * * + c&J, which shows that p is in the range of T. (iii) + (iv). We now assume that, T is onto. If {crl, . . . , (Y,,} is any basis for V, the vectors Tcx~, . . . , TCY,, span the range of T, which is all of W by assumption. Since the dimension, of W is n, these n vectors must be linearly independent, that is, must comprise, a basis for W. (iv) + (v). This requires no comment. (v) + (i). Suppose, there is some basis {(Ye, . . . , CX,} for V such that {Tcrl, . . . , Tcx,} is a, basis for W. Since the Tai span W, it is clear that the range of T is all of W., If CX!= ClcXl + * . * + c,,Q,, is in the null space of T, then, T(CM, , + . . . + c,&, , = 0, , or, cG”aJ, , + . . . + 4’4, , = 0, , and since the TCX~are independent each ci = 0, and thus LY = 0. We have, shown that the range of T is W, and that T is non-singular,, hence T is, invertible., The set of invertible linear operators on a space V, with the operation, of composition, provides a nice example of what is known in algebra as, a ‘group.’ Although we shall not have time to discuss groups in any detail,, we shall at least give the definition., DeJinition., , A group, , consists of the following., , 1. A set G;, 2. A rule (or operation) which associates with, y in G an element xy in G in such a way that, (a) x(yz) = (xy)z, for all x, y, and z in G, (b) there is an element e in G such that ex =, (c) to each element x in G there corresponds, that xx-l = x-lx = e., , each pair of elements x,, (associatiuity);, xe = x, for every x in G;, an element xv1 in G such, , We have seen that composition, (U, T) -+ UT associates with each, pair of invertible linear operators on a space V another invertible operator, on V. Composition, is an associative operation. The identity operator I
Page 91 :
The Algebra of Linear Transformations, , Sec. 3.2, , satisfies IT = TI for each T, and for an invertible, T there is (by Theorem, 7) an.invertible, linear operator T-l such that TT-l = T-lT = I. Thus the, set of invertible, linear operators on V, together with this operation, is a, group. The set of invertible, n X 12 matrices with matrix multiplication as the operation is another example of a group. A group is called, commutative, if it satisfies the condition xy = yx for each x and y. The, two examples we gave above are not commutative, groups, in general. One, often writes the operation in a commutative, group as (x, y) + 2 + y,, rather than (x, y) + xy, and then uses the symbol 0 for the ‘identity’, element e. The set of vectors in a vector space, together with the operation, of vector addition, is a commutative, group. A field can be described as a, set with two operations, called addition and multiplication,, which is a, commutative, group under addition, and in which the non-zero elements, form a commutative, group under multiplication,, with the distributive, law x(y + x) = xy + xz holding., , Exercises, 1. Let T and U be the linear operators, T(zl, 4, , = (22, ~1), , on R2 defined, and, , by, , U(zi, ~2) = (~1~0)., , (a) How would you describe T and U geometrically?, (b) Give rules like the ones defining T and U for each of the transformations, , (U + T), UT, TU, T2, Uz., 2. Let T be the (unique) linear operator, TE, = (1, 0, i),, , on C3 for which, , TEE= (i, 1, 0)., , TEZ = (0, 1, I),, , Is T invertible?, 3. Let T be the linear, , Is T invertible?, , operator, , on R3 defined, , by, , WA, x2, zd = (321, XI - xz, 2x1 + x2 + x3)., If so, find a rule for T-1 like the one which defines T., , 4. For the linear, , operator, , T of Exercise 3, prove that, (T2 - I)(T, , 5. Let C2x2 be the complex, Let, , - 31) = 0., , vector space of 2 x 2 matrices, , with complex, , entries., , B= [-: -:I, by T(A), , = BA. What, , and let T be the linear operator, rank of T? Can you describe T2?, , on C 2x2 defined, , 6. Let T be a linear transformation, formation, from R2 into R3. Prove, Generalize the theorem., , from R3 into R2, and let U be a linear transthat the transformation, UT is not invertible., , is the
Page 92 :
84, , Linear, , Transformations, , 7. Find, , two linear, , operators, , Chap. 3, T and U on R2 such that, , TU = 0 but UT # 0., , 8. Let V be a vector space over the field F and T a linear operator on V. If T2 = 0,, what can you say about the relation of the range of T to the null space of T?, Give an example of a linear operator T on R2 such that T2 = 0 but T # 0., 9. Let T be a linear operator on the finite-dimensional, space V. Suppose there, and, is a linear operator U on V such that TU = I. Prove that T is invertible, U = T-1. Give an example which shows that this is false when V is not finite(Hint: Let T = D, the differentiation, operator on the space of polydimensional., nomial functions.), 10. Let A be an m X n matrix with entries in F and let T be the linear transformation from FnX1 into Fmxl defined by T(X) = AX. Show that if m < n it may, Similarly,, show that if m > n, happen that T is onto without being non-singular., but not onto., we may have T non-singular, vector space and let T be a linear operator on V., 11. Let V be a finite-dimensional, Suppose that rank (T*) = rank (T). Prove that the range and null space of T are, disjoint,, i.e., have only the zero vector in common., 12. Let p, m, and n, matrices over F and, matrix, and let T, T(A) = BA. Prove, m X m matrix., , 3.3., , be positive integers and F a field. Let V be the space of m X n, W the space of p X n matrices over F. Let B be a fixed p X m, be the linear transformation, from V into W defined by, that T is invertible, if and only if p = m and B is an invertible, , Isomorphism, If V and W are vector, transformation, T of V onto, If there exists an isomorphism, to w., Note that V is trivially, , spaces over the field F, any one-one, linear, W is called an isomorphism, of V onto, W., of V onto W, we say that V is isomorphic, , isomorphic, to V, the identity, operator, being, an isomorphism of V onto V. Also, if V is isomorphic to W via an isomorphism T, then W is isomorphic to V, because T-l is an isomorphism, of W onto V. The reader should find it easy to verify that if V is isomorphic to W and W is isomorphic to 2, then V is isomorphic to 2. Briefly,, isomorphism, is an equivalence relation on the class of vector spaces. If, there exists an isomorphism of V onto W, we may sometimes say that V, and W are isomorphic, rather than V is isomorphic to W. This will cause, no confusion because V is isomorphic to W if and only if W is isomorphic, to v., Theorem, 10. Every n-dimensional, morphic to the space F”., , vector space over the field F is iso-, , Proof. Let V be an n-dimensional, space over the field F and let, 63 = {al, . . . ) cr,} be an ordered basis for V. We define a function, T
Page 93 :
Sec. 3.3, , Isomorphism, , from V into P, as follows: If a is in V, let TCY be the n-tuple (Q, . . . , x,), of coordinates of CYrelative to the ordered basis @, i.e., the n-tuple such, that, a = Xl(Yl + . . . + x,c&., In our discussion of coordinates in Chapter, linear, one-one, and maps V onto P., 1, , 2, we verified, , that this T is, , For many purposes one often regards isomorphic vector spaces as, being ‘the same,’ although the vectors and operations in the spaces may, be quite different,, that is, one often identifies isomorphic, spaces. We, shall not attempt a lengthy discussion of this idea at present but shall, let the understanding, of isomorphism and the sense in which isomorphic, spaces are ‘the same’ grow as we continue our study of vector spaces., We shall make a few brief comments. Suppose T is an isomorphism, of V onto W. If S is a subset of V, then Theorem 8 tells us that X is linearly, independent, if and only if the set T(S) in W is independent., Thus in, deciding whether S is independent it doesn’t matter whether we look at S, or T(S). From this one sees that an isomorphism is ‘dimension preserving,’, that is, any finite-dimensional, subspace of V has the same dimension as its, image under T. Here is a very simple illustration, of this idea. Suppose A, is an m X n matrix over the field F. We have really given two definitions, of the solution space of the matrix A. The first is the set of all n-tuples, (21, . . . ) x,) in Fn which satisfy each of the equations in the system AX =, 0. The second is the set of all n X 1 column matrices X such that AX = 0., The first solution space is thus a subspace of Fn and the second is a subspace, of the space of all n X 1 matrices over F. Now there is a completely, obvious isomorphism between Fn and Fnxl, namely,, Xl, (Xl,, , . ., , . ) 2,), , +, , ;, , ., , [1X7L, Under this isomorphism, the first solution space of A is carried onto the, second solution space. These spaces have the same dimension, and so, if we want to prove a theorem about the dimension of the solution space,, it is immaterial, which space we choose to discuss. In fact, the reader, would probably not balk if we chose to identify Fn and the space of n X 1, matrices. We may do this when it is convenient,, and when it is not convenient we shall not., , Exercises, 1. Let V be the set of complex numbers and let F be the field of real numbers., With the usual operations, V is a vector space over F. Describe explicitly an isomorphism of this space onto R2., , 85
Page 94 :
86, , Linear Transformations, , Chap. 3, , 2. Let V be a vector space over the field of complex numbers, and suppose there, T of V onto C3. Let CQ, LYE,a3, a4 be vectors in V such that, is an isomorphism, , TCY,= (1, 0, i),, TCY~= (-1, 1, I),, , TCYZ= (-2, 1 + i, 0),, Told = (d/2, i, 3)., , (a) Is aI in the subspace spanned by crz and as?, (b) Let WI be the subspace spanned by (Yeand LYE,and let W2 be the subspace, spanned by CQ and cy4. What is the intersection, of WI and WI?, (c) Find a basis for the subspace of V spanned by the four vectors o+, matrices, that is, the set, 3. Let W be the set of all 2 X 2 complex Hermitian, of 2 X 2 complex matrices n such that Asj = Aii (the bar denoting, complex, conjugation)., As we pointed out in Example, 6 of Chapter 2, W is a vector space, over the field of real numbers, under the usual operations., Verify that, , is an isomorphism, , of R4 onto W., , 4. Show that Frnxn is isomorphic, , to Fmn., , 5. Let I’ be the set of complex numbers regarded as a vector space over the, field of real numbers (Exercise 1). W7e define a function T from V into the space, of 2 X 2 real matrices, as follows. If z = 2 + iy with z and y real numbers, then, T(z) =, , z + 7Y, -1oy, , 5Y, z--y’, , (a) Verify that T is a one-one (real) linear, space of 2 X 2 real matrices., (b) Verify that T(zlz2) = T(zl)T(zJ., (c) How would you describe the range of T?, , 1, transformation, , of V into, , the, , 6. Let V and W be finite-dimensional, vector spaces over the field F. Prove that, V and W are isomorphic if and only if dim V = dim W., 7. Let V and W be vector spaces over the field F and let U be an isomorphism, of L( V, V) onto L(W, W)., of V onto W. Prove that T + UTUpl is an isomorphism, , 3.4., , Representation, , of Transformations, , by Matrices, vector, space over the field F and let W, Let V be an n-dimensional, be an m-dimensional, vector, space over F. Let 03 = (q . . . , cr,} be an, ordered basis for V and B’ = {PI, . . . , pm} an ordered basis for W. If T, is any linear transformation, from V into W, then T is determined, by its, action on the vectors aj. Each of the n vectors Tcq is uniquely, expressible, as a linear combination, , (3-3), , TCY~= 5 AijPi, i=l
Page 95 :
Representation, , Sec. 3.4, , of Transformations, , by Matrices, , of the pi, the scalars Ali, . . . , A,? being the coordinates of Taj in the, T is determined, by, ordered basis 6~‘. Accordingly,, the transformation, the win scalars A, via the formulas (3-3). The m X n matrix A defined, of T relative, to the pair of ordered, by A(i, j) = Ai, is called the matrix, bases 03 and 6~‘. Our immediate, task is to understand, explicitly how, T., the matrix A determines the linear transformation, If Q = zlal + . . . + xnan is a vector m V, then, , = ,Z,(2, Aijxj)Pi., If X is the coordinate matrix of CYin the ordered basis a, then the computation above shows that AX is the coordinate matrix of the vector Ta!, in the ordered basis a’, because the scalar, , is the entry in the ith row of the column matrix AX., that if A is any m X n matrix over the field F, then, , Let us also observe, , (3-4), , defines a linear transformation, T from V into IV, the matrix, to a, 6~‘. We summarize formally:, , of which is, , A, relative, , Theorem, 11. Let V, and W an m-dimensional, V and a3’ an ordered basis, into W, there is an m X II, , be an n-dimensional, vector space over the jield F, vector space over F. Let CB be an ordered basis for, for W. For each linear transformation, matrix, A with entries in F such that, , T from V, , CTalw= ALaIm, for every vector (Y in V. Furthermore,, , T + A is a one-one, , between the set of all linear, transformations, all m X n matrices over the field E‘., , from, , V into, , The matrix, , correspondence, , W and the set of, , A which is associated with T in Theorem 11 is called the, of T relative, to the ordered, bases a, a’. Note that Equation, (3-3) says that A is the matrix whose columns Al, . . . , A, are given by, , matrix, , Aj, , = [Taj]af,, , j = 1, . . . , n., , 87
Page 96 :
88, , Linear Transformations, , Chap. 3, , If U is another linear transformation, from J’ into W and B = [B1, . . . , B,], is the matrix of U relative to the ordered bases @, a’ then CA + B is the, matrix of CT + U relative to a, 6~‘. That is clear because, , CAM+ Bj = c[Taj]ar, , + [Uaj]at, = [CTaj + UCU~]~, = [(CT + U)C&IY., , Theorem, 12. Let V be an n-dimensional, vector space over the field F, and let W be an m-dimensional, vector space over F. For each pair of ordered, bases &I$ a’ for V and W respectively, the junction which assigns to a linear, transformation, T its matrix relative to 6.~ 03’ is an isomorphism between the, space L(V, W) and the space of all m X n matrices over the jield F., , Proof. We observed above that the function in question is linear,, and as stated in Theorem 11, this function is one-one and maps L(V, W), onto the set of m X n matrices., 1, We shall be particularly, interested in the representation, by matrices, of linear transformations, of a space into itself, i.e., linear operators on a, space V. In this case it is most convenient to use the same ordered basis, in each case, that is, to take B = a’. We shall then call the representing, matrix simply the matrix, of T relative, to the ordered, basis 63. Since, this concept will be so important to us, we shall review its definition. If T, is a linear operator on the finite-dimensional, vector space V and @ =, a,}, is, an, ordered, basis, for, V,, the, matrix, of T relative to a3 (or, the, al,, ., ., ., ,, {, matrix of T in the ordered basis 6~) is the n X n matrix A whose entries, A;i are defined by the equations, (3-5), , Tctj = 5 Aijai,, , j=l, , , . . . , n., , i=l, , One must always remember that this matrix representing T depends upon, the ordered basis @, and that there is a representing matrix for T in each, ordered basis for V. (For transformations, of one space into another the, matrix depends upon two ordered bases, one for V and one for W.) In order, that we shall not forget this dependence, we shall use the notation, , for the matrix of the linear operator T in the ordered basis a. The manner, in which this matrix and the ordered basis describe T is that for each a! in V, , P’alas= [Tld~l~., EXAMPLE, 13. Let V be the space of n X 1 column matrices over the, field F; let W be the space of m X 1 matrices over F; and let A be a fixed, m X n matrix over F. Let T be the linear transformation, of V into W, defined by T(X) = AX. Let a3 be the ordered basis for V analogous to the
Page 97 :
Representation of Transformations, , Sec. 3.4, , by Matrices, , standard basis in Fn, i.e., the ith vector in CBin the n X 1 matrix, a 1 in row i and all other entries 0. Let a’ be the corresponding, basis for W, i.e., the jth vector in a’ is the m X 1 matrix Yj with a, j and all other entries 0. Then the matrix of T relative to the pair, the matrix A itself. This is clear because the matrix AXj is the jth, ofA., EXAMPLE, , X; with, ordered, 1 in row, (8, 63 is, column, , 14. Let F be a field and let T be the operator on F2 defined by, Th, , x2) = (xl, 0)., , It is easy to see that T is a linear operator, ordered basis for F2, CB= {tl, c2>. Now, , on F2. Let B be the standard, , Tel = T(l, 0) = (1, 0) = 1~ + 0~2, Tt, = T(0, 1) = (0, 0) = 0~1 + 0~2, so the matrix, , of T in the ordered basis CBis, , b"lm = [;, , 81., , EXAMPLE, 15. Let V be the space of all polynomial, into R of the form, , j(x), , =, , co +, , Cl2, , +, , c222, , +, , functions, , from R, , c3x3, , that is, the space of polynomial, functions of degree three or less. The, differentiation, operator D of Example 2 maps V into V, since D is ‘degree, decreasing.’ Let CBbe the ordered basis for V consisting of the four functions, ji, j2, j3, j4 defined by jj(x) = xi-l. Then, Ojl, , Of2 +, , WI>(X), , =, , 0,, , Dfi, , =, , (Dfi)(x), , =, , 1,, , Dj2, , = lfi + Oj2 + Oj3 + Oj4, , W3)(5), , =, , Qf3, , =, , (Df4)(2), , = 3x2,, , Dj4, , = Of1 + Oj2 + 3j3 + Of4, , so that the matrix, , 22,, , +, , Of1 +, , V2, , +, , Of3, , Of3, , +, , +, , Of4, , Of4, , of D in the ordered basis CBis, , We have seen what happens to representing matrices when transformations are added, namely, that the matrices add. We should now like, to ask what happens when we compose transformations., More specifically,, let V, W, and Z be vector spaces over the field F of respective dimensions, n, m, and p. Let T be a linear transformation, from V into W and U a linear, transformation, from W into 2. Suppose we have ordered bases, , @= {w,...,cyn}, @‘= {Pl,...,Pm}, 63”= (71,. . . ) Yp), , 89
Page 98 :
90, , Chap. 3, , Linear Transformations, , for the respective spaces V, W, and 2. Let A be the matrix of T relative, to the pair a, a’ and let B be the matrix of U relative to the pair (ES’,a”., It is then easy to see that the matrix C of the transformation, UT relative, to the pair a, a” is the product of B and A ; for, if (Y is any vector in V, , D"Q~CBJ, = A[alcis, [U(T~)]@JT = B[Ta]@!, and so, , [U”)b)lw, , = BALala3, , and hence, by the definition and uniqueness of the representing, matrix,, we must have C = BA. One can also see this by carrying out the computation, (UT)(aJ, = U(Taj), , = U (k!l Axi&), = ,zl -‘b(Uh), = it A,j ii BikYi, k=l, , i=l, , so that we must have, (3-Q, , c;j, , =, , 2, , Bik&j., , k=l, , We motivated, the definition (3-6) of matrix multiplication, via operations, on the rows of a matrix. One sees here that a very strong motivation, for, the definition is to be found in composing linear transformations., Let us, summarize formally., Theorem., 13. Let V, W, and Z be finite-dimensional, vector spaces over, the Jield F; let T be a linear transformation from V into W and U a linear, transformation, from W into Z. If 03, a’, and 63” are ordered bases for the, spaces V, W, and Z, respectively, if A is the matrix of T relative to the pair, a, a’, and B is the matrix of U relative to the pair a’, Brr, then the matrix, of the composition UT relative to the pair a, 63” is the product matrix C = BA., , We remark that Theorem 13 gives a proof that matrix multiplication, is associative-a, proof which requires no calculations and is independent, of the proof we gave in Chapter 1. We should also point out that we proved, a special case of Theorem 13 in Example 12., It is important, to note that if T and U are linear operators on a, space V and we are representing by a single ordered basis a, then Theorem, 13 assumes the simple form [UT]@ = [U]a[T]a., Thus in this case, the
Page 99 :
Representation of Transformations, , Sec. 3.4, , by Matrices, , correspondence which 6~ determines between operators and matrices is not, only a vector space isomorphism but also preserves products. A simple, consequence of this is that the linear operator T is invertible if and only if, [T]a is an invertible matrix. For, the identity operator I is represented by, the identity matrix in any ordered basis, and thus, UT = TU = I, is equivalent, , to, , V%~P’loa, = D%U-Jl~= 1., Of course, when T is invertible, [T-Q,, , = [T&l., , Now we should like to inquire what happens to representing matrices, when the ordered basis is changed. For the sake of simplicity, we shall, consider this question only for linear operators on a space I’, so that we, can use a single ordered basis. The specific question is this. Let T be a, linear operator on the finite-dimensional, space I’, and let, (B = {Q . . . , (Y,}, , and, , 6~’ = (4, . . . , a:}, , be two ordered bases for V. How are the matrices [T]a and [T]~J related?, As we observed in Chapter 2, there is a unique (invertible), n X n matrix P, such that, , IIQICB, = JTQIW, , (3-7), , for every vector (Y in V. It is the matrix, [cy&. By definition, , Wla = P”ld~lob., , (3-S), Applying, , (3-7) to the vector, , TCY,we have, [Tcx]~ = P[Tcx]~J., , (3-9), Combining, , P = [PI, . . . , PR] where Pi =, , (3-7), (3-S), and (3-9), we obtain, , or, P-‘[T]~P[~]w, , = [Tcr]~, , and so it must be that, [T]@! = P-‘[T]@P., , (3-10), , This answers our question., Before stating this result formally, let us observe the following., is a unique linear operator U which carries a3 onto a’, defined by, , UcVj = ffl,, This operator, , U is invertible, , There, , j = 1, . . . ) n., , since it carries a basis for V onto a basis for, , 91
Page 100 :
92, , Linear, , Chap. 3, , Transformations, , V. The matrix I’ (above) is precisely the matrix, ordered basis 6~ For, P is defined by, , of the operator, , U in the, , ai = 5 P<jQi, i=l, , and since Uaj = cu;, this equation, , can be written, , UCYj = i, , Pijffi., , i=l, , So P = [U],,, Theorem, , by definition., 14., , Let V be a finite-dimensional, , vector space over the field, , F,, , and let, 63 =, , {al, . . . ) a,}, , and, , 03’ = {a;,, , . . . , aA>, , be ordered bases for V. Suppose T is a linear operator on V. If P = [P,, . . . ,, P,] is the n X 11 matrix with columns Pj = [a;]@, then, , [T]w, Alternatively,, , 1,**-,, , = P-‘[Tlo3P., , if U is the invertible, , n, then, , operator, , on V dejined, , by Uaj = a;, j =, , [‘UCB~, = WIG‘lTl&Jl,., , EXAMPLE, 16. Let T be the linear operator on R2 defined by T(xl, x2) =, (~1, 0). In Example 14 we showed that the matrix of T in the standard, ordered basis 6~ = {Q, Q} is, , II% = [;, , 81., , Suppose 6~’ is the ordered basis for R2 consisting, E; = (2, 1). Then, t: = Cl + 62, d = 2Q + t2, so that I’ is the matrix, p’, 12., , of the vectors E: = (1, l),, , [ 1, , By a short computation, p-1, , =[, , -1l, , 1, , l, , -1’, , 2], , Thus, [T]w, , = P-‘[T]@P, = [I-:, , -I][:,, , = [-:, , -91, , = [-i, , -;I., , :I[:, [ii, , iI, , T]
Page 101 :
Representation, , Sec. 3.4, , of Transformations, , by Matrices, , We can easily check that this is correct because, TE:=(l,o)=, -e:+, 6;, TE; = (2, 0) = -24 + 2&, EXAMPLE 17. Let V be the space of polynomial functions, R which have ‘degree’ less than or equal to 3. As in Example, the differentiation, operator on V, and let, @, , =, , ul,fi,, , f3,, , from R into, 15, let D be, , f4), , be the ordered basis for V defined by f;(z), and define g<(x) = (x + t)+l, that is, , = xi--l. Let t be a real number, , fi, , g1, , =, , g2, , =, , tfi, , $73, , =, , t2fl, , g4, , =, , t”fi +, , +f2, +, , + f3, , 2v2, , 3ty2, , 31f3, , +, , +, , f4., , Since the matrix, 1, , is easily seen to be invertible, , t, , t2, , t3, , 0 0, 00, , 1, 0, , 3t, 1, , with, , it follows that a’ = {gl, 92, g3, g4} is an ordered basis for V. In Example, we found that the matrix of D in the ordered basis a3 is, , The matrix, p-l[D]+[i, , of D in the ordered, -;, , -i, , basis cB’ is thus, -:i_lk, , ;, , ;, , ;]I, , ;, , f, , ty, , 15,, , 93
Page 102 :
Chap. 3, , Linear !!?an.sjUrmations, , Thus D is represented by the same matrix in the ordered bases 63 and a’., Of course, one can see this somewhat more directly since, Dg,, Dg,, Dgs, Dg4, , =, =, =, =, , 0, gl, 2gz, 3g3., , This example illustrates a good point. If one knows the matrix of a linear, operator in some ordered basis 03 and wishes to find the matrix in another, ordered basis a’, it is often most convenient to perform the coordinate, change using the invertible, matrix P; however, it may be a much simpler, task to find the representing matrix by a direct appeal to its definition., DeJinition., Let A and B be n X n (square) matrices over the field F., to A over F if there is an invertible n X II matrix, We say that B is similar, P over F such that B = P-‘AP., , According to Theorem 14, we have the following: If V is an n-dimensional vector space over F and @ and 03’ are two ordered bases for V,, then for each linear operator T on V the matrix B = [T]ar is similar to, the matrix A = [T]a. The argument also goes in the other direction., Suppose A and B are n X n matrices and that B is similar to A. Let, V be any n-dimensional, space over F and let @ be an ordered basis for V., Let T be the linear operator on V which is represented in the basis @ by, A. If B = P-‘AP, let a be the ordered basis for V obtained from & by P,, i.e.,, , Then the matrix of T in the ordered basis a’ will be B., Thus the statement that B is similar to A means that on each ndimensional space over F the matrices A and B represent the same linear, transformation, in two (possibly) different ordered bases., Note that each n X n matrix A is similar to itself, using P = I; if, B is similar to A, then A is similar to B, for B = P-‘AP implies that, A = (P-I)-‘BP-‘;, if B is similar to A and C is similar to B, then C is similar, to A, for B = P-‘AP, and C = Q-‘BQ imply that C = (PQ)-IA(, Thus, similarity is an equivalence relation on the set of n X n matrices, over the field F. Also note that the only matrix similar to the identity, matrix I is I itself, and that the only matrix similar to the zero matrix is, the zero matrix itself.
Page 103 :
Representation, , Sec. 3.4, , of Transformations, , by Matrices, , Exercises, 1. Let T be the linear operator, the standard ordered basis for C2, by czi = (1, i), (~2 = (-i, 2)., (a) What is the matrix of T, (b) What is the matrix of T, (c) What is the matrix of T, (d) What is the matrix of T, 2. Let T be the linear, , on C2 defined by T(Q, x2) = (x1, 0). Let @ be, and let OS’ = {(pi, 02) be the ordered basis defined, relative to the, relative to the, in the ordered, in the ordered, , transformation, T(q, , pair, pair, basis, basis, , @, a’?, a’, a?, (I?/?, {LYE,cur}?, , from R3 into R2 defined, , by, , 22, 23) = (21 + xz, 223 - 21)., , (a) If OS is the standard ordered basis for R3 and a3’ is the standard, basis for R2, what is the matrix of T relative to the pair 03, OS’?, (b) If & = {czi, CQ, as} and OY = {pi, &}, where, a1 = (1, 0, -1),, what is the matrix, , a2, , = (1, 1, l),, , of T relative, , (Y3 = (1, 0, O),, , P1 = (0, l),, , P2, , ordered, , = (LO), , to the pair OS, a3’?, , 3. Let T be a linear operator on F”, let A be the matrix of T in the standard, ordered basis for Z+, and let W be the subspace of Fn spanned by the column, vectors of A. What does 1Y have to do with T?, 4. Let I’ be a two-dimensional, vector space over the field, ordered basis for V. If T is a linear operator on V and, , prove that T2 -, , (a + d)T + (ad -, , 5. Let T be the linear, basis is, , operator, , bc)Z = 0., , on R3, the matrix, , A=, , F, and let a3 be an, , 1, 0, , of which in the standard, , ordered, , 2 1, 11., 3 4, , [ 1, -1, , Find a basis for the range of T and a basis for the null space of T., 6. Let T be the linear, , operator, , on R2 defined, T(zl,, , (a) What, (b) What, and (Y~ = (1,, (c) Prove, (d) Prove, , (-x2,4., , is the matrix of T in the standard ordered basis for R2?, is the matrix of T in the ordered basis 6S = {LY~,a2}, where CYI = (1,2), -l)?, that for every real number c the operator (T - cZ) is invertible., that if OSis any ordered basis for R2 and [T]m = A, then A12A21 # 0., , 7. Let T be the linear, T(xI,, (a) What, , zz) =, , by, , x2,, , operator, , on R3 defined, , x3) = (321 + x3, -221, , is the matrix, , +, , of T in the standard, , by, x2,, , -x1, , +, , ordered, , 2x2, , +, , 4x3)., , basis for R3?, , 95
Page 104 :
96, , Linear, , Transformations, , (b) What, , Chap. 3, , is the matrix, , of T in the ordered, , basis, , +-a, a2, a31, where LYI = (1, 0, l), a2 = (- 1, 2, l), and (Y~ = (2, 1, I)?, (c) Prove that T is invertible, and give a rule for T-1 like the one which, fines T., 8. Let 6 be a real number., Prove, over the field of complex numbers:, [, , ;?J;, , -;;I’, , that, , the following, , [r, , two matrices, , de-, , are similar, , ?.I, , (Hint: Let T be the linear operator on C2 which is represented by the first matrix, in the standard ordered basis. Then find vectors crl and (Y~ such that Tcvl = eiecyl,, Tm = e-%, and {CYI, (~2) is a basis.), 9. Let V be a finite-dimensional, vector space over the field F and let S and T, be linear operators on V. We ask: When do there exist ordered bases @ and a, for V such that [&‘]a = [T](B~? Prove that such bases exist if and only if there is, an invertible, linear operator, U on V such that T = USU-1. (Outline of proof:, If [S]aa = [T]abt, let U be the operator which carries B onto a’ and show that, S = UTP., Conversely,, if T = USP, for some invertible, U, let (I?, be any, ordered basis for V and let a3’ be its image under U. Then show that [S]a = [T]~J.), 10. We have seen that the linear operator T on R2 defined, is represented in the standard ordered basis by the matrix, , by T(q, x2) = (x1, 0), , This operator satisfies T2 = T. Prove that if S is a linear operator on K? such that, X2 = S, then S = 0, or S = I, or there is an ordered basis G3 for R2 such that, [S]a = A (above)., 11. Let W be the space of all n X 1 column matrices over a field F. If A is an, over F, then A defines a linear operator La on W through, left, n X n matrix, multiplication:, LA(X) = AX. Prove that every linear operator on W is left multiplication, by some n X n matrix, i.e., is LA for some A., vector space over the field F, and let (R, Now suppose V is an n-dimensional, be an ordered basis for V. For each (Y in V, define Ua = [a]~. Prove that U is an, isomorphism, of V onto W. If T is a linear operator on V, then UTU-1 is a linear, UTU-l is left multiplication, by some n X n matrix A., operator on IV. Accordingly,, What is A?, 12. Let V be an n-dimensional, vector space over the field F, and let @ =, +I, . * . , a,,} be an ordered basis for V., (a) According to Theorem 1, there is a unique linear operator T on V such that, t, TCY~= aj+l,, j = 1,. . .) 12 - 1 t, Tcu, = 0., is the matrix A of T in the ordered basis a?, (b) Prove that T* = 0 but Tnml # 0., (c) Let S be any linear operator on V such that Sn = 0 but Sri-l # 0. Prove, that there is an ordered basis 6~’ for V such that the matrix of S in the ordered, basis E.V is the matrix A of part (a)., , What
Page 105 :
Sec. 3.5, , Linear Functionals, , (d) Prove that if M and N are n X n matrices over F such that Ma = Nn = 0, but, Mn-l # 0 # Nn-l, then M and N are similar., 13. Let V and W be finite-dimensional vector spaces over the field F and let T, be a linear transformation from V into W. If, 63 = {al, . . . ,4, , and OS’= {b,...,&i, are ordered bases for V and W, respectively, define the linear transformations Epv*, as in the proof of Theorem 5: E+aJ, = &,&. Then the Epsg, 1 5 p 5 m,, 1 5 q 2 n, form a basis for L(V, IV), and so, T = 5, p=l, , 5 A,,Epsn, q=l, , for certain scalars A,, (the coordinates of T in this basis for L(V, W)). Show that, the matrix A with entries A@, q) = A,, is precisely the matrix of T relative to, the pair cB,6Y., , 3.5., , Linear, , Func, , tionals, , If V is a vector space over the field F, a linear transformation, f from V, into the scalar field F is also called a linear, functional, on V. If we start, from scratch, this means that f is a function from V into F such that, f(ca + PI = cm + f(P), for all vectors CYand /3 in V and all scalars c in F. The concept of linear, functional is important, in the study of finite-dimensional, spaces because, it helps to organize and clarify the discussion of subspaces, linear equations,, and coordinates., 18. Let F be a field and let al, . . . , a, be scalars in F. Define, f on Fn by, , EXAMPLE, , a function, , fkl,, , ., , . . 7 x,), , =, , UlXl, , +, , . . ., , +, , unxn., , Then f is a linear functional, on Fg. It is the linear functional, represented by the matrix [al . * * a,] relative to the standard, basis for Fn and the basis (1) for F:, % = f(9),, , which is, ordered, , j = 1, . . . , n., , Every linear functional on Fn is of this form, for some scalars al, . . . , a,., That is immediate from the definition of linear functional because we define, , uj = f(cj) and use the linearity, , =, , 2; UjXj., , 97
Page 106 :
98, , Linear Transformations, , Chap. 3, , EXAMPLE, 19. Here is an important, example of a linear functional., Let 12 be a positive integer and F a field. If A is an n X n matrix with, entries in F, the trace of A is the scalar, , tr A = AlI + AZ2 + . . . + A,,., The trace function, , is a linear functional, , =, , c, , on the matrix, , space Fnxn because, , 5 Aii + 5 Bii, i=l, , i=l, , = ctrA, , + trB., , EXAMPLE, 20. Let V be the space of all polynomial functions, field F into itself. Let t be an element of F. If we define, , from the, , then L, is a linear functional on V. One usually describes this by saying, that, for each t, ‘evaluation, at t’ is a linear functional, on the space of, polynomial, functions. Perhaps we should remark that the fact that the, at t, functions are polynomials plays no role in this example. Evaluation, is a linear functional on the space of all functions from F into F., EXAMPLE, 21. This may be the most important, linear functional, in, mathematics. Let [a, b] be a closed interval on the real line and let C([a, b]), be the space of continuous real-valued functions on [a, b]. Then, , L(g)= 1.”g(t)at, defines a linear functional L on C([a, b])., If V is a vector space, the collection of all linear functionals, on V, forms a vector space in a natural way. It is the space L(V, F). We denote, this space by V* and call it the dual space of V:, V* = L(V, F)., If V is finite-dimensional,, of the dual space V*. From, space V*, namely that, , we can obtain a rather explicit description, Theorem 5 we know something about the, dim V* = dim V., , Let a3 = {(Ye, . . . , a,} be a basis for V. According to Theorem, is (for each i) a unique linear functionalfi, on V such that, (3-l 1), , fi(Oij) =, , 1, there, , 6ij., , In this way we obtain from @ a set of n distinct linear functionalsfi,, . . . , fn, on V. These functionals are also linearly independent. For, suppose
Page 107 :
Sec. 3.5, , Linear Functionals, , (3-12), Then, , f((Yj)= j, Cifibj), = 5, i-1, , CiSij, , = Cj., , In particular,, if j, the scalars cj are, tionals, and since, OS* = {.A, . . . ,fn), of @I., , is the zero functional, j(aj) = 0 for each j and hence, all 0. Now ji, . . . , jn are n linearly independent, funcwe know that V* has dimension n, it must be that, is a basis for V*. This basis is called the dual basis, , Theorem, 15. Let V be a finite-dimensional, vector space over the jield F,, and let 63 = (~21,. . . , CY,} be a basis for V. Then there is a unique dual, basis a3* = {f,, . . . , fn} for V* such that fi(aj) = 6ij. For each linear junctional f on V we have, , (3-13), , f, , =, , $,, , f(ai)fi, , and for each vector CYin V we have, Cr = g fi(CY)CXi., , (3-14), , i=l, , Proof. We have shown above that there is a unique basis which is, ‘dual’ to EL If j is a linear functional on V, then j is some linear combination, (3-12) of the ji, and as we observed after (3-12) the scalars cj must be given, by cj = j(aj). Similarly, if, Q =, , ;, , xjffj, , i=l, , is a vector, , in V, then, , = 5, , XjSjj, , i=l, =, , Xj, , so that the unique expression for (Y as a linear combination, , of the cri is, , Equation (3-14) provides us with a nice way of describing what the, dual basis is. It says, if CB= {(Ye, . . . , a,} is an ordered basis for V and, , 99
Page 108 :
100, , Chap. 3, , Linear Transformations, , is the dual basis, then ji is precisely the function, a* = {jl, . . . ,fJ, which assigns to each vector (Y in V the ith coordinate of (II relative to the, ordered basis a. Thus we may also call the ji the coordinate functions for, a. The formula (3-13), when combined with (3-14) tells us the following:, If j is in V*, and we let j(a;) = cri, then when, a = Zlcdl + . . . + z,a,, we have, (3-15), , j(a), , = a1q + . * . + c&J,., , In other words, if we choose an ordered basis 63 for V and describe each, vector in V by its n-tuple of coordinates (~1, . . . , z,,) relative to (8, then, every linear functional, on V has the form (3-15). This is the natural, generalization of Example 18, which is the special case V = Fn and @ =, {Qt * . .,GJ., EXAMPLE 22. Let V be the vector space of all polynomial functions, from R into R which have degree less than or equal to 2. Let tl, t2, and t3, be any three distinct real numbers, and let, -L(p), , = P(ti)., , Then L1, Lz, and Lz are linear functionals, linearly independent ; for, suppose, , on V. These functionals, , are, , L = c,L, + CZLZ + c3L3., If L = 0, i.e., if L(p) = 0 for each p in V, then applying, polynomial ‘functions’ 1, 2, 9, we obtain, Cl, , +, , =, , 0, , t1c1, , +, , t2c2, , CP +, +, , t3c3, , c3, , =, , 0, , th, , +, , t;c2, , +, , &a, , =, , 0, , L to the particular, , From this it follows that cl = c2 = c3 = 0, because (as a short computation, shows) the matrix, 1 1 1, t1 t2 t3, , [ 1, t:, , t;, , t;, , is invertible when tl, t2, and t3 are distinct. Now the Li are independent,, and since V has dimension 3, these functionals form a basis for V*. What, is the basis for V, of which this is the dual? Such a basis {pr, ~2, ~3) for V, must satisfy, Li(pj) = 6ij, or, pj(ti), = 6ij., These polynomial, , functions, , are rather, , easily seen to be
Page 109 :
Linear Functionals, , Sec. 3.5, , (xpl(x)= (tl(x P2(x), = (tzp3(2)= Iz;-, , t2)(z- t3), tz)(tl- t3), h)(Z- ta), tl)(tz- t3), h)(s- t2>, tl)(t3- tzj, , The basis {pl, p,, pa} for V is interesting,, have for each p in V, p, , =, , P(h)Pl, , +, , PWP2, , because according, , +, , to (3-14) we, , P@dP3., , Thus, if cl, c2, and c3 are any real numbers, there is exactly one polynomial, function p over R which has degree at most 2 and satisfies p(tJ = cj, j =, 1, 2, 3. This polynomial function is p = clpl + c2p2 + c3p3., Now let us discuss the relationship, between linear functionals, and, subspaces. If f is a non-zero linear functional, then the rank off is 1 because, the range off is a non-zero subspace of the scalar field and must (therefore), be the scalar field. If the underlying space V is finite-dimensional,, the rank, plus nullity theorem (Theorem 2) tells us that the null space Nf has, dimension, dimN, = dimV - 1., In a vector space of dimension n, a subspace of dimension n - 1 is called, a hyperspace., Such spaces are sometimes called hyperplanes or subspaces, of codimension 1. Is every hyperspace the null space of a linear functional?, The answer is easily seen to be yes. It is not much more difficult to show, that each d-dimensional, subspace of an n-dimensional, space is the intersection of the null spaces of (n - d) linear functionals (Theorem 16 below)., Definition., If V is a vector space over the field F and S is a subset of V,, of S is the set So of linear functionals f on V such that, the annihilator, f(ar) = 0 for every Q in S., , It should be clear to the reader that So is a subspace of V*, whether, S is a subspace of V or not. If S is the set consisting of the zero vector, alone, then So = V*. If S = V, then X0 is the zero subspace of V*. (This is, easy to see when V is finite-dimensional.), Theorem, 16. Let V be a jinite-dimensional, and let W be a subspace of V. Then, , vector space over the jield F,, , dim W + dim W” = dim V., Proof. Let Ic be the dimension of W and (CQ, . . . , W} a basis for, W. Choose vectors (Y~+~,. . . , an in V such that {q . . . , a,> is a basis for, V. Let (fi, . . . , fn} be the basis for V” which is dual to this basis for V., , 101
Page 110 :
102, , Linear Transformations, , Chap. 3, , The claim is that {jk+l, . . . , jn} is a basis for the annihilator, ji belongs to W” for i 2 lc + 1, because, ji(aj), , =, , WO. Certainly, , 6ij, , and6ij=Oifi>k+landj~k;fromthisitfollowsthat,fori>k+l,, Jim = 0 whenever cyis a linear combination of al, . . . , ak. The functionals, f k+l, . . . , j n are independent, so all we must show is that they span WO., Suppose f is in V*. Now, , so that if, , n-k., , f, , is in W” we have, , f(aJ =, , 0 for i < k and, , We have shown that if dim W = lc and dim V = n then dim W” =, 1, , Corollary., If W is a k-dimensional subspace of an n-dimensional, space V, then W is the intersection of (11 - k) hyperspaces in V., , vector, , Proof. This is a corollary of the proof of Theorem 16 rather than, its statement. In the notation of the proof, W is exactly the set of vectors (Y, such that fi(a) = 0, i = k + 1, . . . , n. In case k = n - 1, W is the null, space of fn. 1, Corollary., If WI and Wz are subspaces of a jinite-dimensional, space, then W1 = Wz if and only if WY = W!., , vector, , Proof. If WI = Wz, then of course WY = WZ. If WI # Wz, then, one of the two subspaces contains a vector which is not in the other., Suppose there is a vector (Y which is in Wz but not in WI. By the previous, corollaries (or the proof of Theorem 16) there is a linear functional j such, that f(p) = 0 for all p in W, but f(a) # 0. Then f is in WY but not in W!, and WY # Wg. 1, In the next section we shall give different proofs for these two corollaries. The first corollary says that, if we select some ordered basis for the, space, each k-dimensional subspace can be described by specifying (n - k), homogeneous linear conditions on the coordinates relative to that basis., Let us look briefly at systems of homogeneous linear equations from, the point of view of linear functionals. Suppose we have a system of linear, equations,, &la, + ... +Al,z,, =o
Page 111 :
Sec. 3.5, , Linear, , for which we wish to find the solutions., linear functional on Fn defined by, , If we let fi, i = 1, . . . , m, be the, , . . . , 2,) = Ailxl+, , fi(Zl,, , Functionals, , ..., , + A&X,, , then we are seeking the subspace of F” of all CYsuch that, fi(a), , = 0,, , i=l, , , . . . , m., , In other words, we are seeking the subspace annihilated, by fi, . . . , fm., Row-reduction, of the coefficient matrix provides us with a systematic, method of finding this subspace. The n-tuple (Ail, . . . , Ai,) gives the, coordinates of the linear functional fi relative to the basis which is dual, to the standard basis for P. The row space of the coefficient matrix may, thus be regarded as the space of linear functionals spanned by ji, . . . , f wt., The solution space is the subspace annihilated by this space of functionals., Now one may look at the system of equations from the ‘dual’ point, of view. That is, suppose that we are given m vectors in Fn, a; = (Ail,, , . . . ) A&), , and we wish to find the annihilator, of the subspace spanned, vectors. Since a typical linear functional on Fn has the form, * * . > 2,), , fbl,, , the condition, , =, , ClXl, , +, , . . ., , that j be in this annihilator, 5, , cnxn, , is that, i=l, , A<jci=O,, , +, , by these, , ,...,m, , j=l, , that is, that (cl, . . . , c,) be a solution of the system AX = 0. From this, point of view, row-reduction, gives us a systematic method of finding the, annihilator, of the subspace spanned by a given finite set of vectors in Fn., EXAMPLE, , 23. Here are three linear functionals, fl(Xl,, , x2,, , x3,, , z4), , =, , 21, , fi(Xl,, , x2,, , 53,, , x4), , =, , 2x2, , +, , j-3(21,, , x2,, , x3,, , x4), , =, , -2221, , 2x2, +, , +, , 2x3, , on R4:, +, , x4, , 24, -, , 4x3, , + 3x4., , The subspace which they annihilate may be found explicitly, row-reduced echelon form of the matrix, , A short calculation,, , or a peek at Example, , R=, , 21 of Chapter, , [ 1, 1, , 0, , 2, , 0, , 00, , 010 0, , 1, 0., , by finding, , 2, shows that, , the, , 103
Page 113 :
Sec. 3.5, , Linear, , Functionals, , The dimension, of W” is 2 and a basis {ji, ji} for W” can be found, takinga, = 1, b = Oand thena, = 0, b = 1:, x5) = x1 + x2, : : : : x5) = x1 - 223 +, , kg::, The, , above, , by first, , x4., , j in W” is j = uji + bfi., , general, , Exercises, 1. In R3, let (Y~ = (1, 0, I), CQ = (0, 1, -2), a3 = (-1,, (a) If f is a linear functional, on R3 such that, f(4, , f(a2) = -1,, , = 1,, , and if cr = (a, b, c), find f(a)., (b) Describe explicitly, a linear, f(ai), (c) Let f be any linear, , functional, , f(~r), If (Y = (2, 3, --I),, , but, , f(aJ, , and, , # 0., , f(a3) # 0., , # 0., , 2. Let @ = {CQ, (Ye,a3} be the basis for C3 defined, w = (LO,, , = 3,, , fb3), , such that, , = f(az) = 0, , show that f(a), , 0)., , f on R3 such that, , functional, , = f(czz) = 0, , -1,, , --I),, , = (1, 1, I),, , ff2, , by, a3, , = (2, 2, 0)., , Find the dual basis of (8., 3. If A and B are n X n matrices over the field F, show that trace (AB), (BA). Now show that similar matrices have the same trace., 4. Let V be the vector, have degree 2 or less:, , space of all polynomial, , p(z), , Define, , three linear, h(p), , Show that, the dual., , =, , Jd, , functionals, ~($1, , {fi, f2,f3}, , dx,, , =, , co, , +, , Cl5, , functions, , +, , from, , p, , = trace, , R into R which, , c2x2., , on V by, fi(p), , =, , 1,2pC4, , dx,, , $3(p), , is a basis for V* by exhibiting, , 5. If A and B are n x n complex, possible., , matrices,, , 6. Let m and n be positive integers, tionals on Fn. For a in F” define, , and F a field., , !!‘a! = (fi(a),, , =, , /,-1~(4, , dx., , the basis for V of which it is, , show that, , AB -, , BA = I is im-, , Let fr, . . . , J,, be linear, , func-, , . . . ,fm(a))., , Show that T is a linear transformation, from F” into Fm. Then show that every, linear transformation, from F* into Fm is of the above form, for some jr, . . ., f 7n., 7. Let cur = (1, 0, - 1,2) and CV~= (2,3, 1, l), and let W be the subspace of R4, spanned by cri and CY~.Which linear functionals, f:, , 105
Page 114 :
106, , Linear Transformations, , Chap. 3, , fhx2,23,, , are in the annihilator, , x4), , =, , ClXl, , +, , c222, , +, , +, , c4x4, , of W?, , 8. Let W be the subspace of Rb which is spanned, a1 = Cl + 2E2 + -5,, LYE =, , Find, , cax3, , ~1 $, , 4E2 +, , 6~3 +, , ff2, , =, , E2, , 4~4 +, , by the vectors, , + 3Q +, , 3E4 +, , cl, , ~5., , a basis for W”., , 9. Let V be the vector space of all 2 X 2 matrices, and let, , B = [-;, , over the field of real numbers,, , -;I., , Let W be the subspace of V consisting of all A such that A B = 0. Let f be a linear, of W. Suppose that f(1) = 0 and, functional, on V which is in the annihilator, f(C) = 3, where I is the 2 X 2 identity, matrix and, , c=, Find f(B)., 10. Let F be a subfield, on Fn (n 2 2) by, , of the complex, , [ 1, O O, 0, , 1’, , numbers., , fk(%, . . . , XJ = jh, (k - A%, What, , is the dimension, , of the subspace annihilated, , We define, , n linear, , functionals, , l<k<n., by fi, . . . , fn?, , 11. Let WI and W2 be subspaces of a finite-dimensional, (a) Prove that (WI + W2)0 = W’j n W$, (b) Prove that (WI n WZ)O = WY + W&, , vector space V., , 12. Let V be a finite-dimensional, vector space over the field F and let W be a, on W, prove that there is a linear functional, subspace of V. If f is a linear functional, g on V such that g(ar) = f(a) for each (I! in the subspace W., 13. Let F be a subfield of the field of complex numbers and let V be any vector, on V such that the funcspace over F. Suppose that f and g are linear functionals, tion h defined by h(o) = f(a)g(a), is also a linear functional, on V. Prove that, eitherf=, Oorg = 0., zero and let V be a finite-dimensional, 14. Let F be a field of characteristic, space over F. If (pi, . . . , urn are finitely many vectors in V, each different, zero vector, prove that there is a linear functionalf, on V such that, f(w), , # 0,, , i=l, , vector, from the, , , . . . , m., , 15. According to Exercise 3, similar matrices have the same trace. Thus we can, define the trace of a linear operator on a finite-dimensional, space to be the trace, of any matrix, which represents the operator in an ordered basis. This is welldefined since all such representing, matrices for one operator are similar., Now let V be the space of all 2 X 2 matrices over the field F and let P be a, fixed 2 X 2 matrix. Let T be the linear operator on V defined by T(A) = PA., Prove that trace (T) = 2 trace (P).
Page 115 :
Sec. 3.6, , The Double Dual, , 16. Show that the trace functional, , on n X n matrices, sense. If W is the space of n X n matrices over the field, tional on W such that f(M), = f(BA) f or each A and, multiple, of the trace function. If, in addition, f(Z) = n,, , 107, , is unique in the following, F and if f is a linear funcB in W, then f is a scalar, thenf is the trace function., , 17. Let W be the space of n X n matrices over the field F, and let W. be the subspace spanned by the matrices C of the form C = AB - BA. Prove that WO is, exactly the subspace of matrices which have trace zero. (Hint: What is t,he dimension of the space of matrices of trace zero? Use the matrix ‘units,’ i.e., matrices with, exactly one non-zero entry, to construct enough linearly independent, matrices of, the form AB - BA.), , 3.6., , The Double, , One question, about dual bases which we did not answer in the last, section was whether every basis for V* is the dual of some basis for V. One, way to answer that question, is to consider, V**, the dual space of V*., If CY is a vector, in V, then LY induces, a linear, functional, L, on V*, defined by, V*., f in, L(f), = fb),, The fact, operations, , that L, is linear, in V*:, , is just, , L&f, , a reformulation, , + $7) =, =, =, =, , of the definition, , of linear, , (cf + g)b), (d)b), + SW, cm + SW, c-L(f) + -u7)., , If V is finite-dimensional, and a # 0, then L, # 0; in other words, there, exists a linear functional, f such that f(a) # 0. The proof is very simple, and was given in Section, 3.5: Choose an ordered, basis @ = {CQ, . . . , cu,}, for V such that (Ye = (Y and let f be the linear functional, which assigns to, , each vector, , in V its first coordinate, , in the ordered basis CB., , Theorem, 17. Let V be a jinite-dimensional, For each vector a in V define, , L(f), The mapping, Suppose, in V*, , f, , in, , a! + L, is then an isomorphism, , V*., , of V onto V**., , L, is linear., Proof. We showed that for each (Y the function, CYand p are in V and c is in F, and let y = ca + p. Then for each j, -b(f), , and so, , = fb>,, , vector space over the fi.eld F., , =, =, =, =, , f(r), fb, + P>, cm + f(P), c-L(f) + -b(f), , L, = CL, + Lg., , Dual
Page 116 :
108, , Linear Transformations, , Chap. 3, , This shows that the mapping (Y+ L, is a linear transformation, from V, into V**. This transformation, is non-singular;, for, according to the, remarks above L, = 0 if and only if (Y = 0. Now (Y+ L, is a non-singular, linear transformation, from V into V**, and since, dim V** = dim V* = dim V, Theorem 9 tells us that this transformation, an isomorphism of V onto V**., 1, , is invertible,, , and is therefore, , Corollary., Let V be a finite-dimensional, vector space over the field F., If L is a linear junctional on the dual space V* of V, then there is a unique, vector a in V such that, L(f), , = f(a), , for every f in V*., Corollary., Let Tr be a finite-dimensional, vector space over the jield F., Each basis for V* is the dual of some basis for V., , Proof. Let a* = (ji, . . . , fn} be a basis for V*. By Theorem, there is a basis {L1, . . . , L,} for V** such that, Li(fj), Using the corollary, , = f&j., , above, for each i there is a vector, L(f), , 15,, , (I in V such that, , = f(c4, , for every j in V*, i.e., such that Li = Lai. It follows immediately, { a, . . . , a,} is a basis for V and that a3* is the dual of this basis., , that, 1, , In view of Theorem 17, we usually identify (Y with L, and say that V, ‘is’ the dual space of V* or that the spaces V, V* are naturally in duality, with one another. Each is the dual space of the other. In the last corollary, we have an illustration, of how that can be useful. Here is a further illustration., If E is a subset of V*, then the annihilator E” is (technically) a subset, of V**. If we choose to identify V and V** as in Theorem 17, then E” is a, subspace of V, namely, the set of all (Yin V such thatf(a), = 0 for allf in E., In a corollary of Theorem 16 we noted that each subspace W is determined, by its annihilator, W”. How is it determined? The answer is that W is the, subspace annihilated, by all j in W”, that is, the intersection of the null, spaces of all j’s in W”. In our present notation for annihilators, the answer, may be phrased very simply : W = ( W”)O., Theorem, 18. If S is any subset of a finite-dimensional, then (SO)Ois the subspace spanned by S., , vector space V,
Page 117 :
The Double Dual, , Sec. 3.6, , Proof. Let W be the subspace spanned by ~5’. Clearly WO = SO., Therefore, what we are to prove is that W = WOO. We have given one, proof. Here is another. By Theorem 16, dim W + dim WO = dim V, dim W” + dim WOO = dim V*, and since dim V = dim V* we have, dim W = dim Woo., Since W is a subspace of Woo, we see that W = Woo., , 1, , The results of this section hold for arbitrary vector spaces; however,, the proofs require the use of the so-called Axiom of Choice. We want to, avoid becoming embroiled in a lengthy discussion of that axiom, so we shall, not tackle annihilators for general vector spaces. But, there are two results, about linear functionals on arbitrary vector spaces which are so fundamental that we should include them., Let V be a vector space. We want to define hyperspaces in V. Unless, V is finite-dimensional,, we cannot do that with the dimension of the, hyperspace. But, we can express the idea that a space N falls just one, dimension short of filling out V, in the following way:, 1. N is a proper subspace of V;, 2. if W is a subspace of V which contains, w = v., , N, then either W = N or, , Conditions (1) and (2) together say that N is a proper subspace and there, is no larger proper subspace, in short, N is a maximal proper subspace., Dejinition., , If V is a vector space, a hyperspace, , in V is a maximal, , proper subspace of V., Theorem, 19. If f is a non-zero linear functional, on the vector space V,, then the null space off is a hyperspace in V. Conversely, every hyperspace in V, is the null space of a (not unique) non-zero linear junctional on V., , Proof. Let j be a non-zero linear functional on V and Nf its null, space. Let cy be a vector in V which is not in N,, i.e., a vector such that, j(a) # 0. We shall show that every vector in V is in the subspace spanned, by Nf and LY.That subspace consists of all vectors, Y + CQ,, Let p be in V. Define, , y in NI, c in F., , 109
Page 118 :
110, , Linear Transformations, , Chap. 3, , which makes sense because j(a), since, f(r), , # 0. Then the vector y = /3 - ca is in N,, = f(P - 4, = f(P) - cfb>, = 0., , So p is in the subspace spanned by N, and cr., Now let N be a hyperspace in V. Fix some vector cxwhich is not in N., Since N is a maximal proper subspace, the subspace spanned by N and a, is the entire space V. Therefore each vector /3 in V has the form, P = Y + ca,, , y in N, c in F., , The vector y and the scalar c are uniquely, p = y’ + da,, , determined, , by ,8. If we have also, , y’ in N, c’ in F., , then, (c’ - c)‘” = y - 7’., If cl - c # 0, then (Y would be in N; hence, c’ = c and, way to phrase our conclusion is this: If /3 is in V, there is, such that 0 - ca is in N. Call that scalar g(p). It is easy, linear functional on V and that N is the null space of g., , y’ = y. Another, a unique scalar c, to see that g is a, 1, , Lemma., If f and g are linear functionals on a vector space T’, then g, is a scalar multiple of f if and only if the null space of g contains the null space, of f, that is, if and only if f(a) = 0 implies g(a) = 0., , a scalar, Proof. If f = 0 then g = 0 as well and g is trivially, multiple of j. Suppose j # 0 so that the null space N, is a hyperspace in V., Choose some vector a in V with j(a) # 0 and let, , The, and, and, cf., , linear functional h = g - cf is 0 on N,, since both j and g are 0 there,, h(a) = g(a) - cj((~) = 0. Thus h is 0 on the subspace spanned by Nf, a-and, that subspace is V. We conclude that h = 0, i.e., that g =, I, , Let g, fl, . . . , f, be linear junctionals on a vector space V, with respective null spaces N, N1, . . . , N,. Then g is a linear combination of, f I, . . . , f, if and only if N contains the intersection N1 (3 . . . (3 N,., Theorem, , 20., , Proof. If g = clfi + . . f + cTfr and fi(a) = 0 for each i, then, clearly g(a) = 0. Therefore, N contains N1 n . . 1 f7 N,., We shall prove the converse (the ‘if’ half of the theorem) by induction, on the number r. The preceding lemma handles the case r = 1. Suppose we, know the result for r = k - 1, and let fi, . . . , fk be linear functionals with, null spaces Ni, . . . , Ne such that N1 n . . . n NE is contained in N, the
Page 119 :
Sec. 3.7, , The Transpose, , of a Linear, , Transformation, , null space of g. Let g’, f:, . . . , f;-i be the restrictions of g, fi, ., the subspace Nk. Then g’, f;, . . . , f;- 1 are linear functionals on, space Nk. Furthermore,, if (Y is a vector in Nk and f;(a) = 0, i, k - 1, then (Y is in Ni n . . . n Nk and so g’(a) = 0. By the, hypothesis (the case r = k - l), there are scalars ci such that, g’ = sj:, , . . , fk-i to, the vector, = 1, . . . ,, induction, , + . . . + Ck--lf& 1., , Now let, k-l, , (3-16), , h = g -, , 2, i=l, , Cifi., , Then h is a linear functional on V and (3-16) tells us that h(cy) = 0 for, every Q in Nk. By the preceding lemma, h is a scalar multiple of fk. If h =, ckfk, then, g =, , ii, i-l, , 1, , Cifi-, , Exercises, 1. Let n be a positive integer and F a field. Let W be the set of all vectors, (21, . . . , 2,) in F” such that x1 + . . . + Z~ = 0., (a) Prove that IV0 consists of all linear functionalsf of the form, f(21,, , * * * , x,), , =, , c, , 2, , xi., , j=l, , (b) Show that the dual space IV’* of W can be ‘naturally’, linear functionals, .f(%, , * f . , 2,), , =, , c121, , +, , . . ., , +, , identified with the, , cnxn, , on Fn which satisfy ci + +. . + cn = 0., 2. Use Theorem 20 to prove the following. If W is a subspace of a finite-dimensional vector space V and if {gi, . . . , gr} is any basis for W”, then, , W =, , 6, , N,,., , i=l, , 3. Let S be a set, F a field, and V(S; F) the space of all functions from S into F:, (f + g)(x), (d)(x), , = f(x) + g(x), = 6(x)., , Let W be any n-dimensional subspace of V(S; F). Show that there exist points, = &+, 21,, . . . , x, in S and functions fi, . . . , f,, in W such that fi(sJ, 3.7., , The, , Transpose, , of a Linear, , Transformation, Suppose that we have two vector spaces over the field F, V, and W,, and a linear transformation, T from V into W. Then T induces a linear, , 111
Page 120 :
112, , Linear Transformations, , Chap. 3, , transformation, from W* into V*, as follows. Suppose g is a linear functional, on W, and let, (3-17), , fk4 = Lo4, for each a! in V. Then (3-17) defines a function f from V into P, namely,, the composition of T, a function from V into W, with g, a function from, W into F. Since both T and g are linear, Theorem 6 tells us that f is also, linear, i.e., f is a linear functional on V. Thus T provides us with a rule Tt, which associates with each linear functional, g on W a linear functional, f = Tlg on V, defined by (3-17). Note also that T1 is actually a linear, transformation, from W* into V*; for, if g1 and gz are in W* and c is a scalar, , b”Ycgl + &I (4 = (wl + d CT4, = cgl(Ta), = WgJ(4, , + gz(Ta), + G’%)(a), , so that Tt(cgl + 92) = cTtgl + Ttgz. Let us summarize., Theorem, 21. Let V and W be vector spaces over the jield F. For each, linear transformation T from V into W, there is a unique linear transformation, Tt from W* into V* such that, , UWW, for every g in W* and cr in V., , = dT4, , We shall call Tt the transpose, of T. This transformation, Tt is often, called the adjoint of T; however, we shall not use this terminology., Theorem, 22. Let V and W be vector spaces over the ,$eZd F, and let T, be a linear transformation from V into W. The null space of Tt is the annihilator of the range of T. If V and W are jlnite-dimensional,, then, , (i) rank (Tt) = rank (T), (ii) the range of Tt is the annihilator, Proof., , of the null space of T., , If g is in W*, then by definition, (Ttg)k4, , = sU’4, , for each a! in V. The statement that g is in the null space of Tt means that, g(Tol) = 0 for every (Y in V. Thus the null space of T1 is precisely the, annihilator, of the range of T., Suppose that V and W are finite-dimensional,, say dim V = n and, dim W = m. For (i) : Let r be the rank of T, i.e., the dimension of the range, of T. By Theorem 16, the annihilator of the range of T then has dimension, (m - r). By the first statement of this theorem, the nullity of Tt must be, (m - r). But then since Tt is a linear transformation, on an m-dimensional, space, the rank of Tt is m - (m - r) = r, and so T and Tt have the same, rank. For (ii) : Let N be the null space of T. Every functional in the range
Page 121 :
The Transpose of a Linear, , Sec. 3.7, , Transformation, , of Tt is in the annihilator of N; for, suppose j = Ttg for some g in W*; then,, if cz is in N, j(a) = (Ttg)(a) = g(Tcu) = g(0) = 0., Now the range of T1 is a subspace of the space No, and, dim No = n - dim N = rank (T) = rank (T1), so that the range of Tt must be exactly No., Theorem, , 23., , jield F. Let & be an, ordered basis for W, from V into W; let A, of Tt relative to a’*,, Proof., , 1, , Let V and W be jinite-dimensional, vector spaces over the, ordered basis for V with dual basis a*, and let 63 be an, with dual basis c%‘*. Let T be a linear transformation, be the matrix of T relative to 63, 63’ and let B be the matrix, a*. Then Bij = Aji., , Let, @I’=, w*=, , @ = &I,. . .,aJ,, a* = {fl, . . . ,fn),, , {Pl,...,Prn),, {m,...,gm}., , By definition,, T~lj = 2 Aijfii,, , j=l,...,n, , i=l, , Ttgj = ii,, , j = 1,'. . . , m., , Bijfi,, , On the other hand,, , = k!l AkiSj(h), = 5 AhJjk, k=l, , For any linear functional, , j on V, f = i;l fbilfi., , If we apply this formula to the functional, (T’gJ(cyi) = Ajc we have, , j = Ttgj and use the fact that, , Ttgj = $, Ajifi, from which it immediately, , follows that Bij = Aji., , 1, , 11s
Page 122 :
Linear Transformations, , Chap. 3, , DeJinition., If A is an m X n matrix over the field F, the transpose, A is the n X m matrix At dejined by Atj = Aji., , of, , Theorem 23 thus states that if T is a linear transformation, from V, into IV, the matrix of which in some pair of bases is A, then the transpose, transformation, Tt is represented in the dual pair of bases by the transpose, matrix At., Theorem, 24. Let A be ang m X n matrix, row rank of A is equal to the column rank of A., , over the jield F. Then the, , Proof. Let @ be the standard ordered basis for Fn and @ the, standard ordered basis for Fm. Let T be the linear transformation, from Fn, into Fm such that the matrix of T relative to the pair (R, a3’ is A, i.e.,, Th,, , . . . , xn) = (~1, . . . , ym), , where, Yi =, , 5 Aijxj., j=l, , The column rank of A is the rank of the transformation, T, because the, range of T consists of all m-tuples which are linear combinations, of the, column vectors of A., Relative to the dual bases a’* and (8*, the transpose mapping Tt is, represented by the matrix At. Since the columns of At are the rows of A,, we see by the same reasoning that the row rank of A (the column rank of A ‘), is equal to the rank of Tt. By Theorem 22, T and T1 have the same rank,, and hence the row rank of A is equal to the column rank of A. l, Now we see that if A is an m X n matrix over F and T is the linear, transformation, from Fn into Fm defined above, then, rank (T) = row rank (A) = column rank (A), and we shall call this number, , simply the rank, , of A., , EXAMPLE 25. This example will be of a general nature-more, discussion than example. Let I’ be an n-dimensional, vector space over the, field F, and let T be a linear operator on V. Suppose 63 = {w, . . . , cr,>, is an ordered basis for V. The matrix of T in the ordered basis @ is defined, to be the n X n matrix A such that, Taj = 5 Ai+i, j=l, , in other words, Aij is the ith coordinate of the vector Taj in the ordered, basis a. If {fr, . . . , fn} is the dual b asis of @, this can be stated simply, Aij = fi(Taj).
Page 123 :
Sec. 3.7, , The Transpose, , of a Linear Transformation, , Let us see what happens when we change basis. Suppose, 63 = {cd, . . . , a@, is another ordered basis for V, with dual basis {f;, . . . , f;}., matrix of T in the ordered basis a’, then, , If B is the, , Bij = f;(Ta;)., Let U be the invertible, linear, transpose of U is given by Ulfl, invertible, so is Ut and (Ut)-l =, Therefore,, Bij =, =, =, , operator such that Uaj = a;. Then the, = fi. It is easy to verify that since U is, ( U-‘)t. Thusf: = (U-l)“fi,, i = 1, . . . , n., [ ( U-l) $I( Tcx;), fi( U-‘Ta;), fi( U-‘TUaj)., , Now what does this say? Well, f;(U-lTUcq), is the i, j entry of the matrix, of U-‘TU, in the ordered basis 6% Our computation, above shows that this, scalar is also the i, j entry of the matrix of T in the ordered basis CB’. In, other words, [T]@t = [U-‘TU]a, , = K-‘ImP’l~~~l,, , = Wlci’[Tl~[~l~, and this is precisely the change-of-basis, , formula, , which we derived, , earlier., , Exercises, 1. Let F be a field and let j be the linear functional on F2 defined by j(q, ZJ =, azl + bxz. For each of the following linear operators T, let g = Ty, and find, dx1,4., (a), , T(xl,, , x2), , =, , (xl,, , (b) T(xI, x4 =, , C-22,, , (c), , (xl, , T(xl,, , x2), , =, , 0), , -, , ;, XI), , ;, , x2,, , x1, , +, , ~2)., , 2. Let V be the vector space of all polynomial functions over the field of real, numbers. Let, a and b be fixed real numbers and let j be the linear functional on V, , defined by, f(P), , If D is the differentiation, , = /J p(x) (ix*, , operator on V, what is DEf?, , 3. Let, V be the space of all n X n matrices over a field F and let B be a fixed, n X n matrix. If T is the linear operator on V defined by T(A) = AB - BA,, and if j is the trace function, what is Ttf?, 4. Let V be a finite-dimensional vector space over the field F and let T be a, linear operator on V. Let, c be a scalar and suppose there is a non-zero vector CY, in V such that TCI = CQ. Prove that there is a non-zero linear functional j on V, such that TEf = cf., , 115
Page 124 :
116, , Linear Transformations, , Chap. 3, , 5. Let A be an m X n matrix with real entries. Prove that A = 0 if and only, if trace (A’A) = 0., 6. Let n be a positive integer and let V be the space of all polynomial functions, over the field of real numbers which have degree at most n, i.e., functions of the, form, j(z) = co + Cl2 + * * * + c&P., Let D be the differentiation operator on V. Find a basis for the null space of the, transpose operator D’., 7. Let V be a finite-dimensional vector space over the field F. Show that T + Tt, is an isomorphism of L(V, V) onto L(V*, V*)., 8. Let V be the vector space of n X n matrices over the field F., (a) If B is a fixed n X n matrix, define a function Jo on V by js(A) = trace, (B”A). Show that jB is a linear functional on V., (b) Show that every linear functional on V is of the above form, i.e., is js, for some B., (c) Show that B + js is an isomorphism of V onto V*.
Page 125 :
4. Polynomials, , 4.1., , Algebras, , The purpose of this chapter is to establish a few of the basic properties of the algebra of polynomials, over a field. The discussion will be, facilitated if we first introduce the concept of a linear algebra over a field., Let F be a jield. A linear, algebra, over the field F is a, of, vector space Q. over F with, an additional operation called multiplication, vectors, which associates with each pair of vectors a, B in (3 a vector ~$3 in, ~3, called the product, of CTand /I in such a way that, DeJinition., , (a) multiplication, , is associative,, 4P-f), , (b) multiplication, , is distributive, , 4P + r> = 4 + w, , = (c&Y, with respect to addition,, and, , (a + P>r = w + Pr, , (c) for each scalar c in F,, c(c@) = (c(u)/3 = a(@)., If there is an element 1 in a such that la = arl = CI for each LYin (2,, we call Q. a linear, algebra, with identity, over F, and call 1 the identity, of a. The algebra a is called commutative, if C@ = ,&x for all Q and /I in a., EXAMPLE, operations, is, is an algebra, The field itself, , 1. The set of n X n matrices over a field, with the usual, a linear algebra with identity; in particular, the field itself, with identity. This algebra is not commutative, if n 2 2., is (of course) commutative., 117
Page 126 :
118, , Chap. 4, , Polynomials, , EXAMPLE 2. The space of all linear operators on a vector space, with, composition as the product, is a linear algebra with identity. It is commutative if and only if the space is one-dimensionai., The reader may have had some experience with the dot product and, cross product of vectors in R3. If so, he should observe that neither of, these products is of the type described in the definition of a linear algebra., The dot product is a ‘scalar product,’ that is, it associates with a pair of, vectors a scalar, and thus it is certainly not the type of product we are, presently discussing. The cross product does associate a vector with each, pair of vectors in R3; however, this is not an associative multiplication., The rest of this section will be devoted to the construction, of an, algebra which is significantly, different from the algebras in either of the, preceding examples. Let F be a field and S the set of non-negative, integers. By Example 3 of Chapter 2, the set of all functions from S into, F is a vector space over F. We shall denote this vector space by F”. The, vectors in F” are therefore infinite sequences f = (fo, fi, fi, . . .) of scalars, fi in F. If g = (go, 91, g2, . . .>, gi in F, and a, b are scalars in F, af + bg is, the infinite sequence given by, (4-l), , af + bg = (afo + bgo,afl + bgl, afi + be, . . .>., with each pair of vectors f and, , We define a product in F” by associating, g in F” the vector fg which is given by, (4-z), , n = 0, 1, 2, . . . ., , (fgln = jofig.+, , Thus, , fg = (fogo,fog1+, , f1g0,, , fog2, , +, , f1g1 +, , f2g0,, , ., , . .>, , and as, , for n = 0, 1, 2, . . . , it follows, If h also belongs to F”, then, , that multiplication, , is commutative,, , fg = gf.
Page 127 :
Sec. 4.2, , The Algebra of Polynomials, , for n = 0, 1, 2, . . . , so that, (4-3), , (fg)h, , = fW4., , We leave it to the reader to verify that the multiplication, defined by (4-2), satisfies (b) and (c) in the definition, of a linear algebra, and that the, vector 1 = (1, 0, 0, . . .) serves as an identity for F”. Then Fm, with the, operations defined above, is a commutative, linear algebra with identity, over the field F., role in what, The vector (0, 1, 0, . . . , 0, . . .) plays a distinguished, follows and we shall consistently denote it by 2. Throughout, this chapter, x will never be used to denote an element of the field F. The product of x, with itself n times will be denoted by x” and we shall put x0 = 1. Then, x2 = (0, 0, 1, 0, . . .),, , x3 = (0, 0, 0, 1, 0, . . .), , and in general for each integer k 2 0, (x”)k = 1 and (xk), = 0 for all nonnegative integers n # lc. In concluding this section we observe that the, set consisting of 1, x, x2, . . . is both independent, and infinite. Thus the, algebra F* is not finite-dimensional., The algebra Fm is sometimes called the algebra, of formal, power, series over F. The element f = (fo, fi, f2, . . .) is frequently, written, (4-4), This notation is very convenient for dealing with the algebraic operations., When used, it must be remembered that it is purely formal. There are no, ‘infinite sums’ in algebra, and the power series notation (4-4) is not intended to suggest anything about convergence, if the reader knows what, that is. By using sequences, we were able to define carefully an algebra, in which the operations behave like addition and multiplication, of formal, power series, without running the risk of confusion over such things as, infinite sums., , 4.2., We are now in a position, , The, , to define a polynomial, , Algebra, , of Polynomials, , over the field F., , DeJinition., Let F[x] be the subspace of F* spanned by the vectors, 1, x, x2, . . . . An element of F[x] is called a polynomial, over F., , Since F[x] consists of all (finite) linear combinations, of x and its, powers, a non-zero vector f in F” is a polynomial, if and only if there is, an integer n 2 0 such that fn # 0 and such that fk = 0 for all integers, k > n; this integer (when it exists) is obviously unique and is called the, degree, of f. We denote the degree of a polynomial f by deg f, and do, , 119
Page 128 :
1.20, , Polynomials, , Chap. 4, , not assign a degree to the O-polynomial., degree n it follows that, , If, , f, , is a non-zero, , polynomial, , of, , fn z 0., f = f&O + flZ + f2L2 + * * * + f?c,, (4-5), of f, and, The scalars fo, fl, . . . , fn are sometimes called the coefficients, with coefficients in F. We shall call, we may say that f is a polynomial, and frequently, write c, polynomials of the form CZ scalar polynomials,, for cx”. A non-zero polynomial f of degree n such that f,, = 1 is said to, polynomial., be a manic, The reader should note that polynomials, are not the same sort of, objects as the polynomial, functions on F which we have discussed on, several occasions. If F contains an infinite number of elements, there is a, natural isomorphism between F[x] and the algebra of polynomial, functions on F. We shall discuss that in the next section. Let us verify that, F[x] is an algebra., Theorem, , 1. Let f and g be non-zero polynomials, , over F. Then, , (i) fg is a mm-zero polynomial;, (ii) deg (fg) = deg f + deg g;, (iii) fg is a manic polynomial if both f and g are manic polynomials;, (iv) fg is a scalar polynomial if and only if both f and g are scalar, polynomials;, (v> if f + g Z 0,, deg (f + g> I mm (deg f, deg g)., Proof. Suppose, non-negative, integer,, , f, , has degree m and that g has degree n. If k is a, m+n+k, , (fd, , =, , m+n+k, , z, i=o, , f, , ignzfnfk-i-, , In order that figm+n+k-i, # 0, it is necessary that i I m and m + n +, k - i < n. Hence it is necessary that m + k I i 5 m, which implies, k = 0 and i = m. Thus, , (fs>m+n = f&3, , (4-6), and, , (fd, , (4-7), , m+n+k, , - 0,, , k > 0., , The statements (i), (ii), (iii) follow immediately, from (4-6) and (4-7),, while (iv) is a consequence of (i) and (ii). We leave the verification, of (v), to the reader., 1, Corollary, , 1. The set of all polynomials, , with the operations, identity over F., , over a given jield F equipped, linear algebra with, , (4-l) and (4-2) is a commutative
Page 129 :
The Algebra of Polynomials, , Sec. 4.2, , Proof. Since the operations (4-l) and (4-2) are those defined in, the algebra F” and since F[x] is a subspace of Fm, it suffices to prove that, the product of two polynomials is again a polynomial., This is trivial when, one of the factors is 0 and otherwise follows from (i). 1, Corollary, 2. Suppose f, g, and h are polynomials, that f # 0 and fg = fh. Then g = h., , over the Jield F such, , Proof. Since jg = fh, j(g - h) = 0, and as j # 0 it follows, once from (i) that g - h = 0. 1, , at, , Certain additional facts follow rather easily from the proof of Theorem, 1, and we shall mention some of these., Suppose, f = : fixi, i=o, , and, , g = 2 gjxi., j-0, , Then from (4-7) we obtain,, (4-W, The reader should verify,, F, that (4-8) reduces to, , in the special case j = cx*, g = dx” with c, d in, (cxm) (dx”) = cdxmfn., , (4-9), , Now from (4-9) and the distributive, product in (4-8) is also given by, , laws in F[x],, , it follows, , that, , the, , z jigjxi+j, id, , (4-10), where the sum is extended, and 0 I j < n., , over all integer pairs i, j such that 0 5 i < m,, , Let @.be a linear algebra with identity over the field F. We, shall denote the identity of 0, by 1 and make the convention that a0 = 1 for, Dejinition., , each CYin @. Then to each polynomial, , f = ;, , fix’ over F and a in @ we asso-, , i-0, , ciate an element f(a) in c?,by the rule, , EXAMPLE, , 3. Let C be the field of complex numbers and letj, , (a) If a = C and z belongs to C, f(z) = x2 + 2, in particular, and, l+i, 4 1 ->, , 1, =, , ’, , = x2 + 2., j(2), , = 6, , 121
Page 130 :
122, , Chap. 4, , Polynomials, (b) If Q is the algebra, , of all 2 X 2 matrices, , over C and if, , then, , (c) If @,is the algebra, ment of Q. given by, , of all linear operators, , T(Cl, c2, cg) = (iti, then f(T), , is the linear operator, , Cl, c2, 95, , c,), , on C3 defined by, , f(T)(cl,, , cz, cs) = (0, 3~ 0)., , (d) If a is the algebra of all polynomials, thenf(g), is the polynomial, in Q. given by, f(g), , on C3 and T is the ele-, , = -7, , over, , C and g = x4 + 3i,, , + 6ix4 + x8., , The observant reader may notice in connection with this last example, that if f is a polynomial over any field and z is the polynomial, (0, 1, 0, . . .), then f = f(z), but he is advised to forget this fact., Theorem, 2. Let 5’ be a field and a be a linear algebra with identity, over F. Suppose f and g are polynomials over F, that a! is an element of a,, and that c belongs to F. Then, , 6) (cf + d (00 = cf(d + g(4;, (ii> (fg)(d, = f(&(4., Proof. As (i) is quite easy to establish,, Suppose, , f = 5 fixi, , and, , i=O, , BY (4-W,, and hence by (i),, , we shall only prove, , (ii)., , g = 5 gjxi., j=o, , fg = zfigjxi+i, i,i, (fs>(d = ~.fig&+i, = (i:ofiai)(joC7Pi), = f(4d4., , I, , Exercises, 1. Let F be a subfield of the complex numbers and let A be the following 2 X 2, matrix over F
Page 131 :
The Algebra of Polynomials, , Sec. 4.2, For each of, .(a) j =, (b) j=, (c) j =, , the, x2, x3, 22, , following polynomials, - x + 2;, - 1;, - 52 + 7., , 2. Let T be the linear, , operator, T(xI,, , j over F, compute, , on R3 defined, , by, , xz, x3) = (XI, x3, -2x2, , over R defined, , Let j be the polynomial, , by j = -x3, , 3. Let A be an n X n diagonal matrix, Aij = 0 for i # j. Let j be the polynomial, , j(A)., , - x3)., + 2. Find j(T)., , over the field F, i.e., a matrix, over F defined by, , satisfying, , j = (x - AlI) . . . (x - A,,)., What, , is the matrix, , j(A)?, , 4. If j and g are independent, polynomials, over a field F and h is a non-zero, polynomial, over F, show that jh and gh are independent., 5. If F is a field, show that the product, , of two non-zero, , elements, , of F” is non-zero., , 6. Let S be a set of non-zero polynomials, over a field P. If no two elements, have the same degree, show that S is an independent, set in P[x]., , of S, , 7. If a and b are elements of a field F and a # 0, show that the polynomials, ax + b, (az + b)2, (az + b)3, . . . form a basis of F[x]., , 1,, , 8. If F is a field and h is a polynomial, over F of degree 2 1, show that the mapping j + j(h) is a one-one linear transformation, of F[x] into F[x]. Show that this, transformation, is an isomorphism, of F[x] onto F[x] if and only if deg h = 1., 9. Let F be a subfield, on F[x) defined by, , of the complex, , numbers, , and let T, D be the transformations, , and, , D (i$,, , ,ixi), , = ii, iCixi-‘., , (a) Show that T is a non-singular, linear operator on F[x]. Show also that T, is not invertible., (b) Show that D is a linear operator on F[x] and find its null space., (c) Show that DT = I, and TD # I., (d) Show that T[(Tj)g], = (Tj)(Tg) - T[j(Tg)] for all j, g in F[x]., (e) State and prove a rule for D similar to the one given for T in (d)., (f) Suppose V is a non-zero subspace of F[x] such that Tj belongs to V for, each j in V. Show that V is not finite-dimensional., subspace of F[x]. Prove there is an, (g) Suppose V is a finite-dimensional, integer m 2 0 such that Dmj = 0 for each j in V., , 123
Page 132 :
124, 4.3., , Polynomials, Lagrange, , Chap. 4, , Interpolation, , Throughout, this section we shall assume F is a fixed field and that, of F. Let V be the subspace of, . . . , t, are n + 1 distinct elements, F[z] consisting of all polynomials of degree less than or equal to n (together with the O-polynomial),, and let Li be the function from V into F, defined for f in V by, to,, , t1,, , Oliln., -L(f) = f@i>,, By part (i) of Theorem 2, each Li is a linear functional on V, and one of, the things we intend to show is that the set consisting of Lo, LI, . . . , L,, is a basis for V*, the dual space of V., Of course in order that this be so, it is sufficient (cf. Theorem 15 of, Chapter 3) that {Lo, LI, . . . , L,} be the dual of a basis {PO, PI, . . . , P,}, of V. There is at most one such basis, and if it exists it is characterized by, (4-l 1), , Lj(Pi), , = P,(tj), , = 6ij., , The polynomials, p, = (x - to) . . * (x - L,)(x, - ti+,) * * * (x - tn), (4 - to) . . . oi - Ll)@i, t, ti+l) * * * (ti - tn>, 2 tj, z, I, are of degree n, hence belong to V, and by Theorem 2, they satisfy (4-11)., If f = Z ciPi, then for eachj, (4-12), , =gi(t.- t.>, , (4-13), , i, , f(tj), , = 2 CiP,(tj), i, , =, , Cj., , Since the O-polynomial has the property that O(t) = 0 for each t in F, it, follows from (4-13) that the polynomials, PO, PI, . . . , P, are linearly independent. The polynomials, 1, x, . . . , xn form a basis of V and hence the, set {PO, PI, . . . , P,}, dimension of V is (n + 1). So, the independent, must also be a basis for V. Thus for eachf in V, (4-14), The expression (4-14) is called Lagrange’s, tingf = zi in (4-14) we obtain, , interpolation, , formula., , xi = 5 (ti)jPi., i=o, Now from Theorem, , (4-15), , 7 of Chapter, , 2 it follows, , 1 to t;, 1 t1 t’:, .. .. .., i t, i;, , [, , * **, . *., .., . .I, , that the matrix, t;, t’l, .., i;, , 1, , Set-
Page 133 :
Sec. 4.3, , Lagrange Interpolation, , matrix;, it, is invertible., The matrix in (4-15) is called a Vandermonde, is an interesting exercise to show directly that such a matrix is invertible,, when to, ti, . . . , t, are n + 1 distinct elements of F., If j is any polynomial, over F we shall, in our present discussion, denote by j- the polynomial function from F into F taking each t in F into, j(t). By definition, (cf. Example 4, Chapter 2) every polynomial, function, arises in this way; however, it may happen that j” = g- for two polynomials j and g such that j # g. Fortunately,, as we shall see, this unpleasant situation only occurs in the case where F is a field having only, a finite number of distinct elements. In order to describe in a precise way, the relation between polynomials, and polynomial, functions, we need to, define the product of two polynomial, functions. If j, g are polynomials, over F, the product of j” and g- is the function j-g” from F into F given by, , (4-16), , (f-s-> (0 = f-(oc-r(o,, By part (ii) of Theorem 2, (jg)(t) = j(t)g(t),, , t in F., and hence, , (fd-0), = f-(OgW, for each t in F. Thusfg”, = (jg)“, and is a polynomial, function. At this, point it is a straightforward, matter, which we leave to the reader, to verify, that the vector space of polynomial, functions over F becomes a linear, algebra with identity over F if multiplication, is defined by (4-16)., DeJinition., Let F be a jield and let @,and a- be linear algebras over F., if there is a one-to-one mapThe algebras 0, and a- are said to be isomorphic, ping a + a” of a onto @,- such that, , (4, , (ca + do)”, , = ca- + d/3-, , (a/3)- = a!-p-, , (b), all a, fl in a and all scalars c, d in F. The mapping a + a” is called an, isomorphism, of a onto a-. An isomorphism of a onto a- is thus a vectorspace isomorphism of Q onto a- which has the additional property (b) of, ‘preserving’ products., , for, , EXAMPLE 4. Let V be an n-dimensional, vector space over the field F., By Theorem 13 of Chapter 3 and subsequent remarks, each ordered basis, 03 of V determines an isomorphism, T + [T]a of the algebra of linear, operators on V onto the algebra of n X n matrices over F. Suppose now, that U is a fixed linear operator on V and that we are given a polynomial, n, j = z c&xi, i=o, with coefficients, , ci in F. Then, , f(U) = i%ciu”, , 125
Page 134 :
Chap. 4, , Polynomials, and since T + [T]a is a linear mapping, , [f(Ulc% = iio dU"lc5., Now from the additional, , fact that, , [Td”da = [TMT&, for all T1, Tz in L(V,, , V) it follows, , that, , lIwa3 = ([mdi,, As this relation, , 2<i<n., , is also valid for i = 0, 1 we obtain, , the result that, , Lf(~>lcB =f(FJld., , (4-17), , In words, if Ii is a linear operator on V, the matrix of a polynomial, in a given basis, is the same polynomial in the matrix of U., , in U,, , Theorem, 3. If I’ is a field containing an in$nite number of distinct, elements, the mapping f + fW is an isomorphism of the algebra of polynomials, over F onto the algebra of polynomial junctions over F., , F[x], , Proof. By definition,, it is evident that, , the mapping, , is onto, and if f, g belong to, , (cf + dg)” = df- + dgfor all scalars c and d. Since we have already shown that (jg)” = j-g-, we, need only show that the mapping is one-to-one. To do this it suffices by, linearity to show that j- = 0 implies j = 0. Suppose then that j is a polynomial of degree n or less such that j’ = 0. Let to, tl, . . . , t, be any n + 1, distinct elements of F. Sincej= 0, j(tJ = 0 for i = 0, 1, . . . , n, and it, is an immediate consequence of (4-14) that j = 0. 1, From the results of the next section we shall obtain, different proof of this theorem., , an altogether, , Exercises, 1. Use the Lagrange interpolation formula to find a polynomial f with real coefficients such that f has degree 5 3 and f( - 1) = -6, f(0) = 2, j(1) = -2,, f(2), , = 6., , 2. Let ar, 6, y, 6 be real numbers., , We ask when it is possible to find a polynomial, j, = CC,j(l) = /3, f(3) = y and, that this is possible if and only if, , over R, of degree not more than 2, such that f(-1), j(0), , = 6. Prove, , 3cu + 6@ - y 3., , Let F be the field of real numbers,, , 86 = 0.
Page 135 :
Sec. 4.4, , Polynomial, 2, , 0, , 0, , 0, , 0, 0, , 0, 0, , 3, 0, , 0, 1, , [ 1, , Ideals, , 127, , A=0200, , p = (z - 2)(x - 3)(X (a) Show that p(A), (b) Let P1, PI, P3, Compute Ei = Pi(A), i, (c), Show that El +, (d) Show that A =, , 1)., , = 0., be the Lagrange, polynomials, for ti = 2, tz = 3, t3 = 1., = 1, 2, 3., EZ + E3 = I, EiEi = 0 if i # j, Ef = Ei., 2E1 + 3Ez + Ea., , 4. Let p = (z - 2)(s - 3)(2 - 1) and let T be any linear operator on R4 such, that p(T) = 0. Let PI, Pz, P, be the Lagrange polynomials, of Exercise 3, and let, Ei = Pi(T),, i = 1, 2, 3. Prove that, EiEi, , El + Ez + Ea = I,, E,2 = Ei,, 5. Let n be a positive, and P is an invertible, that, , and, , if, , i #j,, , T = 2E1 + 3Ez + Es., , integer and F a field. Suppose A is an n X n matrix over F, n X n matrix over F. If f is any polynomial, over F, prove, f(P-IAP), , = P-tf(A)P., , 6. Let F be a field. We have considered, obtained via ‘evaluation, at t’:, , certain, , Uf), Such functionals, L(f)L(g)., Prove, , = 0, , special linear, , functionals, , = f(t)., , are not only linear but also have the property, that if L is any linear functional, on F[x] such that, Wg), , for all f and g, then either, , that, , L(fg), , =, , = Ufmd, , L = 0 or there is a t in F such that, , 4.4., In this section, on the multiplicative, , on F[z], , L(f), , = f(t) for all f., , Polynomial, , we are concerned, with results which depend primarily, structure, of the algebra, of polynomials, over a field., , Lemma., Suppose f and d are non-zero polynomials, over a jield F such, that deg d 5 deg f. Then there exists a polynomial, g in F[x] such that either, , f Proof., , dg = 0, , or, , deg (f -, , dg), , < deg f., , Suppose, j, , =, , a,x", , +, , m-1, z, i=o, , #, , 0, , C&Xi,, , a,, , biXi,, , b, # 0., , and that, n-1, , d =, , b,Xn, , + Z., , Ideals
Page 136 :
128, , Polynomials, , Chap. 4, , Then m 2 n, and, , Gn, f-C->bn xmwnd =, , 0, , or, , deg [f - (z)xm-nd], , < degf., , Thus we may take g =, Using this lemma, division’ of polynomials, any field., , we can show that the familiar, process of ‘long, with real or complex coefficients is possible over, , Theorem, 4. If f, d are polynomials over a field F and d is di$erent, from 0 then there exist polynomials q, r in F[x] such that, , (i) f = dq + r., (ii) either r = 0 or deg r < deg d., The polynomials, , q, r satisfying, , (i) and (ii) are unique., , Proof. If f is 0 or deg f < deg d we may take q = 0 and r = f. In, case f # 0 and deg f > deg d, the preceding lemma shows we may choose, a polynomial, g such that f - dg = 0 or deg (f - dg) < deg f. If f dg # 0 and deg (f - dg) 2 deg d we choose a polynomial, h such that, (f - dg) - dh = 0 or, , deg If - 4g + hII < deg(f - &I., Continuing, this process as long as necessary, we ultimately, obtain polynomials q, r such that r = 0 or deg r < deg d, and f = dq + r. rc’ow suppose we also have f = dql + rl where rl = 0 or deg rl < deg d. Then, dq + r = dql + ri, andd(q - ql) = ri - r. If q - ql # 0 thend(q - qJ #, 0 and, deg d + deg (q - qJ = deg (ri - r)., But as the degree of ri - r is less than the degree of d, this is impossible, and q - q1 = 0. Hence also r1 - r = 0. 1, De$nition., Let d be a non-zero polynomial over the Jield F. If f is in, F[x], the preceding theorem shows there is at most one polynomial q in F[x], f, that f is divisible, such that f = dq. If such a q exists we say that d divides, by d, that f is a multiple, of d, and call q the quotient, of f and d. We, also write q = f/d., Corollary, 1. Let f be a polynomial, over the field F, and let c be an element of F. Then f is divisible by x - c if and only if f(c) = 0., , Proof. By the theorem,, polynomial., By Theorem 2,, f(c), , f = (x -, , = Oq(c) + r(c), , c)q + r where, = r(c)., , r is a scalar
Page 137 :
Polynomial Ideals, , Sec. 4.4, Hence r = 0 if and only if f(c), , = 0., , 1, , Dejinition., Let F be a field. An element c in F is said to be a root, a zero of a given polynomial f over F if f(c) = 0., , 2. A polynomial, , Corollary, , or, , f of degree n over aJield F has at most n roots, , in F., Proof. The result is obviously true for polynomials, of degree 0, and degree 1. We assume it to be true for polynomials, of degree n - 1. If, a is a root off, f = (x - a)q where 4 has degree n - 1. Since f(b) = 0 if, and only if a = b or q(b) = 0, it follows by our inductive assumption that, f has at most n roots. 1, The reader should observe that the main step in the proof of Theorem, 3 follows immediately from this corollary., The formal derivatives, of a polynomial, are useful in discussing mulof the polynomial, tiple roots. The derivative, f = co + ClZ +, , . . . + CnZ”, , is the polynomial, f’ = cl + 2czx + . . . + nc,xn-‘., We also use the notation Dj = f’. Differentiation, linear operator on F[s]. We have the higher, f” = Dzf 7f@’ = Oaf, and so on., , is linear, that is, D is a, order formal derivatives, , Theorem, 5 (Taylor’s, Formula)., Let F be a field of characteristic, zero, c an element of F, and n a positive integer. If f is a polynomial over f, with deg f _< n, then, , f = i,F, , (c)(x - c)“., , Proof. Taylor’s formula is a consequence of the binomial, and the linearity of the operators D, D2, . . . , Dn. The binomial, is easily proved by induction and asserts that, (a + b)n = 5, k=O, , T, , theorem, theorem, , ampk bk, , 0, , where, m, 0k, , m!, m(m - 1) . ~3 (m - k + 1), 1 . 2 ... k, = k!(m - Ic)! =, , is the familiar binomial coefficient giving the number, m objects taken Ic at a time. By the binomial theorem, , of combinations, , xm = [c + (x - c)]”, , cm-y5- c)”, = cm+, , mcm-l(x, , -, , c) +, , .. . +, , (x -, , c)”, , of, , 129
Page 138 :
Polynomials, and this is the statement, , Chap. 4, of Taylor’s, , formula, , for the case f = xm. If, , f = jOumxthen, Dkf(c), , = z a,(Dkxq(c), m, , and, , = Z a,xm, =;, , [, , It should be noted that because the polynomials, 1, (x - c), . . . ,, (x - c)~ are linearly independent, (cf. Exercise 6, Section 4.2) Taylor’s, formula provides the unique method for writing f as a linear combination, of the polynomials, (x - c)” (0 5 k 5 n)., Although we shall not give any details, it is perhaps worth mentioning, at this point that with the proper interpretation, Taylor’s formula is also, valid for polynomials over fields of finite characteristic., If the field F has, finite characteristic, (the sum of some finite number of l’s in F is 0) then, we may have k! = 0 in F, in which case the division of (Dkf) (c) by lc! is, meaningless. Nevertheless,, sense can be made out of the division of Dkf, by k!, because every coefficient of Dkf is an element of F multiplied, by an, integer divisible by k! If all of this seems confusing, we advise the reader, to restrict his attention to fields of characteristic, 0 or to subfields of the, complex numbers., of c as a root of, If c is a root of the polynomial f, the multiplicity, f is the largest positive integer r such that (x - c)~ divides f., The multiplicity, of a root is clearly less than or equal to the degree, of f. For polynomials, over fields of characteristic, zero, the multiplicity, of c as a root off is related to the number of derivatives, off that are 0 at c., Theorem, 6. Let F be a field of characteristic zero and f a polynomial, over F with deg f 2 n. Then the scalar c is a root off of multiplicity, r if and, only if, Olklr-1, (D”f)(c), = 0,, , (Drf)(c), , f 0., , Proof. Suppose that r is the multiplicity, of c as a root off. Then, there is a polynomial, g such that f = (a: - c)‘g and g(c) # 0. For other-
Page 139 :
Polynomial, , Sec. 4.4, , wise f would be divisible by (z - c) 7+1, by Corollary, Taylor’s formula applied to g, , f = (x - c)r [>;, , 9, , Ideals, , 1 of Theorem, , 4. By, , (c) (z - c)-], , = y ----&, (Dv) (x - c)‘fm, m=O ., Since there is only one way to write f as a linear combination, (z - c)” (0 5 k < n) it follows that, , of the powers, , OifO<kIr-1, (D”f 1Cc>, __- k!, =, , Dk-‘g(c) if r < k < n, ., 1 (k - T)!, , Therefore,, Dkf(c) = 0 for 0 5 k 5 r - 1, and D’f(c), = g(c) # 0. Conversely, if these conditions are satisfied, it follows at once from Taylor’s, formula that there is a polynomial g such that f = (5 - c)rg and g(c) # 0., Now suppose that r is not the largest positive integer such that (X - c)’, divides f. Then there is a polynomial, h such that f = (zr - c)‘+%. But, this implies g = (5 - c)h, by Corollary 2 of Theorem 1; hence g(c) = 0,, a contradiction., 1, Dejinition., Let F be a jield., An ideal, in F[x] is a subspace, F[x] such that fg belongs to M whenever f is in F[x] and g is in M., , M of, , over F, the set, EXAMPLE 5. If F is a field and d is a polynomial, J1 = dF [r], of all multiples df of d by arbitrary f in F [xl, is an ideal. For, M in fact contains d. If f, g belong to F[x], and c is a, M is non-empty,, scalar, then, c@f) - dg = 4cf - d, belongs to M, so that M is a subspace. Finally M contains (df)g = d(fg), as well. The ideal M is called the principal, ideal, generated, by d., EXAMPLE 6. Let dl,...,, d, be a finite number of polynomials over F., Then the sum M of the subspaces dzF[x] is a subspace and is also an ideal., For suppose p belongs to M. Then there exist polynomials fl, . . . , fn in, polynomial, F [r] such that p = dlfl + * . . + dnfn. If g is an arbitrary, over F, then, pg = dl(flg) + . . 1 + dn(fng), so that pg also belongs to M. Thus M is an ideal, and we say that M is the, by the polynomials,, dl, . . . , d,., ideal generated, EXAMPLE 7. Let F be a subfield, sider the ideal, M, , = (x + 2)F[x], , +, , of the complex, , numbers,, , (x” + 8x + 16)F[x]., , and con-
Page 140 :
Polynomials, , Chap. 4, , We assert that M = F[x]., , For M contains, , x2 + 8x + 16 - x(x + 2) = 62 + 16, and hence M contains 6x + 16 - 6(x + 2) = 4. Thus, nomial 1 belongs to M as well as all its multiples., , the scalar poly-, , Theorem, 7. If F is a field, and M is any non-zero ideal in F[x], there, is a unique manic polynomial d in F[x] such that M is the principal ideal, generated by d., , Proof. By assumption, M contains a non-zero polynomial;, among, all non-zero polynomials in M there is a polynomial d of minimal degree., We may assume d is manic, for otherwise we can multiply d by a scalar to, make it manic. Now if f belongs to M, Theorem 4 shows that f = dq + r, where r = 0 or deg r < deg d. Since d is in M, dq and f - dq = r also, belong to M. Because d is an element of M of minimal degree we cannot, have deg r < deg d, so r = 0. Thus M = dF[x]. If g is another manic, polynomial, such that M = gF[x], then there exist non-zero polynomials, p, q such that d = gp and g = dq. Thus d = dpq and, deg d = deg d + deg p + deg q., Hence, d=g., , deg p = deg q = 0, and, 1, , as d, g are manic,, , p = q = 1. Thus, , It is worth observing that in the proof just given we have used a, special case of a more general and rather useful fact; namely, if p is a nonin M which is not, zero polynomial, in an ideal M and if f is a polynomial, r belongs to M, is, divisible by p, then f = pq + r where the ‘remainder’, different from 0, and has smaller degree than p. We have already made, use of this fact in Example 7 to show that the scalar polynomial, 1 is the, manic generator of the ideal considered there. In principle it is always, possible to find the manic polynomial, generating a given non-zero ideal., For one can ultimately obtain a polynomial in the ideal of minimal degree, by a finite number of successive divisions., pi, . . . , pn are polynomials over a field F, not all of, which are 0, there is a unique manic polynomial d in F[x] such that, Corollary., , If, , (a) d is in the ideal generated by pi, . . . , p,,;, (b) d divides each of the polynomials pi., Any polynomial satisfying (a) and (b) necessarily satisjies, (c) d is divisible by every polynomial which divides each of the polynomials pi, . . . , pn., Proof., , Let d be the manic generator, , of the ideal, , plF[xl + .-a + p,FCxl.
Page 141 :
Polynomial Ideas, , Sec. 4.4, , Every member of this ideal is divisible by d; thus each of the polynomials, which divides each of, pi is divisible by d. N ow suppose f is a polynomial, the polynomials, PI, . . . , p,. Then there exist polynomials, gi, . . . , gn, such that p; = fgi, 1 5 i 5 n. Also, since d is in the ideal, , p,F[zl + * ** + PnF[~l,, there exist polynomials, , ~1, . . . , qn in F[z] such that, d = plql + . . . + pnqn., , Thus, d = f[glql + . . . + gnqnl., We have shown that d is a manic polynomial satisfying (a), (b), and (c)., If d’ is any polynomial, satisfying (a) and (b) it follows, from (a) and the, definition of d, that d’ is a scalar multiple of d and satisfies (c) as well., Finally, in case d’ is a manic polynomial,, we have d’ = d. 1, Dejinition., If pi, . . . , pn are polynomials, which are 0, the manic generator d of the ideal, , over a Jield F, not all of, , p,F[xl + . . . + pnF[xl, is called the greatest, common, divisor, (g.c.d.) of pl, . . . , pn. This, terminology is justiJied by the preceding corollary. We say that the polyprime, if their greatest common divisor, nomials p1, . . . , pn are relatively, is 1, or equivalently if the ideal they generate is all of F[x]., EXAMPLE 8. Let C be the field of complex numbers., , Then, , (a) g.c.d. (z + 2, x2 + 8x + 16) = 1 (see Example 7);, (b) g.c.d. ((x - 2)2(x + i), (x’ - 2)(x2 + 1)) = (J: - 2)(x i- i). For,, the ideal, (x - 2)2(x + i)F[x], , + (x - 2)(x2 + l)F[zl, , contains, (x - 2)2(z + i) - (5 - 2) (x2 + 1) = (x - 2) (x + i) (i - 2)., Hence it contains, , (x - 2)(x + i), which is monk, (z - 2)2(x + i), , and, , both, , (x - 2)(x2 + 1)., , EXAMPLE 9. Let F be the field of rational, M be the ideal generated by, (x - 1)(x + a2,, , and divides, , (x + 2)Yx - 3),, , numbers, and, , Then M contains, 4(x + 2)2[(x - 1) - (x - 3)] = (r + 2>2, and since, (x + 2)2 = (x - 3)(X + 7) - 17, , and in F[z], (x - 3)., , let
Page 142 :
134, , Polynomials, , Chap. 4, , M contains, , the scalar, , (II: + q2kC - 3),, , (x - 1)(x + w,, are relatively, , 1. Thus M = F[z], , polynomial, , and the polynomials, , and, , (x - 3), , prime., , Exercises, 1. Let, of &[z], (a), (b), (c), (d), (e), , 2. Find, (a), (b), (c), , & be the field of rational numbers. Determine, which of the following, are ideals. When the set is an ideal, find its manic generator., all f of even degree;, allf of degree 2 5;, all f such that f(0) = 0;, all f such that f(2) = f(4) = 0;, all f in the range of the linear operator T defined by, , subsets, , the g.c.d. of each of the following pairs of polynomials, 29 - x3 - 3x2 - 6x + 4, x4 + x3 - x2 - 2~ - 2;, 324+822-3,23+2x2+3x+6;, x4-2~~-2~~-2~-3,~~+6~~+7~+1., , 3. Let A be an n X n matrix over a field F. Show that the set of all polynomials, f in F[x] such that f(A) = 0 is an ideal., 4. Let F be a subfield, , of the complex, A=, , Find the manic, f(A) = 0., , generator, , 5. Let F be a field., is an ideal., , numbers,, , [, , of the ideal, , Show that, , l-2, 0, , and let, , 1, , 3’, , of all polynomials, , the intersection, , f in F[z], , of any number, , such that, , of ideals in F[x], , 6. Let F be a field. Show that the ideal generated by a finite number of polynomials fi, . , . , fn in F[z] is the intersection, of all ideals containing, fi, . . . , fn., 7. Let K be a subfield of a field F, and suppose f, g are polynomials, in K[x]., Let MK be the ideal generated by f and g in K[x] and MP be the ideal they generate, in F[x]. Show that MK and MF have the same manic generator., , 4.5., , The, , Prime, , Factorization, , of a Polynomial, In this section we shall prove that each polynomial, over the field F, can be written as a product of ‘prime’ polynomials., This factorization, provides us with an effective tool for finding the greatest common divisor
Page 143 :
Sec. 4.5, , The Prime Factorization of a Polynomial, , of a finite number of polynomials,, and in particular,, provides an effective, means for deciding when the polynomials are relatively, prime., De$nition., Let F be a jield. A polynomial f in F[x] is said to be, reducible, over F if there exist polynomials g, h in F[x] of degree 2 1 such, over F. A non-scalar, that f = gh, and if not, f is said to be irreducible, polynomial, over F, and we, irreducible polynomial over I; is called a prime, in F[x]., sometimes say it is a prime, , EXAMPLE 10. The polynomial, complex numbers. For, x2+1, , x2 + 1 is reducible, , over the field C of, , = (x+i)(x-ii), , and the polynomials, 2 + i, z - i belong to C[X]. On the other, 9 + 1 is irreducible over the field E of real numbers. For if, x2+1, , hand,, , = (az+b)(a’J:+b’), , with a, a’, b, b’ in R, then, aa’ = 1,, , ab’ + ba’ = 0,, , bb’ = 1., , These relations imply a2 + b2 = 0, which is impossible, a and b, unless a = b = 0., , with real numbers, , Theorem, 8. Let p, f, and g be polynomials over the Jield F’. Suppose, that p is a prime polynomial and that 11divides the product fg. Then either p, divides f or p divides g., , Proof. It is no loss of generality to assume that p is a manic prime, polynomial. The fact that p is prime then simply says that the only manic, divisors of p are 1 and p. Let d be the g.c.d. of f and p. Then either, d = 1 or d = p, since d is a monk polynomial which divides p. If d = p,, then p divides f and we are done. So suppose d = 1, i.e., suppose f and p, are relatively, prime. We shall prove that p divides g. Since (j, p) = 1,, there are polynomialsfO, and p. such that 1 = fof + pop. Multiplying, by g,, we obtain, 9 = MC7 + PoPg, = (fs)fo + P(PoS)., Since p divides fg it divides, p divides g. 1, , (fg)fo, and certainly, , p divides, , p(pog). Thus, , Corollary., If p is a prime and divides a product fl . . . f,, then p divides, one of the polynomials fl, . . . , f,., , Proof. The proof is by induction. When n = 2, the result is simply, the statement of Theorem 6. Suppose we have proved the corollary for, n = k, and that p divides the product fi . . . fk+l of some (k + 1) poly-
Page 144 :
136, , Polynomials, , Chap. 4, , nomials. Since p divides (ji . . . jk)jk+l, either p divides jk+l or p divides, j-1 * * * fk. By the induction hypothesis, if p divides fi 9. . fk, then p divides, fj for some j, 1 5 j 5 k. So we see that in any case p must divide some fj,, llj<k+l., 1, Theorem, 9. If F is a jield, a non-scalar manic polynomial in F[x] can, be factored as a product of manic primes in F[x] in one and, except for order,, only one way., , over F. As, Proof. Suppose f is a non-scalar manic polynomial, polynomials, of degree one are irreducible,, there is nothing to prove if, deg f = 1. Suppose j has degree n > 1. By induction we may assume the, theorem is true for all non-scalar manic polynomials of degree less than n., If f is irreducible, it is already factored as a product of manic primes, and, otherwise j = gh where g and h are non-scalar manic polynomials, of, degree less than n. Thus g and h can be factored as products of manic, primes in F [z] and hence so can f. Now suppose, pm, =, q1 "', qn, f = p1e.e, where pl, . . . , p, and q1, . . . , qn are manic primes in F[x]. Then p,, divides the product ql . . . qm. By the above corollary, p, must divide, some qi. Since qi and p, are both manic primes, this means that, , (4-16), , pa = pm, , From (4-16) we see that m = n = 1 if either m = 1 or n = 1. For, , de f = i!, degP, = j$, degqjIn this case there is nothing more to prove, so we may assume m > 1 and, n > 1. By rearranging the q’s we can then assume p, = qnr and that, p1, , Now by Corollary, , **., , pm-1pwa, , 2 of Theorem, p1, , *.*, , =, , q1, , **., , 1 it follows, p?n-1, , As the polynomial, pl . . . P,-~ has, assumption applies and shows that, a rearrangement, of the sequence pl,, shows that the factorization, of f as, up to the order of the factors., 1, , =, , q1, , . . ., , qn4pm., that, qn-1., , degree less than n, our inductive, the sequence ql, . . . , q,,-1 is at most, . . . , p,-1. This together with (4-16), a product of manic primes is unique, , In the above factorization, of a given non-scalar manic polynomial j,, some of the manic prime factors may be repeated. If pl, pz, . . . , p, are, the distinct manic primes occurring in this factorization, of j, then, f = p;‘pF . . . p:‘,, (4-17), the exponent, , ni being the number, , of times the prime, , pi occurs in the
Page 145 :
The Prime Factorization of a Polynomial, , Sec. 4.5, , factorization., This decomposition, is also clearly unique, and is called, decomposition, of f. It is easily verified that every manic, the primary, divisor off has the form, (4-18), , 0 I m; 5 ni., , p;l”‘pT * * * p?“,, , From (4-18) it follows that the g.c.d. of a finite number of non-scalar, manic polynomials fi, . . . , fs is obtained by combining all those manic’, primes which occur simultaneously, in the factorizations, of fi, . . . , fs., The exponent to which each prime is to be taken is the largest for which, the corresponding, prime power is a factor of each fi. If no (non-trivial), prime power is a factor of each fi, the polynomials are relatively prime., EXAMPLE 11. Suppose F is a field, and let a, b, c be distinct elements, of F. Then the polynomials x - a, z - b, x - c are distinct manic primes, in F[x]. If m, n, and s are positive integers, (x - c)~ is the g.c.d. of the, polynomials., (x - b)“(z - c)”, , and, , (x - CJ)~(X - c)”, , whereas the three polynomials, (x - b)“(z - c)*,, are relatively prime., Theorem, , 10., , (5 - a>yx - c)a,, , Let f be a non-scalar, , (x - a>yx - b)”, , monk polynomial, , over the field F, , and let, f = pp . . . p4, be the prime factorization, , of f. For each j, 1 5 j 5 k, let, fj, , Then t, . . . , fk are relatively, , =, , f/p;j, , = JIj PP’., , prime., , Proof. We leave the (easy) proof of this to the reader. We have, stated this theorem largely because we wish to refer to it later., 1, Theorem, 11. Let f be a polynomial, Then f is a product of distinct irreducible, f and f’ are relatively prime., , over the field F with derivative f’., polynomials over F if and only if, , Proof. Suppose in the prime factorization, of f over the field F, that some (non-scalar) prime polynomial, p is repeated. Then f = p2h for, some h in F[x]. Then, f’ = p2h’ + 2pp’h, and p is also a divisor of f’. Hence f and f’ are not relatively prime., Now suppose f = pl * . . pk, where pl, . . . , pk are distinct non-scalar, irreducible polynomials over F. Let fi = f/p+ Then, , s’ = p:fi +, , g&f* + * . * + dfk.
Page 146 :
138, , Chap. 4, , Polynomials, , Let p be a prime polynomial which divides both f andf’. Then p = pi for, some i. Now pi divides fi for j # i, and since pi also divides, , we see that p, must divide p:ji. Therefore pi divides either fi or pi. But pi, does not divide f; since pl, . . . , pl, are distinct. So pi divides pi. This is, not possible, since pi has degree one less than the degree of pi. We conclude that no prime divides both f and s’, or that, f and 7 are relatively, prime., 1, Dejinition., , polynomial, , The Jeld F is called algebraically, over F has degree 1., , closed, , if every prime, , To say that F is algebraically, closed means every non-scalar irreducible manic polynomial, over F is of the form (J: - c). We have already, observed that each such polynomial is irreducible for any F. Accordingly,, an equivalent, definition, of an algebraically, closed field is a field F such, that each non-scalar polynomial f in F[x] can be expressed in the form, f = c(z - cp, , . . . (cc - cp, , where c is a scalar, cl, . . . , clc are distinct elements of F, and nl, . . . , nk, are positive integers. Still another formulation, is that if f is a non-scalar, polynomial over F, then there is an element c in F such that f(c) = 0., The field l2 of real numbers is not algebraically, closed, since the polynomial (9 + 1) is irreducible, over R but not of degree 1, or, because, there is no real number c such that c2 + 1 = 0. The so-called Fundamental Theorem of Algebra states that the field C of complex numbers is, algebraically, closed. We shall not prove this theorem, although we shall, use it somewhat later in this book. The proof is omitted partly because, of the limitations, of time and partly because the proof depends upon a, ‘non-algebraic’, property of the system of real numbers. For one possible, proof the interested reader may consult the book by Schreier and Sperner, in the Bibliography., The Fundamental, Theorem of Algebra also makes it clear what the, possibilities are for the prime factorization, of a polynomial, with real, coefficients. If f is a polynomial, with real coefficients and c is a complex, root off, then the complex conjugate 1 is also a root off. Therefore, those, complex roots which are not real must occur in conjugate pairs, and the, entire set of roots has the form {tl, . . . , tk, cl, El, . . . , cr, F,} where tl, . .., 2k, are real and cl, . . . , c7 are non-real complex numbers. Thus f factors, f = c(z - tl) * ’ * (Li?where pi is the quadratic, , t&l, , polynomial, pi = (cc - Ci)(Z - Fi)., , * ’ ’ p,
Page 147 :
Sec. 4.5, , The Prime, , Factorization, , of a Polynomial, , These polynomials, pi have real coefficients., We conclude that every, irreducible polynomial over the real number field has degree 1 or 2. Each, polynomial over R is the product of certain linear factors, obtained from, the real roots off, and certain irreducible quadratic polynomials., , Exercises, 1. Let p be a manic, , prime, , polynomials, , polynomial, over F. Prove, , over the field F, and let j and g be relatively, that the g.c.d. of pj and pg is p., , 2. Assuming the Fundamental, Theorem of Algebra,, g are polynomials, over the field of complex numbers,, only if j and g have no common root., , prove the following. If j and, then g.c.d. (j, g) = 1 if and, , 3. Let D be the differentiation, operator on the space of polynomials, over the, field of complex numbers. Let j be a manic polynomial, over the field of complex, numbers. Prove that, j = (z - Cl) * . . (z - Ck), where cl, . . . , ck are distinct complex numbers if and only if j and Dj are relatively, prime. In other words, j has no repeated root if and only if j and Dj have no common root, (Assume the Fundamental, Theorem of Algebra.), 4. Prove, polynomials, , the following, generalization, of Taylor’s, over a subfield of the complex numbers,, , j(g) = $ Ij”)(h)(g, .?$=I3k!, , formula., Let j, g, and h be, with deg j 5 n. Then, , - h)k., , (Here j(g) denotes ‘j of g.‘), For the remaining, exercises, we shall need the following, definition., If j, g,, and p are polynomials, over the field F with p # 0, we say that j is congruent, to g, modulo, p if (j - g) is divisible, by p. If j is congruent, to g modulo p, we write, j = g mod p., 5. Prove, for any non-zero, lence relation., (a) It is reflexive: j = j, (b) It is symmetric:, if j, (c) It is transitive:, if j, , polynomial, , p, that congruence, , modulo, , p is an equiva-, , mod p., = g mod p, then g = j mod p., = g mod p and g = h mod p, then j = h mod p., , 6. Suppose j = g mod p and ji = g1 mod p., (a) Prove that j + ji = g + g1 mod p., (b) Prove that jfi = gg1 mod p., 7. Use Exercise 7 to prove the following. If j, g, h, and p are polynomials, field F and p # 0, and if j = g mod p, then h(j) = h(g) mod p., , over the, , 8. If p is an irreducible, polynomial, and jg = 0 mod p, prove that either, j = 0 mod p or g = 0 mod p. Give an example which shows that, this is false if p, is not irreducible., , 139
Page 148 :
5. Determinants, , 5.1., , Commutative, , Rings, , In this chapter we shall prove the essential facts about determinants, of square matrices. We shall do this not only for matrices over a field, but, also for matrices with entries which are ‘scalars’ of a more general type., There are two reasons for this generality. First, at certain points in the, next chapter, we shall find it necessary to deal with determinants, of, matrices with polynomial, entries. Second, in the treatment, of determinants which we present, one of the axioms for a field plays no role, namely,, the axiom which guarantees a multiplicative, inverse for each non-zero, element. For these reasons, it is appropriate, to develop the theory of, determinants, for matrices, the entries of which are elements from a commutative ring with identity., Definition., A ring is a set K, together with two operations, x + y and (x, y) + xy satisfying, , (x, y) +, , (a) K is a commutative group under the operation (x, y) + x + y (K, is a commutative group under addition) ;, (b) (xy)z = x(yz) (multiplication, is associative) ;, (y + z)x = yx + zx (the two distributive, (cl x(y+z), =xy+xz;, laws hold)., If xy = yx for all x and y in I(, we say that the ring K is commutative., If there is an element 1 in K such that lx = xl = x for each x, K is said, and 1 is called the identity, for K., to be a ring with identity,, 140
Page 149 :
Determinant Functions, , Sec. 5.2, , We are interested here in commutative, rings with identity. Such a, ring can be described briefly as a set K, together with two operations, which satisfy all the axioms for a field given in Chapter 1, except possibly, for axiom (8) and the condition 1 # 0. Thus, a field is a commutative, ring with non-zero identity such that to each non-zero x there corresponds, an element x:-r with xx-l = 1. The set of integers, with the usual operations, is a commutative, ring with identity which is not a field. Another, commutative, ring with identity is the set of all polynomials, over a field,, together with the addition and multiplication, which we have defined for, polynomials., If K is a commutative, ring with identity, we define an m X n matrix, over K to be a function A from the set of pairs (i, j) of integers, 1 5 i _< m,, 1 2 j < n, into K. As usual we represent such a matrix by a rectangular, array having m rows and n columns. The sum and product of matrices, over K are defined as for matrices over a field, (A + B)ij, (A&, , = Aij +, = z A&j, , Bij, , the sum being defined when A and B have the same number of rows and, the same number of columns, the product being defined when the number, of columns of A is equal to the number of rows of B. The basic algebraic, properties of these operations are again valid. For example,, A(B + C) = AB + AC,, , (AB)C, , = A(X),, , etc., , As in the case of fields, we shall refer to the elements of K as scalars., We may then define linear combinations, of the rows or columns of a, matrix as we did earlier. Roughly speaking, all that we previously did for, matrices over a field is valid for matrices over K, excluding those results, which depended upon the ability to ‘divide’ in K., , 5.2., , Determinant, , Functions, , Let K be a commutative, ring with identity. We wish to assign to, each n X n (square) matrix over K a scalar (element of K) to be known, as the determinant, of the matrix. It is possible to define the determinant, of a square matrix A by simply writing down a formula for this determinant in terms of the entries of A. One can then deduce the various properties of determinants, from this formula., However,, such a formula is, rather complicated, and to gain some technical advantage we shall proceed, as follows. We shall define a ‘determinant, function’ on Knxn as a function, which assigns to each n X n matrix over K a scalar, the function having, these special properties. It is linear as a function of each of the rows of the, , 141
Page 150 :
Determinants, , Chap. 5, , matrix: its value is 0 on any matrix having two equal rows; and its value, on the n X n identity matrix is 1. We shall prove that such a function, exists, and then that it is unique, i.e., that there is precisely one such, function. As we prove the uniqueness, an explicit formula for the determinant will be obtained, along with many of its useful properties., This section will be devoted to the definition of ‘determinant, function’, and to the proof that at least one such function exists., DeJinition., Let Ei be a, integer, and let D be a function, a scalar D(A) in I<. We say, D is a linear function of the ith, , commutative ring with identity, n a positive, which assigns to each n X n matrix A over K, if for each i, 1 5 i 2 n,, that D is n-linear, row when the other (n - 1) rows are held jixed., , This definition, requires some clarification., If D is a function from, Knxn into K, and if (~1, . . . , QI, are the rows of the matrix A, let us also, write, D(A) = D(al, . . . , a,), that is, let us also think of D as the function, ment that D is n-linear then means, (5-l), , D(q, , of the rows of A. The state-, , . . . , C(Y(+ CI:, . . . , a,) = cD(a~, . . . , ai, . . . , ar,,), + WC-Q, . . . ) a:, . . . , a,)., , If we fix all rows except row i and regard D as a function of the ith row,, it is often convenient to write D(cxJ for D(A). Thus, we may abbreviate, (5-l) to, D(cai + a;) = cD(ai) + D(c~:), so long as it is clear what the meaning, , is., , EXAMPLE 1. Let kl, . . . , k, be positive integers, 1 5 Ici 5 n, and, let a be an element of K. For each n X n matrix A over K, define, (5-2), , D(A), , = aA(l,, , kI) ... A(n, k,)., , Then the function D defined by (5-2) is n-linear. For, if we regard D as a, function of the ith row of A, the others being fixed, we may write, D((Y;) = A(i, ki)b, where, have, , b is some fixed element, , of K. Let (Y: = (A&, . . . , Ak)., , D(cai + ai:) = [cA(i, ki) + A’(i, ki)]b, = cD(ai) + D(a:)*, Thus D is a linear function of each of the rows of A., A particular n-linear function of this type is, D(A), , = AllAs, , . . . A,,., , Then, , we
Page 151 :
Determinant Functions, , Sec. 5.2, In other words, the ‘product, on Knxn., , of the diagonal, , entries’ is an n-linear, , function, , EXAMPLE 2. Let us find all 2-linear functions on 2 X 2 matrices over, K. Let D be such a function. If we denote the rows of the 2 X 2 identity, matrix by ~1, q, we have, D(A) = D(Ane1 + AEQ, Azel + A2d, Using the fact that D is 2-linear, (5-l), we have, D(A), , = A,ID(EI, A,I~ + Am) + &D(Q,, A,,e + A2d, = An&D(tl,, ~1) + AddV~l,, ez), + AddV~2,, 4 + &A&(Q,, Thus D is completely determined by the four scalars, and, D(Q, 4,, D(~I, ~2>,, D(Ez,, d,, The reader should find it easy to verify the following., four scalars in K and if we define, , D(A) = AnA21a + An&b, then D is a 2-linear, , function, , Lemma., , A linear combination, , ~2)., , If a, b, c, cl are any, , + AuAac + &z&d, , on 2 X 2 matrices, , D(EI, 61) = a,, D(Ez, ~1) = c,, , D(Ez,, , 4., , over K and, , D(EI, ~2) = b, DC ES,E2) = d., of n-linear junctions, , is n-linear., , Proof. It suffices to prove that a linear combination, of two, n-linear functions is n-linear. Let D and E be n-linear functions. If a and b, belong to K, the linear combination aD + bE is of course defined by, (aD + bE)(A), , = aD(A), , + bE(A)., , Hence, if we fix all rows except row i, (aD + bE)(ccq + a;) = aD(ccu; + (.y;) + bE(cai + (Y:), = acD(aJ + aD(ac:) + bcE(aJ + bE(cr:), = c(aD + bE)(ai) + (aD + bE)(a:)., 1, If K is a field and V is the set of n X n matrices, lemma says the following. The set of n-linear functions, of the space of all functions from V into K., EXAMPLE 3. Let D be the function, K by, , defined, , over K, the above, on V is a subspace, , on 2 X 2 matrices, , D(A) = &A,, - A,,&., (5-3), Now D is the sum of two functions of the type described, D = D1 + D,, DIM) = &I&Z, D,(A) = -A,,&., , in Example, , over, , 1:
Page 152 :
Determinants, , Chap. 5, , By the above lemma, D is a 2-linear function. The reader who has had, any experience with determinants, will not find this surprising, since he, will recognize (5-3) as the usual definition of the determinant, of a 2 X 2, matrix. Of course the function D we have just defined is not a typical, 2-linear function. It has many special properties. Let us note some of these, properties. First, if 1 is the 2 X 2 identity matrix, then D(I) = 1, i.e.,, ,?J(Q, Q) = 1. Second, if the two rows of A are equal, then, D(A), , = AllA12 - A&11, , Third, if A’ is the matrix obtained, ing its rows, then D(A’) = -D(A);, D(A’), , De$nition., (or alternate), , = 0., , from a 2 X 2 matrix, for, , A by interchang-, , = Ak4;z - A:&, = Ad12 - A,,&, = -D(A)., , Let D be an n-linear function. We say D is alternating, if the following two conditions are satisfied:, , (a) D(A) = 0 whenever two rows of A are equal., (b) If A’ is a matrix obtained from A by interchanging, then D(A’) = -D(A)., , two rows of A,, , We shall prove below that any n-linear function D which satisfies (a), automatically, satisfies (b). We have put both properties in the definition, of alternating, n-linear function as a matter of convenience. The reader, will probably also note that if D satisfies (b) and A is a matrix with two, equal rows, then D(A) = -D(A)., It is tempting, to conclude that D, satisfies condition (a) as well. This is true, for example, if K is a field in, which 1 + 1 # 0, but in general (a) is not a consequence of (b)., Definition., Let K be a commutative ring with identity, and let n be a,, positive integer. Suppose D is a function from n X n matrices over K into, function, if D is n-linear, alternating,, 1~. We say that D is a determinant, and D(1) = 1., , As we stated earlier, we shall ultimately, show that there is exactly, one determinant, function on n X n matrices over K. This is easily seen, for 1 X 1 matrices A = [a] over K. The function D given by D(A) = a, is a determinant, function, and clearly this is the only determinant, function on 1 X 1 matrices. We are also in a position to dispose of the case, n = 2. The function, D(A), , = AA,, , - AI,&, , was shown in Example 3 to be a determinant, function. Furthermore,, the, formula exhibited in Example 2 shows that D is the only determinant
Page 153 :
Determinant Functions, , Sec. 5.2, function, D(A), , on 2 X 2 matrices. For we showed that for any Z-linear, = AIIAZID(EI,, , EI) + &&D(~I,, , D, , d, + AnAd8~2,, , If D is alternating,, , function, , ~1) + Aw4dJ(t~,, , 4., , then, WEI, ~1) = D(ez, 4, , = 0, , and, D(ez, q) = -D(Q), If D also satisfies D(I), , e2) = -D(I)., , = 1, then, D(A), , = &A,,, , - AnA,l., , EXAMPLE, 4. Let F be a field and let D be any alternating, function on 3 X 3 matrices over the polynomial ring F[x]., Let, , If we denote the rows of the 3 X 3 identity, D(A), , = D(sal - &,, , Since D is linear as a function, D(A), , = xD(~1, ~2, el +, = xD(s,, , ~27, , matrix, , by E,, c2, Q, then, , ~2,tl + x+3)., , of each row,, , - CC~D(E~,~2, ~1 + 23~3), ~1) + X~J%EI, ~2, ~3) - x2D(ea, ~2, ~1) - ~D(Q,, , Because D is alternating, , 3-linear, , x3c13), , it follows, WA), , ~2,~3)., , that, , = (x4 + x2)D(~1,, , ~2,, , ~3)., , Lemma., Let D be a Z-linear function with the property that D(A) = 0, for all 2 X 2 matrices A over K having equal rows. Then D is alternating., , Proof. What we must show is that if A is a 2 X 2 matrix and A’, is obtained by interchanging, the rows of A, then D(A’) = -D(A)., If the, rows of A are LYand /3, this means we must show that D(/.3, CX) = - D(cr, 0)., Since D is Z-linear,, D(a + P, a + /3 = D(a, 4 + D(a, 13 + W,, By our hypothesis, , 4 + D(P, 8., , D(ar + p, (Y + /3) = D(ol, a) = D(& /3) = 0. So, 0 = D(a, PI + W,, , 4., , I, , Lemma., Let D be an n-linear function on n X n matrices over K., Suppose D has the property that D(A) = 0 whenever two adjacent rows of, A are equal. Then D is alternating., , Proof. We must show that D(A) = 0 when any two rows of A, are equal, and that D(A’) = -D(A), if A’ is obtained by interchanging, , 145
Page 154 :
146, , Determinants, , Chap. 5, , some two rows of A. First, let us suppose that A’ is obtained by interchanging two adjacent rows of A. The reader should see that the argument, used in the proof of the preceding lemma extends to the present case and, gives us D(A’) = -D(A)., Now let B be obtained by interchanging, rows i and j of A, where, i < j. We can obtain l? from A by a succession of interchanges of pairs of, adjacent rows. We begin by interchanging, row i with row (i + 1) and, continue until the rows are in the order, al,, , . . . , a-1,, , %+1,, , . . . , q,, , Qli,, , "j+1,, , . . * , G., , This requires lc = j - i interchanges of adjacent rows. We now move aj, to the ith position using (k - 1) interchanges of adjacent rows. We have, thus obtained 3 from A by k + (k - 1) = 2k - 1 interchanges of adjacent rows. Thus, D(3), , = (-1)+9(A), , = -D(A)., , Suppose A is any n X n matrix with two equal rows, say ai = aj, with i < j. If j = i + 1, then A has two equal and adjacent rows and, D(A) = 0. If j > i + 1, we interchange, ai+~ and cwj and the resulting, matrix B has two equal and adjacent rows, so D(B) = 0. On the other, hand, D(3) = -D(A),, hence D(A) = 0. 1, DeJinition., If n > 1 and A is an n X n matrix over K, we let A(ilj), denote the (n - 1) X (n - 1) matrix obtained by deleting the ith row and, jth column of A. If D is an (n - l)-linear function and A is an n X n, matrix, we put Dij(A) = D[A(iJj)]., Theorem, 1. Let n > 1 and let D be an alternating, (n - I)-linear, junction on (n - 1) X (n - 1) matrices ozler K. For each j, 1 < j I n,,, the function Ej de$ned by, , Ej(A), , (5-4), , = ii,, , is an alternating n-linear function, nant function, so is each Ej., , (-l)‘+‘AiiDij(A), on n X n matrices A. If D is a determi-, , Proof. If A is an n X n matrix, Dij(A) is independent of the ith, row of A. Since D is (n - l)-linear, it is clear that Dij is linear as a function of any row except row i. Therefore AijDii(A), is an n-linear function, of A. A linear combination, of n-linear functions is n-linear; hence, Ej is, n-linear. To prove that Ej is alternating,, it will s&ice to show that, Ej(A) = 0 whenever A has two equal and adjacent rows. Suppose ak =, Crk+l. If i # k and i # k + 1, the matrix A(ilj) has two equal rows, and, thus Dij(A) = 0. Therefore, E,(A), , = ( -l)k+iAtiDkj(A), , + ( -l)k+‘+iA(Ir+l)jD(E+l)j(A).
Page 155 :
Determinant Functions, , Sec. 5.2, Since ffk =, , ffk+l,, , Akj = A(k+l)j, , and, , A(k/j), , = A(k + llj)., , Clearly then Ej(A) = 0., Now suppose D is a determinant, function. If I(“) is the n X n identity, matrix, then lcn)(j]j), is the (n - 1) X (n - 1) identity, matrix Icn--l)., Since 1$’ = 6;j, it follows from (5-4) that, Ej(I’“‘), = D(I’“-I’)., (5-5), NOW, , tion., , D(I(“-“), 1, , = 1, SO that Ej(l’“‘), , = 1 and Ej is a determinant, , func-, , Corollary., Let K be a commutative ring with identity and let n be a, positive integer. There exists at least one determinant function on Knxn., Proof. We have shown the existence of a determinant, function, on 1 X 1 matrices over K, and even on 2 X 2 matrices over K. Theorem 1, tells us explicitly how to construct a determinant, function, on n X n, matrices, given such a function, on (n - 1) X (n - 1) matrices. The, corollary follows by induction., 1, EXAMPLE, , 5. If B is a 2 X 2 matrix, , over K, we let, , IBI = &l&2 - &&l., function, Then ]B[ = D(B), w h ere D is the determinant, We showed that this function on K2X2 is unique. Let, A = E;i, be a 3 X 3 matrix, , ;,, , on 2 X 2 matrices., , ;;k], , over K. If we define El, Ez, Ea as in (5-4), then, , It follows from Theorem 1 that El, Ez, and Es are determinant, functions., Actually, as we shall show later, El = Ez = ES, but this is not yet apparent even in this simple case. It could, however, be verified directly, by, expanding each of the above expressions. Instead of doing this we give, some specific examples., (a) Let K = R[x] and, X3, , x-2, , 1, x-3, , 1, .
Page 156 :
Determinants, , Chap. 5, , Then, , and, E,(A), , (b) Let, , =, , K, , =, , R, , and, , Then, , E,(A) =, , I I, I I, I I, ;, , ; = 1, , E,(A) = - 01 o1 = 1, E,(A) = - 01 o1 = 1., , Exercises, 1. Each of the following, expressions defines a function, D on the set of 3 x 3, matrices over the field of real numbers. In which of these cases is D a a-linear, function?, (4 D(A) = An + A22 + As;, (b), (cl, (4, , D(A), D(A), D(A), , = (Ad2, = &4r&;, = &&&2, , + 3Adzz;, + 5AnAdm;, , (e) D(A) = 0;, (f) D(A) = 1., 2. Verify directly, (5-8) are identical., , that, , the three functions, , &, Ez, E3 defined, , by (5-6), (5-7), and, , 3. Let K be a commutative, ring with identity., If A is a 2 x 2 matrix, the classical, adjoint, of A is the 2 X 2 matrix adj A defined by, , over K,, , -An ., Au, , 1, , If det denotes, that, , the unique, , determinant, , function, , on 2 X 2 matrices, , over K, show
Page 157 :
Determinant Functions, , Sec. 5.2, (a) (adj A)A = A(adj A) = (det A)I;, (b) det (adj A) = det (A);, (c) adj (At) = (adj A)t., (At denotes the transpose of A.), , 4. Let A be a 2 X 2 matrix over a field F. Show that A is invertible, if det A # 0. When A is invertible, give a formula for A-l., , if and only, , 5. Let A be a 2 X 2 matrix over a field F, and suppose that A2 = 0. Show for, each scalar c that det (cl - A) = c2., 6. Let K be a subfield of the complex numbers and n a positive integer. Let, 311 f * *, j, and ICI, . . . , k, be positive integers not exceeding n. For an n X n, matrix A over K define, D(A) = A($, k3A(jz,k2), , . . . ALL,, , W., , Prove that D is n-linear if and only if the integers jr, . . . , j,, are distinct., 7. Let K be a commutative ring with identity. Show that the determinant function on 2 X 2 matrices A over K is alternating and 2-linear as a function of the, columns of A., 8. Let K be a commutative, matrices over K by the rule, , Show that D is alternating, 9. Let, function, (a), (b), one row, , ring with identity. Define a function D on 3 X 3, , and 3-linear as a function of the columns of A., , K be a commutative ring with identity and D an alternating nJinear, on n X n matrices over K. Show that, D(A) = 0, if one of the rows of A is 0., D(B) = D(A), if B is obtained from A by adding a scalar multiple of, of A to another., , 10. Let P be a field, A a 2 X 3 matrix over F, and (cl, c2, ct) the vector in F3, defined by, , Show that, (a) rank (A) = 2 if and only if (cl, ~2,c3) # 0;, (b) if A has rank 2, then (cl, c2,ca) is a basis for the solution space of the, system of equations AX = 0., 11. Let K be a commutative ring with identity, and let D be an alternating 2-linear, function on 2 X 2 matrices over K, Show that D(A) = (det A)D(I) for all A., Now use this result (no computations with the entries allowed) to show that, det (AB) = (det A)(det B) for any 2 X 2 matrices A and B over K., 12. Let F be a field and D a function on n X n matrices over F (with values in F)., , Suppose D(AB) = D(A)D(B) for all A, B. Show that either D(A) = 0 for all A,, or D(I) = 1. In the latter case show that D(A) Z 0 whenever A is invertible., 13. Let R be the field of real numbers, and let D be a function on 2 X 2 matrices
Page 158 :
150, , Determinants, , Chap. 5, , over R, with values in R, such that D(AB) = D(A)D(B), also that, , for all A, B. Suppose, , D([Y, iw([:, iI)*, , Prove the following., (a) D(0) = 0;, (b) D(A) = 0 if A2 = 0;, (c) D(B) = -D(A) if B is obtained by interchanging, of A;, (d) D(A) = 0 if one row (or one column) of A is 0;, (e) D(A) = 0 whenever A is singular., , the rows (or columns), , 14. Let A be a 2 X 2 matrix over a field P. Then the set of all matrices of the, form f(A), where j is a polynomial over F, is a commutative ring K with identity., If B is a 2 X 2 matrix over K, the determinant of B is then a 2 X 2 matrix over F,, of the form f(A). Suppose I is the 2 X 2 identity matrix over F and that B is the, 2 X 2 matrix over K, B = A - &I, -11121, ., A - A&, [ - ‘&?,I, Show that det B = f(A), where f = x2 - (An + Az)x + det A, and also that, f(A) = 0., , 1, , 5.3., , Permutations, , and, , the, , Uniqueness, , of Determinants, In this section we prove the uniqueness of the determinant, function, to couon n X n matrices over K. The proof will lead us quite naturally, sider permutations, and some of their basic properties., n-linear function on n X n matrices over, Suppose D is an alternating, K. Let A be an n X n matrix over K with rows al, CQ, . . . , (Y,. If we denote the rows of the n X n identity matrix over K by cl, Q, . . . , en, then, w = jl, , (5-9), , A (6 jbj,, , l<i<n. -, , Hence, , D(A) = D L: A(1, j)cj, 03, . . . , LY,, ( i, , = 2: AU, j)D(Ej,, i, , ), , w,, . . . , 4., , If we now replace fy2 by 2 A(‘2, k)~, we see that, k, , Thus, D(A), , = Z A (1, j)A (2, k)D(ej,, i,k, , ck,, , * . . 9 an>-
Page 159 :
Sec. 5.3, , Permutations and the Uniqueness of Determinants, , In D(E~, EE,. . . , CY,) we next replace 013by Z A(3, Z)Q and so on. We finally, obtain a complicated but theoretically, important, expression for D(A),, namely, (5-10), , D(A), , =, z, kl,, , AU,, , hM@,, , h), , . . . A(n,, , hJD(ea,, , Ekz,, , . . . , Ek,)., , ka. , . . , k,, , In (5-10) the sum is extended over all sequences (kl, k,, . . . , k,) of positive, integers not exceeding n. This shows that D is a finite sum of functions of, the type described by (5-2). It should be noted that (5-10) is a consequence, just of assumption that D is n-linear, and that a special case of (5-10) was, obtained in Example 2. Since D is alternating,, D(%,, , Ckk,,, , . . . , Ekn), , =, , 0, , whenever two of the indices Ici are equal. A sequence (i&, k,, . . . , k,), of positive integers not exceeding n, with the property that no two of, of degree, n. In (5-10) we need, the ki are equal, is called a permutation, therefore, sum only over those sequences which are permutations, of, degree n., Since a finite sequence, or n-tuple, is a function defined on the first n, positive integers, a permutation, of degree n may be defined as a one-one, function from the set (1, 2, . . . , n} onto itself. Such a function (r corresponds to the n-tuple (~1, ~2, . . . , an) and is thus simply a rule for ordering 1, 2, . . . , n in some well-defined way., If D is an alternating, n-linear function and A is an n X n matrix, over K, we then have, (5-11), , D(A), , = 2 A(l,, c, , ~1) . . . A(n, un)D(cO1, . . . , con), , where the sum is extended over the distinct, Next we shall show that, (5-12), , D(E~I, . . . , 4, , permutations, , u of degree n., , = =WEI t * . . ) 4, , U. The reason for this, where the sign f depends only on the permutation, is as follows. The sequence (al, ~2, . . . , an) can be obtained from the, of pairs of, sequence (1,2, . . . , n) by a finite number of interchanges, elements. For example, if ~1 # 1, we can transpose 1 and al, obtaining, ;;;I : : : ; 1, * . .). Proceeding in this way we shall arrive at the sequence, un) after n or less such interchanges of pairs. Since D is alternating, the sign of its value changes each time that we interchange two, of the rows ci and ej. Thus, if we pass from (1, 2, . . . , n) to (~1, ~2, . . . , an), by means of m interchanges of pairs (i, j), we shall have, , D(M,, In particular,, (5-13), , . . . , em) = (- l)mD(c~,, , if D is a determinant, D(u,, , function, , . . . ) em) = (-1)”, , . . . , 4., , 151
Page 160 :
156, , Determinants, , Chap. 5, , where m depends only upon u, not upon D. Thus all determinant, functions assign the same value to the matrix with rows c.1, . . . , Ed,, and this, value is either 1 or -1., Now a basic fact about permutations, is the following., If u is a permutation of degree n, one can pass from the sequence (1,2, . . . , n) to, the sequence (al, ~2, . . . , an) by a succession of interchanges, of pairs,, and this can be done in a variety of ways; however, no matter how it is, done, the number of interchanges used is either always even or always, odd. The permutation, is then called even or odd,, respectively., One, by, defines the sign of a permutation, sgn u =, , 1,, - 1,, , if uiseven, if u is odd, , the symbol ‘1’ denoting here the integer 1., We shall show below that this basic property of permutations, can be, deduced from what we already know about determinant, functions. Let, us assume this for the time being. Then the integer m occurring in (5-13), is always even if u is an even permutation,, and is always odd if u is an odd, permutation., For any alternating, n-linear function D we then have, D(w,, , ...,4, , = (sgn u)D(a,, , . . .,4, , and using (5-11), (5-14), , D(A), , =, , 2 (sgn u)A(l,, [ c, , al) . . . A(n, an), , 1, , D(I)., , Of course 1 denotes the n X n identity matrix., From (5-14) we see that there is precisely one determinant, function, on n X n matrices over K. If we denote this function by det, it is given by, (5-15), , det (A) = 2 (sgn u)A(l,, c, , the sum being extended, can formally summarize, , over the distinct, as follows., , ~1) . . . A(n, an), permutations, , u of degree n. We, , Theorem, 2. Let K be a commutative ring with identity and let n be a, positive integer. There is precisely one determinant function on the set of, n X n matrices over K, and it is the function det dejined by (5-15). If D is, any alternating n-linear function on Knxn, then for each n X n matrix A, , D(A), , = (det A)D(I)., , This is the theorem we have been seeking, but we have left a gap in, the proof. That gap is the proof that for a given permutation, u, when we, pass from (1, 2, . . . , n) to (~1, ~2, . . . , an) by interchanging, pairs, the, number of interchanges is always even or always odd. This basic combinatorial, fact can be proved without any reference to determinants;
Page 161 :
Permutations, , Sec. 5.3, , and the Uniqueness, , of Determinants, , however, we should like to point out how it follows from the existence of, a determinant function on n X n matrices., Let us take K to be the ring of integers. Let D be a determinant, of degree n, ., function on n X n matrices over K. Let u be a permutation, and suppose we pass from (1, ‘2, . . . , n) to (~1, ~2, . . . , an) by m interchanges of pairs (i, j), i # j. As we showed in (5-13), (-1)”, that is, the number, rows ccl, . . . , con. If, , ( -l)n, , = D(E,~, . . . ;hJ, must be the value, , D(Q,, , . . .,Em), , of D on the matrix, , with, , = 1,, , then m must be even. If, D(e,l, . . .,Em) = --I,, then m must be odd., Since we have an explicit formula for the determinant, of an n X n, matrix and this formula involves the permutations, of degree n, let us, conclude this section by making a few more observations, about permutations. First, let us note that there are precisely n! = 1 . 2 +. . n permutations of degree n. For, if u is such a permutation,, there are n possible, choices for ul; when this choice has been made, there are (n - 1) choices, for ~2, then (n - 2) choices for ~3, and so on. So there are, n(n, , -, , l)(n, , - 2) . . . 2 . 1 = n!, , permutations, u. The formula (5-15) for det (A) thus gives det (A) as a, sum of n! terms, one for each permutation, of degree n. A given term is a, product, A(1, al) . . . A(n, on), of n entries of A, one entry from each row and one from each column,, and is prefixed by a I+’ or ’ -’ sign according as u is an even or odd, permutation., When permutations, are regarded as one-one functions from the set, The, {1,2, . . * , n} onto itself, one can define a product of permutations., product of u and r will simply be the composed function UT defined by, (UT)(i), If t denotes the identity, u-r such that, , = U(T(i))., , permutation,, fJf-l, , c(i) = i, then each u has an inverse, , = u-b, , = e., , One can summarize these observations by saying that, under the operation of composition, the set of permutations, of degree n is a group. This, group, of degree, n., group is usually called the symmetric, From the point of view of products of permutations,, the basic prop^ erty of the sign of a permutation, is that, (5-16), , sgn (UT) = (sgn u) (sgn T)., , 153
Page 162 :
154, , Chap. 5, , Determinants, , In other words, UT is an even permutation, if u and r are either both even, or both odd, while (~7 is odd if one of the two permutations, is odd and the, other is even. One can see this from the definition of the sign in terms of, successive interchanges of pairs (i, j). It may also be instructive, if we, point out how sgn (UT) = (sgn u)(sgn 7) follows from a fundamental, property of determinants., Let K be the ring of integers and let u and r be permutations, of, degree n. Let tl, . . . , E~be the rows of the n X n identity matrix over K,, let A be the matrix with rows E,~,. . . , em, and let B be the matrix with, rows E,~,. . . , Ed,. The ith row of A contains exactly one non-zero entry,, namely the 1 in column ri. From this it is easy to see that e,i is the ith, row of the product matrix AB. Now, det (A), , det (B) = sgn u,, , = sgn 7,, , So we shall have, following., , and, , det (Al?), , = sgn, , sgn (UT) = (sgn u)(sgn 7) as soon as we prove, , Theorem, 3. Let K be a commutative, B be n X n matrices over K. Then, , ring with identity,, , (UT)., , the, , and let A and, , det (AB) = (det A)(det B)., Proof. Let B be a fixed n X n matrix over K, and for each n X n, matrix A define D(A) = det(AB). If we denote the rows of A by al, . . . ,, CY,,then, D(w, . . . , a,) = det (arIB, . . . , GB)., Here qB denotes the 1 X n matrix which, matrix cq and the n X n matrix B. Since, , is the product, , of the 1 X n, , (cai + a:)B = ccuiB + c&B, and det is n-linear, it is easy to see that D is n-linear., CX~B= cyjB, and since det is alternating,, WI,, Hence, D is alternating., by Theorem 2, But D(I), , = det (IB), , . . . , a,) = 0., , Now D is an alternating, D(A), , If CG = q, then, , n-linear, , function,, , and, , = (det A)D(I)., , = det B, so, , det (AB), , = D(A), , = (det A)(det, , B)., , 1, , The fact that sgn (UT) = (sgn ~)(sgn T) is only one of many corollaries, to Theorem 3. We shall consider some of these corollaries in the next, section.
Page 163 :
Sec. 5.3, , Permutations and the Uniqueness of Determinants, , Exercises, 1. If, , K is a commutative, , ring with identity, A=[:;, , and A is the matrix, j, , over K given by, , 3, , show that det A = 0., 2. Prove, , that the determinant, , of the Vandermonde, , a)(c -, , a2, b b2, , [ 1, 1, 1, 1, , is (6 - a)(c -, , matrix, , b)., , a, , c, , c2, , 3. List explicitly, the six permutations, of degree 3, state which are odd and which, are even, and use this to give the complete formula (5-15) for the determinant, of a, 3 X 3 matrix., 4. Let, , u and r be the permutations, , of degree, , 4 defined, , by al = 2, a2 = 3,, , a3 = 4, u4 = 1, 71 = 3, 72 = 1, 73 = 2, 74 = 4., , (a) Is (T odd or even? Is r odd or even?, (b) Find UT and TU., 5. If A is an invertible, , n X, , n matrix, , 6. Let A be a 2 X 2 matrix, if and only if trace (A) = 0., , over a field, show that, , over a field., , Prove, , that, , det A # 0., , det (I + A) = 1 + det A, , 7. An n X n matrix, A is called triangular, if Aii = 0 whenever i > j or if, Aii = 0 whenever i < j. Prove that the determinant, of a triangular, matrix is the, . . . A,, of its diagonal entries., product AIlA, 8. Let A be a 3 X 3 matrix over the field of complex numbers. We form the, entries, the i, j entry of this matrix being the, matrix zl - A with polynomial, polynomial, 6+r - Ai+ If j = det (21 - A), show that j is a manic polynomial, of degree 3. If we write, j = (x - Cl) (z with complex, , numbers, , -, , c3), , cl, cz, and c3, prove that, , ci + c2 + c3 = trace (A), 9. Let n be a positive, prove that the function, , cz)(z, , integer, , and, , and F a field., , clczc3 = det A., If u is a permutation, , of degree n,, , T(z1, . . . , 2,) = (2,1, . . . ,2.n), is an invertible, , linear, , operator, , on F”., , F be a field, n a positive integer, and S the set of n X n matrices over F., Let V be the vector space of all functions from S into F. Let W be the set of alternating n-linear functions on S. Prove that W is a subspace of V. What is the dimension of W?, 10. Let, , 155
Page 164 :
166, , Chap. 5, , Determinants, 11. Let T be a linear operator on Fn. Define, , Mai, . . . , a,) = det (Toll, . . , , Ta.)., (a) Show that DT is an alternating n-linear function., (b) If, c = det (TEE, . . . , TE,), show that for any n vectors crl, . . . , cr., we have, det (!!‘cui, . . . , Tcr,) =cdet((~~,...,(~J., (c) If @ is any ordered basis for Fn and A is the matrix of T in the ordered, basis CR,show that det A = c., (d) What do you think is a reasonable name for the scalar c?, 12. If u is a permutation of degree n and A is an n X n matrix over the field F, with row vectors 01, . . . , ayn,let a(A) denote the n X n matrix with row vectors, a&l,, , . . . ) Gn., , (a) Prove that u(AB) = u(A)B, and in particular that u(A) = a(l, (b) If T is the linear operator of Exercise 9, prove that the matrix of T in, the standard ordered basis is u(l)., (c) Is u-l(I) the inverse matrix of u(I)?, (d) Is it true that u(A) is similar to A?, 13. Prove that the sign function on permutations is unique in the following sense., If f is any function which assigns to each permutation of degree n an integer, and, if I, = f(u)f(~), then f is identically 0, or f is identically 1, or f is the sign, function., , 5.4., , Additional, , Properties, , of Determinants, , In this section we shall relate some of the useful properties of the, determinant, function on n X n matrices. Perhaps the first thing we should, point out is the following. In our discussion of det A, the rows of A have, played a privileged role. Since there is no fundamental, difference between, rows and columns, one might very well expect that det A is an alternating, n-linear function of the columns of A. This is the case, and to prove it,, it suffices to show that, det (At) = det (A), , (5-17), , where At denotes the transpose of A., If u is a permutation, of degree n,, At@,, , From the expression, , ui), , = A(&,, , (5-15) one then has, , det (At) = L: (sgn u)A(al,, , 1) . * - A(un, n)., , c, , When i = u-‘j, A(&,, , i)., , i) = A(j,, , u-‘j)., , Thus, , A(u1, 1) . . . A(un, n) = A(l,, , a-ll), , . . . A(n, u-h).
Page 165 :
Additional Properties of Determinants, , Sec. 5.4, Since au-l is the identity, , permutation,, , (sgn c)(sgnu-‘), Furthermore,, Therefore, , = 1 or, , sgn (u-l), , = sgn (u)., , as u varies over all permutations, det (At) = Z (sgn u-‘)A(l,, c7, , a-ll), , of degree n, so does u--l., . . . A (n, u-%x), , = det A, proving (5-17)., On certain occasions one needs to compute specific determinants., When this is necessary, it is frequently, useful to take advantage of the, following fact. If B is obtained from A by adding a multiple of one row of A, to another (or a multiple of one column to another), then, det B = detA., , (5-18), , We shall prove the statement about rows. Let B be obtained from A by, adding caj to cyi, where i < j. Since det is linear as a function of the ith row, det B = det A +, = det A., , c, , det ((~1, . . . , orj, . . . , aj, . . . , a,), , Another useful fact is the following. Suppose we have an n X n matrix, of the block form, A B, 0 c, , II 1, , where A is an r X r matrix, C is an s X s matrix,, the s X r zero matrix. Then, , B is T X s, and 0 denotes, , (5-19), , C)., , = (det A)(det, , To prove this, define, D(A, B, C) = det, If we fix A and B, then D is alternating, rows of C. Thus, by Theorem 2, , [ 1, t, , ., , z, , and s-linear, , D(A, B, C) = (det C)D(A,, , as a function, , of the, , B, I), , where I is the s X s identity matrix. By subtracting multiples of the rows, of I from the rows of B and using the statement above (5-M), we obtain, D(A, B, I) = D(A, 0, I)., Now D(A, 0,1) is clearly alternating, and r-linear as a function, of A. Thus, D(A, 0, I) = (det A)D(I, 0, I)., , of the rows
Page 166 :
158, , Determinants, But D(I,, , Chap. 5, , 0, 1) = 1, so, D(A, B, C) = (det C)D(A, B, I), = (det C)D(A, 0, I), = (det C)(det A)., , By the same sort of argument,, (5-20), , det, , or by taking, , transposes, , [ 1, A, B, , ’, C, , = (det A)(det, , C)., , EXAMPLE 6. Suppose K is the field of rational, to compute the determinant, of the 4 X 4 matrix, 2, , 3, , -; 3, , -;a0, , l-l, , A =, By subtracting, suitable, obtain the matrix, , numbers, , and we wish, , c 1, ;1, , ;2, , multiples, , [i, , of row 1 from, , -;, , 1;, , rows 2, 3, and 4, we, , i], , which we know by (5-18) will have the same determinant, as A. If we, subtract $ of row 2 from row 3 and then subtract 3 of row 2 from row 4,, we obtain, B-k, , -;, , 1;, , I;], , and again det B = det A. The block form of B tells us that, d&A, , = det B = 1:, , -ill-i, , -iI, , = 4(32) = 128., , Now let n > 1 and let A be an n X n matrix over K. In, we showed how to construct a determinant, function on n X, given one on (n - 1) X (n - 1) matrices. Now that we have, uniqueness of the determinant, function, the formula, (5-4), following. If we fix any column index j,, det A = iz, (-l)“+iAij, The scalar (-l)‘+i, det A(ilj), the cofactor of the i, j entry, , Theorem 1,, n matrices,, proved the, tells us the, , det A(+)., , of A or, is usually called the i, j cofactor, of A. The above formula for det A is then
Page 167 :
Sec. 5.4, , Additional Properties of Determinants, , called the expansion of det A by cofactors of the jth column, the expansion by minors of the jth column). If we set, , (or sometimes, , Cij = (- l)i+’ det A (ilj), then the above formula, , says that for each j, det A = $ AijCij, i=l, , where the cofactor Cii is (- l);+i times the determinant, of the (n - 1) X, (n - 1) matrix obtained by deleting the ith row and jth column of A., If j # k, then, 5 AJTij, , = 0., , i=l, , For, replace the jth column of A by its kth column, and call the resulting, matrix B. Then B has two equal columns and so det B = 0. Since B(ilj) =, A($‘), we have, 0 = det B, , These properties, , = iEl (-l)i+‘Bii, , det B(ilj), , = iil, , det A (ilj), , (- l)“+iAik, , of the cofactors, i, , (5-21), , can be summarized, , AikCii, , by, , = bjk det A., , i=l, , The n X n matrix adj A, which is the transpose of the matrix, adjoint, of A. Thus, factors of A, is called the classical, (5-22), , (adj A)ij, , The formulas, , = Cji = (-l)i+j, , (5-21) can be summarized, (adj A)A, , (5-23), , det A(jli)., , in the matrix, , equation, , = (det A)I., , We wish to see that A(adj A) = (det A)1 also. Since At(ilj), we have, (- l)i+j det At(ilj), = (- l)j+i det A(jli), which simply says that the i, j cofactor of At is thej,, By applying, , i cofactor, , adj (At) = (adj A)t, , (5-24), , (5-23) to At, we obtain, (adj At)At, , = (det At)1 = (det A)1, , and transposing, A(adj, , of co-, , At)t = (det A)I., , = A(jli)l,, , of A. Thus, , 159
Page 168 :
160, , Determinants, , Chap. 5, , Using (5-24), we have what we want:, A(adj, , (5-25), , A) = (det A)I., , As for matrices over a field, an n X n matrix A over K is called, over K if there is an n X n matrix A-l with entries in K, such that AA-’ = A-‘A = I. If such an inverse matrix exists it is unique;, for the same argument used in Chapter 1 shows that when BA = AC = I, we have B = C. The formulas (5-23) and (5-25) tell us the following about, invertibility, of matrices over K. If the element det A has a multiplicative, inverse in K, then A is invertible and A-l = (det A)-’ adj A is the unique, inverse of A. Conversely,, it is easy to see that if A is invertible, over K,, the element det A is invertible, in K. For, if BA = I we have, , invertible, , 1 = det I = det (AB), What we have proved, , = (det A) (det B)., , is the following., , Theorem, 4. Let A be an n X n matrix over K. Then A is invertible, over Ei if and only if det A is invertible in K. When A is invertible, the unique, inverse for A is, A-l = (det A)-’ adj A., , In particular, an n X n matrix over a Jield is invertible, determinant is different from zero., , if and only if its, , We should point out that this determinant, criterion for invertibility, proves that an n X n matrix with either a left or right inverse is invertible., This proof is completely independent of the proof which we gave in Chapter 1 for matrices over a field. We should also like to point out what invertibility, means for matrices with polynomial, entries. If K is the polynomial ring F[x], the only elements of K which are invertible, are the, non-zero scalar polynomials., For if f and g are polynomials, and fg = 1,, we have deg f + deg g = 0 so that deg f = deg g = 0, i.e., f and g are, ring F[x] is, scalar polynomials., So an n X n matrix over the polynomial, invertible, 01 er F [x] if and only if its determinant, is a non-zero scalar, polynomial., EXAMPLE 7. Let K = R[x],, real numbers. Let, A=, , x2+2, [ x-l, , x+1, 1, , 1), , the ring of polynomials, , B=[xZ;:3, , over the field of, , .,a]*, , Then, by a short computation,, det A = x + 1 and det B = -6. Thus A, is not invertible over K, whereas B is invertible over K. Note that, adj A =, , -x 1+ 1, , x2 + x1 ?, -x, , 1, , adj B =, , -x2, , + 22x - 3, , - 2, x2 - 1, -x, , 1
Page 169 :
Additional, , Sec. 5.4, , and (adj A)A = (J: + l)1, (adj B)B = -61., , Properties, , Of course,, -x, , - 2, , 1-x2, , EXAMPLE 8. Let K be the ring of integers, , Then det A = -2, , of Determinants, , 1, ., , and, , and, adj A = [-i, , -91., , Thus A is not invertible, as a matrix over the ring of integers; however,, we can also regard A as a matrix over the field of rational numbers. If we, and, do, then A is invertible, , A-’ = -f [-;, , -;], , = [-3, , ;]*, , In connection with invertible matrices, we should like to mention one, further elementary, fact. Similar matrices have the same determinant,, over K and I3 = P-‘AP, then det B = det A., that is, if P is invertible, This is clear since, det (P-‘AP), , = (det P-l)(det, , A)(det, , P) = det A., , This simple observation, makes it possible to define the determinant, of, a linear operator on a finite dimensional vector space. If T is a linear, of T to be the determinant, of, operator on V, we define the determinant, any n X n matrix which represents T in an ordered basis for V. Since all, such matrices are similar, they have the same determinant, and our definition makes sense. In this connection, see Exercise 11 of section 5.3., rule for solving systems of, We should like now to discuss Cramer’s, linear equations. Suppose A is an n X n matrix over the field F and we, wish to solve the system of linear equations AX = Y for some given, n-tuple (yl, . . . , y,). If AX = Y, then, (adj A)AX, , = (adj A)Y, , (det A)X, , = (adj A)Y., , and so, Thus, (det Ah, , = iil, , (4, , A)jiyi, , = igl ( -l)ifjyi, , det A(iJj)., , This last expression is the determinant, of the n X n matrix obtained by, replacing the jth column of A by Y. If det A = 0, all this tells us nothing;, however, if det A # 0, we have what is known as Cramer’s rule. Let A, , 161
Page 170 :
Determinants, , Chap, 5, , be an n X n matrix over the field F such that det A # 0. If yl, . . . , y,,, are any scalars in F, the unique solution X = A-1Y of the system of, equations AX = Y is given by, xi==’, , det Bj, , j=, , l,...,n, , where Bj is the n X n matrix obtained from A by replacing thejth column, of A by Y., In concluding this chapter, we should like to make some comments, which serve to place determinants, in what we believe to be the proper, perspective. From time to time it is necessary to compute specific determinants, and this section has been partially devoted to techniques which, will facilitate such work. However, the principal role of determinants, in, this book is theoretical. There is no disputing the beauty of facts such as, Cramer’s rule. But Cramer’s rule is an inefficient tool for solving systems, of linear equations, chiefly because it involves too many computations., So one should concentrate on what Cramer’s rule says, rather than on, how to compute with it. Indeed, while reflecting on this entire chapter,, we hope that the reader will place more emphasis on understanding, what, the determinant, function is and how it behaves than on how to compute, determinants, of specific matrices., , Exercises, 1. Use the classical adjoint formula to,compute the inverses of each of the following 3 X 3 real matrices., , 2. Use Cramer’s rule to solve each of the following, over the field of rational numbers., (a) x+ y+, c=ll, 2x--yz=, 3x + 4y + 22 =, , (b) 3x - 2y =, , systems of linear equations, , 0, 0., , 7, , 3y - 22 =, 6, 32 - 2x = -1., , 3. An n X n matrix A over a field F is skew-symmetric, if At = --A. If A is a, skew-symmetric n X n matrix with complex entries and n is odd, prove that, det A = 0., if BAl = I. If A is, 4. An n X n matrix A over a field F is called orthogonal, orthogonal, show that det A = fl. Give an example of an orthogonal matrix, for which det A = -1.
Page 171 :
Sec. 5.4, , Additional, , Properties of Determinants, , 163, , 5. AAn n X n matrix A over the field of complex numbers is said to be unitary, transpose of A). If A is unitary,, show, if AA* = I (A* denotes the conjugate, that ldet Al = 1., 6. Let T and U be linear operators on the finite dimensional, (a) det (TU) = (det T)(det U);, if and only if det T # 0., (b) T is invertible, 7. Let A be an n X n matrix, A has the block form, , where Ai is an ri X ri matrix., , A1, 0.a., 0, [ :I, , over K, a commutative, , A =, , vector space V. Prove, , 0, .., 0, , Az, .., 0, , ..., , 0, , .. ., , k,, , ring with identity., , Suppose, , Prove, , det A = (det AJ(det, , AZ) . . . (det Ak),, , 8. Let V be the vector space of n X n matrices over the field F. Let B be a fixed, element of V and let TB be the linear operator on V defined by Te(A) = AB - BA., Show that det Tg = 0., 9. Let A be an n X n matrix over a field, A # 0. If r is any positive integer, of A is any r X r matrix obtained by deleting, between 1 and n, an r X r submatrix, rank, of A is the, (n - r) rows and (n - T) columns of A. The determinant, largest positive integer r such that some r X r submatrix, of A has a non-zero, determinant., Prove that the determinant, rank of A is equal to the row rank of, A ( = column rank A)., 10. Let A be an n X n matrix, over the field F. Prove, distinct scalars c in F such that det (cl - A) = 0., , that, , there are at most, , n, , A and B be n X n matrices over the field F. Show that if A is invertible, there are at most n scalars c in F for which the matrix CA + B is not invertible., , 11. Let, , 12. If V is the vector space of n X n matrices over F and B is a fixed n X n matrix, BA and, over F, let LB and Rg be the linear operators on V defined by LB(A), Re(A) = AB. Show that, (a) det LB = (det B)“;, (b) det RE, = (det B)“., 13. Let V be the vector space of all n X n matrices over the field of complex, numbers, and let B be a fixed n X n matrix over C. Define a linear operator 121~, on V by MB(A) = BAB*, where B* = F. Show that, det MB = ldet B12”., Now let H be the set of all Hermitian, matrices in V, A being Hermitian, if, A = A*. Then H is a vector space over the field of real numbers. Show that the, Te defined by T&A) = BAB* is a linear operator on the real vector, function, det TB,, space H, and then show that det Te = ldet B12”. (Hint: In computing, show that V has a basis consisting of Hermitian, matrices and then show that, det TB = det MB.)
Page 172 :
164, , Chap. 5, , Determinants, , 14. Let A, B, C, D be commuting n X n matrices over the field F. Show that the, determinant of the 2n X 2n matrix, , A, C, , is det (AD - BC)., , 5.5., , B, D, , [ 1, , Modules, If K is a commutative, ring with identity, a module over K is an algebraic system which behaves like a vector space, with K playing the role, of the scalar field. To be precise, we say that V is a module, over K (or a, K-module), if, 1. there is an addition, (ac, fi) -+ QI + /3 on V, under which V is a, commutative, group;, 2. there is a multiplication, (c, a) + ca of elements a! in V and c in K, such that, (Cl, c(w, , +, , c&-i, , =, , Cla!, , +, , czff, , %), , =, , Cal, , +, , cc%, , =, , Cl(CZ(Y), , +, (ClCJ(Y, , la = (Y., For us, the most important K-modules will be the n-tuple modules Kn., The matrix modules KmX” will also be important., If V is any module, we, speak of linear combinations,, linear dependence and linear independence,, just as we do in a vector space. We must be careful not to apply to V any, vector space results which depend upon division by non-zero scalars, the, one field operation which may be lacking in the ring K. For example, if, dependent, we cannot conclude that some ai is a, w,, . . f , ak are linearly, linear combination of the others. This makes it more difficult to find bases, in modules., A basis, for the module V is a linearly independent, subset which, spans (or generates) the module. This is the same definition which we gave, for vector spaces; and, the important, property of a basis 6~ is that each, element of V can be expressed uniquely as a linear combination, of (some, finite number of) elements of a. If one admits into mathematics the Axiom, of Choice (see Appendix), it can be shown that every vector space has a, basis. The reader is well aware that a basis exists in any vector space, which is spanned by a finite number of vectors. But this is not the case, for modules. Therefore, we need special names for modules which have, bases and for modules which are spanned by finite numbers of elements., Dejinition., , If V has afinite, with, , The K-module V is called a free module, if it has a basis., basis containing n elements, then V is called a free K-module, , n generators.
Page 173 :
Sec. 5.5, , Modules, , DeJinition., The module V is finitely, generated, if it contains a finite, subset which spans V. The rank of a finitely generated module is the smallest, integer k such that some k elements span V., , We repeat that a module may be finitely generated without having, a finite basis. If V is a free K-module with n generators, then V is isomorphic to the module K”. If {&, . . . , &} is a basis for V, there is an isomorphism, which sends the vector c& + . . . + cnPn onto the n-tuple, c,), in Kn. It is not immediately, apparent that the same module V, (Cl, . . . ,, could not also be a free module on k generators, with k # n. In other, words, it is not obvious that any two bases for V must contain the same, number of elements. The proof of that fact is an interesting, application, of determinants., 5. Let K be a commutative ring with identity., with n generators, then the rank of V is n., , Theorem, , K-module, , If V is a free, , Proof. We are to prove that V cannot be spanned by less than, n of its elements. Since V is isomorphic to Kn, we must show that, if, m < n, the module KS is not spanned by n-tuples acl, . . . , (Y,. Let A be, the matrix with rows al, . . . , CY,. Suppose that each of the standard basis, vectors cl, . . . , en is a linear combination of Q(~,. . . , 01,. Then there exists, a matrix P in Knxm such that, PA = I, where I is the n X n identity matrix. Let A” be the n X n matrix obtained, by adjoining n - m rows of O’s to the bottom of A, and let P be any n X n, matrix which has the columns of P as its first n columns. Then, PA = I., Therefore det A” # 0. But, since m < n, at least one row of A has all 0, entries. This contradiction, shows that CQ, . . . , CY, do not span Kn. 1, It is interesting to note that Theorem 5 establishes the uniqueness, of the dimension of a (finite-dimensional), vector space. The proof, based, upon the existence of the determinant, function, is quite different from the, proof we gave in Chapter 2. From Theorem 5 we know that ‘free module, of rank n’ is the same as ‘free module with n generators.’, V* consists of all linear, If V is a module over K, the dual module, functions f from V into K. If V is a free module of rank n, then V* is also, a free module of rank n. The proof is just the same as for vector spaces., If {Pl, . . * , Pn) is an ordered basis for V, there is an associated dual basis, ifi, * . . , fn} for the module V*. The function fi assigns to each OLin V its, ith coordinate relative to {PI, . . . , prr} :, , CY= fi(&% + . * . + fn(Q)>Pn., If f is a linear function, , on V, then, , f, , =, , f (Pllfl, , +, , . . ., , +, , f@nlfn-, , 165
Page 174 :
166, , 5.4., , Determinants, , Multilinear, , Chap. 5, , Functions, , The purpose of this section is to place our discussion of determinants, in what we believe to be the proper perspective. We shall treat alternating, multilinear, forms on modules. These forms are the natural generalization, of determinants, as we presented them. The reader who has not read (or, does not wish to read) the brief account of modules in Section 5.5 can still, study this section profitably, by consistently reading ‘vector space over F, of dimension n’ for ‘free module over K of rank n.’, Let K be a commutative, ring with identity and let V be a module, over K. If r is a positive integer, a function L from v’ = V X V X . . . X V, into K is called multilinear, if L(al, . . . , ar,) is linear as a function of, each cy;when the other q’s are held fixed, that is, if for each i, L(cf1, . . . ) cai + pi, . . . ) a,) = CL(crl, . . . ) (Yi) . . . ) a, +, L( al, . . . , Pi, . . . , 4., A multilinear, function on v’ will also be called an r-linear, form, on V, or a multilinear, form, of degree, r on V. Such functions are sometimes, on V. The collection of all multilinear, functions on, called r-tensors, V will be denoted by W(V)., If L and M are in My(V), then the sum, L + M:, (L + M>(a, . . . , a,) = L(w, . . . , a,) + Mb,, . . . , a,), is also multilinear;, and, if c is an element of K, the product CL,:, (CL)(arl, . . . ) a,) = cL(q, , . . . , a,), , is multilinear., Therefore, W(V), is a K-module-a, submodule, of the, module of all functions from V into K., If T = 1 we have M’(V) = V*, the dual module of linear functions, on V. Linear functions can also be used to construct examples of multilinear forms of higher order. If fi, . . . , fi are linear functions on V, define, L(crl, . . . ) 4, Clearly, , L is an r-linear, , =, , fl(4fd4, , . . * fr(%>., , form on V., , EXAMPLE 9. If V is a module, a a-linear form on V is usually called a, form, on V. Let A be an n X n matrix with entries in K. Then, , bilinear, , L(X,, defines a bilinear, , Y) = Y%X, , form L on the module KnX1. Similarly,, M(a, /3) = aApt, , defines a bilinear, , form M on Kn., , EXAMPLE 10. The determinant, , function, , associates with, , each n X n
Page 175 :
Multilinear, , Sec. 5.6, matrix A an element, the rows of A:, , det A in K. If det A is considered, , Functions, , as a function, , of, , det A = D(cQ, . . . , CT,), then D is an n-linear, , form on K”., , 11. It is easy to obtain an algebraic expression for the, general r-linear form on the module Kn. If (or, . . . , cr, are vectors in V, and A is the r X n matrix with rows CQ, . . . , (Ye,then for any function L, in MT(K”),, EXAMPLE, , L((Y1,, , . . . ), , a,) =L, , i, , Aljei,az,...,a,, , j=l, , = jil, , =, , >, , AljL(q,, , 5, j,k=l, , e, . . . , 4, , AljAd(Ejp, , Ek,, , 013), , * . . 2 a+)-, , If we replace 01~~. . . , (Yein turn by their expressions as linear combinations, of the standard basis vectors, and if we write A(i, j) for Aii, we obtain the, following :, (5-26), , A(lpjl), *** A(r,jr)L(~,, - - . Ej,)., 5, il, . . . , jr==1, In (5-26), there is one term for each r-tuple J = (jl, . . . ,j,) of positive, integers between 1 and n. There are n+’such r-tuples. Thus L is completely, determined by (5-26) and the particular values:, L(a,, , . . . , ffr) =, , CJ, , = L(Cjjl, . . . ,, , Ej,), , assigned to the nr elements (cjl, . . . , q,). It is also easy to see that if for, each r-tuple J we choose an element CJ of K then, (5-27), , L(arl, f . . , a,) = T A(4 jd . . - A(r, jrh, , defines an r-linear, , form on Kn., , Suppose that L is a multilinear, function on V and M is a multilinear, function on v”. We define a function L @ M on V*f8 by, (5-28), , (L 0 M)(al,, , . . . , a,+*) = L(N,, , . . . , (Y,)M((Y,+I, . . . , c++s)., , If we think of Vr+s as V X TJ8, then for CYin V and 0 in Trs, UJ 0 M)(a,, , P> = L(4M(P)., , 167
Page 176 :
168, , Determinants, It is clear, called the, mutative., the tensor, and MB., , Chap. 5, that, , L @ M is multilinear, on Vr+s. The function L @ M is, product, of L and M. The tensor product is not comIn fact, M @ L # L @ M unless L = 0 or M = 0; however,, product does relate nicely to the module operations in 1Mr, , tensor, , Lemma., Let L, Ll be r-linear, on V and let c be an element of IX., , forms on V, let M, M1 be s-linear forms, , (a) (CL + L1) @ M = c(L @ M) + L1 0 M;, (b) L 0 (CM + M,) = c(L @ M) + L @ Ml., Proof., , Exercise., , Tensoring is associative, i.e., if L, M and N are (respectively), and t-linear forms on V, then, (L@M)@N, , r-, s-, , = L@(M@N)., , This is immediate from the fact that the multiplication, in K is associative., functions, on Vrl, . . . , VTk,, Therefore,, if L,, L,, . . . , Lk are multilinear, then the tensor product, is unambiguously, defined as a multilinear, function on P, where r =, r1 + a.. + rk. We mentioned a particular case of this earlier. If ji, . .‘, j 7are linear functions on V, then the tensor product, L =f1Q, , -9. Qfr, , is given by, L(cILl, . . . ) cur), , =, , fl(cQ), , . . . jr(w)., , Theorem, 6. Let K be a commutative ring with identity. If V is a free, IX-module of rank n then M*(V) is a free K-module of rank nr; in fact, if, {fl, . . . , fn} is a basis for the dual module V*, the n’ tensor products, 1 < jl 5 n, . . . , 1 5 j, 4 n, fh 0 ’ . ’ 0 fh, , form a basis for M*(V)., Proof. Let {ji, . . . , jn} be an ordered basis for V* which is dual, to the basis {pl, . . . , Pn} for V. For each vector (Y in V we have, a, , =, , fl(cr)Pl, , +, , * * *, , +, , f&)/L, , We now make the calculation carried out in Example 11. If L is an r-linear, form on V and al, . . . , (Y, are elements of V, then by (5-26), L(cul, . . . ) (Y,) =, jl,, , 2, . . . , j,, , fjl(W>, , . . . fj,(4Wjl,, , * * - 7 Pj3., , In other words,, (5-29), , L =, , 22 L@ju . * * 7&l.fh 0 * ’ ’ Oh+, II ,..., j,
Page 177 :
Sec. 5.6, , Muttikaear, , Functions, , This shows that the nr tensor products, (5-30), , EJ = film, , ..+ Ofj,, , given by the r-tuples J = (j,, . . . , j,) span the module m(V)., We see, that the various r-forms EJ are independent,, as follows. Suppose that for, each J we have an element CJ in K and we form the multilinear, function, L = 2 c&E/;., .J, , (5-31), , Notice that if I = (il, . . . , &), then, , Therefore, , we see from (5-31) that, , (5-32), In particular,, Dejbition., , is alternating, , CI = Jm,,, , . . . , Pd., , if L = 0 then CI = 0 for each r-tuple, , I., , 1, , Let L be an r-linear form on a K-module V. We say that L, if L(QI~, . . . , a,) = 0 whenever ai = crj with i # j., , If L is an alternating, , multilinear, , function, , L(a1, . . . ) cti) . . . , “j, . . . ) a,) = -L(cQ,, , on V”, then, . . . , “j, . . . ) ai, . . . ) 4., , In other words, if we transpose two of the vectors (with different indices), in the r-tuple (0~1,. . . , a,) the associated value of L changes sign. Since, every permutation, u is a product of transpositions,, we see that L(Q, . . . ,, G,) = (sgn u> Uw, . . . , a,>., We denote by h’(V) the collection of all alternating, r-linear forms, on V. It should be clear that h’(V) is a submodule of MT(V)., EXAMPLE 12. Earlier in this chapter, we showed that on the module, Kn there is precisely one alternating, n-linear form D with the property, that D(Q, . . . , E%)= 1. We also showed in Theorem 2 that if L is any, form in hn(Kn) then, L = L(E,, . . . , e,)D., In other words, h”(K”) is a free K-module of rank 1. We also developed, an explicit formula (5-15) for D. In terms of the notation we are now, using, that formula may be written, (5-33), , D = Z (sgn g) fUl 0 . . 3 Ofon., 0, , wherefi, . . . , fn are the standard coordinate functions on K” and the sum, is extended over the n! different permutations, u of the set (1, . . . , n}., If we write the determinant, of a matrix A as, detA, , =Z(sgna)A(al,l), c, , . ..A(un.n), , 169
Page 178 :
170, , Determinants, , Chap. 5, , then we obtain, (5-34), , a different, I%,., , expression for D:, , . ., a,) = ? (sgn u> fi(hJ, , . - . fn(4, , = Z (sgn u) L(ad,, LT, , . . . , ch,), , There is a general method for associating an alternating, form with, a multilinear, form. If L is an r-linear form on a module V and if g is a, permutation, of (1, . . . , r}, we obtain another r-linear function L, by, defining, Lda1, . . . ) 4 = L(&l, . . . ) am)., If L happens to be alternating,, then L, = (sgn a)L. Now, for each L in, Mr( V) we define a function r,L in Mr( V) by, r,L, , (5-35), , = 2 (sgn a)L,, (I, , that is,, (5-36), , (GL)(ar*,, , . . . , ~11,)= 2 (sgn U) L(ad,, (r, , Lemma., ?T~is a linear, is in Ar(V) then ?r,L = r!L., , Proof., (&)(a,,,, , transformation, , Let T be any permutation, . . . , ~4, , from, , . . . , a,,)., Mr(V), , . . . , a,,,), , = (kg 7) Z (sgn 7u) L(M,, D, , (a&)(%,, , f . . , G), , If I,, , of (1, . . . , r}. Then, , = Z (sgn u> L(a,d,, , As u runs (once) over all permutations, , into AI(V)., , . . . , a,,,)., , of (1, . . . , r}, so does W. Therefore,, , = (sgn T) (?r,L) (03, . . . , 0~~)., , Thus T,L is an alternating, form., If L is in h’(V), then L(Q,, 1, each a; hence T,L = r!L., , . . . , abT) = (sgn u) L(q,, , In (5-33) we showed that the determinant, , function, , . . . , a,) for, , D in An(Kn), , is, , where fi, . . . , fn are the standard coordinate functions on Kn. There is, an important remark we should make in connection with the last lemma., If K is a field of characteristic, zero, such that r! is invertible, in K, then, T maps Mr( V) onto h’(V). In fact, in that case it is more natural from one, point of view to use the map ?rl = (l/r!)?r rather than ?r, because al is a, projection of W(V) onto Ar(V), i.e., a linear map of MT(V) onto k(V), such that rl(L) = L if and only if L is in Ar(V).
Page 179 :
Multilinear, , Sec. 5.6, , Functions, , 7. Let K be a commutative ring with identity and let V be, of rank n. If r > n, then Ar(V) = {O}. If 1 5 r 5 n, then, n, ~ ., h’(V) is a free K-module of rank, 0, Proof. Let {pr, . . . , &) be an ordered basis for V with dual, basis {fi, . . . , fn}. If L is in Mr(V), we have, Theorem, , a free K-module, , L = F L@j~,- * * , Pi3fjl CD* ’ * 0 fj,, , (5-37), , where the sum extends over all r-tuples, then, tween 1 and n. If L is alternating,, , J = (j,, . . . , j,) of integers, , be-, , L@ju * * * 7 Pi,) = 0, whenever two of the subscripts ji are the same. If r > n, then in each, r-tuple J some integer must be repeated. Thus M(V) = {0} if r > n., Now suppose 1 5 r 5 n. If L is in k(V), the sum in (5-37) need be, extended only over the r-tuples J for which jl, . . . , j, are distinct, because, all other terms are 0. Each r-tuple of distinct integers between 1 and n is, a permutation, of an r-tuple J = (jl, . . . , j,) such that j, < . . 1 < j,., This special type of r-tuple is called an r-shuffle, of (1, . . . , n}. There are, n, 0r, , n!, = r!(n - r)!, , such shuffles., Suppose we fix an r-shuffle J. Let LJ be the sum of all the terms in, (5-37) corresponding to permutations, of the shuffle J. If (Tis a permutation, of (1, . . . , r}, then, L(Pjm * . . ) Pj.3 = (w, , u> L@ji, . . - 7 &I*, , Thus, (5-38), , LJ = L(Pju . . . 9Pj,lDJ, , where, (5-39), , DJ = 2 (en u>fj.1 0 * * * 0 fj.., L7, =, , rv(fjl, , 0, , * ’ * 0, , fj,), , *, , We see from (5-39) that each D.T is alternating, (5-40), , L, , =, , ahuze8 J L@ju, , and that, , . - - , PADJ, , : forms DJ constitute a, 0, basis for k(V). We have seen that they span AT(V). It is easy to see that, they are independent,, as follows. If I = (ii, . . . , i7) and J = (jl, . . . , j,), are shuffles, then, , for every L in h’(V)., , The assertion, , (5-41), , DOi,,, , ...,, , is that the
Page 180 :
Chap. 5, , Determinants, Suppose we have a scalar CJ for each shuffle and we define, L = 2 c.,DJ., J, , From, , (5-40) and (5-41) we obtain, cr = -mi,,, , In particular,, , - - * , Pi,>., , if L = 0 then cr = 0 for each shuffle I., , 1, , Corollary., If V is a free K-module of rank n, then An(V) is a free, K-module of rank 1. If T is a linear operator on V, there is a unique element, c in K such that, L(Tar,, for every alternating, Proof., , . . . , Tan) = CL&, , . . . , cr,), , n-linear form L on V., , If L is in An(v),, , then clearly, , LT(al, . . . ) a,) = L(Tq, , . . . ) Ta,), , defines an alternating n-linear form LT. Let M be a generator for the rank, 1 module A”(V). Each L in An(V) is uniquely expressible as L = aM for, some a in K. In particular, MT = CM for a certain c. For L = aM we have, LT, , = (Cd&, = UMT, = a(cM), = c(aM), =cL., 1, , Of course, the element c in the last corollary, of T. From (5-39) for the case r = n (when, J = (1,. . . , n)) we see that the determinant, the matrix which represents T in any ordered, see why. The representing matrix has i, j entry, , is called the determinant, there is only one shuffle, of T is the determinant, of, basis {PI, . . . , p,}. Let us, , Aij = fj(TPi), so that, DJ(%,, , . . .,, , Tp,) = Z (sgn u) A(1, cl) . . . A(n, an), c, = det A., , On the other hand,, DdTPl,, , . . . , T/3,) = (det T) D&I,, = det T., , . . . , PJ, , The point of these remarks is that via Theorem 7 and its corollary we, obtain a definition of the determinant, of a linear operator which does not, presume knowledge of determinants, of matrices. Determinants, of matrices, can be defined in terms of determinants, of operators instead of the other, way around.
Page 181 :
The Grassman Ring, , Sec. 5.7, , 173, , We want to say a bit more about the special alternating, r-linear, forms DJ, which we associated with a basis {fi, . . . , fn} for V” in (5-39)., of, It is important, to understand that D~(cY~, . . . , a,) is the determinant, a certain r X r matrix. If, that is, if, l<il?-, , ai = Aid31 + . . . + Ain&, and J is the r-shuffle, (5-42), , D.&l,, , (ji, . . . , j,), then, . . . , a,) = 2 (sgn u) A (1, jd, c, = det, , . . . A (n, j,,>, , A(l,jJ, . . - A(L.9, ., ., :, A, CT,, j,), ., ., A, CT,, jr), I, [, , Thus D.&l,, . . . , or,) is the determinant, of the r X r matrix formed from, columns ji, . . . , j, of the r X n matrix which has (the coordinate n-tuples, of) CXy1,, . . . ) 01, as its rows. Another notation which is sometimes used for, this determinant, is, (5-43), , DJ(al, . . . , a,) = ah * ’ - ’ ar)., w%l, . * . , Pi,), , In this notation, the proof of Theorem 7 shows that every alternating, r-linear form L can be expressed relative to a basis (01, . . . , &} by the, equation, , 5.7., , The Grassman, , Many of the important, properties of determinants, and alternating, multilinear forms are best described in terms of a multiplication, operation, alteron forms, called the exterior product. If L and M are, respectively,, nating r and s-linear forms on the module V, we have an associated product, form, of L and M, the tensor product L @ M. This is not an alternating, unless L = 0 or M = 0; however, we have a natural way of projecting it, into A’+“(V). It appears that, (5-45), , L . M = ?r,+,(L, , @ M), , should be the ‘natural’ multiplication, of alternating, forms. But, is it?, Let us take a specific example. Suppose that V is the module K” and, fl, . * * ,fn are the standard coordinate functions on Kn. If i # j, then, fi, , * fj, , =, , rZ(fi, , Ofj>, , Ring
Page 182 :
174, , Determinants, , Chap. 5, , is the (determinant), , function, , Dij = .fi O.fj - .fj Ofi, given by (5-39). Now suppose k is an index different, Dij, , . fk, , =, , ra[(fi, , =, , rt(fi, , Ofj, , -, , Ofj, , The proof of the lemma following, r-linear form L and any permutation, ?r&), Hence, Dii, , fj, , Ofi), , Ofk), , from i and j. Then, , Ofk], , -, , r3(fj, , 0, , fi, , @f!f)*, , equation (5-36) shows that for any, u of (1, . . . , r}, , = sgn u s,(L), , * fk = 2rs(fi @ fj @ fk)., Thus we have, , By a similar, , computation,, , f;, , ., , Djk, , =, , 2?r3(fi @ fj @ fk)., , (fi, , ’ fj), , ’ fk, , =, , fi, , ’, , !fj, , ’ fk), , and all of this looks very promising. But there is a catch. Despite the, computation, that we have just completed, the putative multiplication, in, (5-45) is not associative. In fact, if I is an index different from i, j, Ic, then, one can calculate that, Dij, , . Dkl, , =, , 4r4(fi, , Ofk, , Ofj, , @f~>, , and that, (Dij, , * 6), , ’, , fz, , =, , 6rd(fi, , Ofk, , Ofj, , Ofl)., , Thus, in general, (fi, , ’ fj), , ’, , (fk, , * fl), , #, , [(fi, , ’ fj), , ’ fk], , * fZ, , and we see that our first attempt to find a multiplication, has produced a, non-associative, operation., The reader should not be surprised if he finds it rather tedious to give, a direct verification, of the two equations showing non-associativity., This, is typical of the subject, and it is also typical that there is a general fact, which considerably simplifies the work., Suppose L is an r-linear form and that M is an s-linear form on the, module I/‘. Then, rr+4(~J), , 0 (T&O), , = c+d2, , (sgn u)(sgn T)L 0 J-f,), -7.7, = Z (sgn u> (sgn T)T~+&, ,787, , 0 M,), , where (T varies over the symmetric, group, X,, of all permutations, of, 0, * * . , r}, and T varies over 8,. Each pair u, 7 defines an element (u, 7), of LY?,+~, which permutes the first r elements of (1, . . . , r + s> according, to u and the last s elements according to T. It is clear that, Sgn, , (U,, , T), , =, , (Sgn, , U)(Sgn, , T), , and that, (L 0 M)cr,,, = L, 0 L,.
Page 183 :
Sec. 5.7, , The Grassman Ring, , Therefore, , Now we have already, , observed, , that, , sgn (u, T)T,+~[(L 0 Jf)bdl, Thus, it follows, , that, , (5-46), , 7rv+s[(7rrL) @ (7TJf)], , = T,+& 0 MI., , = r!s! ?r?.+,(L @ &I)., , This formula simplifies a number of computations., For example, suppose, we have an r-shuffle I = (i1, . . . , i,) and s-shuffle J = (j,, . . . ,j,). To, make things simple, assume, in addition, that, i, < ..., , < i, < j, < * * f < j,., , Then we have the associated determinant, , functions, , Dr = ?T~(ET), DJ, , =, , %(EJ), , where EI and EJ are given by (5-30). Using (5-46), we see immediately, , that, , Dr . DJ = ?rr+s[~,(Er) 0 ~&%)I, = r!s!s,,(Er, @ EJ)., Since ET @ EJ = EIUJ, it follows, , that, , Dr . DJ = r!s! Dl,,~., This suggests that the lack of associativity, for the multiplication, (5-45), results from the fact that DI . DJ z D~uJ. After all, the product of Dr, and DJ ought to be DI us. To repair the situation, we should define a new, product, the exterior, product, (or wedge product), of an alternating, r-linear form L and an alternating, a-linear form M by, (5-47), , LAM=&,, , . ., , ar+dL 0 Ml., , We then have, Dr, , A, , 0.1, , =, , DIUJ, , for the determinant, functions on Kn, and, if there is any justice at all, we, must have found the proper multiplication, of alternating, multilinear, forms. Unfortunately,, (547) fails to make sense for the most general case, under consideration,, since we may not be able to divide by r!s! in the, ring K. If K is a field of characteristic zero, then (5-47) is meaningful,, and, one can proceed quite rapidly to show that the wedge product is associative., Theorem, 8. Let K be a jield of characteristic zero and V a vector space, over K. Then the exterior product is an associative operation on the alternating, multilinear forms on V. In other words, if L, M, and N are alternating, multilinear forms on V of degrees r, s, and t, respectively, then, , 175
Page 184 :
Determinants, , Chap. 5, CL A M) A N = L A (M A N)., , Proof. It follows from, any scalars c and d. Hence, r!s!t![(L, , (5-47), , that, , A M) A N] = r!s!(L, , cd(L A M) = CL A dM for, A M) A t!N, , and since ?rt(N) = t!N, it results that, r!s!t![(.L, , A M), , A iV] = ?r,+,(L, , 1, , @ M), , A r&V), , 1, , = ~(r + s)! t!- ar+,+t[~+,(L 0 M) 0 n(N)]., From (5-46) we now see that, A M) A N], , = ?rr+s+t(L 0 M 0 IV)., , A (M A N)], , = ?rr+s+t(L 0 M @ N), , r!s!t[(L, By a similar computation, r!s!t![L, , 1, and therefore, (L A M) A N = L A (M A N)., Now we return to the general case, in which it is only assumed that K, is a commutative, ring with identity. Our first problem is to replace (5-47), by an equivalent definition which works in general. If L and M are alternating multilinear, forms of degrees r and s respectively, we shall construct, a canonical alternating multilinear, form L A M of degree r + s such that, , r!s!(L, , A M), , = ?rr+s(L @ M)., , Let us recall how we define r .+,(L 0 M). With each permutation, of (1,. . .) r + s} we associate the multilinear, function, (5-48), , (w, , (T, , a)@ 0 Mb, , where, (L 0 MM~,,, , . . . , G+,) = (L 0 M)(cw,, , . . . , ao(r+s)), , and we sum the functions (5-48) over all permutations, (T.There are (r + s) !, permutations;, however, since L and M are alternating,, many of the functions (5-48) are the same. In fact there are at most, (r + s)!, r!s!, distinct functions (5-48). Let us see why. Let &+ be the set of permutations of (1, . . . , r + s}, i.e., let Xl+s be the symmetric group of degree, r + s. As in the proof of (5-46), we distinguish the subset G that consists, of the permutations, u which permute the sets (1, . . . , r} and {r + 1, . . . ,, r + s} within themselves. In other words, u is in G if 1 5 ai < r for each, i between 1 an-d r. (It necessarily follows that r + 1 I uj 5 r + s for, each j between r + 1 and r + s.) Now G is a subgroup of X+, that is, if, u and 7 are in G then UT-l is in G. Evidently, G has r!s! members.
Page 185 :
The Grassman Ring, , Sec. 5.7, , We have a map, defined by, #(CT) = (sgn a)(L 0 JO,., Since L and M are alternating,, for every y in G. Therefore,, N on V, we have, , I(r) = L 0 M, since (Na) 7 = Nra for any (r + s)-linear, , form, , r in X7+, y in G., #(v), = ti(T),, This says that the map $J is constant on each (left) coset TG of the subgroup G. If ri and 72 are in Srfs, the cosets rlG and r2G are either identical, or disjoint, according as ~2~ 71 is in G or is not in G. Each coset contains, r!s! elements; hence, there are, (r + s>!, r!s!, distinct cosets. If 23,.+,/G denotes the collection of cosets then # defines, a function on X,+,/G, i.e., by what we have shown, there is a function $, on that set so that, J/(T) = &(rG), for every 7 in Xl+. If H is a left coset of G, then G(H) = +(r) for every, 7 in H., We now define the exterior, product, of the alternating, multilinear, forms L and M of degrees r and s by setting, (5-49), , LAM=z$(H), , where H varies over S,+,/G. Another way to phrase the definition, of, L A M is the following. Let S be any set of permutations, of (1, . . . , r + s}, which contains exactly one element from each left coset of G. Then, L A M = Z (sgn cr)(L 0 M)#, ST, , (5-50), , where u varies over S. Clearly, r!s! L A M = ?r,+,(L 0 M), so that the new definition, characteristic zero., Theorem, 9. Let, a module over K. Then, alternating multilinear, alternating multilinear, , is equivalent, , to (5-47), , when K is a field of, , H be a commutative ring with identity and let V be, the exterior product is an associative operation on the, forms on V. In other words, if L, M, and N are, forms on V of degrees r, s, and t, respectively, then, , (L A M) A N = L A (M A N)., , 177
Page 186 :
Chap. 5, , Determinants, , Proof. Although the proof of Theorem 8 does not apply here,, it does suggest how to handle the general case. Let G(r, s, t) be the subgroup of Sl+s+.t that consists of the permutations, which permute the sets, {r+s+l,...,r+s+t}, 0, ’ . * 1 r}, {r + 1, * - . , r+s},, within themselves. Then (sgn p)(L 0 M @ N), is the same multilinear, function for all p in a given left coset of G(r, s, t). Choose one element, from each left coset of G(r, s, t), and let E be the sum of the corresponding, terms (sgn p)(L @ M @ N),. Then E is independent of the way in which, the representatives, P are chosen, and, r!s!t! E = ?r,+,+t(L @ M @ N)., We shall show that (L A M) A N and L A (M A N) are both equal to E., Let G(r + s, t) be the subgroup of SI+S+t that permutes the sets, * *t r++,, , u,., , {r+s+l,...,r+s+t), , within themselves. Let T be any set of permutations, of (1, . . . , r + s + t}, which c0ntain.s exactly one element from each left coset of G(r + s, t)., By (5-50), , where the sum is extended over the permutations, be the subgroup of S,.+ that permutes the sets, , T in T. Now let G(r, s), , 0, . * * > r}, {r-t 1,. . .,r+s}, within themselves. Let X be any set of permutations, of (1, . . . , r + s}, which contains exactly one element from each left coset of G(r, s). From, (5-50) and what we have shown above, it follows that, (L A M), , A, , N, , = 2 (sgn c) (sgn 7) [(L, , c,r, , 0 M>., 0, , Nlr, , where the sum is extended over all pairs (T, T in S X T. If we agree to, identify each ‘u in X,+. with the element of S7+S+t which agrees with u on, andistheidentityon, {r+s+l,...,r+s+t},, then, 0, . . . 7r+s}, we may write, (L A M>, , A, , N, , = z sgn (u r)‘[(L, , C,T, , 0 M 0 N),],., , But,, [(L @ M 0 N),],, , = (L @ M 0 N),,., , Therefore, (LAM), , AN, , = zsgn(7u)(L@MON),,., WIT, , Now suppose we have, TlCl, , with gi in S, Ti in T, and y in G(r,, , =, , -n~z, , Y, , S,, , t). Then, , rT1, , 71, , =, , UZYUl, , - ‘, and since
Page 187 :
The Grassman Ring, , Sec. 5.7, , -’ lies in G(r + s, t), it follows that ~1 and 72 are in the same left coset, of G(r + s, t). Therefore,, 71 = Q, and ui = ~7. But this implies that ul, and (TV(regarded as elements of Sr+J lie in the same coset of G(r, s) ; hence, (TV= g2. Therefore, the products ra corresponding, to the, a2w1, , (r + s + t)! (r + s)!, (r + s)!t!, r!s!, pairs (7, U) in T X S are all distinct, Since there are exactly, , and lie in distinct, , cosets of G(r, s, t)., , (r + s + t>!, r!s!t!, left cosets of G(r, s, t) in Sl+s+t, it follows that (L A M) A N = E. By, an analogous argument, L A (M A N) = E as well., 1, EXAMPLE 13. The exterior product is closely related to certain forexpansions., mulas for evaluating, determinants, known as the Laplace, Let K be a commutative, ring with identity and n a positive integer. Supr-linear form on Kn, pose that 1 5 r < n, and let L be the alternating, defined by, L(q,, , . . . , a,) = det [:I, , If s = n - r and M is the alternating, , ]I:, , Y]., , s-linear form, , then L A M = D, the determinant, function on Kn. This is immediate, n-linear form and (as can be, from the fact that L A M is an alternating, seen), (L A W(EI,, . . .,En) = 1., If we now describe L A M in the correct way, we obtain one Laplace, expansion for the determinant, of an n X n matrix over K., In the permutation, group S,, let G be the subgroup which permutes, the sets (1, . . . , r} and {r + 1, . . . , n} within themselves. Each left, coset of G contains precisely one permutation, u such that al < a2 < . . . <, is given by, UT and u(r + 1) < . . . < cm. The sign of this permutation, sgn (T = (-l)ul+...+ur+(T(T-l)/z)., The wedge product, (L A M)b,,, , L A M is given, , by, , . . . , a,) = Z (sgn a)L(aruI,, , where the sum is taken over a collection, Therefore,, , . . ., a,,)M(wr+l),, of, , U'S,, , . . . , G,>, , one from each coset of G., , 179
Page 188 :
180, , Chap. 5, , Determinants, (L, , A, , M)(a1,, , . . . ) cm>, , =, , 2, iI<..., , c7, , L(olj,,, , ., , . . , dMbk,,, , . . . , m,), , <jr, , where, eJ =, , (-l)h+...+i,+(~(~-l)/2), , ki, , U(T +, , =, , i)., , In other words,, det rl = j,<?.<i,q?:;, This is one Laplace expansion., sets (1, . . . , r} and {r+l,...,, sets of indices., , : j:, , ~l~yr:, , Others may be obtained, n} by two different, , 1j:, , ?J, by replacing the, complementary, , If V is a K-module,, we may put the various form modules A(V), together and use the exterior product to define a ring. For simplicity, we, shall do this only for the case of a free K-module of rank n. The modules, A(V) are then trivial for r > n. We define, A(V), , = ho(V) @Al(V), , @ * *. @nyv)., , This is an external direct sum-something, which we have not discussed, previously., The elements of A(V) are the (n + 1)-tuples (Lo, . . . , L,), by elements of K are defined, with L, in A”(V). Addition and multiplication, as one would expect for (n + 1)-tuples. Incidentally,, A”(V) = K. If we, identify A7(K) with the (n + 1)-tuples (0, . . . , 0, L, 0, . . . , 0) where L, is in b(K),, then Ar(K) is a submodule of A(V) and the direct sum, decomposition, A(V) = ho(V) @ .a. @A”(V), holds in the usual sense. Since AT(V) is a free K-module, see that A(V) is a free K-module, , F ) we, , = i, n, r=o 0 r, = 2”., , defines a multiplication, in A(V): Use the exterior, extend it linearly to A(V). It distributes over the, gives A(V) the structure of a ring. This ring is the, V*. It is not a commutative ring, e.g., if L, M are, A”, then, , L A M = (-1)r8M, But, the Grassman, , 0, , and, , rank A(V), , The exterior product, product on forms and, addition of A(V) and, Grassman, ring, over, respectively in A’ and, , of rank, , ring is important, , A L., , in several parts of mathematics.
Page 189 :
6. Elementarv, Canonical, , Forms, , 6.1., , Introduction, , We have mentioned earlier that our principal aim is to study linear, transformations, on finite-dimensional, vector spaces. By this time, we have, seen many specific examples of linear transformations,, and we have proved, a few theorems about the general linear transformation., In the finitedimensional case we have utilized ordered bases to represent such transformations by matrices, and this representation, adds to our insight into, their behavior. We have explored the vector space L(V, IV), consisting of, the linear transformations, from one space into another, and we have, explored the linear algebra L( 8, V), consisting of the linear transformations, of a space into itself., In the next two chapters, we shall be preoccupied with linear operators., Our program is to select a single linear operator T on a finite-dimensional, vector space V and to ‘take it apart to see what makes it tick.’ At this, early stage, it is easiest to express our goal in matrix language: Given the, linear operator T, find an ordered basis for V in which the matrix of T, assumes an especially simple form., Here is an illustration, of what we have in mind. Perhaps the simplest, matrices to work with, beyond the scalar multiples of the identity, are the, diagonal matrices :, , (6-l), , Cl, 0f.., 0, 00c2, ***, .., , .., , J, , .., , 181
Page 190 :
182, , Elementary Canonical Forms, , Chap. 6, , Let T be a linear operator on an n-dimensional, space V. If we could find, an ordered basis @ = {al, . . . , (Y,,} for V in which T were represented by, a diagonal matrix D (6-l), we would gain considerable information, about T., For instance, simple numbers associated with T, such as the rank of T or, the determinant, of T, could be determined with little more than a glance, at the matrix D. We could describe explicitly the range and the null space, of T. Since [T],B = D if and only if, W-2), , TCY~= ck(Yk,, , k=l,...,n, , the range would be the subspace spanned by those ark’s for which ck # 0, and the null space would be spanned by the remaining ak’s. Indeed, it, seems fair to say that, if we knew a basis a3 and a diagonal matrix D such, that [T]a = D, we could answer readily any question about T which, might arise., Can each linear operator T be represented by a diagonal matrix in, some ordered basis? If not, for which operators T does such a basis exist?, How can we find such a basis if there is one? If no such basis exists, what, is the simplest type of matrix by which we can represent T? These are some, of the questions which we shall attack in this (and the next) chapter. The, form of our questions will become more sophisticated as we learn what, some of the difficulties are., , 6.2., , Characteristic, , Values, , The introductory, remarks of the previous section provide us with a, starting point for our attempt to analyze the general linear operator T., We take our cue from (6-a), which suggests that we should study vectors, which are sent by T into scalar multiples of themselves., DeJinition.., Let V be a vector space over the field. F and let T be a linear, value, of T is a scalar c in F such that, operator on V. .A characteristic, there is a non-zero vector a! in V with Tcr = car. If c is a characteristic value of, T, then, vector, of T, (a) any (Y such that TLY = ccr is called a characteristic, associated with the characteristic value c;, (b) the collection of all a such that Ta = cat is called the characteristic, space associated with c., , Characteristic, values are often called characteristic roots, latent roots,, eigenvalues, proper values, or spectral values. In this book we shall use, only the name ‘characteristic values.’, If T is any linear operator and c is any scalar, the set of vectors (Y such, that TCY = C(Yis a subspace of V. It is the null space of the linear trans-
Page 191 :
Sec. 6.2, , Characteristic Values, , formation, (5” - cl). We call c a characteristic value of T if this subspace, is different from the zero subspace, i.e., if (T - ~1) fails to be 1: 1. If the, underlying space V is finite-dimensional,, (T - cl) fails to be 1: 1 precisely, when its determinant, is different from 0. Let us summarize., Theorem, 1. Let T be a linear operator on a finite-dimensional, and let c be a scalar. The following are equivalent., , (i) c is a characteristic value of T., (ii) The operator (T - c1) is singular, (iii) det (T - c1) = 0., , space V, , (not invertible)., , The determinant, criterion (iii) is very important, because it tells us, where to look for the characteristic, values of T. Since det (T - cl) is a, polynomial, of degree n in the variable c, we will find the characteristic, values as the roots of that polynomial., Let us explain carefully., If 63 is any ordered basis for V and A = [T]a, then (T - cl) is invertible if and only if the matrix (A - cl) is invertible., Accordingly,, we, make the following definition., , If A is an n X n matrix over the$eld F, a characteristic, of A in F is a scalar c in F such that the matrix (A - c1) is singular, (not invertible)., Dejinition., , value, , Since c is a characteristic value of A if and only if det (A - cI) = 0,, or equivalently, if and only if det (~1 - A) = 0, we form the matrix, f =, (21 - A) with polynomial, entries, and consider the polynomial, det (21 - A). Clearly the characteristic, values of A in F are just the, scalars c in F such that f(c) = 0. For this reason f is called the characteristic, polynomial, of A. It is important to note that f is a manic polynomial which has degree exactly n. This is easily seen from the formula, for the determinant, of a matrix in terms of its entries., Similar, , Lemma., , Proof., , matrices have the same characteristic, , If B = P-IAP,, , det (XI - B) =, =, =, =, , polynomial., , then, det, det, det, det, , (XI - P-IAP), (P-‘(xl, - A)P), P-l . det (XI - A) . det P, (x1 - A)., 1, , This lemma enables us to define sensibly the characteristic polynomial, of the operator T as the characteristic, polynomial, of any n X n matrix, which represents T in some ordered basis for V. Just as for matrices, the, characteristic values of T will be the roots of the characteristic polynomial, for T. In particular, this shows us that T cannot have more than n distinct, , 183
Page 192 :
184, , Elementary Canonical Forms, characteristic, characteristic, EXAMPLE, , in the standard, , Chap. 6, to point out that T may not have any, , -values. It is important, values., , 1. Let T be the linear operator, ordered basis by the matrix, , A=, , [, , 0, , The characteristic, , polynomial, det (~1 -A), , 1, , -1, , 1, , on R2 which is represented, , 0’, , for T (or for A) is, = -T, I, , i, , I, , = x2 + 1., , Since this polynomial, has no real roots, T has no, If U is the linear operator on C2 which is represented, ordered basis, then U has two characteristic, values,, see a subtle point. In discussing the characteristic, A, we must be careful to stipulate the field involved., values in R, but has the two, has no characteristic, i and -i in C., EXAMPLE, , 2. Let A be the (real) 3 X 3 matrix, , Then the characteristic, 2-3, -2, -2, , characteristic, values., by A in the standard, i and -i. Here we, values of a matrix, The matrix A above, characteristic values, , -1, x-2, -2, , [ 1, , 3 1 -1, 2 2 -1., 22, 0, polynomial for A is, , 1, 1 =23-522+82-4=((z-l)(x-2)2., x, , Thus the characteristic values of A are 1 and 2., Suppose that T is the linear operator on R3 which is represented by A, in the standard basis. Let us find the characteristic vectors of T associated, with the characteristic values, 1 and 2. Now, , A-I=, , 2 1, 2 1, 2 2, , -1, -1., -1, , [ 1, , It is obvious at a glance that A - I has rank equal to 2 (and hence T - I, has nullity equal to 1). So the space of characteristic, vectors associated, with the characteristic value 1 is one-dimensional., The vector cy1= (1, 0, 2), spans the null space of T - 1. Thus Ta = (\I if and only if a! is a scalar, multiple of CQ, Now consider, 1 1 -1, A-21=, 2 0 -1., , [ 1, 2 2, , -2
Page 193 :
Sec. 6.2, , Characteristic Values, , Evidently, A - 21 also hams rank 2, so that the space of characteristic, vectors associated with the characteristic, value 2 has dimension 1. Evidently Ta = 2a! if and only if (Y is a scalar multiple of LYZ= (1, 1, 2)., DeJinition., Let T be a linear operator on the jinite-dimensional, space ,, if there is a basis for V each vector, V. We say that T is diagonalizable, of which is a characteristic vector of T., , The reason for the name should be apparent; for, if there is an ordered, basis iB = {al, . . . , CY~}for V in which each cri is a characteristic vector of, T, then the matrix of T in the ordered basis CRis diagonal. If Tori = ciai,, then, rs, 0 **., 01, , We certainly do not require that the scalars cl, . . . , c, be distinct; indeed,, they may all be the same scalar (when T is a scalar multiple of the identity, operator)., One could also define T to be diagonalizable when the characteristic, vectors of T span V. This is only superficially different from our definition,, since we can select a basis out of any spanning set of vectors., For Examples 1 and 2 we purposely chose linear operators T on Rn, which are not diagonalizable., In Example 1, we have a linear operator on, R2 which is not diagonalizable,, because it has no characteristic, values., In Example 2, the operator T has characteristic values; in fact, the characteristic polynomial, for T factors completely over the real number field:, f = (X - 1) (X - Z)z. Nevertheless, T fails to be diagonalizable., There is, only a one-dimensional, spatce of characteristic vectors associated with each, of the two characteristic values of T. Hence, we cannot possibly form a, basis for R3 which consists of characteristic vectors of T., Suppose that T is a diagonalizable, linear operator. Let cl, . . . , ck be, the distinct characteristic values of T. Then there is an ordered basis @ in, which T is represented by a diagonal matrix which has for its diagonal, entries the scalars ci, each repeated a certain number of times. If ci is, repeated di times, then (we may arrange that) the matrix has the block, form, rclIl, 0, .-0 1, (6-3), where Ii is the dj X dj identity matrix. From that matrix we see two things., First, the characterist’ic polynomial, for T is the product of (possibly, repeated) linear factors:, , 185
Page 194 :
186, , Elementary Canonical Forms, , f, , Chap. 6, , = (2 - Cl)dl * * * (z - cp., , If the scalar field F is algebraically, closed, e.g., the field of complex numbers, every polynomial over F can be so factored (see Section 4.5); however,, if F is not algebraically, closed, we are citing a special property of T when, we say that its characteristic, polynomial, has such a factorization., The, second thing we see from (6-3) is that di, the number of times which ci is, repeated as root off, is equal to the dimension of the space of characteristic, vectors associated with the characteristic, value ci. That is because the, nullity of a diagonal matrix is equal to the number of zeros which it has on, its main diagonal, and the matrix [T - c&, has di zeros on its main, diagonal. This relation between the dimension of the characteristic, space, and the multiplicity, of the characteristic value as a root off does not seem, exciting at first; however, it will provide us with a simpler way of determining whether a given operator is diagonalizable., Lemma., , Suppose that Tcu = ca. If f is any polynomial,, , then f(T)a, , =, , f(C)CY., Proof., , Exercise., , Lemma., Let T be a linear operator on the finite-dimensional, space V., Let cl, . . . , ck be the distinct characteristic values of T and let Wi be the space, of characteristic vectors associated with the characteristic value Ci. If W =, WI + .** + wk, then, , dimW, , = dimW1+, , ..., , +dimWk., , In. fact, if @i is an ordered basis for Wi, then 63 = (&Al, . . . , @k) is an ordered, basis for W., Proof. The space W = WI + . . . + Wk is the subspace spanned, by all of the characteristic vectors of T. Usually when one forms the sum, W of subspaces Wi, one expects that dim W < dim WI + . . ’ + dim Wk, because of linear relations which may exist between vectors in the various, spaces. This lemma states that the characteristic, spaces associated with, different characteristic values are independent of one another., Suppose that (for each i) we have a vector pi in Wi, and assume that, + PE = 0. We shall show that pi = 0 for each i. Let f be any, Pl+, ..., polynomial., Since Tpi = ci@i, the preceding lemma tells us that, 0 = f(T)0, Choose polynomials, , = f(T)PI + .-. +f(T)Pb, = f(c,)P, + * * * + f (c/&L, , fi, . . . , fk such that, fi(Cj), , =, , 6ij, , =, , {t:, , f, , z, , i.
Page 195 :
Sec. 6.2, , Characteristic Values, , Then, , Now, let 63; be an ordered basis for Wi, and let @ be the sequence, 63 = (ah, . . . ) 63,). Then @ spans the subspace W = WI + . . . + Wk., Also, & is a linearly independent, sequence of vectors, for the following, reason. Any linear relation between the vectors in (a will have the form, + 8k = 0, where & is some linear combination, of the vectors in, p1+, *-*, ai. From what we just did, we know that 6; = 0 for each i. Since each @i, is linearly independent, we see that we have only the trivial linear relation, between the vectors in C% 1, Theorem, 2. Let T be a linear operator on a finite-dimensional, space V., Let cl, . . . , ck be the distinct characteristic values of T and let Wi be the null, space of (T - CiI). The following are equivalent., , (i) T is diagonalizable., (ii) The characteristic polynomial, , for T is, , f = (x - Cl)dl . * * (x - Ck)dk, and dim Wi = di, i = 1, . . . , k., (iii) dim W1 + . . . -I- dim wk, , = dim V., , Proof. We have observed that (i) implies (ii). If the characteristic, polynomial f is the product of linear factors, as in (ii), then dl + . . . +, dk = dim V. For, the sum of the d{s is the degree of the characteristic, polynomial, and that degree is dim V. Therefore (ii) implies (iii). Suppose, (iii) holds. By the lemma, we must have V = WI + . . . + Wk, i.e., the, characteristic vectors of T span V. 1, The matrix analogue of Theorem 2 may be formulated as follows. Let, A be an n X n matrix with entries in a field F, and let cl, . . . , ck be the, distinct characteristic values of A in F. For each i, let Wi be the space of, column matrices X (with entries in F) such that, (A - cJ)X, , = 0,, , and let ai be an ordered basis for Wi. The bases 6~1, . . . , 6338collectively, string together to form the sequence of columns of a matrix P:, P = [PI, Pz, . . .] = (al, . . . , a,)., The matrix A is similar over F to a diagonal matrix if and only if P is a, square matrix. When P is square, P is invertible, and P-‘AP is diagonal., EXAMPLE 3. Let T be the linear operator, the standard ordered basis by the matrix, , on R3 which is represented, , in, , 181
Page 196 :
188, , Elementary, , Canonical, , Chap. 6, , Forms, , A=[-%, , 1:, , 21., , Let us indicate how one might compute, using various row and column operations:, x-5, , 1, -3, , 6, x-4, 6, , 6, , the characteristic, , x-5, , -2, , =, , 0, x-2, 2-x, , 1, -3, , x+4, , 6, -2, x+4, 5, , X-, , z (x -, , 2), , -1, , x-5, 2), , 1, -2, , = (x-, , 2), , 0, 1, , 1, -3, , = (x -, , polynomial,, , 0, 1, 0, , 1:; 5 x t, , 6, -2, x+4, 6, -2, x+2, , 2/, , We know that A - I is singular and obviously rank (A - I) 2 2. Therefore, rank (A - 1) = 2. It is evident that rank (A - 21) = 1., Let WI, W2 be the spaces of characteristic vectors associated with the, characteristic values 1, 2. We know that dim WI = 1 and dim Wz = 2. By, Theorem 2, T is diagonalizable. It is easy to exhibit a basis for R3 in which, T is represented by a diagonal matrix. The null space of (T - 1) is spanned, by the vector (Ye = (3, -1,3), and so {aI} is a basis for WI. The null space, of T - 21 (i.e., the space W2) consists of the vectors (21, x2, x3) with x1 =, 2x2 + 2x3. Thus, one example of a basis for WZ is, ffz = (2, 1, 0), = (2,0, 1)., , a3, , If a3 = {q, , cyz,a3 >, then [T]a is the diagonal, , matrix
Page 197 :
Sec. 6.2, , Characteristic Values, , [ 1, 1, 0, 0, , D=, , 0, 2, 0, , 0, 0., 2, , The fact that T is diagonalizable means that the, similar (over R) to the diagonal matrix D. The matrix, to change coordinates from the basis 03 to the standard, the matrix which has the transposes of CYI,(YZ,(~3as its, , [ 1, 3, , P=, , 2 2, 10., 0 1, , -1, , 3, , Furthermore,, , original matrix A is, P which enables us, basis is (of course), column vectors:, , AP = PD, so that, , P-‘AP, , = D., , Exercises, 1. In each of the following cases, let T be the linear operator on R2 which is, represented by the matrix A in the standard ordered basis for R2, and let U be, the linear operator on C2 represented by A in the standard ordered basis. Find the, values of, characteristic, polynomial, for T and that for U, find the characteristic, each operator, and for each such characteristic, value c find a basis for the corresponding space of characteristic, vectors., , 2. Let V be an n-dimensional, vector space over F. What is the characteristic, polynomial, polynomial, of the identity, operator on V? What is the characteristic, for the zero operator?, 3. Let A be an n X n triangular, matrix over the field F. Prove that, teristic values of A are the diagonal entries of A, i.e., the scalars Aii., 4. Let T be the linear, basis by the matrix, , operator, -, , on R3 which is represented, , [, , -16, -9, -8, , 384, , Prove that T is diagonalizable, by exhibiting, is a characteristic, vector of T., , the charac-, , in the standard, , ordered, , a basis for R3, each vector, , of which, , 1, , 7, 44., , 5. Let, , Is A similar over the field R to a diagonal, diagonal matrix?, , matrix?, , Is A similar, , over the field C to a, , 189
Page 198 :
190, , Chap. 6, , Elementary Canonical Forms, 6. Let T be the linear, basis by the matrix, , on R4 which is represented, , operator, , 0, a, 0, , Under, , what conditions, , 0, 0, , 0, 0, , in the standard, , ordered, , 0, 0, , b 0 0’, 0, 0, c 0I, [, on a, b, and c is T diagonalizable?, , vector space V, and suppose, i’. Let T be a linear operator on the n-dimensional, values. Prove that T is diagonalizable., that T has n distinct characteristic, over the field F. Prove, and, , 8. Let A and B be n X n matrices, invertible,, then I - BA is invertible, (I -, , BA)-’, , = 1 + B(I, , that, , if (I - AB), , is, , - AB)-‘A., , 9. Use the result of Exercise 8 to prove that, if A and B are n X n matrices, values in F., over the field F, then AB and BA have precisely the same characteristic, , 10. Suppose that A is a 2 X 2 matrix with real entries which is symmetric, Prove that A is similar over R to a diagonal matrix., 11. Let N be a 2 X 2 complex, or N is similar over C to, , matrix, , such that N2 = 0. Prove that, , (Al = A)., either N = 0, , [0 0., 1, 1, , 0, , 12. Use the result of Exercise 11 to prove the following:, If A is a 2 X 2 matrix, with complex entries, then A is similar over C to a matrix of one of the two types, , 13. Let V be the vector space of all functions from R into R which are continuous,, i.e., the space of continuous, real-valued, functions on the real line. Let T be the, linear operator on V defined by, (Tf)(x>, Prove, , that, , T has no characteristic, , 14. Let A be an n X n diagonal, , = /d” f(t) dt., , values., matrix, , with characteristic, , polynomial, , (z - cl)dl e . . (z - c,+,, where cl, . . . , ck are distinct., , Let V be the space of n X n matrices, of V is d: + . . . + di., , B such that, , AB = BA. Prove that the dimension, , 15. Let V be the space of n X n matrices over F. Let A be a fixed n X n matrix, by A’ on V. Is it true that, over F. Let T be the linear operator ‘left multiplication, A and T have the same characteristic, values?, , 6.3., , Annihilating, , Polynomials, , In attempting, to analyze a linear operator T, one of the most useful, things to know is the class of polynomials which annihilate T. Specifically,
Page 199 :
Sec. 6.3, , Annihilating, , suppose T is a linear operator, polynomial over F, then p(T), polynomial over F, then, , Therefore,, that, , Polynomials, , on I’, a vector space over the field F. If p is a, is again a linear operator on V. If Qis another, , (P + d(T) = P(T) + 0’), (P~U’) = PO%(T)., the collection of polynomials p which annihilate, , T, in the sense, , ~6”) = 0,, is an ideal in the polynomial algebra F[z]. It may be the zero ideal, i.e., it, may be that T is not annihilated, by any non-zero polynomial., But, that, cannot happen if the space V is finite-dimensional., Suppose T is a linear operator on the n-dimensional, space V. Look at, the first (n2 + 1) powers of T:, I, T, T2, . . . , Tn=., This is a sequence of n2 + 1 operators in L(V, V), the space of linear, operators on V. The space L(V, V) has dimension n2. Therefore,, that, sequence of n2 + 1 operators must be linearly dependent, i.e., we have, col + clT + . . . + cn~T”’ = 0, for some scalars ci, not all zero. So, the ideal of polynomials which annihilate, T contains a non-zero polynomial of degree n2 or less., According to Theorem 5 of Chapter 4, every polynomial ideal consists, of all multiples of some fixed manic polynomial, the generator of the ideal., Thus, there corresponds to the operator T a manic polynomial p with this, property : If f is a polynomial over F, then f(T) = 0 if and only if f = pg,, where g is some polynomial over F., Definition., Let T be a linear operator on a jinite-dimensional, vector, polynomial, for T is the (unique), space V over the Jield F. The minimal, manic generator of the ideal of polynomials over F which annihilate T., , The name ‘minimal polynomial’ stems from, of a polynomial, ideal is characterized by being, minimum degree in the ideal. That means that, for the linear operator T is uniquely determined, , the fact that the generator, the manic polynomial, of, the minimal polynomial p, by these three properties:, , (1) p is a manic polynomial over the scalar field F., (2) P(T) = 0., (3) No polynomial over F which annihilates T has smaller degree than, p has., If A is an n X n matrix over F, we define the minimal, polynomial, for A in an analogous way, as the unique manic generator of the ideal of all, polynomials over F which annihilate A. If the operator T is represented in, , 191
Page 200 :
192, , Ekmentary, , Canonical, , Chap. 6, , Forms, , some ordered basis by the matrix A, then T and A have the same minimal, polynomial, That is because f(T) is represented in the basis by the matrix, f(A), so that f(T), = 0 if and only if f(A) = 0., From the last remark about operators and matrices it follows that, similar matrices have the same minimal polynomial. That fact is also clear, from the definitions because, f(P-IAP), , = P-lf(A)P, , for every polynomial f., There is another basic remark which we should make about minimal, polynomials, of matrices. Suppose that A is an n X n matrix with entries, in the field F. Suppose that F1 is a field which contains F as a subfield. (For, example, A might be a matrix with rational entries, while F1 is the field of, real numbers. Or, A might be a matrix with real entries, while F1 is the, field of complex numbers.) We may regard A either as an n X n matrix, over F or as an n X n matrix over F,. On the surface, it might appear that, that is, we obtain two different minimal polynomials, for A. Fortunately, not the case; and we must see why. What is the definition of the minimal, polynomial, for A, regarded as an n X n matrix over the field F? We, consider all manic polynomials with coefficients in F which annihilate A,, and we choose the one of least degree. If f is a manic polynomial, over F:, f = xk +, , (6-4), , then j(A) = 0 merely, powers of A:, (6-5), , k-l, 2 aixi, j=o, , says that we have, , Ak + ak-,Ak-’, , +, , ..a, , a linear, , + alA, , relation, , between, , the, , + aOI = 0., , The degree of the minimal polynomial is the least positive integer k such, that there is a linear relation of the form (6-5) between the powers I,, A, by the uniqueness of the minimal polynomial,, *‘, Ak. Furthermore,, thkre is for that k one and only one relation of the form (6-5); i.e., once the, minimal k is determined, there are unique scalars ao, . . . , ak-1 in F such, that (6-5) holds. They are the coefficients of the minimal polynomial., Now (for each k) we have in (6-5) a system of n2 linear equations for, the ‘unknowns, ao, . . . , ak-1. Since the entries of A lie in F, the coefficients, of the system of equations (6-5) are in F. Therefore, if the system has a, solution with ao, . . . , ak-l in F1 it has a solution with ao, . . . , ak-1 in F., (See the end of Section 1.4.) It should now be clear that the two minimal, polynomials are the same., What do we know thus far about the minimal polynomial for a lineaa, operator on an n-dimensional, space? Only that its degree does not exceed, n2. That turns out to be a rather poor estimate, since the degree cannot, by its, exceed n. We shall prove shortly that the operator is annihilated, characteristic polynomial. First, let us observe a more elementary fact.
Page 201 :
Sec. 6.3, , Annihilating, , Polynomials, , Theorem, 3. Let T be a linear operator on an n-dimensional, vector, space V [or, let A be an n X n matrix]. The characteristic and minimal, polynomials for T [for A] have the same roots, except for multiplicities,, , Proof. Let p be the minimal polynomial for T. Let c be a scalar., What we want to show is that p(c) = 0 if and only if c is a characteristic, value of T., First, suppose p(c) = 0. Then, P = (5 - &?, where q is a polynomial. Since deg q < deg p, the definition of the minimal, polynomial p tells us that q(T) # 0. Choose a vector /3 such that q(T)@ # 0., Let a! = q(T)/3. Then, 0 = PU’M, = 6” - ~~MTM, , = (T - cI)ct, and thus, c is a characteristic value of T., Now, suppose that c is a characteristic, a # 0. As we noted in a previous lemma,, PO’%, Since p(T), , value of T, say, TCX = ccv with, , = P(+., , = 0 and (Y # 0, we have p(c) = 0., , 1, , Let T be a diagonalizable, linear operator and let cl, . . . , ck be the, distinct characteristic values of T. Then it is easy to see that the minimal, polynomial for T is the polynomial, p = (x - Cl) * * * (x - Ck)., If (Y is a characteristic, vector, then one of the operators, T - cJ sends (Y into 0. Therefore, , T - cJ, . . . ,, , (T - cJ) . . . (T - cJ)ar = 0, for every characteristic vector, which consists of characteristic, p(T), , a. There is a basis for the underlying, vectors of T; hence, , = (T - cJ) . . . (T - cJ), , space, , = 0., , What we have concluded is this. If T is a diagonalizable, linear operator,, then the minimal polynomial for T is a product of distinct linear factors., As we shall soon see, that property characterizes diagonalizable operators., EXAMPLE 4. Let’s try to find the minimal polynomials for the operators, in Examples 1, 2, and 3. We shall discuss them in reverse order. The operator in Example 3 was found to be diagonalizable, with characteristic, polynomial, f = (x - 1)(x - 2)2.
Page 202 :
Elementary Canonical Forms, From the preceding, T is, , Chap. 6, , paragraph,, , we know that the minimal, , polynomial, , for, , p = (x - l)(z - 2)., The reader might find it reassuring, , to verify, , (A - I)(A, , directly, , that, , - 2I) = 0., , In Example 2, the operator T also had the characteristic polynomial, f = (X - 1)(x - 2)z. But, this T is not diagonalizable,, so we don’t know, that the minimal polynomial is (Z - 1) (Z - 2). What do we know about, the minimal polynomial in this case? From Theorem 3 we know that its, roots are 1 and 2, with some multiplicities, allowed. Thus we search for p, among polynomials of the form (X - l)k(~ - 2)z, k 2 1, I 2 1. Try (cc - 1), (x - 2):, (A-I)@-21), , 3, , =, , a, , z]E, , i, , 31, , [ 1, 2, 2, 4, , 0, 0, 0, , -1, -1., -2, , Thus, the minimal polynomial has degree at least 3. So, next we should try, either (II: - 1)s1(x - 2) or (X - 1) (x - 2)z. The second, being the characteristic polynomial,, would seem a less random choice. One can readily, compute that (A - I)(A - 21)2 = 0. Thus the minimal polynomial for T, is its characteristic polynomial., In Example 1 we discussed the linear operator T on R2 which is, represented in the standard basis by the matrix, A=, , [, , 0, 1, , -1, , 1, , 0’, , The characteristic, polynomial, is x2 + 1, which has, determine the minimal polynomial, forget about T and, As a complex 2 X 2 matrix, A has the characteristic, Both roots must appear in the minimal polynomial., polynomial is divisible by x2 + 1. It is trivial to verify, So the minimal polynomial is x2 + 1., , no real roots. To, concentrate on A., values i and -i., Thus the minimal, that A2 + I = 0., , Theorem, 4 (Cayley-Hamilton)., Let T be a linear operator on a, jinite dimensional vector space V. If f is the characteristic polynomial for T,, then f(T) = 0; in other words, the minimal polynomial divides the characteristic polynowlial for T., , Proof. Later on we shall give two proofs of this result independent, of the one to be given here. The present proof, although short, may be, difficult to understand. Aside from brevity, it has the virtue of providing
Page 203 :
Sec. 6.3, , Annihilating, , Polynomials, , an illuminating, and far from trivial application, of the general theory of, determinants, developed in Chapter 5., ring with identity consisting of all polyLet K be the commutative, algebra with identity, nomials in T. Of course, K is actually a commutative, over the scalar field. Choose an ordered basis {cyl, . . . , cr,} for V, and let A, be the matrix which represents T in the given basis. Then, , These equations, , may be written, , in the equivalent, , jS, (6ijT - AjJ)aj, , form, , lliln., , = 0,, , Let B denote the element of Knxn with entries, Bii, , = 6ijT, , -, , AjJ., , When n = 2, B =, , T - Ad, [ - A121, , -&uI, T -, , AnzI, , and, det B = (T - Ad), = T2 -, , (An, , (T - Ad), + A&”, +, , 1, , - AlsAd, (An&, -, , Add, , = f(T), , where f is the characteristic, , polynomial:, + det A., , f = x2 - (trace A)z, For the case n > 2, it is also clear that, det B = f(T), since f is the determinant, polynomials, , ,of the matrix, (zI, , -, , A)ii, , XI - A whose entries, , = 6ii~, , -, , are the, , Aji., , We wish to show thatf(T), = 0. In order thatf(T), be the zero operator,, it is necessary and sufficient that (det B)cY~ = 0 for k = 1, . . . , n. By the, definition of B, the vectors CQ, . . . , (Y,,satisfy the equations, (6-6), , g Bijaj, , lIi<?%., , = 0,, , j=l, , When n = 2, it is suggestive, T -, , [ - AnI, , Ad, , to write (6-6) in the form, -Ad, T -, , In this case, the classical adjoint,, T AJ, , Aid], , [I:], , = [:I’, , adj B is the matrix, AzzI, , Ad, T -, , Ad, , 1, , 195
Page 204 :
196, , Chap. 6, , Elementary Canonical Forms, and, BB =, , det B, [0, , 0, detB, , Hence, we have, (det B) [z:], , 1, *, , = @B) [z:], , =[I, 0, 0’, , In the general case, let B = adj B. Then by (6-6), 2, , = 0, , BbjBijcUj, , j=l, , for each pair k, i, and summing, , on i, we have, , 0 = i i, i=l j=l, , BkiBijaj, , Therefore, 0 = jil, , Lj(det, , = (det B)Q, , B)aj, l<IcIn., , 1, , The Cayley-Hamilton, theorem is useful to us at this point primarily, because it narrows down the search for the minimal polynomials of various, operators. If we know the matrix A which represents T in some ordered, basis, then we can compute the characteristic polynomial j. We know that, the minimal polynomial p divides f and that the two polynomials have the, same roots. There is no method for computing precisely the roots of a, polynomial, (unless its degree is small); however, if j factors, (6-7), , j = (z - cl)dl . . . (r - c#r,, , cl, . . . , q distinct,, , di > 1, , then, (6-8), , p = (x - cp, , * * f (x - Ck)‘k,, , 1 2 rj 2 dj., , That is all we can say in general. If j is the polynomial, (6-7) and has, degree n, then for every polynomial p as in (6-8) we can find an n X n, matrix which has j as its characteristic, polynomial, and p as its minimal
Page 205 :
Annihilating, , Sec. 6.3, , Polynomials, , polynomial. We shall not prove this now. But, we wa.nt to emphasize the, fact that the knowledge that the characteristic, polynomial, has the form, (6-7) tells us that the minimal polynomial has the form (6-8), and it tells us, nothing else about p., EXAMPLE, , 5. Let A be the 4 X 4 (rational), 0, , 1 0, , 1, , 0 10, 1 0 1, , 1’, 0, , 2, , 0, , 2, , 0, , 2, 0, , 0, 2, , 2, 0, , 0, 2, , 0, , 4, , 0, , 4, , [ 1, [’ 2’ 21, [ 1, , A=, The powers of A are easy to compute:, , A2=, , matrix, , 0 4 0 4’, As%= 4 0’ 4 0’, Thus A3 = 4A, i.e., if p = 9 - 42 = ~(2 + 2)(s - 2), then p(A) = 0., The minimal polynomial for A must divide p. That minimal polynomial is, obviously not of degree 1, since that would mean that A was a scalar, multiple of the identity. Hence, the candidates for the minimal polynomial, are: p, Z(Z + 2), Z(Z - 2), ~9 - 4. The three quadratic polynomials can be, eliminated because it is obvious at a glance that A2 # -2A, A2 # 2A,, A2 # 41. Therefore p is the minimal polynomial for A. In particular 0, 2,, and -2 are the characteristic, values of A. One of the factors x, x - 2,, x + 2 must be repeated twice in the characteristic polynomial. Evidently,, rank (A) = 2. Consequently, there is a two-dimensional, space of characteristic vectors associated with the characteristic value 0. From Theorem, 2, it should now be clear that the characteristic, polynomial is x2(x2 - 4), and that A is similar over the field of rational numbers to the matrix, , 000, 0, [ 1, 002, 0 0, , 0’, , 0, , -2, , Exercises, 1. Let V be a finite-dimensional vector space. What is the minimal polynomial, for the identity operator on V? What is the minimal polynomial for the zero, operator?, , 197
Page 206 :
198, , Elementary Canonical Forms, , Chap. 6, , 00c, [ 1, , 2. Let a, b, and c be elements of a field F, and let A be the following 3 X 3 matrix, over F:, A=, , 10, b., 0 1 a, Prove that the characteristic polynomial for A is x3 - uz2 - bx - c and that this, is also the minimal polynomial for A., 3. Let A be the 4 X 4 real matrix, , *-[Ii, , 1;, , -i, , i]., , Show that the characteristic polynomial for A is x2(x - 1)2 and that it is also, the minimal polynomial., 4. Is the matrix A of Exercise 3 similar over the field of complex numbers to a, diagonal matrix?, 5. Let V be an n-dimensional vector space and let T be a linear operator on V., Suppose that there exists some positive integer k so that Tk = 0. Prove that, T” = 0., 6. Find a 3 X 3 matrix for which the minimal polynomial is x2., 7. Let n be a positive integer, and let V be the space of polynomials over R, which have degree at most n (throw in the O-polynomial). Let D be the differentiation operator on V. What is the minimal polynomial for D?, 8. Let P be the operator on R2 which projects each vector onto the x-axis, parallel, to the y-axis: P(x, y) = (x, 0). Show that P is linear. What is the minimal polynomial for P?, 9. Let A be an n X n matrix with characteristic polynomial, f = (x - c#* ’ ‘(x - c&k, Show that, Cl& + * * * + ckdk = trace (A)., 10. Let V be the vector space of n X n matrices over the field F. Let A be a fixed, n X n matrix. Let T be the linear operator on V defined by, T(B) = AB., Show that the minimal polynomial for T is the minimal polynomial for A., 11. Let A and B be n X n matrices over the field F. According to Exercise 9 of, Section 6.1, the matrices AB and BA have the same characteristic values. Do, they have the same characteristic polynomial? Do they have the same minimal, polynomial?, , 6.4. Invariant, , Subspaces, , In this section, we shall introduce a few concepts which are useful in, attempting to analyze a linear operator. We shall use these ideas to obtain
Page 207 :
Sec. 6.4, characterizations, of their minimal, , of diagonalizable, polynomials., , (and triangulable), , Invariant, , Subspaces, , operators, , in terms, , Definition., Let V be a vector space and T a linear operator on V. If, under, T if for each vector, W is a subspace of V, we say that W is invariant, a: in W the vector Tel is in W, i.e., if T(W) is contained in W., , EXAMPLE 6. If T is any linear operator on V, then V is invariant, under T, as is the zero subspace. The range of T and the null space of T, are also invariant under T., EXAMPLE 7. Let, on the space F[z] of, let W be the subspace, is invariant under D., decreasing.’, , F be a field and let D be, polynomials over F. Let, of polynomials of degree, This is just another way, , the differentiation, operator, n be a positive integer and, not greater than n. Then W, of saying that D is ‘degree, , EXAMPLE 8. Here is a very useful generalization, of Example 6. Let T, be a linear operator on V. Let U be any linear operator on V which commutes with T, i.e., TU = UT. Let W be the range of U and let N be the, null space of U. Both W and N are invariant under T. If a is in the range, of U, say (Y = Up, then Ta! = T(UP) = U(Tfi) so that Ta! is in the range, of U. If a is in N, then U(TCU) = T(UCX) = T(0) = 0; hence, TCYis in N., A particular type of operator which commutes with T is an operator, U = g(T), where g is a polynomial., For instance, we might have U =, T - cI, where c is a characteristic, value of T. The null space of U is, familiar to us. We see that this example includes the (obvious) fact that, the space of characteristic vectors of T associated with the characteristic, value c is invariant under T., EXAMPLE 9. Let T be the linear operator, in the standard ordered basis by the matrix, A=, , [, , 0, , 1, , --I, , on R2 which is represented, , 1, , 0’, , Then the only subspaces of R2 which are invariant under T are R2 and the, zero subspace. Any other invariant, subspace would necessarily have, dimension 1. But, if W is the subspace spanned by some non-zero vector (Y,, the fact that W is invariant, under T means that CYis a characteristic, vector, but A has no real characteristic values., When the subspace W is invariant, under the operator T, then T, induces a linear operator Tw on the space W. The linear operator Tw is, defined by T&a) = T(a), f or Q in W, but Tw is quite a different object, from T since its domain is W not V., When V is finite-dimensional,, the invariance, of W under T has a, , 199
Page 208 :
200, , Elementary Canonical Forms, , Chap. 6, , simple matrix interpretation,, and perhaps we should mention it at this, point. Suppose we choose an ordered basis a3 = {(Ye, . . . , ar,> for V such, that a’ = (011, . . . , CY,.~is an ordered basis for W (r = dim IV). Let A =, [T]a so that, Taj = i, , Aijai., , i=l, , Since W is invariant, means that, , under T, the vector, , Taj belongs to W for j 5 r. This, , (6-9), In other words, Aij = 0 if j 5 r and i > r., Schematically, A has the block form, (B-10), where B is an r X r matrix, C is an r X (n - r) matrix, and D is an, (n - T) X (n - r) matrix. The reader should note that according to, (6-9) the matrix B is precisely the matrix of the induced operator Tw in, the ordered basis (R’., Most often, we shall carry out arguments about T and TTV without, making use of the block form of the matrix A in (6-10). But we should note, how certain relations between Tw and T are apparent from that block form., Lemma., Let W be an invariant, subspace for T. The characteristic, polynomial for the restriction operator TTV divides the characteristic polynomial, for T. The minimal polynomial for TW divides the minimal polynomial for T., , hoof., , We have, A=, , B, 0, , C, D, , II 1, , where A = [T]@ and B = [Tw]a/. Because of the block form of the matrix, det (XI -A), , = det(zI, , - B) det (~1 - D)., , That proves the statement about characteristic polynomials. Notice, we used I to represent identity matrices of three different sizes., The lath power of the matrix A has the block form, , Ak =, , [, , that, , 1, , Bk, , Ck, , 0, , D”, , where Ch is some r X (n - r) matrix. Therefore, any polynomial, which, annihilates A also annihilates B (and D too). So, the minimal polynomial, for B divides the minimal polynomial for A. 1, EXAMPLE, 10. Let T be any linear operator on a finite-dimensional, space I’. Let W be the subspace spanned by all of the characteristic vectors
Page 209 :
Sec. 6.4, , Invariant, , Subspaces, , of T. Let cl, . . . , ck be the distinct characteristic values of T. For each i,, let Wi be the space of characteristic vectors associated with the characteristic value ci, and let & be an ordered basis for Wi. The lemma before, Theorem 2 tells us that a’ = (@I, . . . , @k) is an ordered basis for W. In, particular,, dim W = dim WI + . . . + dim Wk., Let 63 = {(or, . . . , G+} so that the first few a’s form the basis @I, the next, few &, and so on. Then, Tcri = ticxi,, , i=l, , 7. . f I r, , where (tl, . . . , t,.) = (Cl, cl, . . . , cl, . . . , ck, ck, . . . , ck) with ci repeated, dim Wi times., Now W is invariant under T, since for each LIPin W we have, a = ZlcQ + . . * + &.cYp, Tat = tp,cq + . . . + trx,cx,., Choose any other vectors CX,+~,. . . , (Y%in V such that & = ((~1, . . . , oc,}, is a basis for V. The matrix of T relative to u?,has the block form (6-lo), and, the matrix of the restriction operator Tw relative to the basis 63’ is, , t10 ..* 0, B= 0. tz, . ... 0. ., L; (j ... t:,I, , The characteristic, , polynomial, , of B (i.e., of Tw) is, , where ei = dim Wi. Furthermore,, g dividesf, the characteristic polynomial, for T. Therefore, the multiplicity, of ci as a root off is at least dim WC, All of this should make Theorem 2 transparent. It merely says that T, is diagonalizable if and only if r = n, if and only if el + . . . f ck = n. It, does not help us too much with the non-diagonalizable, case, since we don’t, know the matrices C and D of (B-10)., DeJinition., Let W be an invariant subspace for T and let o( be a vecto?, of LYinto, W is the set $(cY; W), which consists of, in V. The T-conductor, all polynomials g (over the scalar jield) such that g(T)oc is in W., , Since the operator T will be fixed throughout, most discussions, we, shall usually drop the subscript T and write S (a ; W) . The authors usually, call that collection of polynomials, the ‘stuffer’ (das einstopjende Ideal)., ‘Conductor’, is the more standard term, preferred by those who envision, a less aggressive operator g(T), gently leading the vector 01into W. In the, special case W = (0) the conductor is called the T-annihilator, of 01.
Page 210 :
Elementary Canonical Forms, , Chap. 6, , Lemma., If W is an invariant, subspace for T, then W is invariant, under every polynomial in T. Thus, for each a in V, the conductor S(a; W) is, an ideal in the polynomial algebra F[x]., , Proof. If p is in W, then TP is in W. Consequently,, T(T@) = T2fl, is in W. By induction, Tk/3 is in W for each Ic. Take linear combinations to, see that f( T&3 is in W for every polynomial f., The definition of X(cr; W) makes sense if W is any subset of li. If W is, a subspace, then S((Y; W) is a subspace of F[z], because, (cf + g)(T), , = cf0”), , + g(T)., , If W is also invariant, under T, let g be a polynomial, in S(a; W), i.e., let, g( T)a be in W. If f is any polynomial, thenf(T), [g(T)a] will be in W. Since, (fg)(T), fg is in X(or; W). Thus the conductor, nomial., 1, , = fU’)dT), absorbs multiplication, , by any poly-, , The unique, , manic generator of the ideal S(oc; W) is also called the, W (the T-annihilator, in case W = (0)). The, T-conductor of a: into W is the manic polynomial g of least degree such that, g(T)a is in W. A polynomialf, is in S(a; W) if and orlly if g divides f. Note, that the conductor S(cr; W) always contains the minimal polynomial for T;, hence, every T-conductor divides the minimal polynomial for T., As the first illustration, of how to use the conductor X((Y; W), we shall, characterize triangulable, operators. The linear operator T is called triangulahle, if there is an ordered basis in which T is represented by a, triangular matrix., T-conductor, , of cy into, , Lemma., Let V be a jinite-dimensional, vector space over the field F., Let T be a linear operator on V such that the minimal polynomial for T is a, product of linear factors, , p = (x - C1)” . . . (x - Ck)‘k,, , ci in F., , Let W be a proper (W # V) subspace of V which is invariant, exists a vector CYin V such that, , under T. There, , (a) a is not in W;, (b) (T - cI)a: is in W, for some characteristic value c of the operator T., Proof. What (a) and (b) say is that the T-conductor, of cx into W, is a linear polynomial. Let p be any vector in V which is not in W. Let g be, the T-conductor, of p into W. Then g divides p, the minimal polynomial, for T. Since p is not in W, the polynomial g is not constant. Therefore,, g = (x - C@ . . . (x - cjp
Page 211 :
Sec. 6.4, , Invariant, , where at least one of the integers ei is positive., Then (X - cj) divides g :, g = (x - Cj)h., By the definition, , of g, the vector, (T -, , C~I)CX, , Choose j so that ej > 0., , cannot be in W. But, , LY= h(T)/?, = (T -, , Subspaces, , cjI)h(T)p, , = dT)P, , is in W., , 1, , Theorem, 5. Let V be a Jinite-dimensional, vector space over the Jield F, and let T be a linear operator on V. Then T is triangulable, if and only if the, minimal, polynomial, for T is a product of linear polynomials, over F., Proof., , Suppose that the minimal, p = (x -, , polynomial, , c1)” * . . (x -, , factors, , Ck)“., , By repeated application of the lemma above, we shall arrive, basis CR = {al, . . . , cu,} in which the matrix representing, triangular:, all al2 al3 f. . ah, (6-11), , [T]a, , =, , 1, .(j ;. ;. ... arm, iJ, , “0, , i*, , at an ordered, T is upper-, , “0’”, , a23, a33, , ..., .. ., , a2n, , a3n ., , Now (6-11) merely says that, (6-12), , Tai, , = algal, , +, , . . . + ajjai,, , l<j_<n, , that is, Taj is in the subspace spanned by (~1, . . . , aj. TO find (~1, . . . , (Y,,, we start by applying the lemma to the subspace W = {0}, to obtain the, vector (~1. Then apply the lemma to W1, the space spanned by (~1, and we, obtain (~2. Next apply the lemma to W2, the space spanned by (Y~and W., Continue in that way. One point deserves comment. After LYE,. . . , ori have, been found, it is the triangular-type, relations (6-12) for j = 1, . . . , i, which ensure that the subspace spanned by cyl, . . . , ai is invariant, under, T., , If T is triangulable,, , it, is evident, , that the characteristic, , polynomial, , for, , T has the form, , j = (x - cp, , . . . (x - cp,, , ci in F., , Just look at the triangular matrix (6-11). The diagonal entries all, . . . , al,, are the characteristic values, with ci repeated di times. But, if j can be SO, factored, so can the minimal polynomial p, because it divides j. 1, Corollary., ber field. Every, , Let F be an algebraically, closed field, e.g., the complex numover F is similar, over F to a triangular, matrix., II X 11 matrix
Page 212 :
204, , Elementary Canonical Forms, , Chap. 6, , Theorem, 6. Let V be a finite-dimensional, vector space over the Jield F, and let T be a linear operator on V. Then T is diagonalizable if and only if the, minimal polynomial for T has the form, , p = (x - Cl) . ’ * (x - Ck), where cl, . . . , Ck are distinct elements of F., Proof. We have noted earlier that, if T is diagonalizable,, its, minimal polynomial is a product of distinct linear factors (see the discussion, prior to Example 4). To prove the converse, let W be the subspace spanned, by all of the characteristic vectors of T, and suppose W # V. By the lemma, used in the proof of Theorem 5, there is a vector a not in W and a characteristic value cj of T such that the vector, /3 = (T - cJ)a!, lies in W. Since p is in W,, fl, , =, , fll+, , -*', , +, , where Toi = c&, 1 5 i 5 Ic, and therefore, h(T)P, , = h(c&, , pk, , the vector, , + * * * + h(‘A)k&, , is in W, for every polynomial h., Now p = (IL: - cj)q, for some polynomial, , q. Also, , 4 - q(cj) = (5 - cj)h., We have, p(T)a, But h(T)0, , - p(cJar = h(T)(T, , - cJ)a, , = h(TM, , is in W and, since, 0 = p(T)a, , = (T - cjI)q(T),, , the vector ~(T)cI is in W. Therefore, Q(~)LY is in W. Since a! is not in W, we, have q(cj) = 0. That contradicts the fact that p has distinct roots., 1, At the end of Section 6.7, we shall give a different proof of Theorem 6., In addition to being an elegant result, Theorem 6 is useful in a computational way. Suppose we have a linear operator T, represented by the matrix, A in some ordered basis, and we wish to know if T is diagonalizable., We, compute the characteristic polynomial f. If we can factor j:, , we have two different methods for determining whether or not T is diagonalizable. One method is to see whether (for each i) we can find di independent characteristic vectors associated with the characteristic, value ci., The other method is to check whether or not (T - cJ) . . . (T - cd) is, the zero operator., Theorem 5 provides a different proof of the Cayley-Hamilton, theorem., That theorem is easy for a triangular, matrix. Hence, via Theorem 5, we
Page 213 :
Sec. 6.4, , Invariant, , Subspaces, , obtain the result for any matrix over an algebraically, closed field. Any, field is a subfield of an algebraically closed field. If one knows that result,, one obtains a proof of the Cayley-Hamilton, theorem for matrices over any, field. If we at least admit into our discussion the Fundamental, Theorem of, Algebra (the complex number field is algebraically closed), then Theorem 5, provides a proof of the Cayley-Hamilton, theorem for complex matrices,, and that proof is independent of the one which we gave earlier., , Exercises, 1. Let T be the linear operator on R2, the matrix of which in the standard ordered, basis is, A=, , [, , ;, , -g., , 1, , (a) Prove that the only subspaces of R2 invariant, under T are R2 and the, zero subspace., (b) If U is the linear operator on C2, the matrix of which in the standard, ordered basis is A, show that U has l-dimensional, invariant, subspaces., 2. Let W be an invariant, for the restriction, operator, referring to matrices., , polynomial, subspace for T. Prove that the minimal, Tw divides the minimal, polynomial, for T, without, , 3. Let c be a characteristic, value of T and let W be the space of characteristic, vectors associated with the characteristic, value c. What is the restriction, operator Tw?, , 4. Let, A=, , 10, [ 1, 0, 2, 2, , Is A similar over the field of real numbers, triangular, matrix., 5. Every, , matrix, , -2, -3, , 2., 2, , to a triangular, , A such that A2 = A is similar, , matrix?, , to a diagonal, , If so, find such a, matrix., , linear operator on the n-dimensional, vector space V,, 6. Let T be a diagonalizable, and let W be a subspace which is invariant under T. Prove that the restriction, operator TW is diagonalizable., 7. Let T be a linear operator on a finite-dimensional, vector space over the field, if and only if T is annihilated, of complex numbers. Prove that T is diagonalizable, by some polynomial, over C which has distinct roots., 8. Let T be a linear operator on V. If every subspace of V is invariant, then T is a scalar multiple, of the identity, operator., 9. Let T be the indefinite, , integral, (U)(x), , operator, = rfC0, , a, , under, , T,, , 206
Page 214 :
Z’O6, , Elementary Canonical Forms, , Chap. 6, , on the space of continuous functions on the interval [0, I]. Is the space of polynomial functions, invariant, under T? The space of differentiable, functions?, The, space of functions, , which vanish, , at z = a?, , 10. Let A be a 3 X 3 matrix with real entries. Prove that, if A is not similar over R, to a triangular, matrix, then A is similar over C to a diagonal matrix., 11. True or false? If the triangular, A is already diagonal., , matrix, , il is similar, , to a diagonal, , matrix,, , then, , 12. Let T be a linear operator on a finite-dimensional, vector space over an algebraically closed field F. Let f be a polynomial, over F. Prove that c is a characteristic value of f(T) if and only if c = f(t), where t is a characteristic, value of T., 13. Let V be the space of n X n matrices over F. Let A be a fixed n X n matrix, over F. Let T and U be the linear operators on V defined by, , T(B) = AB, U(B), (a) True, (b) True, , 6.5., , Simultaneous, , Simultaneous, , = AB - BA., , or false? If ,2 is diagonalizable, or false? If A is diagonalizable,, , (over F), then T is diagonalizablc., then U is diagonalizablc., , Triangulation;, Diagonalization, , Let V be a finite-dimensional, space and let F be a family of linear, triangulate, or diagooperators on I’. We ask when we can simultaneously, nalize the operators in 5, i.e., find one basis @ such that all of the matrices, [?“]a, T in 3, are triangular, (or diagonal). In the case of diagonalization,, it, is necessary that F be a commuting family of operators: UT = T U for all, T, U in 5. That follows from the fact that all diagonal matrices commute., Of course, it is also necessary that each operator in 5 be a diagonalizable, operator. In order to simultaneously, triangulate,, each operator in 5 must, be triangulable. It is not necessary that 5 be a commuting family; however,, that condition is sufficient for simultaneous triangulation, (if each T caI1 be, individually, triangulated)., These results follow from minor variations, of, the proofs of Theorems 5 and 6., The subspace W is invariant, under, (the family of operators) 5 if, W is invariant under each operator in 5., Lemma., Let 5 be a commuting family of triangulable, on V. Let W be a proper subspace of V u>hich is invariant, exists a vector CYin V such that, , linear operators, under 5. There, , a! is not in W;, (b) for each T in 5, the vector TCX is in the subspace spanned by a! and W., , (a), , Proof. It is no loss of generality to assume that 5 contains only a, finite number of operators, because of this observation., Let {Tl, . . . , T,}
Page 215 :
Sec. 6.5, , Simultaneous Triangulation;, , Simultaneous Diagonalixation, , be a maximal linearly independent subset of 5, i.e., a basis for the s&space, spanned by 5. If LYis a vector such that (b) holds for each Ti, then (b) will, hold for every operator which is a linear combination of T1, . . . , T,., By the lemma before Theorem 5 (this lemma for a single operator), we, can find a vector pi (not in llr) and a scalar cl such that (TI - cJ)p, is in W., Let Vi be the collection of all vectors /3 in I’ such that (T, - cJ)fl is in W., Then VI is a subspace of V which is properly larger than W. Furthermore,, 81 is invariant under 5, for this reason. If T commutes with T1, then, , VI - d)(TP), , = TV’, - c,l)P., , If /3 is in VI, then (T1 - cJ)p is in W. Since W is invariant under each Tin, 5, we have T(T1 - cJ)fi in W, i.e., TP in VI, for all fl in VI and all Tin CF., Now W is a proper subspace of VI. Let Us be the linear operator on VI, obtained by restricting T2 to the subspace VI. The minimal polynomial for, Ux divides the minimal polynomial for Tz. Therefore, we may apply the, lemma before Theorem 5 to that operator and the invariant, subspace W., We obtain a vector pZ in VI (not in W) and a scalar c2 such that (T, - cJ)/~~, is in W. Note that, (a) pz is not in W;, (b) (T, - cJ)pz is in W;, (c) (Tz - cJ)& is in IV., Let V, be the set of all vectors /3 in VI such that (Tz - cJ)p is in W., Then Vz is invariant under 5. Apply the lemma before Theorem 5 to Us,, the restriction of T, to Vz. If we continue in this way, we shall reach a, vector QI = pT (not in W) such that (Tj - cJ)ol is in W, j = 1, . . . , r. 1, Theorem, 7. Let V be a finite-dimensional, vector space over the field F., Let F be a commuting family of triangulable linear operators on V. There exists, an ordered basis for V such that every operator in EFis represented by a triangular matrix in that basis., , Proof. Given the lemma which we just proved, this theorem, the same proof as does Theorem 5, if one replaces T by 5. m, , has, , Corollary., Let 5 be a commuting family of n X n matrices over an, algebraically closed field I?. There exists a non-singular II X 11 matrix P with, entries in F such that P-lA1’ is upper-triangular,, for every matrix A in 5., Theorem, 8. Let 5 be a commuting family of diagonalixable, linear, operators on the finite-dimensional, vector space V. There exists an ordered basis, for V such that every operator in 5 is represented in that basis by a diagonal, matrix., , Proof., before Theorem, , We could prove this theorem by adapting the lemma, 7 to the diagonalizable, case, just as we adapted the lemma, , 207
Page 216 :
208, , Elementary Canonical Forms, , Chap. 6, , before Theorem 5 to the diagonalizable case in order to prove Theorem 6., However, at this point it is easier to proceed by induction on the dimension, of v., If dim 6’ = 1, there is nothing to prove. Assume the theorem for, vector spaces of dimension less than n, and let V be an n-dimensional, space., Choose any T in 5 which is not a scalar multiple of the identity. Let, characteristic values of T, and (for each i) let Wi, Cl,, . . . , ck be the distinct, be the null space of T - ccl. Fix an index i. Then Wi is invariant under, every operator which commutes with T. Let pi be the family of linear, operators on Wi obtained by restricting the operators in 3 to the (invariant), subspace Wi. Each operator in F< is diagonalizable,, because its minimal, polynomial divides the minimal polynomial for the corresponding, operator, in 5. Since dim Wi < dim V, the operators in 5; can be simultaneously, diagonalized. In other words, Wi has a basis a; which consists of vectors, which are simultaneously, characteristic vectors for every operator in 3;., Since T is diagonalizable,, the lemma before Theorem 2 tells us that, 63 = (031, . . . ) (Rk) is a basis for V. That is the basis we seek. [, , Exercises, 1. Bind an invertible real matrix P such that P-IAP and P-‘BP are both diagonal, where A and B are the real matrices, , (4, , A = [;, , ;I>, , B = [;, , I;], , (b), , A = [;, , ;I,, , B = [a, , ;I., , 2. Let 5 be a commuting family of 3 X 3 complex matrices. How many linearly, independent matrices can 5 contain? What about the n X n case?, 3. Let T be a linear operator on an n-dimensional space, and suppose that T, has n distinct characteristic values. Prove that any linear operator which commutes, with T is a polynomial in T., /E. Let 11, B, C, and D be n X n complex matrices which commute. Let E be the, 2% X 2n matrix, , Prove that, det E = tlet (AD - BC)., 5. Let F be a field, n a positive integer, and let V be the space of n X 72matrices, over F. If A is a fixed n X n matrix over P, let TA be the linear operator on V, defined by TA(B) = A B - BA. Consider the family of linear operators TA obtained by letting A vary over all diagonal matrices. Prove that the operators in, that family are simultaneously diagont iizable.
Page 217 :
Direct-Sum, , Sec. 6.6, , 6.6., , Direct-Sum, , Decompositions, , Decompositions, , As we continue with our analysis of a single linear operator, we shall, formulate our ideas in a slightly more sophisticated way-less, in terms of, matrices and more in terms of subspaces. When we began this chapter, we, described our goal this way: To find an ordered basis in which the matrix, of T assumes an especially simple form. Now, we shall describe our goal, as follows: To decompose the underlying space V into a sum of invariant, subspaces for T such that the restriction operators on those subspaces are, simple., DeJinition., Let WI, . . . , Wk be subspaces, if, sag that WI, . . . , Wk are independent, a1 +, , implies, , ‘0., , +, , Cyk =, , 0,, , of the vector space V. We, , ffyi in Wi, , that each ai is 0., , For Ic = 2, the meaning of independence is (0) intersection, i.e., WI, and Wz are independent, if and only if WI n Wz = {O}. If k > 2, the, independence of WI, . . . , Wk says much more than WI n . . . fI Wk =, (0). It says that each Wj intersects the sum of the other subspaces Wi, only in the zero vector., The significance of independence is this. Let W = WI + . . + Wk, be the subspace spanned by WI, . . . , Wk. Each vector o( in W can be, expressed as a sum, a = a!1 + . . . + cyk,, . . . , W, are independent,, , If Wl,, , ffi in Wi., , then that expression for a! is unique;, , a = PI + . . . + Pk,, , for if, , pi in W;, , then 0 = (a1 - pl) + . . . + (ak - Pk), hence CQ- pi = 0, i = 1, . . . , k., Thus, when WI, . . . , Wk are independent, we can operate with the vectors, in W as Ic-tuples (o(~, . . , CQ), LY;in Wi, in the same way as we operate with, vectors in Rk as Ic-tuples of numbers., Lemma., be subspaces, , Let v be a finite-dimensional, vector space. Let WI, . . . , Wk, are equivalent., of V and let W = WI + . . . + Wk. The following, , WI, . . . , Wk are independent., (b) For each j, 2 5 j 5 k, we have, , (a), , Wj n (WI + ..., (c), (6,, , If @i is an ordered, , . . . , CB~) is an ordered, , + Wj-I), , = (0)., , basis for Wi, 1 5 i 5 k, then the sequence, basis for W., , Ei =, , 209
Page 218 :
210, , Elementary, , cw1+, , Canonical Forms, , Chap. 6, , Proof. Assume (a). Let (Y be a vector in the intersection Wi n, *.* + Wj-1). Then there are vectors (~1, . . . , “j-1 with ai in Wi, , such that a! = cy1+ * * * + “j-1. Since, al+, , ..., , +, , Q!j-1, , + (-Cl!) + 0 + . . ’ + 0 = 0, , and since WI, . . . , Wk are independent, it must be that cul = (~2 = . . . =, “j-1 = a! = 0., Now, let’ us observe that (b) implies (a). Suppose, 0 = CX1+ ’ ’ . + Clk,, , (Yi in Wi., , Let j be the largest integer i such that (pi # 0. Then, 0 = 011+ ‘. . +, Thusaj, , = -aI, , -, , ‘.., , -, , "j,, , CYj, , # 0., , is a non-zero vector, , CYj-1, , in Wj n (WI + . . . +, , Wj-1)., Now that we know (a) and (b) are the same, let us see why (a) is, equivalent to (c). Assume (a). Let 03i be a basis for Wi, 1 5 i 5 Ic, and let, 03 = (as,, . . . ) 6%). Any linear relation between the vectors in 63 will have, the form, Pl+, , **-, , +bs=o, , where pi is some linear combination of the vectors in B<. Since WI, . . . , Wk, are independent, each pi is 0. Since each Bi is independent, the relation we, have between the vectors in & is the trivial relation., We relegate the proof that (c) implies (a) to the exercises (Exercise, 2)., I, , If any (and hence all) of the conditions of the last lemma hold, we, say that the sum W = WI + . . . + WI, is direct, or that W is the direct, sum of WI, . . , Wk and we write, w = Wl @ . * . @ Wk., the literature,, the reader may find this direct sum referred, independent sum or the interior direct sum of WI, . . . , Wk., , In, , to as an, , EXAMPLE 11. Let V be a finite-dimensional, vector space over the field, F and let {al, . . . , a,} be any basis for V. If Wi is the one-dimensional, subspace spanned by ai, then V = WI @ - * * @ W,., EXAMPLE 12. Let n be a positive integer and F a subfield of the complex numbers, and let V be the space of all n X n matrices over F. Let, WI be the subspace of all symmetric, matrices, i.e., matrices A such that, matrices, i.e.,, At = A. Let Wz be the subspace of all skew-symmetric, matrices A such that At = -A. Then V = Wl @ Wz. If A is any matrix, in V, the unique expression for A as a sum of matrices, one in WI and the, other in Wz, is
Page 219 :
Sec. 6.6, , Direct-Sum Decompositions, A = A, + A2, A1 = $(A + A’), A, = +(A - At)., , EXAMPLE 13. Let T be any linear operator on a finite-dimensional, space V. Let cl, . . . , cl~be the distinct characteristic values of T, and let, Wi be the space of characteristic vectors associated with the characteristic, value ci. Then W1, . . . , Wk are independent. See the lemma before Theorem 2. In particular, if T is diagonalizable, then V = W1 @ . . . @ Wk., DeJinition., If V is a vector space, a projection, operator E on V such that E2 = E., , Suppose that E is a projection., the null space of E., , of V is a linear, , Let R be the range of E and let N be, , 1. The vector fl is in the range R if and only if Efl = /3. If p = Eq, then E@ = E2a! = Ea = /3. Conversely, if p = E/3, then (of course) /3 is in, the range of E., 2. V=R@N., 3. The unique expression for ac as a sum of vectors in R and N is, a = Ea: + (a! - Eel)., From (l), (a), (3) it is easy to see the following. If R and N are subspaces of V such that V = R ON, there is one and only one projection, operator E which has range R and null space N. That operator is called the, projection, on R along, N., Any projection, E is (trivially), diagonalizable., If {CQ, . . . , a,.} is a, basis for R and {(~,+l, . . . , a,} a basis for N, then the basis a3 = ((~1, . . . ,, CY,) diagonalizes E:, , where I is the r X r identity matrix. That should help explain some of the, terminology connected with projections. The reader should look at various, cases in the plane R2 (or 3-space, R3), to convince himself that the projection on R along N sends each vector into R by projecting it parallel to N., Projections can be used to describe direct-sum decompositions of the, space V. For, suppose V = W1 @ . . @ Wk. For each j we shall define, an operator Ej on 8. Let a be in V, say O( = crl + + . . + (ok with (Y; in W;., Define Eja = aj. Then Ej is a well-defined rule. It is easy to see that Ej is, linear, that the range of Ej is WY, and that Ef = Ej. The null space of Ej, is the subspace, (wl+, for, the statement, , a** + Wj-1 + Wj+l + * ’ ’ + Wk), , that E~(Y = 0 simply means aj = 0, i.e., that c11is actually
Page 220 :
212, , Elementary Canonical Forms, , Chap. 6, , a sum of vectors from the spaces WC with i # j. In terms of the projections, Ei we have, a = ESY + . . . + Eka!, , (6-13), for each a! in V. What, , (6-13) says is that, I = El + . . . + Ek., , Note also that if i # j, then EiEj = 0, because the range of Ej is the, subspace Wj which is contained in the null space of Ei. We shall now, summarize our findings and state and prove a converse., Theorem, 9. If V = W1 @ . . . @ Wk, then there exist k linear operators E1, . . . , Ek on V such that, , (i), (ii), (iii), (iv), , each Ei is a projection, EiEj = 0, if i # j ;, I = E, + .+. + Ek;, the range of Ei is Wi., , (EB = Ei);, , Conversely, if El, . . . , Ek are k linear operators on V which satisfy conditions, $, (ii), and (iii), and if we let Wi be the range of Ei, then V = Wi 0 . . . @, k., , Proof., We have only to prove the converse statement. Suppose, El, . . . , Ek are linear operators on V which satisfy the first three conditions, and let Wi be the range of Ei. Then certainly, , v=, for, by condition, , w1+, , ..., , + wk;, , (iii) we have, Q = Ela! + . . . + E~DL, , for each cxin V, and Eicr is in Wi. This expression for (Yis unique, because if, a = a!1 + .** + CYk, with Qi in Wi, say cy( = Eifli,, , then using (i) and (ii) we have, Eja = ;: Eicui, i=l, , = ;, , E,E&, , i=l, , = Ej2pj, = Ejflj, =, , CYj., , This shows that V is the direct sum of the Wi., , i
Page 221 :
Sec. 6.7, , Direct Sums, , Invariant, , 213, , Exercises, I. Let V be a finite-dimensional, vector space and let W1 be any subspace, Prove that there is a subspace Wz of V such that V = WI @ Wt., 2. Let V be a finite-dimensional, of V such that, v=, Prove that, , Wlf, , -.., , + ?Vk, , vector, , of V., , space and let Wi, . , . , We be subspaces, , dim V = dim IV1 + . . . + dim Wk., , and, , V = WI @ . . . @ Wk., , 3. Find a projection, E which projects, along the subspace spanned by (1,2)., 4. If El and Ez are projections, projection., True or false?, , R2 onto the subspace spanned, , onto independent, , 5. If E is a projection, and S is a polynomial,, a and 6 in terms of the coefficients off?, 6. True or false? If a diagonalizable, 0 and 1, it is a projection., 7. Prove that if E is the projection, on N along R., , operator, , subspaces,, then f(E), , by (1, - 1), , then El + Ez is a, , = aZ + bE. What are, , has only the characteristic, , values, , on R along N, then (Z - E) is the projection, , El, be linear operators on the space V such that El + . . . + Ek = I., (a) Prove that if EiEj = 0 for i # j, then Ef = Ei for each i., (b) In the case k = 2, prove the converse of (a). That is, if EI + Ez = Z and, Ef = El, Ez = Ez, then EIEz = 0., 8. LetEl,...,, , 9. Let V be a real vector space and E an idempotent, linear operator, Find (I + E)-1., a projection., Prove that (I + E) is invertible., , on V, i.e.,, , 10. Let F be a subfield of the complex, vector, Let V be a finite-dimensional, , numbers (or, a field of characteristic, zero)., space over F. Suppose that El, . . . , Ek, are projections of V and that El + . . . + Ek = Z. Prove that EiEi = 0 for i # j, (N&t: Use the trace function and ask yourself what the trace of a projection, is.), , 11. Let V be a vector space, let WI, . . . , Wk be subspaces of V, and let, vj = WI+, , . ’ . + wj-1 + Wit1 + . * . + Wk., , Suppose that V = WI @ . . . @ Wk. Prove that the dual space V* has the directsum decomposition, V* = Vy @ . . . @ V”,., , 6.7., , Invariant, , Direct, , V =, are primarily, interested, in direct-sum, decompositions, Wl@ ... @ Wk, where each of the subspaces Wi is invariant under some, given linear operator T. Given such a decomposition, of V, T induces a, linear operator Ti on each Wi by restriction. The action of T is then this., We, , Sums
Page 222 :
Elementary Canonical Forms, , Chap. 6, , If Q is a vector in V, we have unique vectors q . . . , ak with ai in Wi such, that, a = a1 + . . . + CQ, and then, Tcu = Tlcrl + . . . + Tkac,., sum of the, We shall describe this situation by saying that T is the direct, operators T1, . . . , Tk. It must be remembered in using this terminology, that the Ti are not linear operators on the space V but on the various, subspaces Wi. The fact that V = WI @ . . . @ Wk enables us to associate, with each Q: in V a unique k-tuple (q . . . , CYQ)of vectors LYEin Wi (by a: =, a1 + ... + ak) in such a way that we can carry out the linear operations, in V by working in the individual, subspaces Wi. The fact, that each Wi is, invariant, under 7’ enables us to view the action of T as the independent, action of the operators Ti on the subspaces Wi. Our purpose is to study T, by finding invariant, direct-sum decompositions in which the Ti are operators of an elementary nature., Before looking at an example, let us note the matrix analogue of this, situation. Suppose we select an ordered basis @i for each Wi, and let 03, be the ordered basis for V consisting of the union of the 8Ji arranged in, the order &, . . . , 03k, so that & is a basis for V. From our discussion, CcJnCernirlg, the matrix analogue for a single invariant, subspace, it is easy, to see that if A = [T]a and Ai = [Ti]a;, then A has the block form, , (6-14), , In (6-14), Ai is a di X & matrix (di = dim W,), and the O’s are symbols, for rectangular, blocks of scalar O’s of various sizes. It also seems appropriate to describe (6-14) by saying that A is the direct sum of the matrices, Al, . . . , Ak., Most often, we shall describe the subspace Wi by means of the associated projections Ei (Theorem 9). Therefore, we need to be able to phrase, the invariance of the subspaces Wi in terms of the Ei., Theorem, 10. Let T be a linear, operator on the space V, and let, WI, . . . ) Wk and El, . . . , Ek be as in Theorem 9. Then a necessary and, suficient condition that each subspace Wi be invariant, under T is that T, commute with each of the projections Ei, i.e.,, , TEi, Proof., Ejcr = (Y, and, , = EiT,, , Suppose T commutes, , i=l, , k., ,‘**I, with each Ei. Let a! be in Wj. Then, , Ta! = T(E+), = Ej(Ta)
Page 223 :
Sec. 6.7, , Invariant, , Direct Sums, , which shows that TCXis in the range of Ej, i.e., that Wj is invariant under T., Assume now that each Fl’i is invariant, under T. We shall show that, TEj = EjT. Let (Y be any vector in V. Then, CY= Ela + . . . + Ekol, TCY = Z’E,cu + . . . + TEkcr., Since E,a is in W;, which is :mvariant, E$; for some vector pi. Then, , under, , T, we must have T(Eia), , =, , 0, if i # j, = f Eiflj,, if i = j., Thus, EjT~, , = EjTEla, = Lsjpj, = TEja., , This holds for each a: in V, so EjT, , + ’ . . + EjT&(r, , = Th’i., , 1, , We shall now describe a diagonalizable operator T in the language of, invariant direct sum decompositions, (projections which commute with T)., This will be a great help to us in understanding, some deeper decomposition, theorems later. The reader may feel that the description which we are, about to give is rather complicated, in comparison to the matrix formulation or to the simple statement that the characteristic vectors of T span the, underlying space. But, he should bear in mind that this is our first glimpse, at a very effective method, by means of which various problems concerned, with subspaces, bases, matrices, and the like can be reduced to algebraic, calculations with linear operators. With a little experience, the efficiency, and elegance of this method of reasoning should become apparent., Theorem, 11. Let T be a linear operator on a finite-dimensional, space v., If T is diagonakable, and if cl, . . . , ck are the distinct characteristic, values of T, then there exist lir;,ear operators El, . . . , Ek on V such that, , (i), (ii), (iii), (iv), (v), , T = clEl + . .. + I:~E~;, I = E1+ ... +Ek;, EiEj = 0, i # j;, EP = Ei (Ei is a projection);, the range of Ei is the characteristic, , space for T associated with Ci., , Conversely, if there exist k distinct scalars cl, . . . , ck and k non-zero, linear operators E1, . . . , Ek which satisfy conditions (i), (ii), and (iii), then, T is diagonalizable, cl, . . . , cl< are the distinct characteristic values of T, and, conditions (iv) and (v) are satisfied also., Proof., , Suppose that, , T is diagonalizable,, , with, , distinct, , charac-, , 215
Page 224 :
Elementary, , Canonical, , Forms, , Chap. 6, , teristic values cl, . . . , ck. Let Wi be the space of characteristic, associated with the characteristic value c;. As we have seen,, v = WI@, , VeCtOrS, , * * * @ Wk., , Let E1, . . . , Ek be the projections associated with this decomposition,, in Theorem 9. Then (ii), (iii), (iv) and (v) are satisfied. To verify, proceed as follows. For each (Y in V,, , as, (i),, , a = Ela! + 1. . + Eka!, and so, , Ta = TEw + ’ . . -+ TEN, = C~EKX+ . . . + c~Ra., In other words, T = clEl + 1. ’ t ckER., Now suppose that we are given a linear operator T along with distinct, scalars ci and non-zero operators Ei which satisfy (i), (ii) and (iii). Since, E’;Ei = 0 when i # j, we multiply both sides of I = El -i- . . . -l- Ek by, E; and obtain immediately Ef = Ei. Multiplying, T = clEl $ . . . 7L ckEk, by Ei, we then have TEi = c<E<,which shows that any vector in the range, of Ei is in the null space of (T - cJ). Since we have assumed that Ei # 0,, this proves that there is a non-zero vector in the null space of (T - cd),, i.e., that ci is a characteristic value of T. Furthermore,, the ci are all of the, characteristic values of T; for, if c is any scalar, then, , T - d = (~1 - c)El + . . + + (a - c)Ek, if (T - cl)cr = 0, we must have (ci - c)Eia = 0. If a! is not the zero, vector, then Eia # 0 for some i, SO that for this i we have ci - c = 0., T is diagonalizable, since we have shown that every nonCertainly, zero vector in the range of Ei is a characteristic vector of T, and the fact, that I = E1 f . . . + Ek shows that these characteristic vectors span V., All that remains to be demonstrated, is that the null space of (T - cJ) is, exactly the range of EC. But this is clear, because if Ta = cia, then, , SO, , k, , 2 (cj - ci)Eja = 0, , j=l, , hence, (Ci - ci)Eict = 0, , for each j, , and then, , Ep = 0,, , j # i., , Since Q( = Ela + . . . + Eka, and Eja = 0 for j # i, we have (Y = Eia,, which proves that (Y is in the range of Ei. 1, One part of Theorem 9 says that for a diagonal&able, operator T,, the scalars cl, . . . , ck and the operators El, . . . , EI, are uniquely determined by conditions (i), (ii), (iii), the fact that the ci are distinct, and, the fact that the Ei are non-zero. One of the pleasant features of the
Page 225 :
Invariant, , Sec. 6.7, , Direct, , decomposition, T = clEl + . . . + CEE~is that if g is any polynomial, the field F, then, g(T), , Sums, over, , = ~dn)E, + . . . + gW&., , We leave the details of the proof to the reader. To see how it is proved, need only compute Tr for each positive integer r. For example,, , one, , T2 = ;: c;Ei ; cjEj, i=l, , =, , i tl, , j=l, , j il, , CicjEiEj, , = ; cfEi., i=l, , The reader should compare this with g(A) where A is a diagonal matrix;, for then g(A) is simply the (diagonal matrix with diagonal entries g(An),, , . . . ) dAm)., We should like in particular to note what happens when one applies, the Lagrange polynomials corresponding, to the scalars cl, . . . , ck:, , We have pi(~), , = &, which :means that, , pj(T) = i, i=l, , sijEi, , = Ej., Thus the projections, , Ej not only commute with T but are polynomials, , in, , T., Such calculations with polynomials, in T can be used to give an, alternative, proof of Theorem 6, which characterized diagonalizable, operators in terms of their minimal polynomials., The proof is entirely independent of our earlier proof., 1’ = clEl + . . . + ckEk, then, If T is diagonalizable,, g(T), , = g(c1)I-h + 1. . + gW%, , for every polynomial g. Thu,s g(T) = 0 if and only if g(ci) = 0 for each i., In particular, the minimal p’alynomial for T is, p=, , (5-Q), , ..., , (II:-&)., , p =, Now suppose T is a linear operator with minimal polynomial, (x - Cl) . . . (x - ck), where cl, . . . , ck are distinct elements of the scalar, field. We form the Lagrange polynomials, , 217
Page 226 :
218, , Chap. 6, , Elementary Canonical Forms, , We recall from Chapter 4 that pi(~) = &j and for any polynomial, degree less than or equal to (k - 1) we have, g, , Taking, , = g(c1)p1 + * . . + g(clc)p?c., , g to be the scalar polynomial, 1, , (6-15), , g of, , x, , =, , PI, , =, , clpl, , 1 and then the polynomial, , +, , ’ ’ ’, +, , ”, , +, ’, , 2, we have, , pk, +, , Ckpk., , (The astute reader will note that the application, to x may not be valid, because k may be 1. But if k = 1, T is a scalar multiple of the identity a,nd, hence diagonalizable.), Now let Ej = pj(T). From (6-15) we have, I = El + . * * + Ek, T = clEl + . . . + CkEk., , (6-16), , Observe that if i # j, then pipj is divisible by the minimal, because pipj contains every (ZE- c,) as a factor. Thus, EiEj, , (6-17), , = 0,, , polynomial, , p,, , i #j., , We must note one further thing, namely, that Ei # 0 for each i. This, is because p is the minimal polynomial, for T and so we cannot have, pi(T) = 0 since pi has degree less than the degree of p. This last comment,, together with (6-16), (6-17), and the fact that the ci are distinct enables us, to apply Theorem 11 to conclude that T is diagonalizable., fl, , Exercises, 1. Let E be a projection of V and let T be a linear operator on V. Prove that the, range of E is invariant under T if and only if ETE = TE. Prove that both the, under T if and only if ET = TE., range and null space of E are invariant, 2. Let T be the linear, basis is, , operator, , on R2, the matrix, , of which in the standard, , ordered, , 2 1, 0 2’, , II 1, of R2 spanned by the vector e1 = (1,O)., (a) Prove that WI is invariant under T., , Let WI be the subspace, , (b) Prove that there is no subspace, is complementary, to WI:, , W2 which is invariant, , under, , T and which, , R2 = WI @ Wz., (Compare, , with Exercise, , 1 of Section, , 6.5.), , vector, 3. Let T be a linear operator on a finite-dimensional, the range of T and let N be the null space of T. Prove that, pendent if and only if V = R @ N., , space V. Let R be, R and N are inde-
Page 227 :
Sec. 6.8, , The Primary Decomposition, , Theorem, , 4. Let T be a linear operator on V. Suppose V = WI @ . . . @ WA, where each, Wi is invariant, under T. Let Ti be the induced (restriction), operator on Wi., (a) Prove that det (T) = clet (T,) . . . det (Tk)., (b) Prove that the characteristic, polynomial, for f is the product of the charaeteristic polynomials, for fi, . . . , f,., (c), Prove that the minimal, polynomial, for T is the least common multiple, of the minimal, polynomials, for T, . . . , Tk. (Hint: Prove and then use the corresponding facts about direct sums of matrices.), 5. Let T be the diagonalizable, linear operator, on R3 which we discussed in, Example 3 of Section 6.2. Use l;he Lagrange polynomials, to write the representing, matrix A in the form A = EI -I- 2Ez, E1 + Ez = I, EIEP = 0., 6. Let A be the 4 X 4 matrix in Example 6 of Section 6.3. Find matrices El, Es, Es, such that A = clEl + CUTQ + c~EZ, El + Et + Es = I, and EiEj = 0, i # j., 7. In Exercises 5 and 6, notice that (for each i) the space of characteristic, vectors, associated with the characteristic, value ci is spanned by the column vectors of the, various matrices Ef with j # i. Is that a coincidence?, 8. Let T be a linear operator on V which commutes, on V. What can you say about T?, , with every projection, , operator, , 9. Let V be the vector space of continuous, real-valued, functions on the interval, [--I, I] of the real line. Let lV, be the subspace of even functions, f(-X), = f(z),, and let W, be the subspace of ‘odd functions, f( -x) = -f(x)., (a) Show that V = W, @ W,., (b) If T is the indefinite, integral operator, , (W(z) = I;: f(t) dt, are W, and W, invariant, , unde.r T?, , 6.8., , The, , Primary, , Decomposition, , Theorem, , We are trying, to study a linear operator T on the finite-dimensional, Y’ into a direct, sum of operators, which are in, space V, by decomposing, some sense elementary. We can do this through the characteristic, values, polynomial, and vectors of T in certain special cases, i.e., when the minimal, for T factors over the scalar field F into a product of distinct manic polynomials of degree 1. What can we do with the general T? If we try to study, T using characteristic values, we are confronted with two problems. First,, value;, this is really a deficiency, in, T may not have a single ch.aracteristic, the scalar field, namely, that it is not algebraically, closed. Second, even if, the characteristic polynomial factors completely over F into a product of, polynomials, , of degree, , 1, there, , may, , not be enough, , T to span the space V; this is clearly a deficiency, , characteristic, , vectors, , for, , in T. The second situation, , 619
Page 228 :
Elementary Canonical Forms, , Chap. 6, , is illustrated, by the operator, standard basis by, , T on F3 (F any field), , represented, , in the, , 20, , 0, 0., [ 0 0 -1 I, The characteristic polynomial for A is (Z - 2)2(~ + 1) and this is plainly, also the minimal polynomial for A (or for T). Thus T is not diagonalizable., One sees that this happens because the null space of (T - 21) has dimension 1 only. On the other hand, the null space of (T + I) and the null space, of (T - 21)’ together span V, the former being the subspace spanned by, ~3 and the latter the subspace spanned by ~1 and ~2., This will be more or less our general method for the second problem., If (remember this is an assumption), the minimal polynomial, for T decomposes, p = (z - cp . * - (cc - c#, A=12, , where cl, . . . , ck are distinct elements of F, then we shall show that the, space V is the direct sum of the null spaces of (T - eJ)T*, i = 1, . . . , Ic., The hypothesis about p is ecmivalent to the fact that T is triangulable, (Theorem 5) ; however, that knowledge will not help us., The theorem which we prove is more general than what we have, described, since it works with the primary decomposition, of the minimal, polynomial, whether or not the primes which enter are all of first degree., The reader will find it helpful to think of the special case when the primes, are of degree 1, and even more particularly,, to think of the projection-type, proof of Theorem 6, a special case of this theorem., Theorem, , 12 (Primary, , Theorem)., Let T be a linear, vector space V over the field F. Let p be the, , Decomposition, , operator on the Jinite-dimensional, minimal polynomial for T,, , p = p;’ . . . pB, where the pi are distinct irreducible monk polynomials over F and the ri are, positive integers. Let Wi be the null space of pi(T)“, i = 1, . . . , k. Then, (i) V = W1 @ . . . @ Wk;, (ii) each Wi is invariant under T;, (iii) if Ti is the operator induced on Wi by T, then the minimal, nomial for Ti is pf’., , poly-, , Proof. The idea of the proof is this. If the direct-sum decomposition (i) is valid, how can we get hold of the projections El, . . . , El, associated with the decomposition? The projection Ei will be the identity on Wi, and zero on the other Wi. We shall find a polynomial hi such that hi(T) is, the identity on Wi and is zero on the other Wi, and so that hi(T) + . . . -Ihk(T) = I, etc.
Page 229 :
Sec. 6.8, , The Primary Decomposition Theorem, , For each i, let, , .fi = $, = jJIi p;‘., Since pl, . . . , pk are distinct prime polynomials, the polynomialsfi,, . . . , fk, are relatively prime (Theorem 10, Chapter 4). Thus there are polynomials, g1, * * * 7 gk such that, 5 figi = 1., i=l, , Note also that if i # j, then f$f is divisible by the polynomial p, because, f$j contains each p’; as a factor. We shall show that the polynomials, hi = figi behave in the manner described in the first paragraph of the proof., Let E’, = hi(T) = fi(T)qi(T)., Since h1 + . . . + hk = 1 and p divides, fifj for i # j, we have, EiEj, , El + . . . +Ek=I, =: 0,, if i # j., , Thus the Ei are projections, which correspond to some direct-sum, decomposition of the space V. We wish to show that the range of Ei is exactly, the subspace Wi. It is clear tQat each vector in the range of Ei is in Wi, for, if CYis in the range of Ei, then LY = E~CYand so, pi(T)“ta, , = pi(T)7iEia, =, , Ei(T)‘fi,(T)gi(T)a, , =, , because prfigi is divisible by the minimal polynomial, p. Conversely,, suppose that a is in the null space of pi( !Z’)‘t. If j # i, then fjgj is divisible, by p; and so fi(5”)gj(Z’)a, = 0, i.e., Eia = 0 for j # i. But then it is immediate that E;CY = CY,i.e., that a is in the range of Ei. This completes the, proof of statement (i)., It is certainly clear that the subspaces Wi are invariant, under T., If Ti is the operator induced on Wi by T, then evidently, pi(Ti)‘i, = 0,, because by definition pi(T)‘& is 0 on the subspace Wi. This shows that the, minimal polynomial for T; divides pp. Conversely, let g be any polynomial, such that g(Ti) = 0. Then g(T)fi(T), = 0. Thus gfi is divisible by the, minimal polynomial p of T:, i.e., p;“fi divides gfi. It is easily seen that p;i, divides g. Hence the minimal polynomial for Ti is p:‘. 1, Corollary., If El, . . . , Ek are the projections associated with the primary, decomposition of T, then each Ei is a polynomial in T, and accordingly if a, linear operator U commutes with T then U commutes with each of the Ei, i.e.,, each subspace Wi is invariant under U., In the notation of the proof of Theorem, special case in which the minimal polynomial, , 12, let us take a look at the, for T is a product of first-, , 221
Page 230 :
222, , Elementary Canonical Forms, , Chap. 6, , degree polynomials,, i.e., the case in which each pi is of the form, pi = x - ci. NOW the range of Ei is the null space Wi of (7’ - cJ)7i., Let us put II = clEl + . . . + c&k. By Theorem 11, D is a diagonalpart of T. Let us, izable operator which we shall call the diagonalizable, look at the operator N = T - D. Now, T = TE, + . . . + TE,, D = clEl + . . . + QEE, so, N = (T - cJ)El, The reader should be familiar, sees that, , + . . . + (T - crI)Ek., , enough with, , projections, , by now so that he, , N2 = (T - cJ)~E~ + . . . + (T - c,J)?E,,, and in general that, W = (T - cJ)~E~ + . . . + (T - cJ)~E~., When r 2 ri for each i, we shall have Nr = 0, because the operator, (T - cJ)’ will then be 0 011 the range of EC., DeJinition., , that N is nilpotent, , Let N be a linear operator on the vector space V. We say, if there is some positive integer r such that Nr = 0., , Theorem, 13. Let T be a linear operator on the finite-dimensional, vector, space V over the field F. Suppose that the minimal polynomial for T decomposes over F into a product of linear polynomials. Then there is a diagonalizable operator D on V and a nilpotent operator N on V such that, , (i) T = D + N,, (ii) DN = ND., The diagonalixable operator D and the nilpotent operator N are uniquely, determined by (i) and (ii) and each of them is a polynomial in T., Proof. We have just observed that we can write T = D + N, where D is diagonalizable and N is nilpotent, and where D and N not only, commute but are polynomials in T. Now suppose that we also have T =, D’ + N’ where D’ is diagonalizable,, N’ is nilpotent,, and D’N’ = N’D’., We shall prove that D = D’ and N = N’., Since D’ and N’ commute with one another and T = D’ + N’, we, see that D’ and N’ commute with T. Thus D’ and N’ commute with any, polynomial in T; hence they commute with D and with N. Now we have, D+N=D’+N’, or, D-D’=N’-N, and all four of these operators commute with one another. Since D and D’, are both diagonalizable, and they commute, they are simultaneously
Page 231 :
Sec. 6.8, , The Primary Decomposition Theorem, , diagonalizable,, and D - D” is diagonalizable., Since N and N’ are both, nilpotent, and they commute, the operator (N’ - N) is nilpotent;, for,, using the fact that N and 2L” commute, (N’ - N)? = j., , (5), , (N’)+(-N)j, , and so when r is sufficiently, large every term in this expression for, (N’ - N)? will be 0. (Actually, a nilpotent operator on an n-dimensional, space must have its nth power 0; if we take r = 2n above, that will be, large enough. It then follows that r = n is large enough, but this is not, obvious from the above expression.) Now D - D’ is a diagonalizable, operator which is also nilpotent. Such an operator is obviously the zero, operator; for since it is nilpa’tent, the minimal polynomial for this operator, is of the form xr for some r 2: m; but then since the operator is diagonalizable, the minimal polynomial cannot have a repeated root; hence r = 1 and, the minimal polynomial is si.mply x, which says the operator is 0. Thus we, see that D = D’ and N = N’., 1, Corollary., Let V be a jinite-dimensional, vector space over, ically closed field F, e.g., the jield of complex numbers. Then, operator T on V can be written as the sum of a diagonalizable, and a nilpotent operator N which commute. These operators D, unique and each is a polynomial in T., , an algebraevery linear, operator D, and N are, , From these results, one sees that the study of linear operators on, vector spaces over an algebraically, closed field is essentially reduced to, the study of nilpotent operators. For vector spaces over non-algebraically, closed fields, we still need to find some substitute for characteristic values, and vectors. It is a very interesting fact that these two problems can be, handled simultaneously, and this is what we shall do in the next chapter., In concluding this section, we should like to give an example which, illustrates some of the idea,s of the primary decomposition, theorem. We, have chosen to give it at the end of the section since it deals with differential, equations and thus is not purely linear algebra., EXAMPLE, 14. In the primary decomposition theorem, it is not necessary that the vector space V be finite dimensional, nor is it necessary for, parts (i) and (ii) that p be the minimal polynomial for T. If T is a linear, operator on an arbitrary vector space and if there is a manic polynomial, p such that p(T) = 0, then parts (i) and (ii) of Theorem 12 are valid for T, with the proof which we gave., Let n be a positive integer and let V be the space of all n times continuously, differentiable, functions j on the real line which satisfy the, differential, equation
Page 232 :
224, , Elementary Canonical Forms, , Chap. 6, , where ao, . . . , a,-1 are some fixed constants. If C, denotes the space of, n times continuously differentiable, functions, then the space V of solutions, of this differential, equation is a subspace of C,. If D denotes the diff erentiation operator and p is the polynomial, p = Zn + u,&P-l, , + * * * + u1z + &l, , then V is the null space of the operator p(D), because (6-18) simply says, p(D)f = 0. Therefore,, V is invariant, under D. Let us now regard D as a, linear operator on the subspace V. Then p(D) = 0., If we are discussing differentiable, complex-valued, functions, then C,, and V are complex vector spaces, and uo, . . . , u,-~ may be any complex, numbers. We now write, p = (z - c1y * * . (z - cp, where cl, . . . , ck are distinct complex numbers., (D - cJ)‘f, then Theorem 12 says that, v = Wl@, , If wj is the null space of, , . . * @Wk., , In other words, if f satisfies the differential, uniquely expressible in the form, , equation, , (6-B),, , then f is, , where fj satisfies the differential, equation (D - ~J)~jfj = 0. Thus, the, study of the solutions to the equation (6-18) is reduced to the study of, the space of solutions of a differential, equation of the form, (6-19), , (D - cl)7, , = 0., , This reduction has been accomplished by the general methods of linear, algebra, i.e., by the primary decomposition theorem., To describe the space of solutions to (B-19), one must know something, about differential, equations, that is, one must know something about D, other than the fact that it is a linear operator. However, one does not need, to know very much. It is very easy to establish by induction on r that if f, is in C, then, (D - cI)y = ectDr(e-+tf), that is,, d, df, -cf(t) = ect ;Ei (e-Ctf), etc., &, Thus (D - c1)y = 0 if and only if Dr(e-ctf) = 0. A function g such that, D’g = 0, i.e., d’g/& = 0, must be a polynomial function of degree (r - 1), or less:, g(t) = blJ + bit + * . . + b,ltv-l.
Page 233 :
Sec. 6.8, , The Primary, , Decomposition, , Theorem, , Thus f satisfies (6-19) if and only if f has the form, j(t), , = eYbo + bit + . . . + b,4Y1)., , Accordingly, the ‘functions’ I& tect, . . . , tr--lect span the space of solutions, of (6-19). Since 1, t, . . . , t’-” ire linearly independent, functions and the., exponential function has no zeros, these r functions tiect, 0 5 j 5 r - 1,, form a basis for the space of solutions., Returning to the differential, equation (6-18), which is, PW, = 0, p = (z - c1p * * * (Lx - &)“I, we see that the n functions tm@it, 0 < m 5 rj - 1, 1 5 j _< k, form a, basis for the space of solutions to (6-18). In particular, the space of solutions, is finite-dimensional, and has dimension equal to the degree of the polynomial p., , Exercises, 1. Let T be a linear operator Ion R3 which is represented in the standard ordered, basis by the matrix, [,%, , ;J, , 21., , Express the minimal polynomial p for T in the form p = plpz, where pl and p,, are manic and irreducible over the field of real numbers. Let Wi be the null space, of p&T). Find bases G& for the spaces Wi and Wz. If Ti is the operator induced on, Wi by T, find the matrix of Ti in the basis 03i (above)., 2. Let T be the linear operator on R3 which is represented by the matrix, 3 1 -1, 2 2 -1, 22, 0, , [ 1, , in the standard ordered basis. Show that there is a diagonalizable operator D, on R3 and a nilpotent operator N on R3 such that T = D + N and DN = ND., Find the matrices of D and N in the standard basis. (Just repeat the proof of, Theorem 12 for this special case.), 3. If V is the space of all polynomials of degree less than or equal to n over a, field F, prove that the differentiation operator on V is nilpotent., 4. Let T be a linear operator on the finite-dimensional space V with characteristic, polynomial, f = (z - Cl)dl *. . (cc- C/p, and minimal polynomial, p = (z - CJ” . . . (Lx - Ck)‘X., , Let IV< be the null space of (T - cJ)“.
Page 234 :
226, , Elementary Canonical Forms, , Chap. 6, , (a) Prove that Wi is the set of all vectors Q! in V such that, for some positive integer m (which may depend upon a)., (b) Prove that the dimension, of Wi is die (Hint: If Ti is the, thus the characteristic, on Wi by T, then Ti - CJ is nilpotent;, Ti - CJ must be Z~C where ei is the dimension of Wi (proof?);, teristic polynomial, of Ti is (Z - Ci)ei; now use the fact that, polynomials, polynomial, for T is the product of the characteristic, that ei = di.), , (!!’ -, , cJ)%, , = 0, , operator induced, polynomial, for, thus the characthe characteristic, of the Ti to show, , vector space over the field of complex numbers., 5. Let V be a finite-dimensional, part of T. Prove, Let T be a linear operator on V and let D be the diagonalizable, that if g is any polynomial, with complex coefficients, then the diagonalizable, part, of g(T) is g(D)., 6. Let V be a finite-dimensional, vector space over the field F, and let T be a, linear operator on V such that rank (T) = 1. Prove that either T is diagonalizable, not both., or T is nilpotent,, 7. Let V be a finite-dimensional, on V. Suppose that T commutes, Prove that T is a scalar multiple, , vector space over F, and let T be a linear operator, with every diagonalizable, linear operator on 8., of the identity, operator., , 8. Let V be the space of n X n matrices over a field F, and let A be a fixed n X n, T on V by T(B) = AB - BA. Prove, matrix, over F. Define a linear operator, operator., that if A is a nilpotent, matrix, then T is a nilpotent, 9. Give an example of two 4 X 4 nilpotent, matrices which have the same minimal, polynomial, (they necessarily have the same characteristic, polynomial), but which, are not similar., 10. Let T be a linear operator on the finite-dimensional, space V, let p = pi . . . pS, be the minimal, polynomial, for T, and let V = WI @ . . . @ Wk be the primary, Let W be any subspace, decomposition, for T, i.e., Wi is the null space of pin., under T. Prove that, of V which is invariant, , w = (w n w,) 0 (w n w,) 0 . . . 0 (w n w,)., 11. What’s, wrong with the following, proof of Theorem, 13? Suppose that the, 5,, minimal, polynomial, for T is a product of linear factors. Then, by Theorem, T is triangulable., Let a3 be an ordered basis such that A = [T]oj is upper-triangular., Let D be the diagonal matrix with diagonal entries all, . . . , arm. Then A = D + N,, where N is strictly upper-triangular., Evidently, N is nilpotent., 12. If you thought, about Exercise 11, think about it again, after you observe, what Theorem 7 tells you about the diagonalizable, and nilpotent, parts of T., 13. Let T be a linear operator on V with minimal, polynomial, of the form p”,, where p is irreducible, over the scalar field. Show that there is a vector (Y in V, such that the T-annihilator, of cy is pm., 14. Use the primary decomposition, theorem and the result of Exercise 13 to prove, the following., If T is any linear operator on a finite-dimensional, vector space V,, t’hen there is a vector cy in V with T-annihilator, equal to the minimal, polynomial, for T., 15. If N is a nilpotent, linear operator on an n-dimensional, the characteristic, polynomial, for N is ZIF., , vector, , space V, then
Page 235 :
7. The Rational, and Jordan, , 7.1., , Cyclic, , Subspaces, , Forms, , and, , Annihilators, , Once again I’ is a finite-dimensional, vector space over the field F, and T is a fixed (but arbitrary), linear operator on Tr. If CYis any vector, in V, there is a smallest subspace of I’ which is invariant, under T and, contains a. This subspace can be defined as the intersection, of all Tinvariant subspaces which contain a; however, it is more profitable at the, moment for us to look at things this way. If W is any subspace of V which, is invariant under T and co.atains LY,then W must also contain the vector, Tot; hence W must contain T(Tar) = T2a, T(T2ar) = T3a, etc. In other, words W must contain g(T),2 for every polynomial g over F. The set of all, vectors of the form g(T)a, with g in F[s], is clearly invariant under T, and, is thus the smallest T-invariant, subspace which contains cy., If o( is any vector in V, the T-cyclic, subspace, generated, is the subspace Z(a; 7‘) of all vectors of the form g(T)cu, g in F[x]., for T., If Z(a; T) = V, then a! is called a cyclic vector, DeJinition., , by, , (Y, , Another way of describing the subspace Z(ar; T) is that Z(cu; T) is, the subspace spanned by the vectors Tkoz, lc 2 0, and thus (Y is a cyclic, vector for T if and only if these vectors span V. We caution the reader, that the general operator T has no cyclic vectors., EXAMPLE 1. For any 3”, the T-cyclic subspace generated by the zero, vector is the zero subspace. The space Z(cu; T) is one-dimensional, if and, only if (Y is a characteristic, vector for T. For the identity operator, every
Page 236 :
698, , The Rational and Jordan Forms, , Chap. 7, , non-zero vector generates a one-dimensional, cyclic subspace; thus, if, dim V > 1, the identity operator has no cyclic vector. An example of an, operator which has a cyclic vector is the linear operator T on F2 which is, represented in the standard ordered basis by the matrix, , [0 01, 10’, , Here the cyclic vector (a cyclic vector) is el; for, if B = (a, b), then with, g = a + bx we have /? = g(T)el. For this same operator T, the cyclic, subspace generated by ~2 is the one-dimensional, space spanned by t2,, because ~2 is a characteristic vector of T., For any T and CU,we shall be interested in linear relations, ~,,a + CITE + . +. + ckTka = 0, between the vectors Tim, that is, we shall be interested in the polynomials, g = co + ClX + . . . + ckxlCwhich have the property that g(T)a! = 0. The, set of all g in F[x] such that g(T)oc = 0 is clearly an ideal in F[x]. It is also, a non-zero ideal, because it contains the minimal polynomial, p of the, operator T (p(T)& = 0 for every (Y in V)., Definition., If (Y is any vector in V, the T-annihilator, of cyis the ideal, M(cr; T) in F[x] consisting of all polynomials g over F such that g(T)a = 0., The unique monk polynomial pol which generates this ideal will also be, of (Y., called the T-annihilator, , p, divides the minimal, As we pointed out above, the T-annihilator, polynomial of the operator T. The reader should also note that deg (pa) > 0, unless 01is the zero vector., 1. Let (Y be any non-zero, , Theorem, , T-annihilator, , vector in V and let pal be the, , of a., , (i) The degree of pa is equal to the dimension of the cyclic subspace, Z(a; T)., (ii) If the degree of pa is k, then the vectors LY, Tar, T’b, . . . , Tk-b, form a basis for Z(cu; T)., (iii) If U is the linear operator on Z(CX; T) induced by T, then the minimal, polynomial for U is pa., Proof., , Let g be any polynomial, , over the field F. Write, , g = 2-w + r, where either r = 0 or deg (r) < deg (pa) = lc. The polynomial, the T-annihilator, of a, and so, g(T)a!, , p,q is in, , = r(T)cr., , Since r = 0 or deg (r) < k, the vector r(T)cy is a linear combination, the vectors a, Ta, . . . , Tk-k, and since g(T)a is a typical vector, , of, in
Page 237 :
Sec. 7.1, , Cyclic Subspaces and Annihilators, , Z(cr; T), this shows that these k vectors span Z(a; T). These vectors are, certainly linearly independent,, because any non-trivial, linear relation, between them would give us a non-zero polynomial g such that g(T)a! = 0, and deg (g) < deg (p,), which is absurd. This proves (i) and (ii)., Let U be the linear operator on Z(CK; T) obtained by restricting T to, that subspace. If g is any polynomial over F, then, M-OdT)~, , =, =, =, =, , pa(TMTb, gU’)pd%, g(T)0, 0., , Thus the operator pa(U) sends every vector in Z(cu; T) into 0 and is the, zero operator on Z(ar; T). IFurthermore,, if h is a polynomial, of degree, less than k, we cannot have h(U) = 0, for then h(U)cr = h(T)a! = 0,, contradicting, the definition. of p,. This shows that p, is the minimal, polynomial for U. 1, A particular consequence of this theorem is the following: If a! happens, to be a cyclic vector for T, then the minimal polynomial for T must have, degree equal to the dimension of the space V; hence, the Cayley-Hamilton, theorem tells us that the minimal polynomial, for T is the characteristic, polynomial for T. We shall prove later that for any T there is a vector a! in, V which has the minimal polynomial of T for its annihilator., It will then, follow that T has a cyclic vector if and only if the minimal and characteristic polynomials for T are identical. But it will take a little work for us, to see this., Our plan is to study t,he general T by using operators which have a, cyclic vector. So, let us take a look at a linear operator U on a space W, of dimension k which has a cyclic vector o(. By Theorem 1, the vectors, a, . . . , V-%x form a basis for the space W, and the annihilator, p, of OL, is the minimal polynomial, for U (and hence also the characteristic, polynomial for U). If we let c;yi= V--la, i = 1, . . . , k, then the action of U, on the ordered basis @ = {al, . . . , o(k) is, (7-l), , Uffi = o(i+lj, i=l, UCtk = -c,,(Yl - clol2 -, , ..,lC-1, ’ ’ . - ck-1cYk, , where p, = co -i- clz i- . . + -t- ck-1d+1 i- xk., follows from the fact that p,(U)a! = 0, i.e.,, , The, , eXpreSSiOn, , u”a + ck--] Uk-‘a! + ’ ’ ’ + cl uff + c,,ff = 0., This says that the matrix, , of U in the ordered basis 63 is, --co, -cl, , (7-2), , I$, , ii, , ia, .i, , -c2, , :], , *, -ck-1, , *, , for, , uak, , 629
Page 238 :
The Rational and Jordan Forms, The matrix, nomial p,., , Chap. 7, , (7-2) is called the companion, , matrix, , of the manic, , poly-, , Theorem, 2. If U is a linear operator on the jinite-dimensional, space, W, then U has a cyclic vector if and only if there is some ordered basis for W, in which U is represented by the companion matrix of the minimal polynomial, , for u., Proof. We have just observed that if U has a cyclic vector, then, there is such an ordered basis for W. Conversely, if we have some ordered, basis ((~1, . . . , CY~} for W in which U is represented by the companion, matrix of its minimal polynomial,, it is obvious that (Ye is a cyclic vector, for U. 1, Corollary., If A is the companion matrix of a manic polynomial, then p is both the minimal and the characteristic polynomial of A., , p,, , Proof. One way to see this is to let U be the linear operator on, Fk which is represented by A in the standard ordered basis, and to apply, Theorem 1 together with the Cayley-Hamilton, theorem. Another method, is to use Theorem 1 to see that p is the miuimal polynomial for A and to, verify by a direct calculation that p is the characteristic, polynomial, for, A. I, One last comment-if, T is any linear operator on the space V and, cy is any vector in V, then the operator U which T induces ou the cyclic, subspace Z(C~; T) has a cyclic vector, namely, LY. Thus Z(a; T) has an, ordered basis in which U is represented by the companion matrix of p,,, the T-annihilator, of CL, , Exercises, 1. Let T be a linear operator on F2. Prove that any non-zero vector which is not, a characteristic vector for T is a cyclic vector for T. Hence, prove that either T, has a cyclic vector or T is a scalar multiple of the identity operator., 2. Let T be the linear operator on R3 which is represented in the standard ordered, basis by the matrix, 20, 02, , 0, 0., , [ 1, 0 0, , -1, , Prove that T has no cyclic vector. What is the T-cyclic subspace generated by the, vector (1, -1, 3)?, 3. Let T be the linear operator on C3 which is represented in the standard ordered, basis by the matrix, [-i, , 4, , -(ii.
Page 239 :
Sec. 7.2, Find, , Cyclic Decompositions, , the T-annihilator, , of the vector, , (1, 0, 0). Find, , 4. Prove that if T2 has a cyclic vector,, true?, , and the Rational, , the T-annihilator, , then T has a cyclic vector., , Form, , 631, , of (1, 0, i)., Is the converse, , 5. Let V be an n-dimensional, vector space over the field F, and let N be a nilpotent, linear operator on V. Suppose Nn-1 # 0, and let cy be any vector in V such that, Nn-‘ar # 0. Prove that (Y is a cyclic vector for N. What exactly is the matrix of N, in the ordered basis {a, Na, . . . , Nn%}?, 6. Give a direct proof that if A is the companion, p, then p is the characteristic, polynomial, for A., , matrix, , of the manic, , polynomial, , 7. Let V be an n-dimensional, vector space, and let T be a linear operator on V., Suppose that T is diagonalizable., (a) If T has a cyclic vector, show that T has n distinct characteristic, values., (b) If T has n distinct characteristic, values, and if {LYE,. . . , cr,} is a basis of, characteristic, vectors for T, show that a! = CQ +, . + czn is a cyclic vector for T., vector space V. Suppose T, 8. Let T be a linear operator on the finite-dimensional, has a cyclic vector. Prove that if U is any linear operator which commutes with T,, then U is a polynomial, in T., , 7.2., , Cyclic, , Decompositions, the, , and, , Rational, , purpose, of this section is to prove that if T is any linear, The primary, operator, on a finite-dimensional, space V, then there exist vectors CX~,. . . , CY,, in V such that, V = Z((Y~; T) @ . . . @ Z(CY,; T)., , In other words, we wish to prove that V is a direct sum of T-cyclic subspaces. This will show that T is the direct sum of a finite number of linear, operators,, , each of which, , has a cyclic, , vector., , The, , effect, , of this, , will, , be to, , reduce many questions about the general linear operator to similar questions about an operator which has a cyclic vector. The theorem which we, prove (Theorem 3) is one of the deepest results in linear algebra and has, many interesting corollaries., The cyclic decomposition, theorem is closely related to the following, question. Which T-invariant, subspaces W have the property that there, exists a T-invariant, subspace W’ such that V = W @ W’? If W is any, subspace of a finite-dimensional, space V, then there exists a subspace W’, such that V = W @ W’. Usually there are many such subspaces W’ and, to W. We are asking when a Teach of these is called complementary, invariant subspace has a complementary, subspace which is also invariant, under T., Let us suppose that V = W @ W’ where both W and W’ are invariant, under T and then see what we can discover about the subspace W. Each, , Form
Page 240 :
232, , The Rational and Jordan Forms, , Chap. 7, , vector /3 in V is of the form p = y + y’ where y is in W and y’ is in W’., If f is any polynomial over the scalar field, then, f(T)P, , = mr, , + fcor’., , Since W and W’ are invariant under T, the vectorf(T)y, is in W andf(T)$, is in W’. Therefore f(T)p is in W if and only if f(T)r’, = 0. What interests, us is the seemingly innocent fact that, iff(T)P is in W, thenf(T)/?, = f(T)r., De$nition., Let T be a linear operator on a vector space V and let W, be a subspace of V. We say that W is T-admissible, Zf, , (i) W is invariant under T;, (ii) if f(T)/3 is in W, there exists a vector y in W such that f(T)/3 = f(T)r., As we just showed, if W is invariant, and has a complementary, invariant subspace, then W is admissible. One of the consequences of Theorem 3 will be the converse, so that admissibility, characterizes those, invariant, subspaces which have complementary, invariant subspaces., Let us indicate how the admissibility, property is involved, in the, attempt to obtain a decomposition, V = Z(al;, , T) @ . . . @ Z(a,; T)., , Our basic method for arriving at such a decomposition will be to inductively, select the vectors al, . . . , o(,. Suppose that by some process or another we, have selected (~1,. . . , ai and the subspace, Wj = Z(al;, , T) + . . . + Z(crj; T), , is proper. We would like to find a non-zero vector, Wj n Z(aj+l;, , aj+l such that, , T) = (O}, , because the subspace Wi+l = Wj @ Z(C~~+~;T) would then come at least, one dimension nearer to exhausting V. But, why should any such (~j+l, exist? If al, . . . , aj have been chosen so that Wj is a T-admissible subspace,, then it is rather easy to see that we can find a suitable aj+l. This is what, will make our proof of Theorem 3 work, even if that is not how we phrase, the argument., Let W be a proper T-invariant, subspace. Let us try to find a non-zero, vector a! such that, (7-3), , W n Z(a; T) = {O}., , We can choose some vector p which is not in W. Consider the T-conductor, S(p; W), which consists of all polynomials g such that g(T)P is in W. Recall, that the manic polynomial f = s(/?; W) which generates the ideal S(P; W), is also called the T-conductor of fl into W. The vectorf(T)P, is in W. NOW, if, W is T-admissible, there is a y in W with f(T)/3 = f(T)r., Let Q! = p - y, and let g be any polynomial. Since p - a is in W, g(T)@ will be in W if and
Page 241 :
Cyclic Decompositions and the Rational Form, , Sec. 7.2, , only if g(T)cu is in W; in other words, S(c-w;W) = S@; W)., polynomial f is also the T-conductor, of Q: into W. But f(T)tu, tells us that g(T)cu is in W if and only if g(T)cr = 0, i.e., the, Z(cu; T) and II’ are independent (7-3) and j is the T-annihilator, , Thus the, = 0. That, subspaces, of (Y., , Theorem, 3 (Cyclic, Decomposition, Theorem)., Let T be a linear, operator on a finite-dimensional, vector space V and let W0 be a proper Tadmissible subspace of V. There exist non-zero vectors 01~). . . , LYEin V with, respective T-annihilators, pl, . . . , pr such that, , (i) V = W,OZ((Y~;T)O...OZ(OL~;T);, (ii) pk divides Pk-1, k = 2, . . . ) I”., pl, . . . , pr are uniquely, Furthermore, the integer r and the annihilators, determined by (i), (ii), and the fact that no ok is 0., Proof. The proof is rather long; hence, we shall divide it into four, steps. For the first reading it may seem easier to take W. = {0}, although, it does not produce any substantial simpiification., Throughout, the proof,, we shall abbreviate f(T)p to jp., Step 1. There exist non-zero, , (a> V = WO+ Z@;T), , vectors PI, . . . , & in V such that, , + -. . + Z(p,;T);, , (b) if 1 5 k < I and, wk, , =, , wo, , +, , z(&;, , T), , +, , * . * +, , Z(Pk;, , the conductor pk = s(&; Wk-1) has maximum, conductors into the subspace Wk+ i.e., for every k, then, , deg, , pk, , = ,y$, , deg, , $01;, , 'I'), , degree among all T-, , Wk-1)., , This step depends only upon the fact that W0 is arl invariant, If W is a proper T-invariant, subspace, then, , subspace., , 0 < max deg s(oc; W) _< dim V, a, and we can choose a vector p so that deg s(p; W) attains that maximum., The subspace W + Z@; T) is then T-invariant, and has dimension larger, than dim W. Apply this process to W = W0 to obtain pi. If W1 = Wo +, Z(&; T) is still proper, then apply the process to WI to obtain pz. Continue, in that manner. Since dim Wk > dim Wk+ we must reach W, = V in not, more than dim V steps., Step 2. Let PI, . . . , /I& be non-zero vectors which satisfy conditions, (a) and (b) of Step 1. Fix k, 1 5 k < r. Let p be any vector in V and let, f = s@; W&i). If, fb, , =, , PO +, , then f divides each polynomial, , 1 <Ick, , giPi,, , fli in Wi, , gi and p0 = fro, where yo is in Wo., , 23s
Page 242 :
The Rational and Jordan Forms, , Chap. 7, , If k = 1, this is just the statement that IV0 is T-admissible., to prove the assertion for k > 1, apply the division algorithm:, , In order, , g;=fhi+r., ri=O, or degr;<degf., (7-4), $9, We wish to show that rc = 0 for each i. Let, h-1, , y = /I -, , (7-5), since y - 0 is in, , T hiPi., , wk-1,, , s(r;, , wk-1), , = s(&, , wk-1), , =, , f., , Furthermore, k-l, , fr, , (7-6), , = PO+, , T rioi., , Suppose that some ri is different from 0. We shall deduce a contradiction., Let j be the largest index i for which ri # 0. Then, (7-7), , jY = PO+ +, , Let p = s(y; W+)., must divide p:, , rj # 0, , riPi,, , and, , deg rj < deg j., , Since WL..~ contains Wj-1, the conductorf, , = s(y; Wk-I), , P =fs*, Apply, (7-S), , g(T) to both sides of (7-7):, Pr, , = C7fr = VjPj, , + SPo + 1 <Fci, -, , griPi., , By definition, py is in Wj-1, and the last two terms on the right side of (7-8), are in Wj-1. Therefore, grjpj is in W+1. Now we use condition (b) of Step 1:, deg (grj) 2, =, 2, =, =, , deg, deg, deg, deg, deg, , s(Pj; WC-I>, pj, s (Y ; Wj-1), p, UsI., , Thus deg ri 2 deg f, and that contradicts the choice, that f divides each g; and hence that PO = jr. Since, PO = fro where 70 is in Wo. We remark in passing that, ened form of the assertion that each of the subspaces, T-admissible., , of j. We now know, W, is T-admissible,, Step 2 is a strengthWI, Wz, . . . , W, is, , Step 3. There exist non-zero vectors al, . . . , (Y= in V which, satisfy conditions (i) and (ii) of Theorem 3., Start with vectors pl, . . . , pr as in Step 1. Fix k, 1 5 Ic 5 r. We apply, Step 2 to the vector p = pli and the T-conductor f = pk. We obtain, , (7-W, , pkbk, , =, , pk?‘O, , +, , l<i;<k, -, , PhiPi
Page 243 :
Sec. 7.2, , Cyclic Decompositions and the Rational, , where y. is in IV0 and hi, . . . , hk--l are polynomials., (7-10), , (Yk, , =, , Pk, , -, , 70, , -, , lcFck, -, , Form, , Let, , hiPi., , Since flk - ffk is in wk-1,, (7-11), and since, , +k;, pkak, , =, , 0,, , we, , wk-1), , =, , @k;, , Wk-1), , =, , pk, , have, , (7-12), , wk--l, , n, , -z&k;, , T), , =, , (0)., , Because each (Yksatisfies (7-11) and (7-12), it follows that, wk, , =, , WO, , @, , Z(al;, , T), , @, , ‘., , ’ @, , z((Yk;, , T), , and that pk is the T-annihilator, of Lyk.In other words, the vectors 011,. . . , cyV, define the same sequence of subspaces WI, Wz, . . . as do the vectors, and the T-conductors, pk = s((Y~, wk-,) have the same maxPl,, . . . , Pr, imality properties (condition (b) of Step 1). The vectors (or, . . . , (Y, have, the additional property that the subspaces Wo, Z(al; T), Z(az; T), . . . are, independent., It is therefore easy to verify condition (ii) in Theorem 3., Since pioli = 0 for each i, we have the trivial relation, pkffk, , =, , 0, , +, , plal, , +, , * * ’, , +, , pk-lak-1., , Apply Step 2 with pl, . . . , flk replaced by al, . . . , ak, Conclusion: pk divides each pi with i < lc., , determined, , Step 4. The number, r and the polynomials, by the conditions, of Theorem 3., , and with, , 0 =, , ak., , pl, . . . , pr are uniquely, , Suppose that in addition to the vectors ctl, . . . , (Y?in Theorem 3 we, gr, . . . , g8, have non-zero vectors 71, . . . , ys with respective T-annihilators, such that, (7-13), , V = Wo @ Z(rl;, , gk divides, , T) @ . . . @ Z(rs;, , g&l,, , T), , lc = 2, . . . ) s., , We shall show that r = s and pi = gi for each i., g1 is determined, It is very easy to see that pl = gl. The polynomial, from (7-13) as the T-conductor of V into Wo. Let S( V; WC,) be the collection, of polynomials j such that j/3 is in W. for every /3 in V, i.e., polynomials j, such that the range of j(T) is contained in Wo. Then S( V; Wo) is a non-zero, ideal in the polynomial algebra. The polynomial gl is the manic generator, of that ideal, for this reason. Each p in V has the form, P, , =, , PO, , +, , jm, , +, , . * ., , +, , fSY8, , and so, SlP, , =, , glP0, , +, , +, , SljiYi., , Since each gi divides gl, we have gly; = 0 for all i and g@ = g& is in WO., of least degree, Thus gl is in S(V; W,). Since g1 is the manic polynomial, , 255
Page 244 :
236, , The Rational and Jordan Forms, , Chap. 7, , which sends y1 into WO, we see that g1 is the manic polynomial of least degree, in the ideal X(V; W,J. By the same argument, pl is the generator of that, ideal, so pl = 91., If f is a polynomial and W is a subspace of V, we shall employ the, shorthand jW for the set of all vectors fa with LYin W. We have left to the, exercises the proofs of the following three facts., 1. fZ(a; T) = Z(fa; T)., 2. If v = VI@ . * . @ Vh, where each Vi is invariant, under T, then, fV = fV1 @ . * . @ fVk., 3. If a and y have the same T-annihilator,, then fa and fr have the, same T-annihilator, and (therefore), dim Z(fcy; T) = dim .Z(jr;, , T)., , Now, we proceed by induction to show that r = s and pi = gi for, i=2, .‘, r. The argument consists of counting dimensions in the right, way. We shall give the proof that if r >_ 2 then pz = gz, and from that the, induction should be clear. Suppose that r 2 2. Then, dim Wo + dim Z(al;, , T) < dim V., , Since we know that pl = gl, we know that Z(czl; T) and Z(yl;, same dimension. Therefore,, dim W0 + dim Z(n;, , T) have the, , T) < dim V, , which shows that s 2 2. Now it makes sense to ask whether or not p2 = g2., From the two decompositions, of V, we obtain two decompositions of the, subspace pz V:, (7-14), , p2V = p2Wo 0, , Z(p2al;, , pzV =, , Z(pen;, , p2Wo, , 0, , T), T) 0, , ... 0, , T)., , Z(pz~s;, , We have made use of facts (1) and (2) above and we have used the fact, that pzai = 0, i 2. 2. Since we know that pl = 91, fact (3) above tells US, that Z(p2cul; T) and Z(pzrl; T) have the same dimension. Hence, it is, apparent from (7-14) that, dim Z(p2yi;, , T) = 0,, , i 2. 2., , We conclude that pzy2 = 0 and g2 divides p,. The argument, to show that pz divides gz. Therefore p2 = g2. 1, , can be reversed, , Corollary., If T is a linear operator on a finite-dimensional, vector, space, then every T-admissible subspace has a complementary subspace which, is also invariant under T., Proof. Let W0 be an admissible subspace of V. If Wo = V, the, complement we seek is (0). If W0 is proper, apply Theorem 3 and let, W:, = Z(crl; T) @ . . . @ Z(crr; T)., Then WA is invariant, , under T and V = Wo @ WA., , 1
Page 245 :
Sec. 7.2, , Cyclic Decompositions and the Rational Form, , Corollary., , Let T be a linear, , operator, , on a finite-dimensional, , vector, , space V., (a) There exists a vector Q in V such that the T-annihilator, of cy is the, minimal polynomial for T., (b) T has a cyclic vector if and only if the characteristic and minimal, polynomials for T are identical., Proof., , If V = (O}, the results are trivially, , (7-15), , V = Z(al;, , true. If V # (0)) let, , T) @ . . . @ Z(cyy; T), , where the T-annihilators, pl, . . . , p, are such that pk+l divides pk, 1 5 k <, T - 1. As we noted in the proof of Theorem 3, it follows easily that pl is the, minimal polynomial for T, i.e., the T-conductor, of V into (O}. We have, proved (a)., We saw in Section 7.1 that, if T has a cyclic vector, the minimal, polynomial for T coincides with the characteristic polynomial. The content, of (b) is in the converse. Choose any LYas in (a). If the degree of the minimal, polynomial is dim V, then V = Z(a; T)., 1, Theorem, 4 (Generalized, Cayley-Hamilton, Theorem)., Let T be, a linear operator on a finite-dimensional, vector space V. Let p and f be the, minimal and characteristic polynomials for T, respectively., , (i) p divides f., (ii) p and f have the same prime factors, except for multiplicities., (iii) If, p = e ...e, , (7-E), is the prime factorization, , of p, then, f = f$ . . . p, , (7-17), where di is the nullity, , of fi(T)rl, , divided by the degree of fi., , Proof. We disregard the trivial case V = (0). To prove (i) and, (ii), consider a cyclic decomposition, (7-15) of V obtained from Theorem 3., As we noted in the proof of the second corollary, pl = p. Let IJ; be the, restriction of T to Z(aa; T). Then Ui has a cyclic vector and so pi is both, the minimal polynomial, and the characteristic polynomial, for Ui. Therefore, the characteristic polynomial f is the product f = pl . . . p,.. That is, evident from the block form (6-14) which the matrix of T assumes in a, suitable basis. Clearly pl = p divides f, and this proves (i). Obviously any, prime divisor of p is a prime divisor of f. Conversely, a prime divisor of, f = p1 . . . p, must divide one of the factors pi, which in turn divides pl., Let (7-16) be the prime factorization, of p. We employ the primary, decomposition theorem (Theorem 12 of Chapter 6). It tells us that, if Vi, is the null space of f<(T)rc, then, , 237
Page 246 :
258, , The Rational and Jordan Forms, (7-18), , Chap. 7, , v = Vl @ . . . @ Vk, , andj2 is the minimal polynomial of the operator Ti, obtained by restricting, T to the (invariant), subspace Vi. Apply part (ii) of the present theorem to, the operator Ti. Since its minimal polynomial is a power of the prime ji,, the characteristic polynomial for Ti has the form j?, where di >_ ri. Obviously, d, = dim Vi, *, deg ji, and (almost by definition), dim Vi = nullity ji(T)ri., sum of the operators T,, . . . , Tk, the characteristic, product, , Since T is the direct, polynomial j is the, , j = j;"' . . . ft. m, , Corollary., If T is a nilpotent linear operator on a vector space of, dimension n, then the characteristic polynomial for T is xn., Now let us look at the matrix analogue of the cyclic decomposition, theorem. If we have the operator T and the direct-sum decomposition, of, Theorem 3, let @i be the ‘cyclic ordered basis’, {ai, Tail . . . 7 Tki-‘ai), for Z(ai; T). Here Ici denotes the dimension of Z(ai; T), that is, the degree, of the annihilator pi. The matrix of the induced operator Ti in the ordered, basis @i is the companion matrix of the polynomial pi. Thus, if we let 63 be, the ordered basis for V which is the union of the & arranged in the order, of T in the ordered basis @ will be, @Q, . . . ) a,, then the matrix, , (7-19), , where Ai is the Ici X lci companion matrix of pi. An n X n matrix A,, which is the direct sum (7-19) of companion matrices of non-scalar manic, polynomials, pl, . . . , p, such that pi+1 divides pi for i = 1, . . . , r - 1,, will be said to be in rational, form., The cyclic decomposition, theorem, tells us the following concerning matrices., Theorem, 5. Let F be a field and let B be an n X n matrix over F., Then B is similar over the Jield F to one and only one matrix which is in, rational form., , Proof. Let T be the linear operator on Fn which is represented by, B in the standard ordered basis. As we have just observed, there is some, ordered basis for Fn in which T is represented by a matrix ,4 in rational, form. Then B is similar to this matrix A. Suppose B is similar over F to
Page 247 :
Cyclic Decompositions, , Sec. 7.2, , and the Rational Form, , another matrix C which is in rational form. This means simply that there, is some ordered basis for Fn in which the operator T is represented by the, matrix C. If C is the direct sum of companion matrices Ci of manic polynomials gl, . . . , gs such that gi+l divides gi for i = 1, . . . , s - 1, then it, is apparent that we shall have non-zero vectors pi, . . . , ps in V with Tanriihilators gl, . . . , gs such that, V = Z(/%; T) @ . . . @Z(P,;, , T)., , But then by the uniqueness statement in the cyclic decomposition theorem,, the polynomials gi are identical with the polynomials pi which define the, matrix A. Thus C = A. 1, factors, for, The polynomials, pl, . . , p, are called the invariant, the matrix B. In Section 7.4, we shall describe an algorithm for calculating, the invariant, factors of a given matrix B. The fact that it is possible to, compute these polynomials by means of a finite number of rational operations on the entries of B is what gives the rational form its name., , vector space over, EXAMPLE 2. Suppose that V is a two-dimensional, the field F and T is a linear operator on V. The possibilities for the cyclic, subspace decomposition, for T are very limited. For, if the minimal polynomial for T has degree 2, it is eyual to the characteristic polynomial for, T and T has a cyclic vector. Thus there is some ordered basis for V in, which T is represented by the companion matrix of its characteristic, polynomial. If, on the other hand, the minimal polynomial for T has degree, 1, then T is a scalar multiple of the identity operator. If T = cl, then for, any two linear independent vectors (~1and LYZin V we have, , V = Z(m; T) @Z(az; T), p1 = pz = x -, , c., , For matrices, this analysis says that every 2 X 2 matrix, is similar over F to exactly one matrix of the types, , EXAMPLE 3. Let T be the linear operator, by the matrix, A-[f, in the standard ordered basis., acteristic polynomial, for T, polynomial for T is p = (X decomposition, for T the first, , 1;, , over the field F, , on R3 which is represented, , I;], , We have computed earlier that the charis f = (x - 1)(x - 2)z and the minimal, 1)(x - 2). Thus we know that in the cyclic, vector CY~will have p as its T-annihilator., , 239
Page 248 :
$40, , The Rational and Jordan Forms, , Chap. 7, , Since we are operating in a three-dimensional, space, there can be only, further vector, CY~.It must generate a cyclic subspace of dimension 1,, pz must, it must be a characteristic, vector for T. Its T-annihilator, (Z - 2), because we must have pp2 = f. Notice that this tells us, mediately that the matrix A is similar to the matrix, , [ 1, 0, , B=l, , one, i.e.,, be, im-, , 0, , -2, , 0, 30, 0 2, , that is, that T is represented by B in some ordered basis. How can we find, suitable vectors al and 0(2?Well, we know that any vector which generates, a T-cyclic subspace of dimension 2 is a suitable aI. So let’s just try cl. We, have, Tc, = (5, -1, 3), which is not a scalar multiple of cl; hence Z(E~; T) has dimension, :, space consists of all vectors UQ + I, , 2. This, , a(1, 0, 0) + b(5, -1, 3) = (a + 55, -5, 35), or, all vectors (x1, 52, ~3) satisfying ~3 = -3~., Now what we want is, a vector ~2 such that To12= 2cu2 and Z(,,; T) is disjoint from Z(E~; T)., Since cy2is to be a characteristic vector for T, the space Z(cu2; T) will simply, be the one-dimensional, space spanned by LYE,and so what we require is that, 012 not be in Z(E~; T). If a! = (x1, 22, x3), one can easily compute that, Ta = 2~r if and only if x1 = 2x2 + 2x3. Thus a2 = (2, 1, 0) satisfies Tac2=, 2012and generates a T-cyclic subspace disjoint from Z(Q; T). The reader, ’, should verify directly that the matrix of T in the ordered basis, I(17 0, o>, (5, --1,3),, is the matrix, , (2, l,O)l, , B above., , EXAMPLE 4. Suppose that T is a diagonalizable linear operator on V., It is interesting to relate a cyclic decomposition, for T to a basis which, diagonalizes the matrix of T. Let cl, , . . , ck be the distinct characteristic, values of T and let Vi be the space of characteristic vectors associated with, the characteristic value ci. Then, v = Vl @ * . . @ Vk, and if di = dim Vi then, , f = (x - cp . . . (x - cp, is the characteristic polynomial for T. If LYis a vector in V, it is easy to, relate the cyclic subspace Z(CX; T) to the subspaces VI, . . . , Vk. There are, unique vectors &, . . . , /& such that pi is in Vi and, ~2 = PI + * . * +, , pk.
Page 249 :
Sec. 7.2, Since Tpi = c&, , Cyclic Decompositions, , and the Rational Form, , we have, , (7-20), , f(T)a, , = f(cdP~ + . . . + f(cdPk, , for every polynomial f. Given any scalars tl, . . . , tk there exists a polynomial f such that f(ci) = ti, 1 5 i 5 lc. Therefore,, Z(CY; T) is just the, subspace spanned by the vectors ,&, . . . , Pk. What is the annihilator, of CY?, According to (7-20), we havef(T)cr, = 0 if and only if f(c&, = 0 for each i., = 0 provided f(ci) = 0 for each i such that pi # 0., In other words, f(T)a, Accordingly, the annihilator, of (Y is the product, (7-21), , II (z - 4., j%#O, Now, let 6% = {pi, . . . , Pi,} be an ordered basis for V;. Let, r = max di., , We define vectors, , al, . . . , CY~by, , (7-22), , “j = z, , di2j, , p;,, , l<j<r., , The cyclic subspace Z(q; T) is the subspace spanned by the vectors /3j, as, i runs over those indices for which di 2 j. The T-annihilator, of q is, (7-23), We have, V = Z(a1;, , T) @ . . . @ Z(ar;, , T), , because each /$ belongs to one and only one of the subspaces Z(CQ; T), . . . ,, zk+; T) and & = (&, . . . , &) is a basis for V. By (7-23), pi+1 divides pj., , Exercises, 1. Let T be the linear operator on F2 which is represented in the standard ordered, basis by the matrix, , [0 01, 10’, , Let aI = (0, 1). Show that, in F2 with Z(CQ; T) disjoint, , F2 # Z( crl; T) and that, from Z(CQ; T)., , there is no non-zero, , vector, , (~2, , 2. Let T be a linear operator on the finite-dimensional, space V, and let, R be, the range of T., (a) Prove that R has a complementary, T-invariant, subspace if and only if R, is independent, of the null space N of T., (b) If R and N are independent,, prove that N is the unique T-invariant, subspace complementary, to R., 3. Let, T be the linear, basis by the matrix, , operator, , on R3 which is represented, 2 0, 12, 0 0, , 0, 0., 3, , [ 1, , in the standard, , ordered
Page 250 :
242, , The Rational and Jordan FOIYS, , Chap. 7, , Let W be the null space of T - 21. Prove that W has no complementary, T-invariant, subspace. (Hi&, Let /3 = e1 and observe that (T - 21)fl is in W. Prove there is, no (Y in W with (T - 21)/3 = (T - 21)~~), 4. Let T be the linear operator, basis by the matrix, , c000, [ 1, , on F4 which is represented, , 1, 0, 0, , c, 1, 0, , 0, c, 1, , in the standard, , ordered, , 0., 0, c, , Let W be the null space of T - cl., (a) Prove that W is the subspace spanned by e4., (b) Find the manic generators, of the ideals S(e4; W),, , S(Q; W), S(Q; W),, , &El; w., 5. Let T be a linear operator on the vector space V over the field P. If f is a polysubnomial over F and a is in V, let fcu = f(T)cr. If VI, . . . , Vk are T-invariant, spaces and V = VI @ . . . @ Vk, show that, fV = fV, 0 . . . @fV,., 6. Let T, V, and F be as in Exercise 5. Suppose cy and fl are vectors in V which, have the same T-annihilator., Prove that, for any polynomial, f, the vectors fcu, and fb have the same T-annihilator., 7. Find the minimal, real matrices., , polynomials, , 8. Let T be the linear, basis by, , Find non-zero, , operator, , and the rational, , forms of each of the following, , on R3 which is represented, , vectors cyl, . . . , LYEsatisfying, , the conditions, , in the standard, , of Theorem, , ordered, , 3., , 9. Let A be the real matrix, , A =[_i, Find, , an invertible, , 3 X 3 real matrix, , -%, , -51., , P such that P-1AP is in rational, , form., , 10. Let F be a subfield of the complex numbers and let T be the linear, on F4 which is represented in the standard ordered basis by the matrix, , operator
Page 251 :
Cyclic Decompositions and the Rational Form, , Sec. 7.2, , Find the characteristic, polynomial, for T. Consider the cases a = b = 1; a = b = 0;, a.= 0, b = 1. In each of these cases, find the minimal, polynomial, for T and nonzero vectors crl, . . . , (Ye which satisfy the conditions, of Theorem 3., , 11. Prove that if A and B are 3 X 3 matrices over the field F, a necessary and, sufficient condition, that A and B be similar over F is that they have the same, characteristic, polynomial, and the same minimal, polynomial., Give an example, which shows that this is false for 4 X 4 matrices., 12. Let F be a subfield of the field of complex numbers, and let A and B be n X n, matrices over F. Prove that if A and B are similar over the field of complex numbers, then they are similar over F. (Hint: Prove that the rational form of A is the, same whether A is viewed as a matrix over F or a matrix over C; likewise for B.), 13. Let A be an n X n matrix with complex entries., istic value of A is real, then A is similar to a matrix, , Prove that if every characterwith real entries., , space V. Prove that there, 14. Let T be a linear operator on the finite-dimensional, If f is a polynomial, and f(T)cr = 0,, exists a vector a in V with this property., then f(T) = 0, (Such a vector (Y is called a separating, vector, for the algebra of, polynomials, in T.) When T has a cyclic vector, give a direct proof that any cyclic, vector is a separating vector for the algebra of polynomials, in T., 15. Let F be a subfield of the field of complex numbers, and let A be an n X n, polynomial, for A. If we regard A as a matrix, matrix over F. Let p be the minimal, polynomial, f as an n X n matrix over C. Use a, over C, then A has a minimal, theorem on linear equations to prove p = f. Can you also see how this follows from, the cyclic decomposition, theorem?, 16. Let A be an n X n matrix with real entries such that A2 + Z = 0. Prove that, n is even, and if n = 21c, then A is similar over the field of real numbers to a matrix, of the block form, 0, , [, , -I, , z, , where Z is the k X k identity, , 0, , 1, , matrix., , 17. Let T be a linear operator on a finite-dimensional, vector space V. Suppose that, polynomial;, (a) the minimal, polynomial, for T is a power of an irreducible, (b) the minimal, polynomial, is equal to the characteristic, polynomial., Show that no non-trivial, T-invariant, subspace has a complementary, T-invariant subspace., 18. If T is a diagonalizable, a complementary, T-invariant, , linear operator,, subspace., , then every T-invariant, , 19. Let T be a linear operator on the finite-dimensional, has a cyclic vector if and only if the following, is true:, in T., which commutes with T is a polynomial, , subspace has, , space V. Prove that, Every linear operator, , T, U, , 20. Let V be a finite-dimensional, vector space over the field F, and let T be a, linear operator on V. We ask when it is true that every non-zero vector in V is a, cyclic vector for T. Prove that this is the case if and only if the characteristic, over F., polynomial, for T is irreducible
Page 252 :
244, , Chap. 7, , The Rational and Jordan Forms, , 21. Let A be an n X n matrix with real entries. Let T be the linear operator on Rn, which is represented by A in the standard ordered basis, and let U be the linear, operator on Cn which is represented by A in the standard ordered basis. Use the, result of Exercise 20 to prove the following: If the only subspaces invariant under, T are Rn and the zero subspace, then U is diagonalieable., , 7.3., , The Jordan, , Form, , Suppose that N is a nilpotent linear, sional space V. Let us look at the cyclic, obtain from Theorem 3. We have a positive, pl, . ., a, . . . , LY,in V with N-annihilators, , operator on the finite-dimendecomposition, for N which we, integer r and r non-zero vectors, . , p,, such that, , V = Z(cx,; N) @ . . . @ Z(a,; N), and pi+l divides pi for i = 1, . . . , r - 1. Since N is nilpotent, the minimal, polynomial, is xk for some k 5 n. Thus each pi is of the form pi = xkl,, and the divisibility, condition simply says that, ICI 2 k, 2 . ‘. 2 k,., Of course, kl = k and k, >_ 1. The companion, matrix, 0 0 *** 0, 1 0 ..., 0, (7-24), 0, A,= 1 0 1 **., . ., :, j, (j, ., ., ., i, 1, *, , matrix, , of xki is the ki X ki, , 0, 0, 0 ., :, 0, , 1, , Thus Theorem 3 gives us an ordered basis for V in which the matrix of N, is the direct sum of the elementary nilpotent matrices (7-24), the sizes of, which decrease as i increases. One sees from this that associated with a, nilpotent, n X n matrix is a positive integer r and r positive integers, k I, . . . , k, such that k, + . . . + k,. = n and ki 2 ki+l, and these positive, integers determine the rational form of the matrix, i.e., determine the, matrix up to similarity., Here is one thing we should like to point out about the nilpotent, operator N above. The positive integer r is precisely the nullity of N;, in fact, the null space has as a basis the r vectors, (7-25), , Nki-, , 1,., , +., , For, let a! be in the null space of N. We write (Y in the form, , where ji is a polynomial, the degree of which we may assume is less than, ki. Since Na = 0, for each i we have
Page 253 :
Sec. 7.3, , The Jordan Form, 0 = N(fia!i), = Nfi(N)c-ri, = (Xfi)CYi., , Thus xji is divisible, , by xki, and since deg (fJ, , > Ici this means that, , fi = Cixk,--l, where Ci is some scalar. But then, a = cl(xkl-lq), , + . * * + C,(x+-la,), , which shows us that the vectors (7-25) form a basis for the null space of N., The reader should note that this fact is also quite clear from the matrix, point of view., Now what we wish to do is to combine our findings about nilpotent, operators or matrices with the primary decomposition theorem of Chapter, 6. The situation is this: Suppose that T is a linear operator on V and that, the characteristic polynomial for T factors over F as follows:, f = (x - cp, where cl, . . . , ck are distinct, polynomial for T will be, , . . . (x - C/y”, , elements of F and di 2 1. Then the minimal, , p = (x - C# f * . (x - CJp, where 1 5 ri 5 di. If Wi is the null space of (T - CJ)r;, then the primary, decomposition theorem tells us that, v = Wl @ . * * @ Wk, and that the operator Ti induced on Wi by T has minimal polynomial, (X - Ci)r,. Let Ni be the linear operator on Wi defined by Ni = Ti - CiI., Then Ni is nilpotent and has minimal polynomial 2”. On Wi, T acts like, Ni plus the scalar ci times the identity operator. Suppose we choose a, basis for the subspace Wi corresponding, to the cyclic decomposition, for, the nilpotent operator Ni. Then the matrix of Ti in this ordered basis will, be the direct sum of matrices, , (7-26), , c, 1, ., , 0, c, ., , -**, *.*, , i=0, , 0, , ..., , each with c = ci. Furthermore,, as one reads from left to right., , 0 0, 0 0, .. .., . ., c, 1 c1, , the sizes of these matrices will decrease, A matrix of the form (7-26) is called an, elementary, Jordan, matrix, with, characteristic, value c. NOW if we put, all the bases for the Wi together, we obtain an ordered basis for V. Let, us describe the matrix A of T in this ordered basis., , 245
Page 254 :
The Rational and Jordan Forms, , Chap. 7, , The matrix A is the direct sum, , (7-27), , of matrices Al, . . . , Ak. Each Ai is of the form, , where each Jpl is an elementary Jordan matrix with characteristic, value, ci. Also, within each Ai, the sizes of the matrices J:i) decrease as j increases. An n X n matrix A which satisfies all the conditions described, SO far in this paragraph, (for some distinct scalars cl, . . . , ck) will be said, form., to be in Jordan, We have just pointed out that if T is a linear operator for which the, characteristic, polynomial, factors completely over the scalar field, then, there is an ordered basis for V in which T is represented by a matrix which, is in Jordan form. We should like to show now that this matrix is something uniquely associated with T, up to the order in which the characteristic values of T are written down. In other words, if two matrices are, in Jordan form and they are similar, then they can differ only in that the, order of the scalars ci is different., The uniqueness we see as follows. Suppose there is some ordered basis, for V in which T is represented by the Jordan matrix A described in the, previous paragraph. If Ai is a di X di matrix, then di is clearly the multiplicity of ci as a root of the characteristic, polynomial for A, or for T. In, other words, the characteristic, polynomial for T is, f = (x - cp, , * * * (x - cp., , This shows that cl, . . . , ck and dl, . . . , dk are unique, up to the order in, which we write them. The fact that A is the direct sum of the matrices, Ai gives us a direct sum decomposition, V = WI @ . . . @ wk invariant, under T. NOW note that Wi must be the null space of (T - CiI)n, where, n = dim V; for, A i - CiI is clearly nilpotent and Aj - cil is non-singular, forj # i. So we see that the subspaces Wi are unique. If T; is the operator, induced on Wi by T, then the matrix Ai is uniquely determined, as the, rational form for (Ti - CiI)., Now we wish to make some further observations, about the operator, T and the Jordan matrix A which represents T in some ordered basis., We shall list a string of observations:, (1) Every, , entry of A not on or immediately, , below the main diagonal
Page 255 :
Sec. 7.3, , The Jordan Form, , is 0. On the diagonal of A occur the k distinct characteristic, values, di times, where di is the multiplicity, Cl,, . . . , ck of T. Also, ci is repeated, of ci as a root of the characteristic, polynomial,, i.e., di = dim Wi., (2) For each i, the matrix A; is the direct sum of ni elementary, Jordan matrices #’ with characteristic, value ci. The number ni is precisely the dimension of the space of characteristic vectors associated with, the characteristic value ci. For, ni is the number of elementary nilpotent, blocks in the rational form for (Ti - ci1), and is thus equal to the dimension of the null space of (T - cil). In particular, notice that T is diagonalizable if and only if ni = di for each i., (3) For each i, the first block Ji*’ in the matrix Ai is an TV X ri, matrix, where ri is the multiplicity, of ci as a root of the minimal polynomial for T. This follows from the fact that the minimal polynomial for, the nilpotent operator (Ti - CJ) is 2”., Of course we have as usual the straight matrix result. If B is an, over the field F and if the characteristic, polynomial, for B, factors completely over F, then B is similar over F to an n X n matrix, A in Jordan form, and A is unique up to a rearrangement, of the order, form, of B., of its characteristic values. We call A the Jordan, Also, note that if F is an algebraically, closed field, then the above, remarks apply to every linear operator on a finite-dimensional, space over, F, or to every n X n matrix over F. Thus, for example, every n X n, matrix over the field of complex numbers is similar to an essentially unique, matrix in Jordan form., n X n matrix, , EXAMPLE 5. Suppose T is a linear operator on P. The characteristic, polynomial, for T is either (Z - c~)(z - cg) where cl and c2 are distinct, complex numbers, or is (Z - c)~. In the former case, T is diagonalizable, and is represented in some ordered basis by, , [ 1, Cl, , 0, , 0, , c2’, , In the latter case, the minimal polynomial for T may be (Z - c), in which, case T = CT, or may be (Z - c)~, in which case T is represented in some, ordered basis by the matrix, , c, 0, [ 1, 1, , c’, , Thus every 2 X 2 matrix over the field of complex numbers is similar to, a matrix of one of the two types displayed above, possibly with cl = CZ., EXAMPLE 6. Let A be the complex 3 X 3 matrix, , 247
Page 256 :
248, , The Rational and Jordan Forms, , Chap. 7, , The characteristic, polynomial for A is obviously (X - 2)2(x: + 1). Either, this is the minimal polynomial, in which case A is similar to, 20, 12, 0 0, , or the minimal, , polynomial, , 0, 0, , [ 1, -1, , is (x - 2)(x + l), in which case A is similar to, , Now, (A--21)(A+I)=, , 000, [ 1, 3a 0, UC 0, , and thus A is similar to a diagonal, , matrix, , 0, 0, , if and only if a = 0., , EXAMPLE 7. Let, 2, , 0, , 0, , 0, , 0, 0, , 0, 0, , 2, a, , 0’, 2, , [ 1, , A=1200, , The characteristic polynomial for A is (X - 2)4. Since A is the direct sum, of two 2 X 2 matrices, it is clear that the minimal polynomial, for A is, (z - 2)2. Now if a = 0 or if a = 1, then the matrix A is in Jordan form., Notice that the two matrices we obtain for a = 0 and a = 1 have the, same characteristic, polynomial, and the same minimal polynomial,, but, are not similar. They are not similar because for the first matrix the solution space of (A - 21) has dimension 3, while for the second matrix it, has dimension 2., EXAMPLE 8. Linear differential, equations with constant coefficients, (Example 14, Chapter 6) provide a nice illustration, of the Jordan form., Let aor . . . , u,-~ be complex numbers and let V be the space of all n times, differentiable, functions f on an interval of the real line which satisfy the, differential, equation, ‘&, , + a,4 g<, , + . . . + al $, , Let D be the differentiation, operator., V is the null space of p(D), where, , + aof = 0., , Then V is invariant, , under D, because, , p = xn + * * f + UlJ: + all., What is the Jordan, , form for the differentiation, , operator, , on V?
Page 257 :
The Jordan Form, , Sec. 7.3, Let Cl, . . . ) ck be the distinct, , complex roots of p:, , p = (x - c1p * * * (x - C#., Let Vi be the null space of (D - c@,, differential, equation, (D - cjpf, Then as we noted in Example, theorem tells us that, , that is, the set of solutions, , to the, , = 0., , 15, Chapter, , 6 the primary, , decomposition, , Let Ni be the restriction of D - CiI to Vi. The Jordan form for the operator D (on V) is then determined by the rational forms for the nilpotent, operators NI, . . . , Nk on the spaces VI, . . . , Vk., So, what we must know (for various values of c) is the rational form, for the operator N = (D - cI) on the space V,, which consists of the, solutions of the equation, CD - cI)rf, , = 0., , How many elernentary nilpotent blocks will there be in the rational form, for N? The number will be the nullity of N, i.e., the dimension of the, characteristic, space associated with the characteristic, value c. That, dimension is 1, because any function, which satisfies the differential, equation, Df = cf, is a scalar multiple of the exponential function h(z) = ecz. Therefore, the, operator N (on the space V,) has a cyclic vector. A good choice for a, cyclic vector is g = x’-Vz:, g(z) = zr’-lecz., This gives, Ng = (r - l)f-*h, N”g, , = (T -, , 1) ih, , The preceding paragraph, shows us that the Jordan form for D (on, the space V) is the direct sum of Ic elementary Jordan matrices, one for, each root ci., , Exercises, 1. Let NI and Nz be 3 X 3 nilpotent matrices over the field F. Prove that N1, and Nz are similar if and only if they have the same minimal polynomial., 2. Use the result of Exercise 1 and the Jordan form to prove the following:, , Let
Page 258 :
250, , The Rational and Jordan Forms, A and B be n X n matrices, , Chap. 7, , over the field F which, , have the same characteristic, , polynomial, f = (x - cp, and the same minimal, similar., , polynomial., , 3. If A is a complex, , 5 X 5 matrix, , than 3, then A and B are, , with characteristic, , polynomial, , f=(xand minimal, , polynomial, , 4. How many possible, characteristic, polynomial, , * * * (x - c@, , If no di is greater, , 2)3(x + 7)2, , p = (x -, , 2)2(~ + 7), what, , is the Jordan, , Jordan forms are there for a 6 X 6 complex, (x + 2)4(x - 1)2?, , form, , for A?, , matrix, , with, , 5. The differentiation, operator on the space of polynomials, of degree less than, or equal to 3 is represented in the ‘natural’, ordered basis by the matrix, 0, 0, 0, 0, , What, , is the Jordan, , form, , 6. Let A be the complex, , 1, 0, 0, 0, , 0, 2, 0, 0, , 0, 0., 3, 0, , [ 1, , of this matrix?, , (F a subfield, , 11112, 0 0 0, the Jordan, , numbers.), , matrix, 20000, 12000, -10200, , Find, , of the complex, , 0, 0, 0, 0, 0, , 1, , -1, , form for A., , 7. If A is an n X n matrix, , over the field F with characteristic, f = (x - cp, , polynomial, , . . . (x - cp, , what is the trace of A?, 8. Classify, , up to similarity, , all 3 X 3 complex, , matrices, , A such that A3 = I., , 9. Classify, , up to similarity, , all n, X n complex, , matrices, , A such that A” = 1., , 10. Let n be a positive integer, n 2 2, and let N be an n X n matrix, field F such that Nn = 0 but Nn-l # 0. Prove that N has no square, that there is no n X n matrix A such that A2 = N., , over the, root, i.e.,, , 11. Let N1 and Nz be 6 X 6 nilpotent, matrices over the field F. Suppose that, N1 and Ns have the same minimal, polynomial, and the same nullity., Prove that, N1 and Nt are similar., Show that this is not true for 7 X 7 nilpotent, matrices., 12. Use the result of Exercise 11 and the Jordan form to prove the following:, Let A and B be n X n matrices over the field F which have the same characteristic, polynomial, f = (x - cl)dl . . . (x - ck)dr
Page 259 :
Sec. 7.4, , of Invariant, , Computation, , Factors, , and the same minimal, polynomial., Suppose also that for each i the solution spaces, If no di is greater than 6,, of (A - cil) and (B - ciZ) have the same dimension., then A and B are similar., 13. If N is a k X k elementary, nilpotent, matrix, i.e., Nk = 0 but Nk-1 # 0, show, that Nt is similar to N. Now use the Jordan form to prove that every complex, n X n matrix is similar to its transpose., , n, 14. What’s wrong with the following, proof? If A is a complex, such that At = -A, then A is 0. (Proof: Let J be the Jordan form, A’ = -A, .P = --J. But J is triangular, so that Jt = -J implies, entry of J is zero. Since J = 0 and A is similar to J, we see that A, an example of a non-zero A such that At = -A.), , X n matrix, of A. Since, that every, = 0.) (Give, , 15. If N is a nilpotent, 3 X 3 matrix, over C, prove that A = Z + $N - +N2, satisfies A2 = Z + N, i.e., A is a square root of Z + N. Use the binomial, series for, (1 + t)lj2 to obtain a similar formula for a square root of Z + N, where N is any, nilpotent, n X n matrix over C., 16. Use the result of Exercise 15 to prove that if c is a non-zero complex number, and N is a nilpotent, complex matrix, then (cZ + N) has a square root. Now use, the Jordan form to prove that every non-singular, complex n X n matrix has a, square root., , 7.4., , of Invariant, , Computation, , Factors, , Suppose, that A is an n X n matrix, with entries, in the field F. We, wish to find a method, for computing, the invariant, factors, pl, . . . , p,, which define the rational, form for A. Let us begin with the very simple, case in which A is the companion, matrix, (7.2) of a monk, polynomial, p = x” + Cn-1x-l, In Section, polynomial, calculation, this case,, , +, , . . * + Cl5 + co., , 7.1 we saw that p is both the minimal, and the characteristic, for the companion, matrix, A. Now, we want to give a direct, which shows that p is the characteristic, polynomial, for A. In, 00-e., x0-**, , r -I, , xl-A=, , o-ls.*., , 0, 0, 0, , co, Cl, , 1, , the x in the (n - 1,, Add x times row n to row (ti - 1). This will remove, n - 1) place and it will not change the determinant., Then,, add x times, the new row (n - 1) to row (n - 2). Continue, successively, until, all of, the x’s on the main, diagonal, have been removed, by that process., The, result is the matrix, , 261
Page 260 :
The Rational and Jorofan FOT~S, , Chap. 7, , which has the same determinant, as XI - A. The upper right-hand, entry, of this matrix is the polynomial p. We clean up the last column by adding, to it appropriate, multiples of the other columns:, , Multiply, each of the first (n - 1) columns by -1 and then perform, (n - 1) interchanges of adjacent columns to bring the present column n, to the first position. The total effect of the 2n - 2 sign changes is to leave, the determinant, unaltered. We obtain the matrix, , (7-28), , It is then clear that p = det (XI - A)., We are going to show that, for any n X n matrix A, there is a SUCcession of row and column operations which will transform XI - A into, a matrix much like (7-28), in which the invariant, factors of A appear, down the main diagonal. Let us be completely clear about the operations, we shall use., We shall be concerned with F[x] mXn, the collection of m X n matrices, with entries which are polynomials over the field F. If M is such a matrix,, an elementary, row operation, on M is one of the following, , f, , 1. multiplication, of one row of M by a non-zero scalar in F;, ‘2. replacement of the rth row of M by row r plus f times row S, where, is any polynomial over F and T # s;, 3. interchange of two rows of M., , The inverse operation of an elementary row operation is an elementary, row operation of the same type. Notice that we could not make such an, assertion if we allowed non-scalar polynomials, in (1). An rn X m ele-
Page 261 :
Computation of Invariant, , Sec. 7.4, , Factors, , mentary, matrix,, that is, an elementary matrix in F[x]*x*, is one which, can be obtained from the m X m identity matrix by means of a single, elementary row operation. Clearly each elementary row operation on 1M, can be effected by multiplying, M on the left by a suitable m X m elementary matrix; in fact, if e is the operation, then, , e(M), , = e(I)M., , Let M, N be matrices in F [x] mXn. We say that N is row-equivalent, to M if N can be obtained from M by a finite succession of elementary, row operations :, M = M,,+, , Ml+, , .a. + MI, = N., , Evidently, N is row-equivalent, to M if and only if M is row-equivalent, to, N, so that we may use the terminology, ‘M and N are row-equivalent.’, If N is row-equivalent, to M, then, N = PM, where the m X m matrix, , P is a product, , of elementary, , matrices:, , P = El . ‘. EE., In particular,, , P is an invertible, matrix with inverse, p-1 = Eil . . . E;l., , Of course, the inverse of Ei comes from the inverse elementary, row, operation., All of this is just as it is in the case of matrices with entries in F. It, parallels the elementary, results in Chapter 1. Thus, the next problem, which suggests itself is to introduce a row-reduced, echelon form for polynomial matrices. Here, we meet a new obstacle. How do we row-reduce, a matrix? The first step is to single out the leading non-zero entry of row 1, and to divide every entry of row 1 by that entry. We cannot (necessarily), do that when the matrix has polynomial, entries. As we shall see in the, next theorem, we can circumvent this difficulty in certain cases; however,, there is not any entirely suitable row-reduced form for the general matrix, in F[x]“X”. If we introduce column operations as well and study the type, of equivalence which results from allowing the use of both types of operations, we can obtain a very useful standard form for each matrix. The, basic tool is the following., Lemma., Let M be a matrix in F[x] mXn which has some non-zero entry, in its jirst column, and let p be the greatest common divisor of the entries in, column 1 of M. Then M is row-equivalent to a matrix N which has, , II, P, 0, 0, , as its first column.
Page 262 :
264, , Chap. 7, , The Rational and Jordan Forms, , Proof. We shall prove something more than we have stated., We shall show that there is an algorithm for finding N, i.e., a prescription, which a machine could use to calculate N in a finite number of steps., First, we need some notation., Let M be any m X n matrix with entries in F[z] which has a nonzero first column, , Cl, fl, , Ml=, , i, , ., , m, , Define, , 1(Mr) = min degji, fi#O, , (7-29), p(Ml), , = g.c.d. (fi, . . . ,fm)., , Let j be some index such that deg jj = Z(1Mr). To be specific, let j be, the smallest index i for which deg ji = Z(Afl). Attempt to divide each ji, by.fi:, ri = 0 or deg ri < deg ji., (7-30), ji = fm + ri,, For each i different from j, replace row i of M by row i minus gi times, row j. Multiply, row j by the reciprocal of the leading coefficient of ji and, then interchange rows j and 1. The result of all these operations is a matrix, M’ which has for its first column, , (7-31), , wherejj is the manic polynomial obtained by normalizingjj, to have leading, coefficient 1. We have given a well-defined, procedure for associating with, each M a matrix M’ with these properties., (a) M’ is row-equivalent, to M., (b) POW = PWJ., (c) Either Z(M:) < Z(M1) or, , It is easy to verify, , (b) and (c) from, , (7-30) and (7-31). Property, , (c)
Page 263 :
Sec. 7.4, , Computation of Invariant, , Factors, , is just another way of stating that either there is some i such that ri # 0, and deg ri < degfj or else ri = 0 for all i and j;i is (therefore) the greatest, common divisor of fr, . .‘, f m., The proof of the lemma is now quite simple. We start with the matrix, 111and apply the above procedure to obtain X’. Property (c) tells us that, either M’ will serve as the matrix N in the lemma or 1(&Z;) < Z(M,). In, the latter case, we apply the procedure to 111 to obtain the matrix M(2) =, (~14’)‘. If M@) is not a suitable N, we form M(3) = (M(2))‘, and so on, The, point is that the strict inequalities, Z(M,) > Z(kq), , > Z(A!G2’) > 1. *, , cannot continue for very long. After not more than 1(&l,) iterations of our, procedure, we must arrive at a matrix Mck) which has the properties we, seek. 1, Theorem, 6. Let P be an m X m matrix with entries in the polynomial, algebra F [xl. The following are equivalent., , (i), (ii), (iii), (iv), , P is, The, P is, P is, , invertible., determinant of P is a non-zero scalar polynomial., row-equivalent to the m X m identity matrix., a product of elementary matrices., , Proof. Certainly, (i) implies (ii) because the determinant, function is multiplicative, and the only polynomials invertible, in F[x] are the, non-zero scalar ones. As a matter of fact, in Chapter 5 we used the classical, adjoint to show that (i) and (ii) are equivalent., Our argument here provides a different proof that (i) follows from (ii). We shall complete the, merry-go-round, 7 + @), 3(iv) t (iii)., The only implication which is not obvious is that (iii) follows from (ii)., Assume (ii) and consider the first column of P. It contains certain, polynomials pl, . . . , p,, and, g.c.d. (~1, . . . , pm) = I, because any common divisor of pl, . . . , p, must divide, Apply the previous lemma to P to obtain a matrix, 1, , (7-32), , Q=p, , a2, , ..., , (the scalar) det P., , a,, , L 1, B, , 0, , which is row-equivalent, to P. An elementary row operation changes the, determinant, of a matrix by (at most) a non-zero scalar factor. Thus det Q, , 255
Page 264 :
256, , The Rational and Jordan Forms, , Chap. 7, , is a non-zero scalar polynomial., Evidently, the (m - 1) X (m - 1), matrix B in (7-32) has the same determinant, as does &. Therefore,, we, may apply the last lemma to B. If we continue this way for m steps, we, obtain an upper-triangular, matrix, R =, , which is row-equivalent, identity matrix., 1, , [ *I, 1, , c&, , ***, , a,, , 0, .., i,, , 1, .., 0, , **a, , 73,, ., , ..., , i, , to R. Obviously, , R is row-equivalent, , to the m X m, , Corollary., Let M and N be m X n matrices with entries in the polynomial algebra E‘[x]. Then N is row-equivalent to M if and only if, , N = PM, where P is an invertible, , m X m matrix with entries in F[x]., , We now, , define elementary, column, operations, and columnin a manner analogous to row operations and row-equivalence., We do not need a new concept of elementary matrix because the class of, matrices which can be obtained by performing, one elementary, column, operation on the identity matrix is the same as the class obtained by, using a single elementary row operation., equivalence, , DeJinition., The matrix N is equivalent, to the matrix, pass from M to N by means of a sequence of operations, , M = Mo+Ml+, each of which is an elementary, operation., Theorem, , polynomial, , ..a +Mk, row operation, , M if we can, , = N, or an elementary, , column, , 7. Let M and N be m X n matrices with entries in the, algebra F[x]. Then N is equivalent to M if and only if, N = PM&, , where P is an invertible, F[x]“X”., , matrix in F[x] mXn and Q is an invertible, , matrix in, , Theorem, 8. Let A be an n X n matrix with entries in the field F,, and let pl, . . . , pr be the invariant factors for A. The matrix x1 - A is, equivalent to the n X n diagonal matrix with diagonal entries pl, . . . , pr,, 1, 1, . . . ) 1., , Proof. There, in F, such that PAP-’, , exists an invertible, n X n matrix P, with entries, is in rational form, that is, has the block form
Page 265 :
Computation of Invariant, , Sec. 7.4, , where Ai is the companion, Theorem 7, the matrix, P(xI, , (7-33), , is equivalent, , matrix, - A)P-’, , pi. According, , to, , = xl - PAP-’, , to XI - A. Now, XI - A1, XI - PAP-’, , (7-34), , of the polynomial, , Factors, , 0, , ., , =, , 0, XI-AZ, , ..., ..a, , 0, 0, , :1, , 0:, . . . XI -* A,, i, L, where the various I’s we have used are identity matrices of appropriate, sizes. At the beginning of this section, we showed that XI - Ai is equivalent to the matrix, , pi0**a, 0, i- I, 0, ., , 1, ., , .a*, , 0, , 0, , b, , .. ., , i, , From (7-33) and (7-34) it is then clear that xl - A is equivalent, to a, diagonal matrix which has the polynomials pi and (n - r) l’s on its main, diagonal. By a succession of row and column interchanges, we can arrange, those diagonal entries in any order we choose, for example: pl, . . . , p,,, 1 ,***7, , 1., , I, , Theorem 8 does not give us an effective way of calculating the elementary divisors pl, . . . , p, because our proof depends upon the cyclic, decomposition, theorem. We shall now give an explicit algorithm for reducing a polynomial, matrix to diagonal form. Theorem 8 suggests that, we may also arrange that successive elements on the main diagonal divide, one another., Definition., normal, form, , f l,...,, , Let N be a matrix in F[x]*“., , We say that N is in (Smith), , if, , (a) every entry of the main diagonal of N is 0;, (b) on the main diagonal of N there appear (in order), fl such that fk divides fk+r, 1 I k 5 1 - 1., , polynomials, , In the definition, the number 1 is 1 = min (m, n). The main diagonal, entries are jk = Nkt, k = 1, . . . , 1., Theorem, 9. Let M be an m X n matrix with entries in the polynomial, algebra F [xl. Then M is equivalent to a matrix N which is in normal form., , 257
Page 266 :
The Rational and Jordan Forms, , Chap. 7, , fl0--*, 0, I 1, , Proof. If M = 0, there is nothing to prove. If M # 0, we shall, give an algorithm for finding a matrix M’ which is equivalent, to M and, which has the form, (7-35), , M’=, , 0, , R, , 0, , where R is an (m - 1) X (n - 1) matrix and ji divides every entry of R., We shall then be finished, because we can apply the same procedure to R, and obtain ji, etc., Let Z(M) be the minimum of the degrees of the non-zero entries of M., Find the first column which contains an entry with degree Z(M) and, interchange that column with column 1. Call the resulting matrix M(O)., We describe a procedure for finding a matrix of the form, , (7-36), , which is equivalent to M co). We begin by applying to the matrix M(O) the, procedure of the lemma before Theorem 6, a procedure which we shall, call PL6. There results a matrix, , (7-37), , If the entries a, . . . , b are all 0, fine. If not, we use the analogue of PL6, for the first row, a procedure which we might call PL6’. The result is a, matrix, (7-38), , where q is the greatest common divisor of p, a, . ., we may or may not have disturbed the nice form, we can apply PL6 once again. Here is the point., steps:, PL6, M(O), --f, M(l), %, &f(2) z, ... -, , . , b. In producing M(z),, of column 1. If we did,, In not more than l(M), M(t), , we must arrive at a matrix M(Q which has the form (7-36), because at, each successive step we have l(M @+I)) < Z(M@)). We name the process, which we have just defined P736:, M(O), , P7-36, -, , MC’).
Page 267 :
Computation of Invatiant, , Sec. 7.4, , Factors, , In (7-36), the polynomial g may or may not divide every entry of S., If it does not, find the first column which has an entry not divisible by g, and add that column to column 1. The new first column contains both g, and an entry gh + r where r # 0 and deg r < deg g. Apply process P7-36, and the result will be another matrix of the form (7-36), where the degree, of the corresponding, g has decreased., It should now be obvious that in a finite number of steps we will, obtain (7-35), i.e., we will reach a matrix of the form (7-36) where the, degree of g cannot be further reduced., 1, We want to show that the normal form associated with a matrix M, is unique. Two things we have seen provide clues as to how the polyby M. First,, nomials ji, . . . ,fi in Theorem 9 are uniquely determined, elementary row and column operations do not change the determinant, of a square matrix by more than a non-zero scalar factor. Second, elementary row and column operations do not change the greatest common, divisor of the entries of a matrix., DeJinition., Let M be an m X n matrix with entries in F[x]., 1 5 k 5 min (m, n), we define &(M) to be the greatest common divisor, the determinants of all k X k submatrices of M., , Recall that a k X k submatrix of M is one obtained, m - k rows and some n - k columns of M. In other, certain k-tuples, I = (il, . . . ) ik),, J = (jl, . . . , j,),, , D,,,(M), , = det[x,, , 1 ( i1 < . . . < il, I m, 1 < j, < f . . < j, 5 n, , 1::, , 10., , If, , M and N are equivalent, , of M. We, , :‘?I., , The polynomial, C%(M) is the greatest common divisor, DI,J(M),, as I and J range over the possible k-tuples., Theorem, , of, , by deleting some, words, we select, , and look at the matrix formed using those rows and columns, are interested in the determinants, (7-39), , If, , of the polynomials, , m X n matrices with entries, , in F [xl, then, (7-40), , &c(M) = h(N),, , 1 <klmin(m,n)., , Proof. It will suffice to show that a single elementary row operation e does not change &. Since the inverse of e is also an elementary row, operation,, it will suffice to show this: If a polynomial, j divides every, Dr,J(M), then f divides &,J(e(M)), for all k-tuples I and J., , 269
Page 268 :
260, , The Rational and Jordan Forms, , Chap. 7, , Since we are considering a row operation,, of M and let us employ the notation, DJ(cyi~,..., , , 4, , =, , Given I and J, what is the relation, Consider the three types of operations, , let al, . . . , LY,,,be the rows, , D~,J(M>., , between, e:, , DI,J(M), , and &,J(e(M))?, , (a) multiplication, of row r by a non-zero scalar c;, (b) replacement of row r by row r plus g times row s, r # s;, (c) interchange of rows r and s, r # s., Forget about type (c) operations for the moment, and concentrate, on types (a) and (b), which change only row r. If r is not one of the indices, il,, . . . , ik, then, Dr,J(e(M)), = D~,J(M)., If r is among the indices il, . . . , ik, then in the two cases we have, (a) Dr,&(M)), , (b), , &de(M)), , = DJ(%,, =, , cDJ((Y<,,, , =, , cDI,J(M);, , =, , DJ(CG,, , =, , D~,J(M), , . . . , C%, . . . , a&, ., , ., , . , (Y,,, , . . . , a,, +, , ., , ., , . , CX& ,’, , -I- gas,, , gDJ(%,,, , . . . , ad, , . . .,, , a,, , . . . ) cuir 1., , For type (a) operations, it is clear that any f which divides DI,J(M), also divides Dr,J(e(M))., For the case of a type (c) operation, notice that, DJ(G,, DJ(%,, , . . .,, , if S = ij for some j, if s # ij for all j., , . . . , a,, . . . , ai,> = 0,, as,., , . ., ad, , =, , fDr',J(M),, , The I’ in the last equation is the k-tuple (il, . . . , s, . . . , i,) arranged in, increasing order. It should now be apparent that, if f divides every DI,J(M),, then f divides every Dr,J(e(M))., Operations of type (c) can be taken care of by roughly the same, argument or by using the fact that such an operation can be effected by, a sequence of operations of types (a) and (b). 1, Corollary., Each matrix M in F[xlmxn is equivalent to precisely one, matrix N which is in normal form. The polynomials fi, . . . , fl which occur, on the main diagonal of N are, , 6dM), fk = 6k-1(M)’, , 1 5 k < min (m, n), , where, for convenience, we define s,,(M) = 1., Proof. If N is in normal, it is quite easy to see that, , form, , with, , L(N) = fj-2 -.-f/e., , diagonal, , [, , entries fi, . . . , fl,
Page 269 :
Sec. 7.4, , Computation of Invariant, , Factors, , Of course, we call the matrix N in the last corollary the normal, form, of M. The polynomials jr, . . . , ji are often called the invariant, factors, of M., Suppose that A is an n X n matrix with entries in F, and let pl, . . . , p,, be the invariant, factors for A. We now see that the normal form of the, matrix zl - A has diagonal entries 1, 1, . . . , 1, p,, . . . , pl. The last, corollary tells us what pl, . . . , p, are, in terms of submatrices of z1 - A., The number n - r is the largest k such that &(x1 - A) = 1. The minimal, polynomial pl is the characteristic polynomial for A divided by the greatest, common divisor of the determinants, of all (n - 1) X (n - 1) submatrices, of XI - A, etc., , Exercises, 1. True or false? Every matrix, matrix., , in F[z] nXn is row-equivalent, , to an upper-triangular, , 2. Let T be a linear operator on a finite-dimensional, vector space and let A be, the matrix of T in some ordered basis. Then T has a cyclic vector if and only if, the determinants, of the (n - 1) X (n - 1) submatrices, of z1 - A are relatively, prime., , 3. Let A be an n X n matrix with entries in the field F and let ji, . . . , j,, be the, diagonal, , entries, , of the normal, , form, , of zI - A. For which, , matrices, , A is jl # I?, , polynomial, z2(z - 1)2 and charac4. Construct a linear operator T with minimal, teristic polynomial, z3(z - 1)4. Describe the primary, decomposition, of the vector, on the primary, components., Find a basis, space under T and find the projections, in which the matrix of T is in Jordan form. Also find an explicit direct sum decomposition of the space into T-cyclic subspaces as in Theorem 3 and give the invariant, factors., 5. Let T be the linear, basis by the matrix, , A=, , operator, , I, 1, 0, 0, 0, 0, 0, 0, 0, , on R8 which, , 1, 0, 0, 1, 0, 1, , -1, , 0, , 1, 0, 0, 1, 0, 1, -1, , 1, 0, 0, 0, 1, 1, -1, , 0, , 0, , is represented, , 111, 000, 0 0, 000, 100, 110, -1, 0, 000, , in the, , standard, , 1, 1, 0, , -1, 1, 0, 1, , 1, , -1, 0, , (a) Find the characteristic, polynomial, and the invariant, factors., on, (b) Find the primary, decomposition, of R8 under T and the projections, the primary components., Find cyclic decompositions, of each primary, component, as in Theorem 3., , 261
Page 270 :
262, , Chap. 7, , The Rational and Jordan Forms, (c) Find the Jordan form of A., (d) Find a direct-sum decomposition, , of R* into T-cyclic subspaces as in, Theorem 3. (Hint: One way to do this is to use the results in (b) and an appropriate, generalization, of the ideas discussed in Example 4.), , 7.5., , Summary;, , Semi-Simple, , Operators, , In the last two chapters, we have been dealing with a single linear, operator T on a finite-dimensional, vector space V. The program has been, to decompose T into a direct sum of linear operators of an elementary, nature, for the purpose of gaining detailed information, about how T, ‘operates’ on the space V. Let us review briefly where we stand., We began to study T by means of characteristic, values and characteristic vectors. We introduced, diagonalizable, operators, the operators, which can be completely described in terms of characteristic, values and, vectors. We then observed that T might not have a single characteristic, vector. Even in the case of an algebraically, closed scalar field, when every, linear operator does have at least one characteristic, vector, we noted that, the characteristic vectors of T need not span the space., We then proved the cyclic decomposition, theorem, expressing any, linear operator as the direct sum of operators with a cyclic vector, with, no assumption about the scalar field. If U is a linear operator with a cyclic, vector, there is a basis {(Ye, . . . , ar,} with, uaj = aj+1,, j=l,...,n-1, Ua, = -c&Y1 - Cl@ - * * * - c,-ICY,., The action of U on this basis is then to shift each aj to the next vector, q+l, except that UCX, is some prescribed linear combination of the vectors, in the basis. Since the general linear operator T is the direct sum of a, finite number of such operators U, we obtained an explicit and reasonably, elementary description of the action of T., We next applied the cyclic decomposition, theorem to nilpotent, operators. For the case of an algebraically, closed scalar field, we combined, this with the primary decomposition, theorem to obtain the Jordan form., The Jordan form gives a basis {q . . . , cy,} for the space V such that,, for each j, either Taj is a scalar multiple of q or Tq = caj + aj+l. Such, a basis certainly describes the action of T in an explicit and elementary, manner., The importance, of the rational form (or the Jordan form) derives, from the fact that it exists, rather than from the fact that it can be computed in specific cases. Of course, if one is given a specific linear operator, T and can compute its cyclic or Jordan form, that is the thing to do;, for, having such a form, one can reel off vast amounts of information
Page 271 :
Sec. 7.5, , Summary; Semi-Simple Operators, , about T. Two different types of difficulties, arise in the computation, of, such standard forms. One difficulty, is, of course, the length of the computations. The other difficulty, is that there may not be any method for, doing the computations,, even if one has the necessary time and patience., The second difficulty arises in, say, trying to find the Jordan form of a, complex matrix. There simply is no well-defined method for factoring the, characteristic, polynomial,, and thus one is stopped at the outset. The, rational form does not suffer from this difficulty. As we showed in Section, 7.4, there is a well-defined method for finding the rational form of a given, n X n matrix; however, such computations, are usually extremely lengthy., In our summary of the results of these last two chapters, we have not, yet mentioned one of the theorems which we proved. This is the theorem, which states that if T is a linear operator on a finite-dimensional, vector, space over an algebraically, closed field, then T is uniquely expressible as, the sum of a diagonalizable, operator and a nilpotent, operator which, commute. This was proved from the primary decomposition, theorem and, certain information, about diagonalizable, operators. It is not as deep a, theorem as the cyclic decomposition, theorem or the existence of the, Jordan form, but it does have important, and useful applications in certain, parts of mathematics., In concluding, this chapter, we shall prove an, analogous theorem, without assuming that the scalar field is algebraically, closed. We begin by defining the operators which will play the role of the, diagonalizable operators., Definition., Let V be a finite-dimensional, vector space over the field F,, and let T be a linear operator on V. We say that T is semi-simple, if every, T-invariant, subspace has a complementary T-invariant, subspace., , What we are about to prove is that, with some restriction, on the, field F, every linear operator T is uniquely expressible in the form T =, X + N, where X is semi-simple, N is nilpotent,, and SN = NS. First,, we are going to characterize semi-simple operators by means of their, minimal polynomials, and this characterization, will show us that, when F, is algebraically, closed, an operator is semi-simple if and only if it is, diagonalizable., Lemma., Let T be a linear operator on the jinite-dimensional, vector, space V, and let V = WI @ . . . @ Wk be the primary decomposition for T., In other words, if p is the minimal polynomial for T and p = p;’ * . . p? is, the prime factorization of p, then Wj is the null space of pi(T)“., Let W be, any subspace of V which is invariant under T. Then, w, , =, , (w, , n, , Proof. For the proof, of the primary decomposition, , WI), , @, , . . . @, , (w, , n, , wk), , we need to recall a corollary to our proof, theorem in Section 6.8. If J?&,. . . , Ek are, , 265
Page 272 :
The Rational and Jordan Forms, , Chap. 7, , the projections associated with the decomposition, V = WI @ . . . @ Wk,, then each Ej is a polynomial in T. That is, there are polynomials hI, . . . , hk, such that Ej = hj( T)., Now let W be a subspace which is invariant, under T. If Q! is any, vector in W, then 01 = crl + . . . + ak, where aj is in Wj. NOW aj = Eja =, hj(T)a, and since W is invariant under T, each aj is also in W. Thus each, vector a! in W is of the form a = a1 + . . . + (Yk,where aj is in the intersection W n Wj. This expression is unique, since V = WI @ . . . @ Wk., Therefore, 1, w = (w n wl> @ -.. @ (w n wk)., Lemma., , polynomial, , Let T be a linear operator on V, and suppose that the minimal, for T is irreducible over the scalar Jield F. Then T is semi-simple., , Proof. Let W be a subspace of V which is invariant, under T., We must prove that W has a complementary, T-invariant, subspace., According to a corollary of Theorem 3, it will suffice to prove that if f is, a polynomial and @is a vector in V such that f(T)/3 is in W, then there is, a vector cy in W with f( T)/3 = f( T)a. So suppose /3 is in V and f is a polynomial such that f( T)fi is in W. If f( T)/3 = 0, we let c11= 0 and then Q!is a, vector in W with f(T)/3 = f( T)a. If f(T)@ # 0, the polynomial, f is not, divisible by the minimal polynomial p of the operator T. Since p is prime,, this means that f and p are relatively prime, and there exist polynomials, g and h such that fg + ph = 1. Because p(T) = 0, we then have, f(T)g(T), = I. From this it follows that the vector p must itself be in the, subspace W; for, P = dTlf(T)P, = dT)(f(T)P), while f(T)p is in W and W is invariant under T. Take a! = /3. 1, Theorem, 11. Let T be a linear operator on the finite-dimensional, vector, space V. A necessary and suficient condition that T be semi-simple is that, the minimal polynomial p for T be of the form p = pl . . . pk, where pl, . . . , pk, are distinct irreducible polynomials over the scalar Jield F., , Proof. Suppose T is semi-simple. We shall show that no irreducible polynomial, is repeated in the prime factorization, of the minimal, polynomial p. Suppose the contrary. Then there is some non-scalar manic, polynomial g such that g2 divides p. Let W be the null space of the operator g(T). Then W is invariant, under T. Now p = g2h for some polynomial h. Since g is not a scalar polynomial,, the operator g(T)h(T), is not, the zero operator, and there is some vector /3 in V such that g( T)h( T)/3 # 0,, i.e., (gh)P # 0. Now (gh)P is in the subspace W, since g(ghp) = g2hfl =, p/3 = 0. But there is no vector a in W such that ghp = gha; for, if (Yis in W, (gh)a = (hg)a = h(ga), , = h(0) = 0.
Page 273 :
Sec. 7.5, , Summary; Semi-Simple Operators, , Thus, W cannot have a complementary, T-invariant, subspace, contradicting the hypothesis that T is semi-simple., Now suppose the prime factorization, of p is p = pl . . . pk, where, (non-scalar), manic polynomials., Let, PI, . . . t Pk are distinct irreducible, W be a subspace of V which is invariant under T. We shall prove that W’, has a complementary, T-invariant, subspace. Let V = WI @ . . . @ Wk, be the primary decomposition for T, i.e., let Wi be the null space of pj(T)., Let Tj be the linear operator induced on Wj by T, SO that the minimal, polynomial for Tj is the prime pj. Now W n Wj is a subspace of Wi which, is invariant under Tj (or under T). By the last lemma, there is a subspace, Vj of Wj such that Wj = (W n Wj) @ Vj and Vj is invariant, under Tj, (and hence under T). Then we have, (w, n, wd, @, vl@, -", @(w, n, wk> @ vk, = (w n w,) + . . . + (w n wk) @ VI@ ’ ’ . @ vk., , =, , By the first lemma above, W = (W n WI) @ . . . @ (W n wk),, then V= WOW’, and W’ is invariant, if W’ = Vl @ . . . @vk,, , so that, under, , Corollary., If T is a linear operator on a finite-dimensional, vector space, over an algebraically closed $eld, then T is semi-simple if and only if T is, diagonalixable., , Proof. If the scalar field F is algebraically, closed, the monk, primes over F are the polynomials, 2 - c. In this case, T is semi-simple, if and only if the minimal polynomial for T is p = (z - cl) . . . (CC- ce),, where cl, . . . , ck are distinct elements of F. This is precisely the criterion, for T to be diagonalizable,, which we established in Chapter 6. 1, We should point out that T is semi-simple if and only if there is some, polynomial f, which is a product of distinct primes, such that f(T) = 0., This is only superficially, different from the condition that the minimal, polynomial be a product of distinct primes., We turn now to expressing a linear operator as the sum of a semisimple operator and a nilpotent, operator which commute. In this, we, shall restrict the scalar field to a subfield of the complex numbers. The, informed reader will see that what is important, is that the field F be a, field of characteristic, zero, that is, that for each positive integer n the, sum 1 + . . . + 1 (n times) in F should not be 0. For a polynomial f over, F, we denote by fck) the kth formal derivative, of f. In other words,, f@) = D”f, where D is the differentiation, operator on the space of polynomials. If g is another polynomial, f(g) denotes the result of substituting, obtained by applying j to the element g in the, g in f, i.e., the polynomial, linear algebra F [xl., , 265
Page 274 :
266, , The Rational and Jordan Forms, Lemma, , (Taylor’s, , Chap, i’, , Formula)., , and let g and h be polynomials, deg f I n, then, f(g) = f(h) + f”‘(h)(g, , Let F be a field of characteristic zero, over F. If f is any polynomial over F with, , - h) + f$), , (g - h)z + . . . + y, , (g - h)“., , Proof. What we are proving is a generalized Taylor formula. The, reader is probably used to seeing the special case in which h = c, a scalar, polynomial, and g = x. Then the formula says, f = f(x), , = f(c) + f”‘(C)(X, , - c), + P’(C), ar-(x-C>“+, , *** + fq, , (z - c)“., , The proof of the general formula is just an application of the binomial, theorem, (a + b)” = ak + j&-lb + “‘“2y, ‘) ak-262 + . , . + bk., For the reader should see that, since substitution, linear processes, one need only prove the formula, mula for f = 2 chxk follows, , and differentiation, are, when f = xk. The for-, , by a linear combination., , In the case f = xk, , k=O, , with k 5 n, the formula, , says, , g” = hk + ,+hk-‘(g - h) + k’“2;, which is just the binomial, , ‘) hk-2(g - h)2 + . . . + (9 - Wk, , expansion, , of, , 1, , gk = [h + (g - hII"., , Lemma., Let F be a subfield of the complex numbers, let f be a polynomial over F, and let f’ be the derivative of f. The following are equivalent:, , (a) f is the product of distinct polynomials irreducible over F., (b) f and f’ are relatively prime., (c) As a polynomial with complex coejyicients, f has no repeated root., Proof. Let us first prove that (a) and (b) are equivalent, statements about f. Suppose in the prime factorization, off over the field F that, some (non-scalar) prime polynomial p is repeated. Then f = p2h for some, h in F[x]. Then, f’ = p2h’ + 2pp’h, and p is also a divisor of f’. Hence f and f’ are not relatively, conclude that (b) implies (a)., Now suppose f = pl . . . pk, where PI, . . . , pk are distinct, irreducible polynomials over F. Let fj = f/pi. Then, f’ = p:fl +, , p;f2, , +, , “-, , +, , p:fk., , prime., , We, , non-scalar
Page 275 :
Sec. 7.5, , Summary; Semi-Simple Operators, , Let p be a prime polynomial which divides both f andf’. Then p = pi for, some i. Now pi divides fj for j # i, and since pi also divides, , we see that pi must divide pif;. Therefore pi divides either fi or pi. But pi, does not divide fi since pl, . . . , pk are distinct. So pi divides pi. This is, not possible, since pi has degree one less than the degree of pi. We conclude that no prime divides both f andf’, or that (f, f’) = 1., To see that statement (c) is equivalent, to (a) and (b), we need only, observe the following: Suppose f and g are polynomials, over F, a subfield, of the complex numbers. We may also regard f and g as polynomials with, complex coefficients. The statement that f and g are relatively, prime as, polynomials over F is equivalent, to the statement that f and g are relatively prime as polynomials over the field of complex numbers. We leave, the proof of this as an exercise. We use this fact with g = f’. Note that, (c) is just (a) when f is regarded as a polynomial, over the field of complex, numbers. Thus (b) and (c) are equivalent,, by the same argument that, we used above., 1, We can now prove a theorem which makes the relation between, simple operators and diagonalizable, operators even more apparent., , semi-, , Theorem, 12. Let F be a subfield of the field of complex numbers, let V, be a finite-dimensional, vector space over F, and let T be a linear operator on, V. Let @ be an ordered basis for V and let A be the matrix of T in the ordered, basis 63. Then T is semi-simple if and only if the matrix A is similar over the, Jield of complex numbers to a diagonal matrix., , Proof. Let p be the minimal polynomial, for T. According to, Theorem 11, T is semi-simple if and only if p = pl . . . pk where pl, . . . , pk, are distinct irreducible, polynomials, over F. By the last lemma, we see, that T is semi-simple if and only if p has no repeated complex root., Now p is also the minimal polynomial, for the matrix A. We know, that A is similar over the field of complex numbers to a diagonal matrix, if and only if its minimal polynomial has no repeated complex root. This, proves the theorem., 1, Let F be a subfield of the field of complex numbers, let V, be a Jinite-dimensional, vector space over F, and let T be a linear operator on V., There is a semi-simple operator S on V and a nilpotent operator N on V such, that, Theorem, , 13., , (i) T=S+N;, (ii) SN = NS., , 267
Page 276 :
268, , The Rational and Jordan Forms, , Chap. 7, , Furthermore, the semi-simple S and nilpotent, unique, and each is a polynomial in T., , N satisfying, , (i) and (ii) are, , Proof. Let p’l’ . ’ . p? be the prime factorization, of the minimal, polynomial for T, and letf = pl . . . pk. Let r be the greatest of the positive, is a product of distinct primes,, integers rl, . . . , rk. Then the polynomialf, f is divisible by the minimal polynomial for T, and so, f(T)’, We are going to construct, such that, , = 0., , a sequence of polynomials:, , f(”, , go, 91, gz, . . ., , - jio c7iJ-g, , is divisible byf”+‘, n = 0, 1, 2, . . . . We take go = 0 and then f(x - gsf”) =, f(x) = f is divisible by f. Suppose we have chosen go, . . . , g,,-1. Let, n-1, , jzo Sifi, , h=xso that, by assumption,, , f(h) is divisible, , byf”., , We want to choose g,, so that, , f(h - gRfn), is divisible, , by f”+‘. We apply the general Taylor, , formula, , and obtain, , f(h - gn.P>= f(h) - gnfY(h) + P+lb, where b is some polynomial., By assumption f(h) = gf”. Thus, we see that, to have f(h - gnfn) divisible by f n+l we need only choose gn in such a way, that (q - gnf’) is divisible by f. This can be done, because f has no repeated prime factors and so f and f’ are relatively, prime. If a and e are, polynomials, such that af + ef’ = 1, and if we let gn = eq, then q - gJ, is divisible by f., Now we have a sequence go, 91, . . . such that f”+l, divides, f(s-j~ogJ-pt, , e us take n = r - 1 and then since f(T)’, f (T, , -, , :il, , gj(TI)f(T)i), , = 0, , = 0., , Let, r-1, , N = Z gj(T)f(T)j, j=l, , Since 5 gjfj is divisible, , r-1, , = jFo gdT)f(T)i*, , by f, we see that N’ = 0 and N is nilpotent., , Let, , j=l, , S = T factors, S, Now, and each, , N. Then f(s) = f(T - N) = 0. Since f has distinct prime, is semi-simple., we have T = S + N where X is semi-simple, N is nilpotent,, is a polynomial, in T. To prove the uniqueness statement, we
Page 277 :
Sec. 7.5, , Summary; Semi-Simple Operators, , shall pass from the scalar field F to the field of complex numbers., be some ordered basis for the space V. Then we have, , Let @, , [nB = [Xl, + [N](B, while [&‘]a is diagonalizable, over the complex numbers and [N]a is nilpotent. This diagonalizable, matrix and nilpotent matrix which commute, are uniquely determined, as we have shown in Chapter 6. 1, , Exercises, 1. If N is a nilpotent linear operator on V, show that for any polynomial f the, semi-simple part of f(N) is a scalar multiple of the identity operator (F a subfield, of C)., , 2. Let F be a subfield of the complex numbers, V a finite-dimensional vector, space over F, and T a semi-simple linear operator on V. If f is any polynomial, over F, prove that f(T) is semi-simple., 3. Let T be a linear operator on a finite-dimensional space over a subfield of C., Prove that 7’ is semi-simple if and only if the following is true: If f is a polynomial, and f(T) is nilpotent, then f(T) = 0.
Page 278 :
8. Inner, , Product, , SPaces, 8.1., , Inner, , Products, , Throughout, this chapter we consider only real or complex vector, spaces, that is, vector spaces over the field of real numbers or the field of, complex numbers. Our main object is to study vector spaces in which it, makes sense to speak of the ‘length’ of a vector and of the ‘angle’ between, two vectors. We shall do this by studying a certain type of scalar-valued, function on pairs of vectors, known as an inner product. One example of, an inner product is the scalar or dot product of vectors in R3. The scalar, product of the vectors, a = (XI, XZ, ~3), , and, , P = (~1, ~2, ~3), , in R3 is the real number, (alp), , =, , z1y1, , +, , x2y2, , +, , 53y3., , Geometrically,, this dot product is the product of the length of (Y, the, length of 0, and the cosine of the angle between (Y and p. It is therefore, possible to define the geometric concepts of ‘length’ and ‘angle’ in R3 by, means of the algebraically, defined scalar product., An inner product on a vector space is a function with properties, similar to the dot product in R3, and in terms of such an inner product, one can also define ‘length’ and ‘angle.’ Our comments about the general, notion of angle will be restricted to the concept of perpendicularity, (or, orthogonality), of vectors. In this first section we shall say what an inner, product is, consider some particular, examples, and establish a few basic, , 270
Page 279 :
Inner Products, , Sec.&l, properties of inner products., and orthogonality., , Then we turn to the task of discussing length, , DeJinition., Let F be the field of real numbers or the jield of complex, on V is a function, numbers, and V a vector space over F. An inner product, which assigns to each ordered pair of vectors CY,p in V a scalar (&3) in F in, such a way that for all QI, p, y in V and all scalars c, (a> (a + Plr) = (4~) + (Plr);, (b) (~4), = 443;, (c) (pia) = (@), the bar denoting complex conjugation;, (d) (~Icz) > 0 if a # 0., , It should be observed, , that conditions, , (a), (b), and (c) imply, , that, , (&P + 7) = $4) + WY>., (4, One other point should be made. When F is the field R of real numbers,, the complex conjugates appearing in (c) and (e) are superfluous; however,, in the complex case they are necessary for the consistency of the conditions. Without these complex conjugates, we would have the contradiction:, (ala) > 0, , and, , (i&z), , = -l(crla), , > 0., , In the examples that follow and throughout, the chapter,, the field of real numbers or the field of complex numbers., , F is either, , EXAMPLE 1. On Fn there is an inner product which we call the, inner, product., It is defined on LY= (~1, . . . , z,) and P =, , standard, , (Yl, . . . > ~4, , (8-l), , by, , (40) = 7 GY-IP, , When F = R, this may also be written, (alP) = 7 XiYjIn the real case, the standard inner, scalar product and denoted by a! . p., , product, , is often, , called the dot or, , EXAMPLE 2. For a! = (x1, x2) and p = (yl, yz) in R2, let, (alp) = xlyl - xzyl - xlyz + ~W!/Y,., Since (ala) = (21 - 42 + 3xi, it follows that (a/a) > 0 if Q! # 0. Conditions (a), (b), and (c) of the definition are easily verified., EXAMPLE 3. Let V be Fnxn, the space of all n X n matrices over F., Then V is isomorphic to Fn’ in a natural way. It therefore follows from, Example 1 that the equation, (A IB) = E Aj8jk
Page 280 :
272, , Inner Product Spaces, , Chap. 8, , defines an inner product on V. Furthermore,, if we introduce the conjugate, transpose, matrix B*, where Bzj = Bjk, we may express this inner product, on Fnxn in terms of the trace function:, (A/B), , = tr (AB*), , = tr (B*A)., , For, tr (AB*), , = 2 (AB*),, i, , EXAMPLE 4. Let Fnxl be the space of n X 1 (column) matrices over, F, and let Q be an n X n invertible, matrix over F. For X, Y in Fnxl set, (XIY), , = Y*Q*Qx., , We are identifying, the 1 X 1 matrix on the right with its single entry., When Q is the identity matrix, this inner product is essentially the same, inner, product, on Fnxl., as that in Example 1; we call it the standard, The reader should note that the terminology, ‘standard inner product’ is, used in two special contexts. For a general finite-dimensional, vector space, over F, there is no obvious inner product that one may call standard., EXAMPLE 5. Let V be the vector space of all continuous, valued functions on the unit interval, 0 I t I 1. Let, , complex-, , The reader is probably more familiar with the space of real-valued, continuous functions on the unit interval,, and for this space the complex, conjugate on g may be omitted., EXAMPLE 6. This is really a whole class of examples. One may construct new inner products from a given one by the following, method., Let V and W be vector spaces over F and suppose ( 1 ) is an inner product, on IV. If T is a non-singular, linear transformation, from V into W, then, the equation, PT(~, P> = (WV0, defines an inner product pr on V. The inner product in Example, special case of this situation. The following are also special cases., (a) Let V be a finite-dimensional, , vector space, and let, , a3 = {al, . . . ) a,}, , 4 is a
Page 281 :
Inner Products, , Sec. 8.1, , be an ordered basis for V. Let 61, . . . , e, be the standard basis vectors in, Fn, and let T be the linear transformation, from V into Fn such that Tai =, Ej,j = 1,. . ., n. In other words, let T be the ‘natural’, isomorphism of V, onto Fn that is determined by a. If we take the standard inner product, on F”, then, , Thus, for any basis for V there is an inner product on V with the property, (aj]ab) = &; in fact, it is easy to show that there is exactly one such, inner product. Later we shall show that every inner product on V is, determined by some basis @ in the above manner., (b) We look again at Example 5 and take V = W, the space of, continuous functions on the unit interval., Let T be the linear operator, ‘multiplication, by t,’ that is, (Tf) (t) = tf(t), 0 5 t I 1. It is easy to see, that T is linear. Also T is non-singular; for suppose Tf = 0. Then tf(t) = 0, for 0 < t 5 1; hence f(t) = 0 for t > 0. Since f is continuous, we have, f(0) = 0 as well, or f = 0. Now using the inner product of Example 5,, we construct a new inner product on V by setting, m(f,, , 9) = [, , (U)(t)(Q)(t), , dt, , = / o1f(t)g(t)Pczt., We turn now to some general observations, about inner products., Suppose V is a complex vector space with an inner product. Then for all, cu,pin V, , (alP) = Re (c#, , + i Im (aIP), , where Re (~~10) and Im (cx]~) are the real and imaginary, parts of the, complex number (Q/P). If z is a complex number, then Im (x) = Re (-iz)., It follows that, Im (~~10) = Re [-;(a]p)], Thus the inner product, accordance with, 63-2), , blP), , = Re (ar]$)., , is completely, , determined, , = Re (c#, , + i Re (&P)., , by its ‘real part’, , in, , Occasionally it is very useful to know that an inner product on a real, or complex vector space is determined by another function, the so-called, quadratic form determined by the inner product. To define it, we first, denote the positive square root of (CX]CX)by ]]a]]; ]]a]] is called the norm, of O(with respect to the inner product. By looking at the standard inner, products in R1, Cl, R2, and R3, the reader should be able to convince himself that it is appropriate, to think of the norm of a as the ‘length’ or, form, determined by the inner product, ‘magnitude’ of CL The quadratic, , 273
Page 282 :
274, , Inner Product, , Chap. 8, , Spaces, , is the function that assigns to each vector a! the scalar (la/j2. It follows, from the properties of the inner product that, lb f Al2 = 11412 f, , 2, , Re(4~) + IIPII~, , for all vectors a and 0. Thus in the real case, , km = a IIQ!+ PII2- ; lb - PIP., , (8-3), , In the complex case we use (8-2) to obtain the more complicated, (8-4), , expression, , (a(P) =~I~~+B~~2-~II~-~l12+~~l~+~~ll’-~~la-iB~12., , Equations (8-3) and (8-4) are called the polarization, that (8-4) may also be written as follows:, (alP) = ini,, , identities., , Note, , i” /la + W12., , The properties obtained above hold for any inner product on a real, or complex vector space V, regardless of its dimension. We turn now to, As one might guess, an inner, the case in which V is finite-dimensional., product on a finite-dimensional, space may always be described in terms, of an ordered basis by means of a matrix., that, Suppose that V is finite-dimensional,, 63 = {al, . . . , a,}, is an ordered basis for V, and that we are given a particular inner product, on V; we shall show that the inner product is completely determined by, the values, Gjk = (ak[aj), , (8-5), it assumes, , 011, , pairs of vectors in 6% If a! = 2 x/c& and p = 2 YjcYj, then, k, , @I/%, , =, , ‘5, , xnaklb), , =, , z, , zk(~klP), , =, , z, , XL, , =, , J:, , g@jkXk, , =, , z, i, , i, , gj(aklaj), , Y*GX, , where X, Y are the coordinate matrices of cy,p in the ordered basis 63,, We call G the matrix, and G is the matrix with entries Gjk = (%laj)., of the inner, product, in the ordered, basis 6~ It follows from (8-5)
Page 283 :
Inner Products, , Sec. 8.1, , that G is hermitian, i.e., that G = G*; however, G is a rather special kind, of hermitian matrix. For G must satisfy the additional condition, X*GX, , (843), , > 0,, , x # 0., , In particular, G must be invertible., For otherwise there exists an X # 0, such that GX = 0, and for any such X, (8-6) is impossible. More explicitly,, (8-6) says that for any scalars 21, . . . , 2, not all of which are 0, J% ZiGjkXk > 0., , (S-7), , From this we see immediately, that each diagonal entry of G must be, positive; however, this condition on the diagonal entries is by no means, sufficient to insure the validity, of (S-6). Sufficient conditions for the, validity of (8-6) will be given later., The above process is reversible; that is, if G is any n X n matrix over, F which satisfies (8-6) and the condition G = G*, then G is the matrix in, the ordered basis (a of an inner product on V. This inner product is given, by the equation, (a(@ = Y*GX, where X and Y are the coordinate, basis a., , matrices, , of a! and /3 in the ordered, , Exercises, 1. Let, V be a vector space and (, (a) Show that, , 1 ) an inner product on V., , (010) = 0 for all p in V., , (b) Show that if (c@) = 0 for all fi in V, then (Y = 0., 2. Let, V be a vector space over F. Show that the sum of two inner products, on V is an inner product on V. Is the difference of two inner products an inner, product? Show that a positive multiple, of an inner product is an inner product., , 3. Describe explicitly all inner products on R1 and on Cl., 4. Verify that the standard inner product on Fn is an inner product., 5. Let ( ( ) be the standard inner product on R2., (a) Let a! = (1, 2), /3 = (-1, 1). If y is a vector such that (&y) = -1 and, (Plr) = 3, find Y., (b) Show that for any (Y in R2 we have (Y = (alel)el + (&)Ez., 6. Let ( I ) be the standard inner product on R2, and let T be the linear operator, T(zl, ~2) = (-x2, xl). Now T is ‘rotation, through, 90”’ and has the property, that (a/Ta) = 0 for all (Y in R2. Find all inner products [ 1 ] on R2 such that, [alTa] = 0 for each (Y., 7. Let, (, zero linear, , I ) be the standard inner product on C2. Prove that there is no nonoperator on C2 such that (alTa) = 0 for every LY in 0. Generalize., , 275
Page 284 :
276, , Inner Product Spaces, , Chap. 8, , 8. Let A be a 2 X 2 matrix, , For X, Y in Rwl let, , with real entries., f.4(X,, , Show that f* is an inner product, and det A > 0., , Y) = Y’AX., , on R 2x1 if and only if A = Al, Al1 > 0, An > 0,, , 9. Let V be a real or complex vector space with an inner product. Show that the, law, quadratic, form determined, by the inner product satisfies the parallelogram, lb, , + PIP + lb, , -, , PIP =, , 211412, + w3II’~, , 10. Let ( 1 ) be the inner product, on R2 defined in Example, the standard ordered basis for R2. Find the matrix of this inner, to 63., 11. Show that, , 2, and let @ be, product relative, , the formula, aibk, (z, , ajx’lz, , bkxk), , k, , i, , =, , 2, , j.k3, , ., , +, , k, , +, , 1, , defines an inner product on the space R[z] of polynomials, over the field R. Let W, be the subspace of polynomials, of degree less than or equal to n. Restrict the above, inner product to W, and find the matrix of this inner product on W, relative to the, ordered basis (1, x, x2, . . . , x*}. (Hint: To show that the formula defines an inner, product, observe that, , (fl 9) = Jolfm(t) dt, and work with the integral.), vector space and let @ = {ai, . . . , ah} be a, 12. Let V be a finite-dimensional, basis for V. Let ( 1 ) be an inner product on V. If cl, . . . , c, are any n scalars,, show that there is exactly one vector a! in V such that (alcui) = ci, j = 1, . . . , n., vector space. A function, J from V into V is called a, if J(a + fl) = J(a) + J(p), J(m) = ~J((Y), and J(J(cw)) = cy, for, show that:, all scalars c and all cy, p in V. If J is a conjugation, (a) The set W of all ar in V such that Ja = cr is a vector space over R with, respect to the operations defined in V., (b) For each (Y in V there exist unique vectors ,B, y in W such that (Y = /3 + i-r., , 13. Let, , V be a complex, , conjugation, , 14. Let V be a complex vector space and W a subset of V with the following, properties :, (a) W is a real vector space with respect to the operations, defined in V., (b) For each a in V there exist unique vectors /3, y in W such that (Y = ,B + i-r., Show that the equation Jar = /3 - i-r defines a conjugation, on V such that Ja = a, if and only if (Y belongs to W, and show also that J is the only conjugation, on V, with this property., 15. Find all conjugations, , on Cl and C2., , 16. Let W be a finite-dimensional, Show that W satisfies condition, is also a basis of V., , real subspace of a complex vector space V., (b) of Exercise 14 if and only if every basis of W
Page 285 :
Inner, , Sec. 8.2, , PTOdUCt, , Spaces, , 977, , 17. Let V be a complex vector space, J a conjugation on V, W the set of (Y in V, such that Ja = a!, and j an inner product on W. Show that:, (a) There is a unique inner product g on V such that g(a, /3) = j(a, 6) for, all (Y,/3 in W,, (b) g(Jcz, J/?) = g(& a) for all C-X,@in V., What does part (a) say about the relation between the standard inner products, on RI and Cl, or on Rn and Cn?, , 8.2., , Inner, , Product, , Spaces, , Now that we have some idea of what an inner product is, we shall, turn our attention to what can be said about the combination of a vector, space and some particular, inner product on it. Specifically,, we shall, establish the basic properties of the concepts of ‘length’ and ‘orthogonality’ which are imposed on the space by the inner product., Dejinition., An inner product, space is a real or complex vector space,, together with a specified inner product on that space., , A finite-dimensional, real inner product space is often called a EuclidA complex inner product space is often referred to as a unitary, , ean space., space., , Theorem, 1. If V is an inner product space, then for any vectors a, fl, in V and any scalar c, , (9 Ibll, , = ICI 1141;, , (ii) 1/a]/ > Ofor a # 0;, , @i>IbIP)I5 lbll IIPII;, 64 lb + PII5 lbll + IIPIIProof. Statements (i) and (ii) follow almost immediately, the various definitions involved., The inequality, in (iii) is clearly, when (Y = 0. If OL# 0, put, , Then (yla), , = 0 and, , from, valid
Page 286 :
278, , Inner Product Spaces, Hence I(o, , Chap. 8, , < ~~~~~2~~~~[2., Now using (c) we find that, , lb + PII”= 11412, + (4% + @Ia)+ llPl12, , = lbl12+ 2 Re(4~) + 11~11~, , I 11412, + 2 lbll IIPII + IIPII”, = (II41 + IIPII)“., Thus, lb + PII I llall + lIPIl~ I, The inequality, in (iii) is called the Cauchy-Schwarz, inequality., It has a wide variety of applications. The proof shows that if (for example), (Y is non-zero, then I(, < IIaII ll~ll unless, , Thus, equality, EXAMPLE, , inner products, (a>, (b), , occurs in (iii) if and only if a! and p are linearly, , 7. If we apply the Cauchy-Schwarz, inequality, to the, given in Examples 1, 2, 3, and 5, we obtain the following:, II: xeglcl 5 (Z 1xk12)1’2(L: lYk12)1’2, IXlyl, , -, , Z2y1, , -, , SI?JZ, , +, I, , (4, , Cd), , dependent., , Itr (AB*)), , 422Y21, ((Xl, , -, , 2212, , 5 (tr (AA*))1’2(tr, , ll,lfC&(z) dxl I (/ol If(z, , + 32z)“y(yl, (BB*))‘12, , -, , Y212, , + 3Yz)“2, , dx)1’2(lo’ bb>l” dx)1’2-, , Let LYand p be vectors in an inner product space V. Then (Y, to /3 if ((YIP) = 0; since this implies /3 is orthogonal to a,, we often simply say that cr and ,8 are orthogonal. If S is a set of vectors in V,, set provided all pairs of distinct vectors in S are, S is called an orthogonal, set is an orthogonal set S with the additional, orthogonal. An orthonormal, property that llcyll = 1 for every (Y in S., , DeJinitions., is orthogonal, , The zero vector is orthogonal, to every vector in V and is the only, vector with this property. It is appropriate, to think of an orthonormal, set as a set of mutually perpendicular, vectors, each having length 1., EXAMPLE 8. The standard basis of either Rn or Cn is an orthonormal, set with respect to the standard inner product., EXAMPLE 9. The vector (x, y) in R2 is orthogonal, respect to the standard inner product, for, , ((x, Y)I(-Y,, , to (-y,, , x) with, , of Example, , 2, then, , XI> = -XY + YX = 0., , However,, if R2 is equipped with the inner product, (x, y) and (-y, x) are orthogonal if and only if
Page 287 :
Sec. 8.2, , Inner Product Spaces, y = 3 (-3, , f, , a,,., , EXAMPLE 10. Let V be CnXn, the space of complex n X n matrices,, and let Epq be the matrix whose only non-zero entry is a 1 in row p and, column p. Then the set of all such matrices Epq is orthonormal, with respect, to the inner product given in Example 3. For, (Ep$T), , = tr (Ep*Esr) = 6,, tr (Epr) = 6,,6,,., , EXAMPLE 11. Let V be the space of continuous complex-valued, (or, real-valued), functions on the interval 0 5 2 5 1 with the inner product, , (fls) = l,l f(x)s(x)f&c., Suppose fn(z) = fi, cos 2mx and that g,(x) = fi, sin 27rnx. Then, is, an, infinite, orthonormal, set., In, the, complex case,, U,fl, 91,fzt Q2, . . .>, we may also form the linear combinations, , & (fn + &J,, , n = 1,2,., , ..., , In this way we get a new orthonormal, set S which consists of all functions, of the form, n = fl, f2,., ..., h,(x) = e2rim,, The set S’ obtained from X by adjoining the constant function 1 is also, orthonormal., We assume here that the reader is familiar with the calculation of the integrals in question., The orthonormal, sets given in the examples above are all linearly, independent. We show now that this is necessarily the case., Theorem, , 2. An orthogonal, , set of non-zero vectors is linearly, , inde-, , pendent., Proof. Let X be a finite or infinite orthogonal, set of non-zero, vectors in a given inner product space. Suppose (~1,a2, . . . , (Y, are distinct, vectors in X and that, + * * . + cfna,., p = Clct!l + CZCYZ, , Then, , Since (CQ~CQ)# 0, it follows, , that, , 279
Page 288 :
280, , Chap. 8, , Inner Product Spaces, , Thus when /3 = 0, each CL = 0; so S is an independent, , set., , 1, , Corollary., If a vector /3 is a linear combination of an orthogonal, sequence of non-zero vectors (~1,. . . , am, then /3 is the particular, linear, combination, , (8-8), This corollary follows from the proof of the theorem. There is another, corollary which although obvious, should be mentioned. If ((~1, . . . , LY,}, is an orthogonal, set of non-zero vectors in a finite-dimensional, inner, product space V, then m 5 dim V. This says that the number of mutually, orthogonal directions in V cannot exceed the algebraically, defined dimension of V. The maximum number of mutually orthogonal directions in V, is what one would intuitively, regard as the geometric dimension of V,, and we have just seen that this is not greater than the algebraic dimension., The fact that these two dimensions are equal is a particular corollary of, the next result., Theorem, 3. Let V be an inner product space and let 01, . . . , on be, any independent vectors in V. Then one may construct orthogonal vectors, al, * * * , (Y, in V such that for each k = 1,2, . . . , n the set, , -la, . . - , ffd, is a basis for the subspace spanned by a, . . . , Pk., Proof. The vectors (~1, . . . , (Ye will be obtained by means of a, orthogonalization, process., construction known as the Gram-Schmidt, First let (~1 = pi. The other vectors are then given inductively, as follows:, Suppose al, . . . , (Y,,,(1 5 m < n) have been chosen so that for every k, llklm, -lw, . . . , d’,, is an orthogonal basis for the subspace of V that is spanned by &, . . . , pk., To construct the next vector (~,+l, let, , (8-W, , %+1, , =, , Then LY,+~ # 0. For otherwise, and hence a linear combination, , Pm+1, , -, , m, 2, k=l, , @m+lbk), ll~k112, , cyk, ', , &+I is a linear combination, of pl, . . . , pm. Furthermore,, , of (~1,. . . , h, if 1 5 j 5 m,
Page 289 :
Sec. 8.2, , Inner, , ProductSpaces, , Therefore, {cyI, . . . , a,+~} is an orthogonal, set consisting of m + 1 nonzero vectors in the subspace spanned by pl, . . . , pm+l. By Theorem 2,, it is a basis for this subspace. Thus the vectors 01~). . . , CY, may be constructed one after the other in accordance with (8-9). In particular,, when, n = 4, we have, , Corollary., normal basis., , Every finite-dimensional, , inner product space has an ortho-, , Proof. Let V be a finite-dimensional, inner product space and, ., ,, &), a, basis, for, V., Apply, the, Gram-Schmidt, process to construct, {Pl, . ., an orthogonal basis {CQ, . . . , (Y,}. Then to obtain an orthonormal, basis,, 1, simply replace each vector o(k by C%/([Q(kI\., One of the main advantages, which orthonormal, bases have over, arbitrary, bases is that computations, involving, coordinates are simpler., To indicate in general terms why this is true, suppose that V is a finitedimensional inner product space. Then, as in the last section, we may use, Equation, (8-5) to associate a matrix G with every ordered basis @, =, CC,}, of V. Using this matrix, { aI, * . . ,, Gjr, = (a&j),, we may compute inner products in terms of coordinates. If CRis an orthonormal basis, then G is the identity matrix, and for any scalars xj and yk, ‘7, , xjaj[T, , ykak), , =, , 2:, 3, , xjgj., , Thus in terms of an orthonormal, basis, the inner product in V looks like, the standard inner product in Fn., Although it is of limited practical use for computations,, it is interesting to note that the Gram-Schmidt, process may also be used to test, for linear dependence. For suppose &, . . . , & are linearly dependent, vectors in an inner product space V. To exclude a trivial case, assume, that PI f 0. Let m be the largest integer for which PI, . . . , Pm are independent. Then 1 < m < n. Let CY~,. . . , ay, be the vectors obtained by, applying the orthogonalization, process to 01, . . . , Pm. Then the vector, (Y,,.~ given by (S-9) is necessarily 0. For LY,+~is in the subspace spanned
Page 290 :
282, , Inner, , Product, , Chap. 8, , Spaces, , by al, . . . , am and orthogonal, to each of these vectors; hence it is 0 by, (8-8). Conversely,, if (~1,. . . , (Y, are different from 0 and CL,+~ = 0, then, Pl, * * * , Pm+1are linearly dependent., 12. Consider, , EXAMPLE, , the vectors, Pl, , =, , (3, 0, 4), , P2, , =, , C---1,0,7), , Pa, , =, , (2,, , 9,, , 11), , in R3 equipped with the standard inner product., Schmidt process to pi, pZ, p3, we obtain the following, w, , =, , the Gram-, , (3, 0, 4), , (y2 = (-1, , a3, , Applying, vectors., , ((-LO,, , , 0 , 7) _, , = (-1,, , 0, 7) -, , = (-4,, , 0, 3), , 7)1(3,0,4)), 25, , (3, 0, 4), ((2, 9, 11)1(3,, 25, , = (2, 9, 11) -, , (3, 0, 4), , 0, 4)) (3 0 4), 2 9, , _ ((2, 9, ll)l(--490,, 25, = (2, 9, 11) - 2(3, 0, 4) = (0, 9, 0)., , 0 3), 7 ,, , 3)) (-4, , -4,, , 0, 3), , These vectors are evidently, non-zero and mutually, orthogonal., Hence, vector, {al, (~2,~l3) is an orthogonal basis for R3. To express an arbitrary, of crl, (Ye,a3 it is not necessary to, (~1, ~2, 23) in R3 as a linear combination, solve any linear equations. For it suffices to use (8-8). Thus, 321 +, (Xl,, , as is readily, , 22,, , x3), , verified., , =, , 423, , (yl +, , -421, , z5, , + 353, 25, , a2, , +, , 5, , 9, , a3, , In particular,, , (1, 2, 3) = I (3,0,, , 4) + + (-4,0,, , 3) + Q (0, $0)., , To put this point in another way, what we have shown is the following:, The basis {fi, fi, f3) of (R3)* which is dual to the basis {al, a2, (~3) is defined, explicitly by the equations, 3x1 + 4x3, flh, , 22,, , 53), , =, , fi(Xl,, , x2,, , x3), , =, , f3b1,, , x2,, , 23), , =, , 25, , -421, , + 3x3, 25, , 5, , 9
Page 291 :
Inner Product Spaces, , Sec. 8.2, , and these equations, , Finally,, , may be written, , more generally, , in the form, , note that from (~1, CQ, cu3we get the orthonormal, it C-4,, , 6 (3, (441,, , EXAMPLE 13. Let A =, , 0, 31,, , ab, [ c d 1where, , basis, , (0, LO>., , a, b, c, and d are complex num-, , bers. Set PI = (a, b), fit = (c, d), and suppose that /31 # 0. If we apply, the orthogonalization, process to PI, p2, using the standard inner product, in C2, we obtain the following vectors:, (~1 = (a, b), a2, , =, , (c, d) -, , = (c, d) -, , (~,p,i($$)), f$, , ‘+ $, , (a, b), (a, b), , & - &a dcia - cab, = ( lal2 + lb12’ Ial2 + lb12>, = ,ay&2, , M,, , 4., , Now the general theory tells us that a2 # 0 if and only if &, p2 are linearly, independent. On the other hand, the formula for 01~shows that this is the, case if and only if det A # 0., In essence, the Gram-Schmidt, process consists of repeated applications of a basic geometric operation called orthogonal, projection,, and it, is best understood from this point of view. The method of orthogonal, projection also arises naturally in the solution of an important approximation problem., Suppose W is a subspace of an inner product space V, and let p be, an arbitrary vector in V. The problem is to find a best possible approximation to p by vectors in W. This means we want to find a vector o( for which, 1(p - 0~11is as small as possible subject to the restriction, that (Y should, belong to W. Let us make our language precise., top by vectors in W is a vector Q!in W such that, A best approximation, , IIP- 41 I IIP- YII, for every vector y in W., By looking at this problem in R2 or in R3, one sees intuitively, that a, best approximation, to fl by vectors in W ought to be a vector a! in W such, that ,8 - a! is perpendicular, (orthogonal), to W and that there ought to
Page 292 :
Inner Product Spaces, , Chap. 8, , be exactly one such a. These intuitive, ideas are correct for finite-dimensional subspaces and for some, but not all, infinite-dimensional, subspaces., Since the precise situation is too complicated to treat here, we shall prove, only the following result., 4. Let W be a subspace of an inner, let p be a vector in V., Theorem, , (i), and only, (ii), (iii), basis for, , product space V and, , The vector a! in W is a best approximation, to 0 by vectors in W if, if /3 - cxis orthogonal to every vector in W., If a best approximation, to p by vectors in W exists, it is unique., If W is jknite-dimensional, and (0~1,. . . , CY,} is any orthonormal, W, then the vector, , is the (unique), , best approximation, , Proof., First note that, (P - 4 + (a - -11, and, , to p by vectors in W., if y is any vector, , in V, then /3 - y =, , IIP - ~11~ = IIP - alI2 + 2 Re (6 - ala - r> + I/a - ~11~., Now suppose ,B - CYis orthogonal, to every vector in W, that y is in W, and that y # (Y. Then, since Q! - y is in W, it follows that, , IIP- Y112, ; 11;1 j2 + lb - Yl12, a 2., Conversely,, suppose that I[/3 - y[ I 2 (1~ - (~11 for every, Then from the first equation above it follows that, , y in W., , 2 Re (P - ala - Y) + I/a - y/l2 2 0, for all y in W. Since every vector, (Y - y with y in W, we see that, , in W may be expressed in the form, , 2 Re (P - 4~) + 11~11~2 0, for every T in W. In particular,, , if y is in W and y # (Y, we may take, , T = .J@- + - 7) (a _ r), lb - YIP, *, Then the inequality, 4, , reduces to the statement, , I(0 n 4J jj r)l” + I@n ak 1 r)l” > 0., CY Y2, CY y2, -, , This holds if and only if (p - cula! - 7) = 0. Therefore, fl onal to every vector in W. This completes the proof of the, of the two conditions on (Y given in (i). The orthogonality, evidently satisfied by at most one vector in W, which proves, , LYis orthogequivalence, condition is, (ii).
Page 293 :
Sec. 8.2, , Inner, , Product, , Spaces, , Now suppose that W is a finite-dimensional, subspace of 8. Then we, know, as a corollary of Theorem 3, that W has an orthogonal basis. Let, {a, . . . ) an} be any orthogonal basis for W and define cyby (8-11). Then,, by the computation, in the proof of Theorem 3, p - o( is orthogonal, to, each of the vectors 01k (fi - a! is the vector obtained at the last stage when, the orthogonalization, process is applied to 011,. . . , (Y%,/?). Thus fi - (Y is, orthogonal to every linear combination of q, . . . , a,, i.e., to every vector, in W. If y isin W and y # (Y, it follows that \Ifl - y(I > IIfi - aI\. Therefore, o( is the best approximation, to p that lies in W. h, Definition., Let V be an inner product space and S any set of vectors, complement, of S is the set S’ of all vectors in V, in V. The orthogonal, which are orthogonal to every vector in S., , The orthogonal complement of V is the zero subspace, and conversely, {O}l = V. If S is any subset of V, its orthogonal complement X1 (S perp), is always a subspace of V. For S is non-empty,, since it contains 0; and, whenever a! and p are in XI and c is any scalar,, b, , + Plr) = 44Y), =co+o, = 0, , + (PI71, , for every y in S, thus cur + fi also lies in S. In Theorem 4 the characteristic property of the vector a! is that it is the only vector in W such that, p - (Y belongs to W-‘-., Definition., Whenever the vector (Y in Theorem 4 exists it is called the, orthogonal, projection, of /3 on W. If every vector in V has an orthogonal, , projection on W, the mapping that assigns to each vector in V its orthogonal, projection, of V on W., projection on W is called the orthogonal, By Theorem 4, the orthogonal projection of an inner product space, on a finite-dimensional, subspace always exists. But Theorem 4 also implies, the following result., Corollary., Let V be an inner product space, W a finite-dimensional, subspace, and E the orthogonal projection of V on W. Then the mapping, , P-,P--EP, is the orthogonal projection, , of V on WI., , Proof. Let p be an arbitrary, vector, and for any y in WI, p -r=Ep+(B--Ep-7)., and p - E/3 - y is in WI, it follows that, , in V. Then fi - Efl is in WI,, Since EPisin, W, , 285
Page 294 :
Chap. 8, , Inner Product Spaces, , IIP- ~11~, = IIEPII”+ IIP- EP - -ill2, 2 IIP- (0 - Jw)ll”, with strict inequality, when y # /3 - E/3. Therefore,, approximation, to 0 by vectors in WI., m, , /3 - E/3 is the best, , EXAMPLE, 14. Give R3 the standard inner product. Then the orthogonal projection of (-10,2,8), on the subspace W that is spanned by, (3, 12, -1) is the vector, ac =, , 2, Wl(3, 1% -1)), 9+144+1, , ((-1%, , = +J, The orthogonal, defined by, , nz,, , 23), , ’, , 12 -1), ’, , (3, 12, -1)., , projection, , (21,, , (3, , of R3 on W is the linear, 3x1, , +, , .+, , ;;2, , -, , “3), , transformation, , E, , (3, 12, -1)., , (, , The rank of E is clearly, , 1; hence its nullity, E&l,, , is 2. On the other hand,, , x3) = (0, 0, 0), , ~2,, , if and only if 321 + 12~~ - 23 = 0. This is the case if and only if (x1, x2, x3), is in WI. Therefore,, WI is the null space of E, and dim (Wl) = 2., Computing, (Xl,, , x2,, , 23), , 3x1, , -, , +, , ;;;, , -, , 53), , (3, 12, -1), , (, , we see that the orthogonal projection of R3 on WI is the linear transformation I - E that maps the vector (x1, x2, x3) onto the vector, k4 (145x1 - 3622 + 3x3, -36x1, The observations, fashion., , + 10x2 + 12x3, 3x1 + 12x2 + 15323)., , made in Example, , 14 generalize, , in the following, , Theorem, 5. Let W be a finite-dimensional, subspace of an inner product, space V and let E be the orthogonal projection of V on W. Then E is an idempotent linear transformation, of V onto W, WI is the null space of E, and, , V=W@W~., Proof. Let fl be an arbitrary, vector in V. Then Efl is the best, approximation, to p that lies in W. In particular,, E/3 = ,8 when p is in W., Therefore, E(Eb) = Efl for every p in V; that is, E is idempotent:, E2 = E., To prove that E is a linear transformation,, let (Y and /3 be any vectors in
Page 295 :
Inner ProductSpaces, , Sec. 8.2, , V and c an arbitrary, scalar. Then, by Theorem 4, (Y - Ea and p - E/3, are each orthogonal to every vector in W. Hence the vector, c(a - Ea) + (0 - EP) = (ca + P) - (cEa + EP), also belongs to WI., Theorem 4 that, , Since cEa + Efl is a vector, , in W, it follows, , from, , E(ca + /3) = cEa + E&, Of course, one may also prove the linearity of E by using (8-11). Again, let p be any vector in V. Then E@ is the unique vector in W such that, p - Eo is in WI. Thus E/3 = 0 when /3 is in W’-. Conversely, p is in W’, when E/3 = 0. Thus WI is the null space of E. The equation, P=E/3+/3--Eb, W n WI = (0). For if 01 is a, shows that V = W + WI; moreover,, vector in W n WI, then (ala) = 0. Therefore, a! = 0, and V is the direct, sumof Wand WI., 1, Corollary., Under the conditions of the theorem, I - I3 is the orthogmal, projection of V on WI. It is an idempotent linear transformation, of V onto, WI with null space W., Proof. We have already seen that the mapping /3 + p - E/3 is, the orthogonal projection of V on W I. Since E is a linear transformation,, this projection on WI is the linear transformation, I - E. From its geometric properties one sees that I - E is an idempotent, transformation, of V onto W. This also follows from the computation, (I-E)(I-E), , =I-E-E+E2, =I-E., , Moreover, (1 - E)P = 0 if and only if p = Efl, and this is the case if and, only if /3 is in W. Therefore W is the null space of I - E. 1, The Gram-Schmidt, process may now be described geometrically, in, the following way. Given an inner product space V and vectors 61, . . . , fin, in V, let Pk (lc > 1) be the orthogonal projection of V on the orthogonal, complement of the subspace spanned by PI, . . . , /3k+ and set PI = I., Then the vectors .one obtains by applying the orthogonalization, process, by the equations, to, 01,, . . . , Pn are defined, (8-12), Theorem, , ak, , =, , PkPk,, , l<k<n., , 5 implies another result known as Bessel’s, , inequality., , Corollary., Let ((~1, . . . , a,} be an orthogonal set of non-zero vectors, in an inner product space V. If p is any vector in V, then, , 287
Page 296 :
288, , Inner Product Spaces, , Chap. 8, , and equality holds if and only if, , Prod., , Let, , Y = 2 [(PI~/IIQ~~~], , (YE., , Then, , P = y + 6 where, , (~18) = 0. Hence, , llPl12= llYl12+ ll~l12., It now suffices to prove, , that, , llYll2= z y&g., This is straightforward, (aylak) = 0 forj # k., , computation, , in which, , one uses the fact, , that, , 1, , In the special case m which, Bessel’s inequality says that, , {aI, . . . , or,J is an orthonormal, , set,, , The corollary also tells us in this case that p is in the subspace spanned by, QII, . . . , an if and only if, P = F @bk), ak, or if and only if Bessel’s inequality is actually an equality. Of course, in, the event that V is finite dimensional and {al, . . . , Q(~} is an orthogonal, basis for V, the above formula holds for every vector fl in V. In other, words, if {LY~,. . . , LY,} is an orthonormal, basis for V, the kth coordinate, of fi in the ordered basis {aI, . . . , an> is (pjak)., EXAMPLE 15. We shall apply the last corollary, sets described in Example 11. We find that, , (4, , i, k=, , l/bf(t)e-2rii’, , to the orthogonal, , dt12 5 /d lj(t>l” dt, , --n, , (b), , (cl, , / o1(d/z COS2at, , + ti, , sin 4?rt)2 dt = 1 + 1 = 2., , Exercises, 1. Consider R4 with the standard inner product. Let W be the subspace of, R4 consisting of all vectors which are orthogonal to both a! = (1, 0, -1, 1) and, fi = (2, 3, -1,2). Find a basis for W.
Page 297 :
Inner Product Spaces, , Sec. 8.2, , 2. Apply the Gram-Schmidt, process to the vectors fil = (1, 0, l), fi% = (1, 0, -l),, inner, f13 = (0,3,4),, to obtain an orthonormal, basis for R3 with the standard, product., 3. Consider 15’3, with the standard inner product. Find an orthonormal, the subspace spanned by PI = (1, 0, i) and pz = (2, 1, 1 + i)., 4. Let V be an inner product, in V is defined by, , space. The distance, 4%P), , Show that, (4 G,, (b) d(cu,, (cl aa,, (4 4a,, , PI, /3), P), PI, , 2, =, =, I, , between, , basis for, , two vectors Q! and p, , = /IQ! - PII., , 0;, 0 if and only if Q! = p;, w, 4 ;, d(a, Y) + 47, PI., , 5. Let V be an inner product space, and let LY, /3 be vectors, a! = /3 if and only if (cyly) = (Pi-r) for every y in V., , in V. Show that, , 6. Let W be the subspace of R2 spanned by the vector (3,4). Using the standard, projection, of R2 onto W. Find, inner product, let E be the orthogonal, (a) a formula for E(Q, n);, (b) the matrix of E in the standard ordered basis;, Cc) WL;, (d) an orthonormal, basis in which E is represented by the matrix, 1, 0, , 0, 0’, , [ 1, 7. Let V be the inner product space consisting, whose quadratic, form is defined by, 11(x1,22112, , = (Xl -, , x2)2, , of R2 and the inner, + 3s;., , Let E be the orthogonal, projection, of V onto the subspace, vector (3, 4). Now answer the four questions of Exercise 6., 8. Find an inner product, , on R2 such that, , W spanned, , by the, , (Ed, ~2) = 2., , 9. Let V be the subspace of R[z] of polynomials, with the inner product, (fls), , product, , = Jd f(Os(t), , of degree at most 3. Equip, , V, , a., , (a) Find the orthogonal, complement, of the subspace of scalar polynomials., (b) Apply the Gram-Schmidt, process to the basis {1,x, x2, x3}., , V be the vector space of all n X n matrices over C, with the inner product, (AIB) = tr (AB*). Find the orthogonal, complement, of the subspace of diagonal, matrices., , 10. Let, , V be a finite-dimensional, inner product space, and let {CQ . . . , cu,} be, an orthonormal, basis for V. Show that for any vectors (Y, p in V, , 11. Let, , 289
Page 298 :
290, , Inner Product Spaces, , Chap. 8, , 12. Let W be a finite-dimensional subspace of an inner product space V, and let E, be the orthogonal projection of V on W. Prove that (EC@) = ((Y~EP) for all (Y, p, in V., , 13. Let S be a subset of an inner product space V. Show that (S-)‘- contains the, subspace spanned by S. When V is finite-dimensional, show that (Sl)l is the subspace spanned by S., 14. Let V be a finite-dimensional, , inner product space, and let @I = {CX~,. . . , cr,}, be an orthonormal basis for V. Let T be a linear operator on V and A the matrix, of T in the ordered basis 6% Prove that, , 15. Suppose V = WI 0 W2 and that fi and fi are inner products on WI and Wz,, respectively. Show that there is a unique inner product f on V such that, (a) Wz= W$-;, (b) f(a, /3) = fk(~, fl), when (Y, /3 are in Wk, k = 1, 2., 16. Let V be an inner product space and W a finite-dimensional subspace of V., There are (in general) many projections which have W as their range. One of, these, the orthogonal projection on W, has the property that IIEall 2 IIcxII for, every a: in V. Prove that if E is a pro,jection with range W, such that llEol\l 2 ll(~I\, for all (Y in V, then E is the orthogonal projection on W., 17. Let V be the real inner product space consisting of the space of real-valued, continuous functions on the interval, -1 2 t 2 1, with the inner product, , (fig) = y1 f(M9 ck, Let W be the subspace of odd functions, i.e., functions satisfying f( -t), Find the orthogonal complement of W., , 8.3., , Linear, , Functionals, , and, , = -f(t)., , Adjoints, , The first portion of this section treats linear functionals, on an inner, product space and their relation to the inner product. The basic result is, that any linear functional f on a finite-dimensional, inner product space, is ‘inner product with a fixed vector in the space,’ i.e., that such an f has, the formf(ar), = (c+) f or some fixed /3 in V. We use this result to prove, the existence of the ‘adjoint’ of a linear operator T on V, this being a linear, operator T* such that (T&3) = ((rlT*/3) for all a and p in V. Through the, use of an orthonormal, basis, this adjoint operation on linear operators, (passing from T to T*) is identified with the operation of forming the, conjugate transpose of a matrix. We explore slightly the analogy between, the adjoint operation and conjugation on complex numbers., Let V be any inner product space, and let p be some fixed vector inV., We define a function fp from V into the scalar field by
Page 299 :
Sec. 8.3, , Linear Functionals and Adjoints, fs(4, , = (4%, , This function fp is a linear functional on V, because, by its very definition,, (alp) is linear as a function of cy. If I’ is finite-dimensional,, every linear, functional on V arises in this way from some /3., Theorem, 6. Let V be a jinite-dimensional, inner product space, and f a, linear functional on V. Then there exists a unique vector fl in V such that, f(a) = ((YIP) for all (Y in V., , Proof., , Let {q, , 01~). . . , (Y,} be an orthonormal, , basis for V. Put, , P = j, fCarjlori, , (8-13), , and let fb be the linear functional, , defined by, , fs(4, , = (4P)., , Then, , Since this is true for each CQ, it follows that f = f,+ Now suppose y is a, vector in V such that (crib) = (~~17) for all o(. Then (0 - -# - y) = 0, and /3 = y. Thus there is exactly one vector /3 determining the linear func1, tional f in the stated manner., The proof of this theorem can be reworded slightly, in terms of the, representation, of linear functionals, in a basis. If we choose an orthonormal basis {‘Ye, . . . , oc,} for V, the inner product of (Y = X~CQ+ . . . +, X,,CY,and p = ~10~1+ . . . + Y~CY,,will be, (alp), , If f is any linear functional, , =, , 291, , +, , on V, then, , f(a) =, , ClZl, , +, , .*., , f, , +, , 2,%., , has the form, , . . ., , +, , GLxn, , for some fixed scalars cl, . . . , cn determined, by the basis. Of course, cj = f(aj). If we wish to find a vector ,8 in V such that (CX]P) = f(a) for all 01,, then clearly the coordinates yj of /3 must satisfy gj = cj or yi = f(q)., Accordingly,, is the desired vector., Some further comments, we have given is admirably, geometric fact that p lies in, off. Let W be the null space, determined by its values on, of V on WI, then, , are in order. The proof of Theorem 6 that, brief, but it fails to emphasize the essential, the orthogonal complement of the null space, off. Then V = W + WI, and f is completely, WI. In fact, if P is the orthogonal projection, , ,291
Page 300 :
292, , Inner Product, , Chap. 8, , Spaces, , for all a! in V. Suppose f # 0. Then f is of rank 1 and dim (Wl), is any non-zero vector in WI, it follows that, P,=jy+, , = 1. If y, , Y, , for all cyin V. Thus, f(4, , for all a, and P = iXr>lllrllzl, , .- gf2, , = (4r>, , Y., , EXAMPLE 16. We should give one example showing that Theorem 6, is not true without the assumption that V is finite dimensional. Let V be, the vector space of polynomials over the field of complex numbers, with, the inner product, (fig), This inner product, g = Z bkxk, then, , = /o’fm3, , 0%., , can also be defined, , algebraically., , If f = Z akxk and, , (As)= sj + : + 1a&., Let x be a fixed complex, ‘evaluation, at 2:, , number,, L(f), , and let L be the linear, , functional, , = f(4., , Is there a polynomial g such that (fg), no; for suppose we have, , for every f? The answer is, , = L(f), -, , f(z) = /dMt), , 02, , for every f. Let h = x: - x, so that for any f we have (hf)(z), 0 = Jb Wlf(W), for allf., , In particular, , this holds whenf, , = 0. Then, , dt, = Kg so that, , Jo1Iw)l”ls(0l” fdt=, , 0, , and so hg = 0. Since h # 0, it must be that g = 0. But L is not the zero, functional;, hence, no such g exists., One can generalize the example somewhat, to the case where L is a, linear combination, of point evaluations. Suppose we select fixed complex, numbers 21, . . . , xn and scalars cl, . . . , c, and let, , L(f) = Clf(Zl) + . . . + cnf(z,>.
Page 301 :
Linear Functionals and Adjoints, , Sec. 8.3, , 293, , Then L is a linear functional on V, but there is no g with L(f) = (fg),, unless cl = c2 = . . . = c, = 0. Just repeat the above argument, with, h = (x - 21) . . . (5 - 2,)., We turn now to the concept of the adjoint of a linear operator., Theorem, 7. For any linear operator T on a finite-dimensional, inner, product space V, there exists a unique linear operator T* on V such that, , (8-14), , (T&3, , = (4’*P), , for all a, 0 in V., Proof. Let /3 be any vector, functional on V. By Theorem 6 there, (T4), = (40’) f or every a in V. Let, 0’ =, , in V. Then a + (T&3) is a linear, is a unique vector p’ in V such that, T* denote the mapping /3 + 0’:, T”P., , We have (g-14), but we must verify that T* is a linear operator., be in V and let c be a scalar. Then for any 01,, (dT*(cP, , + r>> =, =, =, =, =, =, , (T4cP, (TM, 8T4P), E(aIT*P), (&T*P), (+T*P, , Let p, y, , + r>, + CT+), + (T&l, + (alT*r), + (alT*r), + T*r)., , Thus T*(cP + y) = cT*p + T*y and T* is linear., The uniqueness of T* is clear. For any /3 in V, the vector T*p is, uniquely, determined, as the vector p’ such that (T&I), = (ollp’) for, every o(. 1, Theorem, 8. Let V be a jkite-dimensional, inner product space and let, 03 = {al, . . . ) cr,} be an (ordered) orthonormal, basis for V. Let T be a, linear operator on V and let A be the matrix of T in the ordered basis a. Then, Akj = (Tc+~)., , Proof., , Since ~3 is an orthonormal, , basis, we have, ., , a = 5 (c+L)OLL., k=l, , The matrix A is defined by, , and since
Page 302 :
Inner Product Spaces, , Chap. 8, , Corollary., Let V be a jinite-dimensional, T be a linear operator on V. In any orthonormal, is the conjugate transpose of the matrix of T., , Proof., , inner product space, and let, basis for V, the matrix of T*, , Let (a = {CQ, . . . , cr,} be an orthonormal, , A = [T]a and B = [T*]a., , According, , to Theorem, , basis for V, let, , 8,, , Arj = (TOljlQk), Bkj = (T*ail~~k)., By the definition, , of T* we then have, Bbj = (T*cx~[cx~), =(LYklT*“i), = ‘vG&), =&., 1, , EXAMPLE 17. Let V be a finite-dimensional, inner product space and, E the orthogonal, projection of V on a subspace W. Then for any vectors, (Y and p in V., , = (EC2 + (1 - E)ar]Efl), = (4-M., From the uniqueness of the operator E* it follows that, consider the projection E described in Example 14. Then, A =&[~~, , 215, , E* = E. Now, , -141, , is the matrix of E in the standard orthonormal, basis. Since E = E*, A is, also the matrix of E*, and because A = A*, this does not contradict, the, preceding corollary. On the other hand, suppose, cc!1= (154,0,0), a2 = (145, -36,3), a3 = (-36, 10, 12)., Then, , (0~1,az, (Ye} is a basis, and, ECI~ = (9, 36, -3), Eaz = (0, 0,O), Ears = (0, 0,O)., , Since (9, 36, -3) = -(154,0,0), - (145, -36,, the basis ((~1, (~2,(~3) is defined by the equation, , 3), the matrix, , B of E in
Page 303 :
Linear Functionals and Adjoints, , Sec. 8.3, , B=, , [ 1, -1, , 0, , 0, , -1 0, , 00, , 0, 0., , In this case B # B*, and B* is not the matrix of E* = E in the basis, the corollary, we conclude that {(Ye, (Ye,W} is not, { CYI,012,CYZ). Applying, an orthonormal, basis. Of course this is quite obvious anyway., Dejinition., Let T be a linear operator on an inner product space V., on V if there exists a linear operator T*, Then we say that T has an adjoint, on V such that (TorIP) = (alT*P) for all CI and ,!I in V., , By Theorem 7 every linear operator on a finite-dimensional, inner, product space V has an adjoint on V. In the infinite-dimensional, case this, is not always true. But in any case there is at most one such operator T*;, of T., when it exists, we call it the adjoint, Two comments should be made about the finite-dimensional, case., 1. The adjoint of T depends not only on T but on the inner product, as well., 2. As shown by Example 17, in an arbitrary, ordered basis @, the, relation between [T]a and [T*]a is more complicated than that given in, the corollary above., EXAMPLE 18. Let V be CnX1, the space of complex n X 1 matrices,, with inner product (XIY) = Y*X. If A is an n X n matrix with complex, entries, the adjoint, of the linear operator, X+ AX is the operator, X-b A*X. For, (AXIY), , = Y”AX, , The reader should convince, last corollary., , = (A*Y)*X, , = (XIA*Y)., , himself that this is really a special case of the, , EXAMPLE 19. This is similar to Example 18. Let V be CnXn with, inner product (AIB) = tr (B*A). Let M be a fixed n X n matrix over, The adjoint of left multiplication, by M is left multiplication, by M*., course, ‘left multiplication, by M’ is the linear operator LM defined, L&4), = MA., Gh(A)IB), , =, =, =, =, =, , tr (B*(MA)), tr (MAB”), tr (AB*M), tr (A(M*B)*), (AI-b*(B))., , the, C., Of, by, , 295
Page 304 :
296, , Inner Product Spaces, , Chap. 8, , Thus (LA*)* = L,w. In the computation, acteristic property of the trace function:, , above, we twice used the chartr (AB) = tr (BA)., , EXAMPLE 20. Let V be the space of polynomials, complex numbers, with the inner product, (fig), , = J,’ fCOz0, , over the field of, , fit., , If f is a polynomial, f = 2 czkxk,we let f = 2 &zk. That is, f is the polynomial whose associated polynomial, function is the complex conjugate, of that for f:, t real, f(t) = f(t),, Consider the operator ‘multiplication, by f,’ that is, the linear, Mf defined by M,(g) = fg. Then this operator has an adjoint,, multiplication, by f. For, (M,(g)Ih), , operator, namely,, , = cfglh), = / olf(odoh(t), (2, r ()l dt)Lf(ow)l, fdt, /, , = (gl.fh), = (glJG(h)), and so (MT)*, , = M,., , EXAMPLE 21. In Example 20, we saw that some linear operators on, an infinite-dimensional, inner product space do have an adjoint. As we, commented earlier, some do not. Let V be the inner product space of, Example 21, and let D be the differentiation, operator on C[X]. Integration by parts shows that, v?fls> = f(lMl>, - fmm, - m7>., Let us fix g and inquire when there is a polynomial, D*g such that, or, allf., If, such, a, D*g, exists,, we, shall, have, (Dfls> = W*s> f, (flD*g), , = fU)s(l), , - f@MN, , -, , Ws), , or, W*s, , + Dg) = f(lk/(l), , - fK’M0)., , is a linear functional of the type, With g fixed, L(f) = f(lMl), - fKM0), considered in Example 16 and cannot be of the form L(f) = (f/h) unless, L = 0. If D*g exists, then with h = D*g + Dg we do have L(f) = (flh),, and so g(0) = g(1) = 0. The existence of a suitable polynomial D*g implies, g(0) = g(1) = 0. Conversely,, if g(0) = g(1) = 0, the polynomial, D*g =, -Dg satisfies (Dflg) = (f[D*g) f or all f. If we choose any g for which, g(0) # 0 or g(1) # 0, we cannot suitably define D*g, and so we conclude, that D has no adjoint.
Page 305 :
Linear Functionals and Adjoints, , Sec. 8.3, , We hope that these examples enhance the reader’s understanding, of, the adjoint of a linear operator. We see that the adjoint operation, passing, on complex numbers., from T to T”, behaves somewhat like conjugation, The following theorem strengthens the analogy., Theorem, 9. Let V be a Jinite-dimensional, inner, and U are linear operators on V and c is a scalar,, , (i), (ii), (iii), (iv), , product space. If T, , (T + U)* = T* + U*;, (CT)* = CT*;, (TU)* = U*T*;, (T*)* = T., Proof., , To prove, , (i), let CYand /3 be any vectors, , in V., , Then, (CT + UbIP), , =, =, =, =, =, , (Ta + U4P), (w3, + (U&9, (4 T*P) + (4 U*P>, (4T”P + u*p>, bl(T* + u*M., , From the uniqueness of the adjoint we have (T + U)* = T* + U”. We, leave the proof of (ii) to the reader. We obtain (iii) and (iv) from the, relations, , (TU@), (T*&), , = (UalT*fl), , = C-1, , = (ajU*T*@), , = (m>, , = (4TP)., , I, , Theorem 9 is often phrased as follows: The mapping T + T* is a, conjugate-linear, anti-isomorphism, of period 2. The analogy with complex, conjugation, which we mentioned, above is, of course, based upon the, observation, that complex conjugation, has the properties, (ZI + 22) =, & + &, (G) = ?&, Z = z. One must be careful to observe the reversal, of order in a product, which the adjoint operation imposes: (UT)* =, T*U*. We shall mention extensions of this analogy as we continue our, study of linear operators on an inner product space. We might mention, something along these lines now. A complex number z is real if and only, if z = P. One might expect that the linear operators T such that T = T*, behave in some way like the real numbers. This is in fact the case. For, complex, inner, example, if T is a linear operator on a finite-dimensional, product space, then, (8-15), , T = U,+iCJ,, , where iI71 = UT and iYJ2= U& Thus, in some sense, T has a ‘real part’ and, an ‘imaginary, part.’ The operators VI and Uz satisfying UI = UT, and, i.7, = Ub, and (8-15) are unique, and are given by, , 297
Page 306 :
298, , Inner, , Chap. 8, , Product Spaces, VI = f (T +, , T*), , U, = $ (T - T*)., A linear, Hermitian)., , operator T such that T = T* is called self-adjoint, If @ is an orthonormal, basis for V, then, , (or, , [T*h = D”ltB, and so T is self-adjoint if and only if its matrix in every orthonormal, basis, is a self-adjoint, matrix. Self-adjoint, operators are important,, not simply, because they provide us with some sort of real and imaginary part for the, general linear operator, but for the following, reasons: (1) Self-adjoint, operators have many special properties. For example, for such an operator, there is an orthonormal, basis of characteristic vectors. (2) Many operators, which arise in practice are self-adjoint., We shall consider the special, properties of self-adjoint, operators later., , Exercises, 1. Let V be the space C2, with the standard inner product. Let T be the linear, operator defined by Tcl = (1, -2), TEZ = (i, -1). If L\I = (x1, zz), find T*cY., 2. Let T be the linear operator on C2 defined by Tel = (1 + i, 2), TQ = (i, i)., Using the standard inner product, find the matrix of T* in the standard ordered, basis. Does T commute with T*?, 3. Let V be C3 with the standard inner product. Let T be the linear operator on, V whose matrix in the standard ordered basis is defined by, Aik = ji+k >, (i” = -1)., Find a basis for the null space of T*., 4. Let V be a finite-dimensional inner product space and T a linear operator on V., Show that the range of T* is the orthogonal complement of the null space of T., 5. Let V be a finite-dimensional inner product space and T a linear operator on V., If T is invertible, show that T* is invertible and (T*)-’ = (T-1) *., 6. Let V be an inner product space and /3, y fixed vectors in V. Show that, Tcz = (arIP) defines a linear operator on V. Show that T has an adjoint, and, describe T* explicitly., Now suppose V is Cn with the standard inner product, @ = (yl, . . . , YA and, Y = (Xl, . . . , z,). What is the j, k entry of the matrix of T in the standard ordered, basis? What is the rank of this matrix?, 7. Show that the product of two self-adjoint operators is self-adjoint if and only, if the two operators commute.
Page 307 :
Sec. 8.4, , Unitary, , 8.. Let V be the vector space of the polynomials, equal to 3, with the inner product, , If t is a real number,, , 9. Let 17 be the inner product, operator on V. Find D*., , over R of degree less than, , g1 in V such that (f/gJ, , find the polynomial, , space of Exercise, , Operators, , = f(t) for allj, , or, , in V., , 8, and let D be the differentiation, , 10. Let V be the space of n X n matrices over the complex numbers, with the, matrix in V, and, inner product, (A, B) = tr (AB*)., Let P be a fixed invertible, let Tp be the linear operator on V defined by T,(A) = P-‘AP. Find the adjoint, of Tp., 11. Let V be a finite-dimensional, inner product space, and let E be an idempotent, if and only if, linear operator on V, i.e., E2 = E. Prove that E is self-adjoint, , EE* = E*E., 12. Let V be a finite-dimensional, linear operator on V. Prove that, every o( in V., , complex, , inner, , T is self-adjoint, , product space, and let T be a, if and only if (TOI] is real for, , 8.4., , Unitary, , Operators, , In this section, we consider the concept of an isomorphism, between, two inner product spaces. If V and W are vector spaces, an isomorphism, of V onto W is a one-one linear transformation, from V onto W, i.e., a, one-one correspondence between the elements of V and those of W, which, $)reserves the vector space operations. Now an inner product space consists of a vector space and a specified inner product on that space. Thus,, when V and W are inner product spaces, we shall require an isomorphism, from V onto W not only to preserve the linear operations, but also to, preserve inner products. An isomorphism of an inner product space onto, itself is called a ‘unitary operator’ on that space. We shall consider various, operators, and establish their basic properties., examples of unitary, Dejinition., Let V and W be inner product spaces over the same jield,, and let T be a linear tran.sformation from V into W. We say that T preserves inner, products, if (TaITfi), = (&3) for all CX, p in V. An isomorphism, of V onto W is a vector space isonzorphism T of V onto W which, also preserves inner products., , If T preserves, sarily, , non-singular., , inner products,, Thus, , then llTcv\l = IIcx~IIand so T is necesfrom, V onto W can also be, from V onto W which preserves inner, of V onto W, then T-l is an isomorphism, , an isomorphism, , defined as a linear transformation, products. If T is an isomorphism, , 299
Page 308 :
300, , Inner Product Spaces, , Chap. 8, , of W onto V; hence, when such a T exists, we shall simply say V and W, Of course, isomorphism, of inner product spaces is an, are isomorphic., equivalence relation., Theorem, 10. Let V and w be finite-dimensional, inner product spaces, over the same field, having the same dimension. If T is a linear transformation, from V into W, the following are equivalent., , (i), (ii), (iii), for W., (iv), for W., , T preserves inner products., T is an (inner product space) isomorphism., T carries every orthonormal basis for V onto an orthonormal, , basis, , T carries some orthonormal, , basis, , basis for V onto an orthonormal, , Proof. (i) + (ii) If T preserves inner products, then 11Taj 1 = 1Ia/ 1, for all (Y in V. Thus T is non-singular,, and since dim V = dim W, we know, that T is a vector space isomorphism., (ii) -+ (iii) Suppose T is an isomorphism., Let (aI, . . . , an} be an, orthonormal, basis for V. Since T is a vector space isomorphism, and, dim W = dim V, it follows that {Toll, . . . , TLY,} is a basis for W. Since T, also preserves inner products, (TajlTaE) = (aj[ak) = 6jk., (iii) + (iv) This requires no comment., basis for V such that, (iv) + (i) Let {ai, . . . , (r,} be an orthonormal, Ta,}, is, an, o, r, th, onormal, basis, for, W., Then, {TN, . . . ,, (TcY~~Tcx~)= (cYj\ak) =, 6jka, , For any (Y = ZI(Y~ + 1. . + zna, and p = ylal + . . . + y,,a,, in V, we have, , and so T preserves, , inner products., , 1, , Corollary., Let V and W be finite-dimensional, inner product spaces, over the same field. Then V and W are isomorphic if and only if they have, the same dimension., , basis for V and, Proof. If (~(1, . . . , (Y,} is an orthonormal, basis for W, let T be the linear transfor@I, . . . , pn} is an orthonormal, mation from V into W defined by Tai = flj. Then T is an isomorphism of, V onto W. 1
Page 309 :
Unitary Operators, , Sec. 8.4, , EXAMPLE 22. If V is an n-dimensional, inner product space, then each, ordered orthonormal, basis @ = {ai, . . . , an} determines an isomorphism, of V onto Fn with the standard inner product. The isomorphism is simply, T(Xlcx1 + * * * + x&J, , = (21, . . . ) 2,)., , There is the superficially, different isomorphism which CBdetermines of V, onto the space F nX1 with (X/Y) = Y*X as inner product. The isomorphism is, , a+ [aIcE, i.e., the transformation, sending cyinto its coordinate matrix in the ordered, basis @. For any ordered basis @, this is a vector space isomorphism;, however, it is an isomorphism of the two inner product spaces if and only, if CBis orthonormal., EXAMPLE 23. Here is a slightly less superficial isomorphism., Let W, be the space of all 3 X 3 matrices A over R which are skew-symmetric,, i.e., At = -A. We equip W with the inner product (AIB) = 3 tr (ABt),, the + being put in as a matter of convenience. Let V be the space R3 with, the standard inner product. Let T be the linear transformation, from V, into W defined by, T(xI, x2, xd = [ -;:, , -i:, , -i’]-, , Then T maps V onto W, and putting, A =[-;;, , -;, , -;:I7, , B =[p:, , -%, , -;:I, , we have, tr (AB “1 =, =, , z3y3, 2(XlYl, , + x2y2 + z3y3 +, + xzyz + z3y3)., , x2y2, , + xlyl, , Note that T, Thus (alp) = (T~rlTfi) and T is, . a vector space isomorphism., carries the standard basis {pi, Q, c3} onto the orthonormal, basis consisting, of the three matrices, , EXAMPLE 24. It is not always particularly, convenient to describe an, isomorphism in terms of orthonormal, bases. For example, suppose G = P*P, where P is an invertible n X n matrix with complex entries. Let V be the, space of complex n X 1 matrices, with the inner product [XIY] = Y*GX., , 301
Page 310 :
Inner Product Spaces, , Chap. 8, , Let W be the same vector space, with the standard inner product (Xl Y) =, Y*X. We know that V and W are isomorphic inner product spaces. It, would seem that the most convenient way to describe an isomorphism, between V and W is the following:, Let T be the linear transformation, from V into W defined by T(X) = PX. Then, (TXITY), , =, =, =, =, =, , (PXIPY), (PY)*(Px), Y*P*Px, Y*GX, [XIY]., , Hence T is an isomorphism., EXAMPLE 25. Let V be the space of all continuous real-valued, tions on the unit interval, 0 5 t 5 1, with the inner product, , [.fbl = j,lS(OdW, Let W be the same vector, , func-, , dt., , space with the inner product, , Let T be the linear transformation, (Tf)(O, , from V into W given by, = tf(O., , Then (TflTg) = Vbl, and so T preserves inner products; however, T is, not an isomorphism of V onto W, because the range of T is not all of W., Of course, this happens because the underlying, vector space is not finitedimensional., Theorem, 11. Let V and W be inner product spaces over the same field,, and let T be a linear transformation, from V into W. Then T preserves inner, products if and only if jjTa/l = j/all for every CYin V., , Proof., , If T preserves inner products, T ‘preserves norms.’ SupposeIIT = Ibll f or every (Y in V. Then llT~lj2 = lla/j2. Now using the, appropriate, polarization, identity, (8-3) or (S-4), and the fact that T is, linear, one easily obtains (alp) = (TalTp) for all LY,p in V. 1, A unitary, operator, of the space onto itself., , DeJinition., , morphism, , on an inner product space is an iso-, , The product of two unitary operators is unitary. For, if U1 and Up, are unitary,, then U2U1 is invertible, and /I U2U41 = II UNI][ = I IoJI for, each o(. Also, the inverse of a unitary operator is unitary, since Ij Ual I =, (Icy(j says II U-‘~ll = /IpI 1, where /? = Ucr. Since the identity operator is
Page 311 :
Unitary Operators, , Sec. 8.4, , clearly unitary, we see that the set of all unitary operators on an inner, product space is a group, under the operation of composition., If V is a finite-dimensional, inner product space and U is a linear, operator on V, Theorem 10 tells us that U is unitary if and only if, (U4UP), = (43 f or each (Y, /3 in V; or, if and only if for some (every), orthonormal, basis {al, . . . , (Ye} it is true that {UCQ, . . . , UOC~} is an, orthonormal, basis., Theorem, , 12., , Then U is unitary, u*u = I., Proof., , Let U be a linear operator on an inner prod --? space V., if and only if the adjoint Ti* of U exists and UU* =, , Suppose U is unitary., (Ualp), , Then U is invertible, , = (UCY uu-l/3), , and, , = (al u-y), , for all (Y, /3. Hence U-l is the adjoint of U., Conversely,, suppose U* exists and UU* = U*U = I. Then, invertible, with U-l = U*. So, we need only show that U preserves, products. We have, , for all a!, 0., , U is, inner, , 1, , EXAMPLE 26. Consider CnX1 with the inner product (X/Y) = Y*X., Let A be an n X n matrix over C, and let U be the linear operator defined, by ii(X) = AX. Then, (UXl, , VU) = (AXIAY), , for all X, Y. Hence, U is unitary, Definition., , = Y*A*AX, , if and only if A*A, , = I., , A complex n X n matrix A is called unitary,, , if A*A, , = I., , Theorem, 13. Let V be a jinite-dimensional, inner product space and, let U be a linear operator on V. Then U is unitary if and only if the matrix, of U in some (or every) ordered orthonormal, basis is a unitary matrix., , Proof. At this point, this is not much, it largely for emphasis. If @ = (011,. . . , a,}, basis for V and A is the matrix of U relative, only if U*U = I. The result now follows from, Let A be an n X n matrix. The statement, means, (A*A)j~ = Sj,, , of a theorem, and we state, is an ordered orthonormal, to a, then A*A = I if and, Theorem 12. 1, that A is unitary, , simply
Page 312 :
304, , Inner Product Spaces, , Chap, 8, , 2 A,jA,E = Sj,., , r=l, , In other words, it means that the columns of A form an orthonormal, set, of column matrices, with respect to the standard inner product (XIY) =, Y*X. Since A*A = I if and only if AA* = I, we see that A is unitary, exactly when the rows of A comprise an orthonormal, set of n-tuples in C,, (with the standard inner product)., So, using standard inner products,, A is unitary if and only if the rows and columns of A are orthonormal, sets., One sees here an example of the power of the theorem which states that a, one-sided inverse for a matrix is a two-sided inverse. Applying this theorem, as we did above, say to real matrices, we have the following: Suppose we, have a square array of real numbers such that the sum of the squares of, the entries in each row is 1 and distinct rows are orthogonal., Then the, sum of the squares of the entries in each column is 1 and distinct columns, are orthogonal., Write down the proof of this for a 3 X 3 array, without, using any knowledge of matrices, and you should be reasonably impressed., , nal,, , Dejinition., A real or complex, if AtA = I., , n X n matrix A is said to be orthogo-, , A real orthogonal, matrix is unitary;, and,, orthogonal if and only if each of its entries is real., EXAMPLE, , 27. We give, , some examples, , a unitary, , of unitary, , matrix, , is, , and orthogonal, , matrices., (a), unitary, ICI = 1,, (b), , A 1 X 1 matrix [c] is orthogonal, if and only if c = &l, and, if and only if EC = 1. The latter condition means (of course) that, or c = eis, where 0 is real., Let, , Then A is orthogonal, , if and only if, At, , =, , A-1, , zz, a&[-:, , The determinant, of any orthogonal, A is orthogonal if and only if, , A=, or, , -3, , matrix, , is easily seen to be f 1. Thus, , [ 1, 1 1, , A = -;, , -b, , ab, , -;, , a
Page 313 :
Sec. 8.4, , Unitary Operators, , where a2 + b2 = 1. The two cases are distinguished by the value of det A., (c) The well-known, relations between the trigonometric, functions, show that the matrix, A0 =, , ;;, [, , ;, , -sin ’, cos e1, , is orthogonal., If 0 is a real number, then Au is the matrix in the standard, ordered basis for R2 of the linear operator US, rotation through the angle 8., The statement that As is a real orthogonal matrix (hence unitary) simply, means that Us is a unitary operator, i.e., preserves dot products., (d) Let, , Then A is unitary, , The determinant, complex number, , if and only if, , B, 6, 6, d, 1 1=a&c[-f -a"]*, , of a unitary matrix has absolute value, of the form eis, 0 real. Thus A is unitary, , where 0 is a real number,, (a(2 + (b[2 = 1., , and a, b are complex, , 1, and is thus a, if and only if, , numbers, , such that, , As noted earlier, the unitary operators on an inner product space, form a group. From this and Theorem 13 it follows that the set U(n) of, all n X n unitary matrices is also a group. Thus the inverse of a unitary, matrix and the product of two unitary matrices are again unitary. Of, course this is easy to see directly. An n X n matrix A with complex entries, is unitary if and only if A-’ = A*. Thus, if A is unitary, we have (A-l)-’, =, A = (A*)-’ = (A-l)*., If A and B are n X n unitary, matrices, then, (A@-’, = B-IA-1 = B*A* = (AB)*., The Gram-Schmidt, process in C” has an interesting, matrices that involves the group U(n)., , corollary, , for, , Theorem, 14. For every invertible complex n X n matrix B there exists, a unique lower-triangular, matrix M with positive entries on the main diagonal, such that MB is unitary., , Prooj. The rows pl, . . . , &, of B form a basis for P. Let (~1,. . . , LY,, be the vectors obtained from p,, . . . , Pn by the Gram-Schmidt, process., Then, for 1 < Ic I n, {aI, . . . , o(k) is an orthogonal basis for the subspace, spanned by {pl, . . . , &}, and, , 305
Page 314 :
306, , Inner, , Product, , Chap. 8, , Spaces, , Hence, for each k there exist unique, ak, , Let U be the unitary, , matrix, , =, , pk, , scalars ckj such that, - jTk, , ckjfij., , with rows, g+‘~, , and M the matrix, , defined by, if j < k, , Then M is lower-triangular,, in the sense that its entries, diagonal are 0. The entries Mkk of M on the main diagonal, , Now these equations, , above the main, are all > 0, and, , simply say that, U=MB., , To prove the uniqueness of M, let 5!‘+(n) denote the set of all complex, matrices with positive entries on the main diagonal., Suppose Ml and M2 are elements of T+(n) such that MiB is in U(n) for, i = 1, 2. Then because U(n) is a group, n X n lower-triangular, , (M,B)(M,B)-’, , = MIM,’, , lies in U(n). On the other hand, although it is not entirely obvious, T+(n), is also a group under matrix multiplication., One way to see this is to consider the geometric properties of the linear transformations, X+MX,, , (M, , in T+(n)), , on the space of column matrices. Thus MT’, MIMz’, and (MIMF’)-’, are, is in U(n), (MlM;‘)-l, = (MlM,‘)*., The, all in T+(n). But, since MIMg’, transpose or conjugate transpose of any lower-triangular, matrix is an, upper-triangular, matrix. Therefore,, MIMzl, is simultaneously, upperand lower-triangular,, i.e., diagonal. A diagonal matrix is unitary if and, only if each of its entries on the main diagonal has absolute value 1; if the, = I, diagonal entries are all positive, they must equal 1. Hence MlM;’, andMl=Mz., 1
Page 315 :
Sec. 8.4, , Unitary Operators, , Let GL(n) denote the set of all invertible, complex n X n matrices., Then GL(n) is also a group under matrix multiplication., This group is, linear, group., Theorem 14 is equivalent, to the folcalled the general, lowing result., Corollary., For each B in GL(n), such that N is in T+(n), U is in U(n),, , there exist unique matrices N and U, and, , B=N.U., Proof. By the theorem there is a unique, that MB is in U(n). Let MB = U and N = M-l., B = N . U. On the other hand, if we are given, such that N is in T+(n), U is in U(n), and B =, U(n) and N-l is the unique matrix M which, theorem; furthermore, U is necessarily N-‘B., 1, , matrix M in T+(n) such, Then N is in T+(n) and, any elements N and U, N . U, then N-lB is in, is characterized, by the, , EXAMPLE 28. Let x1 and x2 be real numbers, and $1 # 0. Let, , such that xi + xg = 1, , B=;, 4”, 0., [ 1, 0, , Applying, vectors, , the Gram-Schmidt, , Let U be the matrix, , NOW multiplying, , we find that, , 1, , process to the rows of B, we obtain, , w, , =, , (XI,, , a2, , =, , (0,, , =, , x1(-22,, , a3, , 0, , x2,0>, 1,, , 0), , Xl,, , x2(x1,, , x2,0>, , 0), , = (0, 0, 1)., , with rows CXI,(cY~/x~), 013.Then U is unitary,, , by the inverse of, , and, , the, , 307
Page 316 :
Inner Product Spaces, , Chap. 8, , Let us now consider briefly change of coordinates in an inner product, space. Suppose V is a finite-dimensional, inner product space and that, 63 = {al, . . . , cxn) and ~$3’= {a:, . . . , CXA) are two ordered orthonormal, bases for V. There is a unique (necessarily invertible), n X n matrix P, such that, [al&q = P-l[a]&, for every a! in V. If U is the unique linear operator on V defined, 7Jaj = ol$, then P is the matrix of U in the ordered basis ~3:, CY&= i, , by, , PjkCYj., , j=l, , Since G?,and 6~’ are orthonormal, bases, U is a unitary operator, a unitary matrix. If T is any linear operator on V, then, , CT&w= P-‘[T]&’, , and P is, , = P*[T$$,, , Let A and B be complex n X n matrices. We say that B, to A if there is an n X n unitary matrix P such, equivalent, to A if there, that B = P-‘Al?. We say that B is orthogonally, is an n X n orthogonal matrix P such that B = P-‘AP., , Dejinition., is unitarily, , equivalent, , With this definition,, what we observed above may be stated as, follows: If &Aand a’ are two ordered orthonormal, bases for V, then, for, each linear operator T on V, the matrix [T]~J is unitarily, equivalent, to, the matrix [T]a. In case V is a real inner product space, these matrices, are orthogonally, equivalent,, via a real orthogonal matrix., , Exercises, 1. Find a unitary matrix which is not orthogonal, and find an orthogonal matrix, which is not unitary., , 2. Let V be the space of complex n X n matrices with inner product (AIB) =, tr (AB*). For each M in V, let TM be the linear operator defined by T&A) = MA., Show that TM is unitary if and only if M is a unitary matrix., 3. Let V be the set of complex numbers, regarded as a real vector space., (a) Show that (c#) = Re (c$) defines an inner product on 8., (b) Exhibit an (inner product space) isomorphism of V onto R2 with the, standard inner product., (c) For each y in V, let M, be the linear operator on V defined by M,(cY) = ya., Show that (M,)* = M-,., (d) For which complex numbers y is M, self-adjoint?, (e) For which y is M, unitary?
Page 317 :
Sec. 8.4, (f), (g), (h), (i), for T to, (j), , Unitary Operators, For which y is M, positive?, What is det (M,)?, Find the matrix of M, in the basis {I, i} ., If T is a linear operator on V, find necessary and sufficient, be an M,., Find a unitary operator on V which is not an M,., , 4. Let V be R2, with the standard inner product. If U is a unitary, show that the matrix of U in the standard ordered basis is either, cos e, sin e, , -sin e, cos 8, , 1, , or, , [, , e, , cos, sin 0, , sin e, --cos e, , conditions, , operator, , on V,, , 1, , for some real 0, 0 5 8 < 2a. Let Ue be the linear operator corresponding, to the, first matrix, i.e., Ue is rotation through the angle 0. Now convince yourself that, or reflection about the cl-axis, every unitary, operator on V is either a rotation,, followed by a rotation., (a) What is USU~?, (b) Show that Vi = U-0., (c) Let 4 be a fixed real number,, and let 6~ = {al, cyz) be the orthonormal, basis obtained, by rotating, (Q, Q} through, the angle 4, i.e., ‘pi = U+E~. If 0 is, another real number, what is the matrix of US in the ordered basis a?, 5. Let V be R3, with the standard inner product. Let W be the plane spanned, by a! = (1, 1, 1) and p = (1, 1, -2). Let U be the linear operator defined, geothrough the angle 0, about the straight line, metrically,, as follows: U is rotation, through the origin which is orthogonal, to TV. There are actually two such rotations, -choose, one. Find the matrix of U in the standard ordered basis. (Here is one, way you might proceed. Find (Y~and ay2 which form an orthonormal, basis for W., Let (~3 be a vector of norm 1 which is orthogonal, to W. Find the matrix of U in, the basis {LL~, 02, cyQ}. Perform a change of basis.), 6. Let V be a finite-dimensional, inner product space, and let W be a subspace, of V. Then V = W @ WL, that is, each 01 in V is uniquely expressible in the form, (Y = /3 + y, with /3 in W and y in WI. Define a linear operator U by Ua = p - y., (a) Prove that U is both self-adjoint, and unitary., (b) If V is R3 with the standard inner product and lJ7 is the subspace spanned, by (1, 0, I), find the matrix of U in the standard ordered basis., 7. Let V be a complex inner product space and T a self-adjoint linear, on V. Show that, (a) lj~ll + iToll\ = 11~~- iTaj( for every (Y in V., (b) cy + iTcr = /3 + &T@if and only if Q = p., (c) I + iT is non-singular., (d) 1 - iT is non-singular., and prove that, (e) Now suppose V is finite-dimensional,, , U = (I - iT)(I, is a unitary, , U = f(T),, , operator;, where!(z), , U is called the Cayley, = (1 - iz)/(l, + ix)., , operator, , + iT)-’, transform, , of T. In a certain, , sense,, , 309
Page 318 :
SlO, , Inner, , Product Spaces, , Chap. 8, , 8. If f3 is a real number,, , prove that the following, , [, , ;;;, , -::;I>, , [yy, , matrices, , are unitarily, , equivalent, , e!q*, , 9. Let V be a finite-dimensional, inner product space and T a positive linear, operator on V. Let pT be the inner product on V defined by pT(a, p) = (Ta#?)., Let C be a linear operator on V and U* its adjoint with respect to ( / ). Prove, that U is unitary with respect to the inner product pT if and only if T = U*TU., , V be a finite-dimensional, inner product space. For each (Y, /3 in V, let, Tar,0 be the linear operator on V defined by T&y), = (~]/3)cu. Show that, 10. Let, , (4, (b), Cc), (d), , C,S = TB+., trace (Tad, = (43., Tm&“,,s, = T~,cms., Under what conditions, , is Tu,o self-adjoint?, , inner product space over the field F, and let L(V, V), 11. Let V be an n-dimensional, be the space of linear operators on V. Show that there is a unique inner product, that IITu,o/12 = il,\jzll/?12, for all (Y, @ in V. (T,,p, on L(V, V) with the property, is the operator defined in Exercise 10.) Find an isomorphism, between L(V, V), with this inner product and the space of n X n matrices over F, with the inner, product (AIB) = tr (AB*)., 12. Let V be a finite-dimensional, inner product space., how to construct, some linear operators on V which, unitary., Now prove that there are no others, i.e., that, operator arises from some subspace W as we described, , In Exercise 6, we showed, are both self-adjoint, and, every self-adjoint, unitary, in Exercise 6., , 13. Let V and W be finite-dimensional, inner product, spaces having the same, dimension., Let, U be an isomorphism, of V onto W. Show that:, T + UTU-1 is an isomorphism, of the vector space L(V, V), (a) The mapping, onto the vector space L(W, W)., (b) trace (UTU-1) = trace (T) for each T in L(V, V)., (c) UT,,BU-~ = TuC,ub (T,,b defined in Exercise 10)., = UT*U-l., (d) (UTU-‘)*, (e) If we equip L(V, V) with inner product, (T1/TP) = trace (TIT,*), and, similarly, for L(W, W), then T + UTU-1 is an inner product space isomorphism., 14. If V is an inner product space, a rigid motion, is any function, T from V, into V (not necessarily linear) such that IITcx - T/3\ / = I[a! - PI 1 for all ac, p in V., One example of a rigid motion is a linear unitary, operator. Another example is, translation, by a fixed vector y:, T,(a), , = a + Y, , (a) Let V be R2 with the standard inner product. Suppose T is a rigid motion, of V and that T(0) = 0. Prove that T is linear and a unitary operator., (b) Use the result of part (a) to prove that every rigid motion of R2 is composed of a translation,, followed by a unitary operator., (c) Now show that a rigid motion of R* is either a translation, followed by a, rotation,, or a translation, followed by a reflection followed by a rotation.
Page 319 :
sec. 8.5, , Normal, , 15. A unitary, , operator, , operator on R4 (with the standard, which preserves the quadratic form, 11(x, y, 2, t)p, , inner product), , is simply, , Operators, a linear, , = x2 + y2 + z2 + t2, , that is, a linear operator U such that 11CLU//~ = llo1\2 for all LY in R4. In a certain, part of the theory of relativity,, it is of interest to find the linear operators T which, preserve the form, 11(x, y, 2, t)lI”L = t2 - x2 - y2 - 22., Now Ij 11; does not come from an inner product,, but from something, called, the ‘Lorentz metric’ (which we shall not go into). For that reason, a linear operator, T on R4 such that ~\TcY~\: = ll~u\\& for every a in R4, is called a Lorentz, transformation., , (a) Show that the function, , U(x, Y,, , U defined, 2, t) =, , by, t+x, , 1 y-k, , y+iz, t-x, , 1, , is an isomorphism, of R4 onto the real vector space H of all self-adjoint, 2 X 2, complex matrices., (b) Show that \iali% = det (Ucr)., (c) Suppose T is a (real) linear operator on the space H of 2 X 2 self-adjoint, matrices. Show that L = U-1TU is a linear operator on R4., (d) Let M be any 2 X 2 complex matrix. Show that T&A), = M*AM, defines, a linear operator TM on H. (Be sure you check that TM maps H into H.), (e) If dl is a 2 X 2 matrix such that ldet dlj = 1, show that LM = U-‘T,U, is a Lorentz transformation, on R4., (f) Find a Lorentz transformation, which is not an L.M., , 8.5., The, problem., , principal, objective, in this, If T is a linear operator, , space V, under what conditions, , section, , is the, , solution, , Normal, , Operators, , of the following, , on a finite-dimensional, inner product, does V have an orthonormal, basis con-, , sisting, of characteristic, vectors for T? In other words, when is there an, orthonormal, basis 63 for V, such that the matrix of T in the basis @ is, , diagonal?, We shall begin by deriving some necessary conditions on T, which, we shall subsequently, show are sufficient. Suppose C%= (aI, . . . , cy,} is, an orthonormal, basis for V with the property, (8-16), , Tai = cjai,, , j = 1, . . . ) n., , This simply says that the matrix of Tin the ordered basis @ is, matrix with diagonal entries cl, . . . , en, The adjoint operator, sented in this same ordered basis by the conjugate transpose, the diagonal matrix with diagonal entries i?i, . . . , F,. If V is, , the diagonal, T* is reprematrix, i.e.,, a real inner, , 311
Page 320 :
312, , Inner Product Spaces, , Chap. 8, , product space, the scalars cl, . . . , cn are (of course) real, and so it must, be that T = T* .In other words, if V is a finite-dimensional, real inner, probut space and T is a linear operator for which there is an orthonormal, basis of characteristic, vectos, then T must be self-adjoint., If V is a complex inner product space, the scalars cl, . . . , c, need not be real, i.e.,, T need not be self-adjoint., But notice that T must satisfy, (8-17), , TT*, , = T*T., , For, any two diagonal matrices commute, and since T and T* are both, represented by diagonal matrices in the ordered basis a, we have (8-17)., It is a rather remarkable fact that in the complex case this condition is, also sufficient to imply the existence of an orthonormal, basis of characteristic vectors., DeJinition., Let V be a finite-dimensional, linear operator on V. We say that T is normal, ie. ‘7 TT* = T*T., , inner product space and T a, if it commutes with its adjoint, , Any self-adjoint, operator is normal, as is any unitary operator. Any, scalar multiple of a normal operator is normal; however, sums and products of normal operators are not generally normal. Although it is by no, means necessary, we shall begin our study of normal operators by considering self-adjoint operators., Theorem, 15. Let V be an inner product space and T a self-adjoint, linear operator on V. Then each characteristic value of T is real, and characteristic vectors of T associated with distinct characteristic values are, orthogonal., , Proof. Suppose c is a characteristic, for some non-zero vector 0~.Then, c(aIa), , value of T, i.e., that Ta = ca, , = (cc+), = (T&x), a cff, 1 iaiTY, = F(a[ct)., , Since (ala) # 0, we must have c = F. Suppose we also have TP = dP with, P # 0. Then, 44o) = CT@>, I, , ~a~~;~, , = c&p,, = d(@)., If c # d, then (~~10) = 0., , 1
Page 321 :
Normal Operators, , Sec. 8.5, , It should be pointed out that Theorem 15 says nothing, existence of characteristic values or characteristic vectors., Theorem, , dimension,, , about, , the, , 16. On a jkite-dimensional, inner product space of positive, every self-adjoint operator has a (non-zero) characteristic vector., , Let V be an inner product space of dimension n, where, operator on I’. Choose an orthonormal, basis @ for V and let A = [T]aj. Since T = T*, we have A = A*. Now, let TV be the space of n X 1 matrices over C, with inner product (XIY) =, Y*X. Then U(X) = AX defines a self-adjoint, linear operator U on W., The characteristic, polynomial,, det (~1 - A), is a polynomial of degree n, over the complex numbers; every polynomial, over C of positive degree, has a root. Thus, there is a complex number c such that det (cl - A) = 0., This means that A - cl is singular, or that there exists a non-zero X, such that AX = cX. Since the operator U (multiplication, by A) is selfadjoint, it follows from Theorem 15 that c is real. If V is a real vector, space, we may choose X to have real entries. For then A and A - cI have, real entries, and since A - cI is singular, the system (A - cI)X = 0 has, a non-zero real solution X. It follows that there is a non-zero vector Q in, V such that Tcr = CCL 1, Proof., , n > 0, and let T be a self-adjoint, , There are several comments we should make about the proof., (1) The proof of the existence of a non-zero X such that AX = CX, had nothing to do with the fact that A was Hermitian, (self-adjoint)., It, shows that any linear operator on a finite-dimensional, complex vector, space has a characteristic vector. In the case of a real inner product space,, the self-adjointness, of A is used very heavily, to tell us that each characteristic value of A is real and hence that we can find a suitable X with, real entries., (2) The argument shows that the characteristic, polynomial of a selfadjoint matrix has real coefficients, in spite of the fact that the matrix, may not have real entries., (3) The assumption that V is finite-dimensional, is necessary for the, theorem; a self-adjoint, operator on an infinite-dimensional, inner product, space need not have a characteristic, value., ,, EXAMPLE 29. Let V be the vector space of continuous, valued, (or real-valued), continuous, functions, on the unit, 0 _< t _< 1, with the inner product, , The operator ‘multiplication, suppose that Tf = cf. Then, , by 1,’ (Tf)(t), , = tf(t), is self-adjoint., , complexinterval,, , Let us
Page 322 :
Inner Product Spaces, , Chap. 8, , 0 - c)f(O = 0,, , o<t<1, , and so f(t) = 0 for t # c. Since f is continuous,, characteristic values (vectors)., , f = 0. Hence T has no, , Theorem, 17. Let V be a jlnite-dimensional, inner product space, and, let T be any linear operator on V. Suppose W is a subspace of V which is, invariant, under T. Then the orthogonal complement of W is invariant, under T*., , Proof. We recall that the fact that W is invariant, under T does, not mean that each vector in W is left fixed by T; it means that if (ILis in, W then TCY is in W. Let /3 be in WI. We must show that T*fl is in WI,, that is, that ((~lT*p) = 0 for every a! in W. If a! is in W, then Tot is in W,, so (Tc#) = 0. But (TalP) = (aIT*@., 1, Theorem, 18. Let V be a finite-dimensional, inner product space, and, let T be a self-adjoint linear operator on V. Then there is an orthonormal basis, for V, each vector of which is a characteristic vector for T., , Proof. We are assuming dim V > 0. By Theorem 16, T has a, characteristic vector (Y. Let 0~~= a/llal[ so that LYEis also a characteristic, vector for T and (Ia11I = 1. If dim V = 1, we are done. Now we proceed, by induction on the dimension of V. Suppose the theorem is true for inner, product spaces of dimension less than dim V. Let W be the one-dimensional, subspace spanned by the vector o(i. The statement that (~1is a characteristic, vector for T simply means that W is invariant, under T. By Theorem 17,, the orthogonal, complement, WI is invariant, under T* = T. Kow WI,, with the inner product from V, is an inner product space of dimension, one less than the dimension of V. Let U be the linear operator induced, on WI by T, that is, the restriction of T to WI. Then U is self-adjoint,, and by the induction hypothesis, WI has an orthonormal, basis (a2, . . . , a,}, consisting of characteristic, vectors for U. Now each of these vectors is, also a characteristic, vector for T, and since V = W @ WI, we conclude, that {cul, . . . , an} is the desired basis for V. 1, Let A be an n X n Hermitian, (self-adjoint), matrix. Then, is a unitary matrix P such that P-‘AP is diagonal (A is unitarily, equivalent to a diagonal matrix). If A is a real symmetric matrix, there is a, real orthogonal matrix P such that P-‘AP is diagonal., Corollary., , there, , Proof. Let V be CnX1, with the standard inner, be the linear operator on V which is represented by, ordered basis. Since A = A*, we have T = T*. Let, be an ordered orthonormal, basis for V, such that Toli =, If D = [T]a, then D is the diagonal matrix with diagonal, Let P be the matrix with column vectors al, . . . , (Y,., , product, and let T, A in the standard, @ = {al, . . . , an}, cjaj, j = 1, . . . , n., entries cl, . . . , c,., Then D = P-‘AP.
Page 323 :
Sec. 8.5, , Normal Operators, , In case each entry of A is real, we can take V to be Rn, with the, standard inner product, and repeat the argument. In this case, P will be, a unitary matrix with real entries, i.e., a real orthogonal matrix., 1, Combining Theorem 18 with our comments at the beginning of this, section, we have the following:, If V is a finite-dimensional, real inner, product space and T is a linear operator on V, then 17 has an orthonormal, basis of characteristic vectors for T if and only if T is self-adjoint., Equivalently, if A is an n X n matrix with real entries, there is a real orthogonal, matrix P such that PtAP is diagonal if and only if A = At. There is no, such result for complex symmetric matrices. In other words, for complex, matrices there is a significant difference between the conditions A = At, and A = A*., Having disposed of the self-adjoint, case, we now return to the study, of normal operators in general. We shall prove the analogue of Theorem 18, for normal operators, in the complex case. There is a reason for this restriction. A normal operator on a real inner product space may not have any, non-zero characteristic, vectors. This is true, for example, of all but two, rotations in R2., Theorem, 19. Let V be a finite-dimensional, inner product space and, T a normal operator on V. Suppose 01 is a vector in V. Then Q is a characteristic vector for T with characteristic value c if and only if (Y is a characteristic vector for T* with characteristic value 1., , Proof. Suppose U is any normal operator on V. Then, I/ U*ctll. For using the condition UU* = U*U one sees that, IIUall2, , = (UalUa), , I/ UCX/~=, , = (cx]u*ua), = (al uu*cY> = (u*cx u*a> = 11u*alp., , If c is any scalar, the operator U = T - c1 is normal. For (T - cI)*, T* - EI, and it is easy to check that UU* = U*U. Thus, ll(T - cI)all, , = II(T*, , - cI)all, , so that (T - c1)a = 0 if and only if (T* - II)a, Dejkition., , A complex n X II matrix, , =, , = 0., , 1, , A is called normal, , if AA* =, , A*A., It is not so easy to understand what normality, of matrices or operators really means; however, in trying to develop some feeling for the, concept, the reader might find it helpful to know that a triangular, matrix, is normal if and only if it is diagonal., Theorem, , linear operator, , 20. Let V be a Jinite-dimensional, inner product space, T a, on V, and 63 an orthonormal basis for V. Suppose that the, , 315
Page 324 :
316, , Inner Product Spaces, , Chap. 8, , matrix A of T in the basis 03 is upper triangular., only if A is a diagonal matrix., , Then T is normal if and, , Proof. Since @ is an orthonormal, basis, A* is the matrix of T*, in ~3. If A is diagonal, then AA* = A*A, and this implies TT* = T*T., Conversely,, suppose T is normal, and let & = {q . . . , cu,}. Then, since, A is upper-triangular,, Tal = AllcuI. By Theorem 19 this implies, T*oll =, A lloll. On the other hand,, T*acl = Z (A*)jlari, i, = 2 Aljorj., j, Therefore, Al? = 0 for every j > 1. In particular,, is upper-triangular,, it follows that, , A,, = 0, and since A, , Thus T*cr, = A~~(YZand Azj = 0 f or all j # 2. Continuing, we find that A is diagonal., 1, , in this fashion,, , Theorem, 21. Let V be a Jinite-dimensional, complex inner product, space and let T be any linear operator on V. Then there is an orthonormal, basis for V in which the matrix of T is upper triangular., , Proof. Let n be the dimension of V. The theorem is true when, n = 1, and we proceed by induction, on n, assuming the result is true for, linear operators on complex inner product spaces of dimension n - 1., Since V is a finite-dimensional, complex inner product space, there is a, unit vector Q(in V and a scalar c such that, T*a! = ca., Let W be the orthogonal, complement of the subspace spanned by a and, let S be the restriction of T to W. By Theorem 17, W is invariant under T., Thus X is a linear operator on W. Since W has dimension n - 1, our, inductive, assumption, implies the existence of an orthonormal, basis, o(,-~}, for, W, in, which, the, matrix, of, S, is, upper-triangular;, let, {al, . . . ,, 01, = o(. Then {q, . . . , OC,} is an orthonormal, basis for V in which the, matrix of T is upper-triangular., 1, This theorem, , implies the following, , result for matrices., , For every complex n X n matrix A there is a unitary, U such that U-‘AU is upper-triangular., , matrix, , Now combining Theorem 21 and Theorem 20, we immediately, the following analogue of Theorem 18 for normal operators., , obtain, , Corollary.
Page 325 :
Normal Operators, , Sec. 8.5, , Theorem, 22. Let V be a finite-dimensional, complex inner product, space and T a normal operator on V. Then V has an orthonormal basis consisting of characteristic vectors for T., Again, , there is a matrix, , Corollary., , such that P-‘AP, , interpretation., , For every normal matrix, is a diagonal matrix., , A there is a unitary, , matrix, , P, , Exercises, 1. For each of the following real symmetric, matrix P such that PAP is diagonal., c:, 2. Is a complex, , 3, , [:., , symmetric, , matrix, , 3, , matrices, , [::i, , A, find a real orthogonal, , -23:1, , self-adjoint?, , Is it normal?, , 3. For, , A=, there is a real orthogonal, diagonal matrix D., , matrix, , [ 1, 1, 2, 3, , 2, 3, 4, , 3, 4, 5, , P such that PiilP, , = D is diagonal., , Find such a, , 4. Let V be Cz, with the standard inner product. Let T be the linear operator, V which is represented in the standard ordered basis by the matrix, , Show that T is normal,, teristic vectors for T., 5. Give an example, normal., , and find an orthonormal, of a 2 x 2 matrix, , basis for V, consisting, , A such that, , A* is normal,, , on, , of charac-, , but A is not, , 6. Let T be a normal operator on a finite-dimensional, complex inner product, positive, or unit,ary according as every characspace. Prove that T is self-adjoint,, 22 to, teristic value of T is real, positive, or of absolute value 1. (Use Theorem, reduce to a similar question about diagonal matrices.), inner, 7. Let T be a linear operator on the finite-dimensional, and suppose T is both positive and unitary. Prove T = I., 8. Prove T is normal if and only if T = T1 + iTz, where, adjoint operators which commute., , product, , space V,, , TX and Ta are self-, , 9. Prove that a real symmetric, matrix has a real symmetric, cube root; i.e., if A, is real symmetric,, there is a real symmetric, B such that B3 = A., 10. Prove that every positive, , matrix, , is the square of a positive, , matrix.
Page 326 :
318, , Inner Product Spaces, 11. Prove, , that a normal, , Chap. 8, and nilpotent, , operator, , is the zero operator., , vectors, 12. If T is a normal operator, prove that characteristic, associated with distinct characteristic, values are orthogonal., , for T which, , are, , complex inner product, 13. Let T be a normal operator on a finite-dimensional, space. Prove that there is a polynomial, f, with complex coefficients,, such that, T* = f(T). (Represent T by a diagonal matrix, and see what f must be.), 14. If two normal, , operators, , commute,, , prove that, , their product, , is normal.
Page 327 :
9. Operators, Inner, , Product, , on, , Spaces, , 9.1., , Introduction, , We regard most of the topics treated in Chapter 8 as fundamental,, the material that everyone should know. The present chapter is for the, more advanced student or for the reader who is eager to expand his knowledge concerning operators on inner product spaces. With the exception of, the Principal Axis theorem, which is essentially just another formulation, of, Theorem 18 on the orthogonal diagonalization, of self adjoint operators, and, the other results on forms in Section 9.2, the material presented here is, more sophisticated and generally more involved technically. We also make, more demands of the reader, just as we did in the later parts of Chapters, 5 and 7. The arguments and proofs are written in a more condensed style,, and there are almost no examples to smooth the way; however, we have, seen to it that the reader is well supplied with generous sets of exercises., The first three sections are devoted to results concerning forms on, inner product spaces and the relation between forms and linear operators., The next section deals with spectral theory, i.e., with the implications of, Theorems 18 and 22 of Chapter 8 concerning the diagonalization, of selfadjoint and normal operators. In the final section, we pursue the study of, normal operators treating, in particular, the real case, and in so doing we, examine what the primary decomposition theorem of Chapter 6 says about, normal operators., , 319
Page 328 :
320, , 9.2., , Operators on Inner Product Spaces, , Forms, , on, , Inner, , Product, , Chap. 9, , Spaces, , If T is a linear operator on a finite-dimensional, the function f defined on V X V by, , inner product, , space V, , f(a, P) = (T4P), may be regarded as a kind of substitute for T. Many questions about T are, equivalent to questions concerning f. In fact, it is easy to see that f determines T. For if @ = (al, . . . , cu,} is an orthonormal, basis for V, then the, entries of the matrix of T in ~3 are given by, It is important, point of view., definition., DeJinition., , V is a function, , to understand why f determines T from a more abstract, The crucial properties of f are described in the following, A (sesqui-linear), form, on a real or complex vector space, f on V X V with values in the field of scalars such that, , f(ca + P, r> = Cfb, r> + f@, r>, fb, CP + r> = W% L-3 + fb, r>, for all a, p, y in V and all scalars c., ii;, , Thus, a sesqui-linear form is a function on V X V such that f(cy, p), is a linear function of LYfor fixed /3 and a conjugate-linear, function of fi, for fixed a. In the real case, f (a, fi) is linear as a function of each argument;, in other words, f is a bilinear, form., In the complex case, the sesquilinear form f is not bilinear unless f = 0. In the remainder of this chapter,, we shall omit the adjective ‘sesqui-linear’, unless it seems important, to, include it., If f and g are forms on V and c is a scalar, it is easy to check that, cf + g is also a form. From this it follows that any linear combination, of, forms on V is again a form. Thus the set of all forms on V is a subspace of, the vector space of all scalar-valued, functions on V X 8., Theorem, 1. Let V be a finite-dimensional, inner product space and f a, form on V. Then there is a unique linear operator T on V such that, , fb, P> = (‘WP), for all a, t3 in V, and the map f + T is an isomorphism, onto L(V, V)., , of the space of forms, , Proof. Fix a vector /3 in V. Then LY+ f(a, ,f3) is a linear function, on V. By Theorem 6 there is a unique vector p’ in V such that f(a, p) =, (crlp’) for every a. We define a function U from V into V by setting UP =, p’. Then
Page 329 :
Forms on Inner Product spaces, , Sec. 9.2, , for all a, 0, y in V and all scalars c. Thus U is a linear operator on V and, T = U* is an operator such thatj(cu, /3) = (Talfi) for all CYand p. If we also, havej(cr, /3) = (T’alfi), then, (Ta - T’&J?) = 0, for all a! and p; so Tcu = T’a for all (Y. Thus for each formj, linear operator T, such that, , there is a unique, , fb, P>= U’c-4P), for all (Y, p in 8. If j and g are forms and c a scalar, then, (cf + g)(a, PI = (Tef+c+IP), = da, PI + da, PI, , = cU’&) + !T,4@, = ((cTf + T,MP), for all (Y and p in V. Therefore,, T cf+o, , = cTf + To, , so j + Tf is a linear map. For each T in L( V, V) the eauation, , fb, P>= P’QIP), defines a form such that T, = T, and Tf = 0 if and only if j = 0. Thus, j+ Tf is an isomorphism., 1, Corollary., , The equation, (flg) = tr (TfT;>, , dejkes an inner product on the space of forms with the property, (fig), , for every orthonormal, , =, , 2, j,k, , f(ak,, , (Yj)g(olkt, , that, , Crj), , basis {a~, . . . , a,} of v., , Proof. It follows easily from Example 3 of Chapter 8 that, (T, U) + tr (TU”), is an inner product on L( V, V). Since j + Tf is an, isomorphism, Example 6 of Chapter 8 shows that, , (fig) = tr U’fTTJ, is an inner product. Now suppose that A and B are the matrices of T, and, T, in the orthonormal, basis a3 = {CQ, . . . , cu,}. Then, &rc, , =, , (Tp&j>, , =, , j(cYk,, , q>
Page 330 :
Operators on inner Product Spaces, , Chap. 9, , and Bjk = (Tuakla~) = g(crk, ai). Since AB* is the matrix, basis 6~ it follows that, (fig), , = tr CAB*) = ,T A&r., , of T,T: in the, , I, , Dejinition., If f is a form and 63 = ((~1, . . . , cu,} an arbitrary, basis of V, the matrix A with entries, Ajk, , is called the matrix, , of f in, , the, , =, , f((Yk,, , ordered, , aj), , ordered, , basis, , 6~., , When ($3is an orthonormal, basis, the matrix off in 63 is also the matrix, of the linear transformation, Tf, but in general this is not the case., If A is the matrix off in the ordered basis @ = {(Ye, . . . , a,}, it follows, that, f(z xsaa, z y/tar) = Z wL,xr, 7, 8, T.8, for all scalars x. and y,. (1 5 T, s 5 n). In other words, the matrix, the property that, (9-l), , A has, , f(ar, /!?) = Y*AX, where X and Y are the respective coordinate, ordered basis a., The matrix off in another basis, , CXI= i Pij(Yi,, , matrices, , of a and /3 in the, , (1 I j I n>, , i=l, , is given by the equation, (9-2), For, , A’ = P*AP., Ah = fb:,, , a;>, , = f(X Psk%~ q, s, , PljW), , = 2 P,iA,P,e, = yP*AP)jk., Since P* = P-l for unitary matrices, it follows from (9-2) that results, concerning unitary equivalence may be applied to the study of forms., Theorem, 2. Let f be a form on a jinite-dimensional, complex inner, product space V. Then there is an orthonormal basis for V in which the matrix, of f is upper-triangular., , (Tc#), , Proof. Let T be the linear operator on V such that f(cr, 0) =, for all (Y and p. By Theorem 21, there is an orthonormal, basis
Page 331 :
Forms on Inner Product Spaces, , Sec. 9.2, { al,, , . ., , . ,, , cr,} in which the matrix, , f( CX~,aj), whenj>k., , of T is upper-triangular., , Hence,, , = (TcI~I~~) = 0, , 1, , De$nition., Hermitian, , A form, , f on a real or complex vector space V is called, , if, fb, PI = w, 4, , for all (Y and /3 in V., If T is a linear operator, and f is the form, , on, , a finite-dimensional, , inner product, , space V, , fb, P>= CMP), (T*crj@; so f is H ermitian, , if and only if T is selfthen f(p, a) = ((YJTP) =, adjoint., When f is Hermitian f (a, CX)is real for every cy, and on complex spaces, this property characterizes Hermitian, forms., Theorem, 3. Let V be a complex vector space and f a form on V such, that f(c\l, CX)is real for every CY.Then f is Hermitian., , Proof., , Let CYand p be vectors, , in V. We must show that f(cq /3) =, , f(Pt 4. Now, fb + A a + PI = fk% P> +fb,, , PI +f(P, a) +f(P, P)., , Since f (a + p, (Y + p), f (a, a), and f(p, p) are real, the number f(a, p) +, f(,f3, CX)is real. Looking at the same argument with CY+ $3 instead of a! + p,, we see that -if(cr, p) + if@, a) is real. Having concluded that two numbers are real, we set them equal to their complex conjugates and obtain, , -if&, , fb, P) +f(P, a) = f(% P) + f(P, 4, P) + if@, 4 = ifm, - ifm, , If we multiply the second equation by i and add the result to the first, equation, we obtain, 2f (a, P> = 2ff(P,4. I, Corollary., Let T be a linear operator on a complex finite-dimensional, inner product space V. Then T is self-adjoint if and only if (TCY~CY), is real for, every a in V., Theorem, 4 (Principal, Axis Theorem)., For every Hermitian, on a finite-dimensional, inner product space V, there is an orthonormal, V in which f is represented by a diagonal matrix with real entries., , form f, basis of, , I
Page 332 :
Operators, , on Inner Product, , Chap. 9, , Spaces, , Proof. Let T be the linear operator such thatf(cu, 0) = (Taj~) for, all (II and /3 in V. Then, since f(ar, /3) = f(& c~) and (m), = (orIT@), it, follows that, (T4P), , = f(P, 4 = (4TP), , for all a! and p; hence T = T*. By Theorem 18 of Chapter 8, there is an, vectors for T., orthonormal, basis of V which consists of characteristic, basis and that, Suppose (01~). . . , or,} is an orthonormal, Taj, , = cjaj, , for 1 I j I n. Then, f(%, Lyj) = (Tak(aj), and by Theorem, Corollary., , 15 of Chapter, Under, , 8 each, , ck, , =, , 6kjck, , is real., , 1, , the above conditions, , f(Z, i, , XjO!j,, , Z: YkoLk) = Z, k, , CjXjyjm, , j, , Exercises, 1. Which of the following functions f, defined on vectors a! = (zi, z,) and ,8 =, (Yi, YJ in C2, are (sesqui-linear) forms on C2?, , (4 f(% 0) = 1., , (b) f(a, P) = (a - !A)~ + XZYZ., , (c) fb, P) = (Xl + iid2 - (Zl - iiJ2., (d) f(a, P) = xdiz - 352~1., 2. Let f be the form on R2 defined by, f((Xl, Yl), (22, Y2)) = %Yl +, , x21/2., , Find the matrix off in each of the following bases:, IO, 01, (0, I)), ((1, --I),, , (1, 111, {Cl, 2), (374))., , 3. Let, A=, , [ 1, -if, , and let g be the form (on the space of 2 X 1 complex matrices) defined by g(X, I’) =, Y*AX. Is g an inner product?, 4. Let V be a complex vector space and let f be a (sesqui-linear) form on V which, is symmetric: f(a, p) = f(F, ci). What isf?, 5. Let f be the form on R* given by, f((Xl, x2), (y1, yz)) = x1y1 + 4xzy2+ 2XlYZ +, , 2XYl., , Find an ordered basis in which f is represented by a diagonal matrix., if 0 is the only vector a! such that, 6. Call the form f (left) non-degenerate, f(cr, p) = 0 for all p. Let f be a form on an inner product space V. Prove that f is
Page 333 :
Sec. 9.3, , Positive Forms, , non-degenerate, non-singular., , if and only if the associated linear operator Tf (Theorem, , 325, , 1) is, , 7. Let f be a form on a finite-dimensional vector space V. Look at the definition, of left non-degeneracy given in Exercise 6. Define right non-degeneracy and prove, that the form J’ is left non-degenerate if and only if f is right non-degenerate., 8. Let f be a non-degenerate form (Exercises 6 and 7) on a finite-dimensional, space V. Let L be a linear functional on V. Show that there exists one and only one, vector fl in V such that L(a) = f(cr, @) for all a., 9. Let f be a non-degenerate form on a finite-dimensional space V. Show that, each linear operator S has an tadjoint relative to f,’ i.e., an operator S’ such that, f(Sa, /3) = f(a, S’fi) for all a, 0., , 9.3., , Positive, , In this section, we shall discuss non-negative, (sesqui-linear), and their relation to a given inner product on the underlying vector, , Forms, , forms, space., , Definitions., A form f on a real or complex vector space V is nonnegative, if it is Hermitian and f( a, 0~) 2 0 for every a in V. The form f is, positive, , ij f is Hermitian, , and f(oc, a) > 0 for all cr # 0., , A positive form on Ir is simply an inner product on V. A non-negative, form satisfies all of the properties of an inner product except that some nonzero vectors may be ‘orthogonal’, to themselves., Let f be a form on the finite-dimensional, space V. Let CB= {ai, . . . , a,,}, be an ordered basis for V, and let A be the matrix of j in the basis CB,that is,, AjL = ~(cQ, aj). If cx = ~lal + . . . + Z~CX,,then, j(a, 4 = fez xjq, z Zkcyk), k, i, =, , 7, , i$, , @kf, , (ai,, , ak>, , = 2 z AkjXjZk., j k, , So, we see that j is non-negative, , if and only if, A = A*, , and, 2 2 AkjXiTk 2 0 for all scalars, i k, In order that j should be positive, the inequality, all (xi, . . . , 2,) # 0. The conditions we have, positive form on V if and only if the function, P-3), , g(X, Y) = Y*AX, , ~1, . . . , zn., in (9-3) must be strict for, derived state that j is a
Page 334 :
Operators on Inner Product Spaces, is a positive, field., , Chap. 9, , form on the space of n X 1 column matrices, , over the scalar, , Theorem, 5. Let F be the Jield of real numbers or the jield of complex, numbers. Let A be an II X n matrix over F. The junction g defined by, , g(X, Y) = Y*AX, , (9-4), , is a positive form on the space rinX1 if and only if there exists an invertible, n X n matrix P with entries in F such that A = P*P., Proof. For any n X n matrix A, the function g in (9-4) is a form, on the space of column matrices. We are trying to prove that g is positive, if and only if A = P*P. First, suppose A = P*P. Then g is Hermitian and, q(X, X) = x*p*px, = (PX)“PX, >_ 0., If P is invertible and X # 0, then (PX)*PX, > 0., Now, suppose that g is a positive form on the space of column matrices., Then it is an inner product and hence there exist column matrices Qr, . . . ,, Q,, such that, ajk = SC&j,Qk), = Q:AQj., But this just says that, if Q is the matrix with columns &I, . . . , Qn, then, &*A& = I. Since {Qr, . . . , Qn} is a basis, Q is invertible., Let P = Q-l and, we have A = P*P., 1, In practice, it is not easy to verify that a given matrix A satisfies the, criteria for positivity, which we have given thus far. One consequence of, the last theorem is that if y is positive then det A > 0, because det A =, det (P*P) = det P* det 1’ = ldet P12. The fact that det A > 0 is by no, means sufficient to guarantee that g is positive; however, there are n, determinants, associated with A which have this property: If A = A* and, if each of those determinants, is positive, then g is a positive form., Let A be an n X n matrix over the field F. The principal, of A are the scalars Ak(A) defined by, , Dejinition., minors, , Ak(A) =det[tII, , Lemma., , The following, , 1::, , :,I,, , l<k<n., , Let A be an invertible n X n matrix with entries in a jield F., two statements are equivalent.
Page 335 :
Positive Forms, , Sec. 9.3, , (a) There is an upper-triangular, matrix P with Pkk = 1 (1 5 k 5 n), such that the matrix B = AP is lower-triangular., (b) The principal minors of A are all different from 0., Proof., , Let P be any n X n matrix, , and set B = AP. Then, , Bjk = 22 Aj,P,k., r, If P is upper-triangular, , and Pkk = 1 for every k, then, k-l, , k > 1., Now B is lower-triangular, provided, lower-triangular, if and only if, , Bit = 0 for j < le. Thus, , B will be, , k-l, (9-5), , 2, r=l, , AjrP,k, , =, , l<jlk-1, , -A!&,, , 2<lc<n., , So, we see that statement (a) in the lemma is equivalent to the statement, that there exist scalars P,k, 1 < r 5 Ic, 1 5 Ic < n, which satisfy (9-5) and, Pkk = 1, 1 5 k 5 n., In (9-5), for each lc > 1 we have a system of k - 1 linear equations, for the unknowns P1k, Pzk, . . . , Pk-l,k. The coefficient matrix of that, system is, An, . . . Au-1, LiAL-~, , ..., , .1, , AiLl, , and its determinant, is the principal minor Akel(A). If each Ak-l(A) # 0,, the systems (9-5) have unique solutions. We have shown that statement, (b) implies statement (a) and that the matrix P is unique., Now suppose that (a) holds. Then, as we shall see,, (g-6), , &(A), , = A@>, = LLB,,, , * * * Bkk,, , lc = 1, . . . , n., , To verify (g-6), let AI, . . . , A, and B1, . . . , B, be the columns, B, respectively. Then, , of A and, , B, = A,, , (9-V, , 7-l, , B, = 2 Pj+4j + A,,, , T > 1., , j=l, , Fix k, 1 5 k 5 n. From (9-7) we see that the rth column of the matrix, , is obtained, , by adding to the rth column of
Page 336 :
Operators on Inner Product Spaces, , Chap. 9, , a linear combination, of its other columns. Such operations do not change, determinants., That proves (9-G), except for the trivial observation, that, because B is triangular An.(B) = B,, . . . Bkk. Since A and P are invertible,, B is invertible., Therefore,, A(B), andsoA,(A)#O,lc=, , = B11 . . . B,, # 0, , l,...,n., , 1, , Theorem, 6. Let f be a form on a jinite-dimensional, vector space V, and let A be the matrix of f in an ordered basis CK Then f is a positive form if, and only if A = A* and the principal minors of A are all positive., , Proof. Let’s do the interesting half of the theorem first., that A = A* and Ak(A) > 0, 1 < k _< n. By the lemma, there, (unique) upper-triangular, matrix P with Pkk = 1 such that B, lower-triangular., The matrix P* is lower-triangular,, so that P*B, is also lower-triangular., Since A is self-adjoint,, the matrix D =, self-adjoint., A self-adjoint, triangular, matrix is necessarily a, matrix. By the same reasoning which led to (9-G),, A,(D), , Since D is diagonal,, , its principal, A,(D), , Suppose, exists an, = AP is, = P*AP, P*AP is, diagonal, , = Ak(P*B), = &s(B), = Ak(A)., minors are, = Dll . . . DH+, , From A,(D) > 0, 1 5 Ic 5 n, we obtain Dkk > 0 for each Ic., If A is the matrix of the formf in the ordered basis a3 = {cq, . . . , a,},, then D = P”AP is the matrix off in the basis (~4, . . . , aA} defined by, Cl!; = i, , Piiff i., , i=l, , See (9-2). Since D is diagonal with positive entries, obvious that, X*DX, > 0,, X#O, , on, , its diagonal,, , it is, , from which it follows that f is a positive form., Now, suppose we start with a positive formf. We know that A = A*., How do we show that Ak(A) > 0, 1 < k 5 n? Let V, be the subspace, spanned by al, . . . , (Ye and let fk be the restriction off to l/k X Vk. Evi-
Page 337 :
Sec. 9.3, , Positive, , dently fk is a positive form, represented by the matrix, , Forms, , on Vk and, in the basis {(or, . . . , LYE} it is, , As a consequence of Theorem 5, we noted that the positivity, of a form, implies that the determinant, of any representing matrix is positive., l, There are some comments we should make, in order to complete our, discussion of the relation between positive forms and matrices. What is it, that characterizes the matrices which represent positive forms? If f is a, form on a complex vector space and A is the matrix off in some ordered, basis, then f will be positive if and only if A = A* and, x*Ax, > 0,, for all complex X # 0., (g-8), It follows from Theorem 3 that the condition A = A* is redundant, i.e.,, that (9-8) implies A = A *. On the other hand, if we are dealing with a real, vector space the form f will be positive if and only if A = At and, XtAx, , (9-9), , > 0,, , for all real X # 0., , We want to emphasize that if a real matrix A satisfies (9-g), it does not, follow that A = At. One thing which is true is that, if A = At and (9-9), holds, then (9-8) holds as well. That is because, (X + iY)*A(X, , + iY), , = (Xt = XtAX, , iYt)A(X, + YtAY, , + iY), + i[XIAY, , -, , YtAX], , and if A = At then YtAX = XtA Y., If A is an n X n matrix with complex entries and if A satisfies (9-g),, matrix., The comments which we have just, we shall call A a positive, made may be summarized by saying this: In either the real or complex, case, a form f is positive if and only if its matrix in some (in fact, every), ordered basis is a positive matrix., inner product space. Let f, lSow suppose that V is a finite-dimensional, be a non-negative, form on V. There is a unique self-adjoint linear operator, T on V such that, (9-10), and T has the additional, , fb, P> = (T&>., property, , that (‘Y’olla) 2 0., , DeJinition., A linear operator T on a finite-dimensional, inner product, if T = T* and (T~~Lu) >_ 0 for all LYin V. A, space V is non-negative, positive, linear operator is one such that T = T* and (Tala) > 0 for all, a # 0., , 329
Page 338 :
Operators on Inner Product Spaces, , Chap. 9, , If V is a finite-dimensional, (real or complex) vector space and if (. 1. ) is, an inner product on V, there is an associated class of positive linear operators on V. Via (9-10) there is a one-one correspondence between that class, of positive operators and the collection of all positive forms on V. We shall, use the exercises for this section to emphasize the relationships, between, positive operators, positive forms, and positive matrices. The following, summary may be helpful., If A is an n X n matrix over the field of complex numbers, the following are equivalent., ~1, . . . , x,,, are, (1) A is positive,, i.e., 7 T AkjxjS > 0 whenever, complex numbers, not all 0., (2) (Xl Y) = Y*AX is an inner product on the space of n X 1 complex, matrices., (3) Relative to the standard inner product (X/Y) = Y*X on n X 1, matrices, the linear operator X + AX is positive., (4) A = P*P for some invertible n X n matrix P over C., (5) A = A*, and the principal minors of A are positive., If each entry of A is real, these are equivalent to:, (6) A = At, and Z: 2 Akjxjxk > 0 whenever, x1, . . . , x, are real, j k, numbers not all 0., (7) (XIY) = YtAX is an inner product on the space of n X 1 real, matrices., (8) Relative to the standard inner product (XIY) = YtX on n X 1, real matrices, the linear operator X + AX is positive., (9) There is an invertible, n X n matrix P, with real entries, such, that A = PIP., , Exercises, 1. Let 17 be C2, with the standard inner product. For which vectors a in V is, there a positive linear operator T such that a! = Tel?, , 2. Let V be R2, with the standard inner product. If 6’ is a real number, let T, be the linear operator ‘rotation through 8,’, T&xl, x2) = (2, cos 8 - x2 sin 0, z1 sin 0 + x2 cos 0)., For which values of 0 is Te a positive operator?, 3. Let V be the space of n X 1 matrices over C, with the inner product (XIY) =, Y*GX (where G is an n X n matrix such that this is an inner product). Let A be, an n X n matrix and T the linear operator T(X) = AX. Find T*. If Y is a fixed, element of V, find the clement 2 of V which determines the linear functional, X + Y*X. In other words, find Z such that Y*X = (XjZ) for all X in V.
Page 339 :
Sec. 9.3, , Positive Forms, , 4. Let V be a finite-dimensional, inner product space. If T and U are positive, Give an example which, linear operators on V, prove that (T + U) is positive., shows that TU need not be positive., 5. Let, , [ 1, , A=;, , ;., , (a) Show that A is positive., with, (b) Let V be the space of 2 X 1 real matrices,, (Xi Y) = YlAX. Find an orthonormal, basis for V, by applying, process to the basis {Xl, X2) defined by, , the inner product, the Gram-Schmidt, , [I x2=[Iy., , x1= ;7, (c) Find an invertible, 6. Which, , 2 X 2 real matrix, , of the following, , matrices, , P such that A = PtP,, , are positive?, , 7. Give an example of an n X n matrix, but which is not a positive matrix., , which has all its principal, , 8. Does ((21, z~)j(yi, ye)) = ~$1 + 2x251 + Zr@ + x2@ define, on C2?, 9. Prove, , that, , every entry, , on the main, , diagonal, , inner, 10. Let V be a finite-dimensional, operators on V, we write T < U if U lowing :, , product, , of a positive, , minors, , positive,, , an inner, , product, , matrix, , is positive., , space. If T and U are linear, operator. Prove the fol-, , T is a positive, , (a) T < U and U < T is impossible., (b) If T < U and U < S, then T < S., (c) If T < U and 0 < S, it need not be that ST < SU., , 11. Let V be a finite-dimensional, projection, of V onto some subspace., , inner, , product, , space and E the orthogonal, , (a) Prove that, for any positive number c, the operator cl + E is positive., (b) Express in terms of E a self-adjoint, linear operator T such that T2 = I + E., 12. Let n be a positive, , integer, , and A the n X n matrix, ..., ..., , -1, n, - 1, , n-t1, ., , __ 1, n+l, Prove that A is positive., , __ 1, n+2, , ..., , ___ 1, 2n - 1, :I, , ., , 331
Page 340 :
Operators on Inner Product Spaces, , Chap. 9, , 13. Let A be a self-adjoint, n X n matrix., such that the matrix cl + A is positive., , Prove, , 14. Prove that the product, only if they commute., , of two positive, , 15. Let S and T be positive, , operators., , that, , linear, , Prove, , there, , operators, , that, , every, , is a real number, is positive, , characteristic, , c, , if and, value, , of, , ST is positive., , 9.4., , More, , on Forms, , This section contains two results which give more detailed, about (sesqui-linear), forms., , information, , Theorem, 7. Let f be a form on a real or complex vector space V and, subspace W of V. Let M be the, { a, . . . , CY,} a basis for the jinite-dimensional, r X r matrix with entries, Mjk, , =, , f((Yk,, , OJj), , and W’ the set of all vectors p in V such that f(a, /3) = 0 for all a. in W. Then, W’ is a subspace of V, and W 0 W’ = {0} if and only if M is invertible., When this is the case, V = W + W’., Proof., every a! in W, , If /3 and y are vectors, , in W’ and c is a scalar, then for, , f(a, CP + r> = Vb, P> + fb,, = 0., , r>, , Hence, W’ is a subspace of V., Now suppose CY= 2, , xkffk, , and that p = i, , k=l, , It follows, system, , yjc~j. Then, , j=l, , from this that W n W’ # (0) if and only if the homogeneous, ji,, , V$fjk, , = 0,, , 1llClr, , has a non-trivial, solution (yl, . . . , y,). Hence W n W’ = {0} if and only, if M* is invertible. But the invertibility, of M* is equivalent to the invertibility of M., Suppose that M is invertible, and let, A = (M*)-1, , = (M-l)*.
Page 341 :
More on Forms, , Sec. 9.4, , Define gi on V by the equation, gi(P) = ,$, Ajh’k,, , P)., , Then, gj(&, , + 7) = :, , cb + ‘?‘), , Ajkf(ak,, , = C2, , p) + T, , Ajkf(ak,, , Ajk.f(ak,, , ‘F’), , k, =, , CSi(P), , +, , Hence, each gj is a linear function, operator E on V by setting, , !Ji(T>., , on V. Thus we may define a linear, , EP = jP, gj(P)aj*, Since, gj(h>, , = z AjkfjGZQ, = T, =, , Ajk(M*)kn, , 6jn, , it follows that E(an) = (Ye for 1 5 n 5 r. This implies Ea = CYfor every, (Y in W. Therefore, E maps V onto W and E2 = E. If p is an arbitrary, vector in V, then, f(an, -WI = f (an, 7 o&), , = 3 (z xjkf(ak, 0)) f(%, aj)., 3, , Since A* = M-1,, , it follows that, f(an, EP) = z (7 (M-l)kjMjn), , f(ak, P>, , = z 8knf(Qk, P), = f(%, This impliesf(ar,, , E/3), , = f(a,, , P>., , p) for every, , fb,, , a in W. Hence, , P - EP) = 0, , for all (Y in W and p in V. Thus I - E maps V into W’. The equation, P = EP + (I - EM, shows that, , V = W + W’., , One final point should be mentioned. Since, V is uniquely the sum of a vector in W, , W n W’ = {0}, every vector in
Page 342 :
Operators on Inner Product Spaces, , Chap. 9, , and a vector in W’. If fi is in W’, it follows, maps I/’ onto W’. 1, , that E/3 = 0. Hence I - E, , The projection E constructed in the proof may be characterized as, follows: Ep = (Y if and only if cy is in W and fi - (Y belongs to W’. Thus E, is independent of the basis of W that was used in its construction., Hence, we may refer to E as the projection, of V on W that is determined, by, the direct sum decomposition, v = WOW’., Note that E is an orthogonal, , projection, , if and only if W’ = W*., , Theorem, 8. Let f be a form on a real OT complex vector space V and A, the matrix of f in the ordered basis {cY~,. . . , (Y,} of V. Suppose the principal, minors of A are all Agerent from 0. Then there is a unique upper-triangular, matrix P with Pkk = 1 (1 5 k 5 n) such that, , P*AP, is upper-triangular., Proof. Since Ak(A*) = Ak(A) (1 5 Ic 5 n), the principal minors, of A* are all different from 0. Hence, by the lemma used in the proof of, Theorem 6, there exists an upper-triangular, matrix P with Pkk = 1 such, that A*P is lower-triangular., Therefore,, P*A = (A*P)*, is upper-triangular. Since the product of two upper-triangular, matrices is again uppertriangular,, it follows that P*AP is upper-triangular., This shows the, existence but not the uniqueness of P. However,, there is another more, geometric argument which may be used to prove both the existence and, uniqueness of P., Let Wk be the subspace spanned by (~1,. . . , at and W: the set of all, p in V such that f(a, /3) = 0 for every (Y in wk. Since Ak(A) # 0, the, Ic X Ic matrix M with entries, =, , Mij, , (1 < i, j 2 Ic) is invertible., , f(cYjp, , (Yi) =, , By Theorem, , Aij, , 7, , v=w,@w;., Let Ek be the projection of V on Wk which is determined, position, and set Eo = 0. Let, fik, , =, , ak, , by this decom-, , (1 5 Ic 5 n)., , - Ek-Nt,, , Then PI = CQ, and E’k-lak belongs to Wk-1 for k > 1. Thus when k > 1,, there exist unique scalars Pjk such that, k-l, , Et-lab, , = -, , 2, j=l, , Pjk(Yj.
Page 343 :
Spectral Theory, , sec. 9.5, , Setting PU = 1 and Pjk = 0 for j > k, we then, triangular matrix P with PEW.= 1 and, pk, , =, , E, j=I, , have an n x n upper-, , pjkaj, , for k = 1, . . . , n. Suppose 1 < i < k. Then pi is in Wi and Wi C Wtel., Since Pk belongs to W:-1, it follows that j&i, ,&) = 0. Let B denote the, matrix of j in the ordered basis (&, . . . , Pn}. Then, Bki, , =, , f(pi>, , @k), , so Bki = 0 when k > i. Thus B is upper-triangular., , On the other hand,, , B = P”AP., Conversely,, such that P*AP, , suppose P is an upper-triangular, is upper-triangular., Set, , matrix, , with, , Pkk = 1, , (1 < k < n)., = 2: pjk(ujj, j, a basis for wk. Suppose k > 1. Then, Then @I, . . . , Pk) is evidently, BE--I}, is, a, basis, for, wk--I,, and since j@, @k) = 0 when i < k, we, ., ,, @I, . *, see that ,&k is a vector in W:- I. The equation defining & implies, @k, , ffk, , =, , -, , (:!I, , pjkaj), , +, , Pk., , k-l, , Now, , z, , Pjkffj, , belongs, , to, , wk-1 and ok is in w:-1., , Therefore,, , Plk,, , . . . , Pk--lk, , j=l, , are the unique, , scalars such that, k-l, EL-m, , so that P is the matrix, , constructed, , =, , -, , jz,, , earlier., , Pikaj, , 1, , 9.5., , Spectral, , Theory, , In this section, we pursue the implications, of Theorems 18 and 22, of Chapter 8 concerning the diagonalization, of self-adjoint, and normal, operators., Theorem, 9 (Spectral, Theorem)., Let T be a normal operator on a, finite-dimensional, complex inner product space V or a self-adjoint operator on, a finite-dimensional, real inner product space v. Let cl, . , . , ck be the distinct, characteristic values of T. Let Wj be the characteristic space associated with cj, and Ej the orthogonal projection of V on Wj. Then Wi is orthogonal to Wi, when i Z j, V is the direct sum of W1, . . . , wk, and, , (9-11), , T = clEl + . . . + ckEk., , 335
Page 344 :
Operators on inner Product Spaces, , Chap. 9, , Proof. Let LYbe a vector in Wj, P a vector in Wi, and suppose, i # j. Then c~(cr\p) = (Tc#) = (aJ!Z’*P) = (c&P). Hence (q - ci)(cr)p) =, 0, and since cj - ci # 0, it follows that (&3) = 0. Thus Wj is orthogonal, to Wi when i # j. From the fact that V has an orthonormal, basis consisting, of characteristic, vectors (cf. Theorems 18 and 22 of Chapter S), it follows that V = W1 + . . . + Wk. If q belongs to Vi (1 5 j 5 k) and, a1 + 0.. + ollt = 0, then, 0 =, =, , (ail?, , = Z, i, , aj), , (ailaj), , IjQil;’, , for every i, so that V is the direct sum of W1, . . . , Wk. Therefore, . . . + Ek = I and, T = TEI + . . . + TEE, = clEl + . . . + CEE~. 1, , El +, , The decomposition, (9-11) is called the spectral, resolution, of T., This terminology, arose in part from physical applications, which caused, of a linear operator on a finite-dimensional, vector space, the spectrum, to be defined as the set of characteristic, values for the operator. It is, important, to note that the orthogonal projections El, . . . , Ek are canonically associated with T; in fact, they are polynomials in T., If, , Corollary., , Proof., , = n, i#j, Since EiBj = 0 when i # j, it follows that, ej, , TZ = c:El-land by an easy induction, , argument, , ..., , + c;Ek, , that, , Tn = c;El + . . . + &El,, for every integer n 2 0. For an arbitrary, f = i,, we have, , polynomial, cwn, , f(T)=nioa2, k, , = 5, , a,, 2 cj”Ej, , n=O, =, , ji,, , j=l, , ( iow;>, , Ej, , = j$ f (cj)Ei*, Since ej(c,), , = 6jm, it follows that ej(T), , = Ej., , 1
Page 345 :
Spectral Theory, , Sec. 9.5, , Because E1, . . . , EE are canonically, , associated with T and, , I = El + . . . + I&, the family, , of the, of projections, (El, . . . , Ek} is called the resolution, by T., There is a comment that should be made about the proof of the spectral, theorem. We derived the theorem using Theorems 18 and 22 of Chapter 8, on the diagonalization, of self-adjoint, and normal operators. There is another, more algebraic, proof in which it must first be shown that the minimal polynomial of a normal operator is a product of distinct prime factors., Then one proceeds as in the proof of the primary decomposition theorem, (Theorem 12, Chapter 6). We shall give such a proof in the next section., In various applications, it is necessary to know whether one may, compute certain functions of operators or matrices, e.g., square roots., This may be done rather simply for diagonslizable, normal operators., identity, , defined, , Definition., , dimensional, , Let T be a diagonalizable, inner product space and, , normal, , operator, , on a jinite-, , T = $ CjEj, j=l, , its spectral resolution. Suppose f is a function whose domain includes the, spectrum, of T that has values in the field oj scalars. Then the linear operator, f(T) is defined by the equation, (9-12), , f(T), , = i$, , f(cj)Ej-, , Theorem, 10. Let T be a diagonalizable normal operator with spectrum S, on a jinite-dimensional, inner product space V. Suppose f is a function whose, domain contains S that has values in the jleld of scalars. Then f(T) is a, diagonalixable normal operator with spectrum f(S). If U is a unitary map of, V onto V’ and T’ = UTU-‘,, then S is the spectrum of T’ and, , f(T’), Proof. The normality, from (g-12) and the fact that, , = Uf(T)U-‘., of f(T), , follows, , by a simple computation, , f(T)*, , Moreover,, , = 2 f(cj)Ej., j, it is clear that for every a~in Ej(V), Jv)~, , = f(Cib., , Thus, the setf(S) of allS(c) with c in S is contained, Conversely, suppose Q: # 0 and that, f(T)a, , = ba., , in the spectrum, , of f(T).
Page 346 :
Operators on Inner Product Spaces, , Chap. 9, , Then CY= 2 Ejcy and, j, , Hence,, , = 0., Therefore, f(q) = b or Eicr = 0. By assumption, LY# 0, so there exists an, index i such that Eicr # 0. It follows that f(cJ = b and hence that f(x) is, the spectrum of f(T). Suppose, in fact, that, , where b, # b, when m # n. Let X, be the set of indices i such that, 1 5 i < k and f(c;) = b,. Let P, = Z Ei, the sum being extended over, the indices i in X,. Then P, is the kthogonal, projection, of V on the, subspace of characteristic vectors belonging to the characteristic value 6,, of f(T), and, , f(T) = j,, is the spectral resolution of f(T)., Now suppose U is a unitary, T’ = UTU-I., Then the equation, , b,J’,, , transformation, , of V onto V’ and that, , TCY = ccr, holds if and only if, Thus S is the spectrum of T’, and U maps each characteristic subspace for, T onto the corresponding subspace for T’. In fact, using (g-12), we see that, T’ = 2 CjEl,, i, is the spectral resolution, , Ei = UEjU-’, , of T’. Hence, f(T’), , = ~f(cj)E:, j, = 2 j(cj) UEjU-’, j, = U (Zf(cj)Ej), j, = Uj(T)U-‘., , U-’, 1
Page 347 :
Spectral Theory, , Sec. 9.5, , In thinking about the preceding discussion, it is important for one to, keep in mind that the spectrum of the normal operator T is the set, x = {Cl, . . . ) c,), of distinct characteristic, values. When T is represented by a diagonal, matrix in a basis of characteristic, vectors, it is necessary to repeat each, value ci as many times as the dimension of the corresponding, space of, characteristic vectors. This is the reason for the change of notation in the, following result., Corollary., With the assumptions of Theorem 10, suppose that T is, represented in the ordered basis 03 = (aI, . . . , an} by the diagonal matrix D, with entries dl, . . . , d,. Then, in the basis 63, f(T) is represented by the, diagonal matrix f(D) with entries f(dl), . . . , f(d,). 1f 6~’ = (& . . . , (Y:}, is any other ordered basis and P the matrix such that, , CX;= 2 PijCri, z, then P-lf(D)P, , is the matrix of f(T), , in the basis a’., , Proof. For each index i, there is a unique j such that 1 _< j 5 lc,, o; belongs to Ej(V), and di = ci. Hence f(T)ai, = f(di)ai for every i, and, f(T)4, , = F Pijf(T)ai, = 3 diPiiai, = ; umijw, = F (DP)ij, , F P.iilaA, , = T (P-lDP)kjCYY:., , 1, , It follows from this result that one may form certain functions of a, normal matrix. For suppose A is a normal matrix. Then there is an invertible matrix P, in fact a unitary P, such that PAP-l is a diagonal matrix, say, D with entries dl, . . . , d,. Let f be a complex-valued, function which can, be applied to dl, . . . , d,, and let f(D) be the diagonal matrix with entries, is independent of D and just a function of, f(dd, . . . , f(dn). Then P-lf(D)P, A in the following sense. If Q is another invertible matrix such that &A&-’, is a diagonal matrix D’, then f may be applied to the diagonal entries of D’, and, P-tf(D)P, = Q-lf(D’)Q., Definition., , Under the above conditions,, , f(A), , is defined as P-lf(D)P., , The matrix f(A) may also be characterized, in a different, doing this, we state without proof some of the results on normal, , way. In, matrices, , 339
Page 348 :
340, , Operators, , on Inner Product Spaces, , that one obtains, theorems., , Chap. 9, , by formulating, , the matrix, , Theorem, 11. Let A be a normal, complex roots of det (x1 - A). Let, ei, , =, , matrix, , analogues, , of the preceding, , and ~1, . . . , ck the distinct, , ( >, , n, j#i, , X, , -, , Cj, , Ci, , -, , Cj, , and Ei = ei(A) (1 5 i < k). Then EiEj = 0 when i # j, Ef = Ei, E: = Ei,, and, 1 = E, + . . . + I&., If f is a complex-valued, , junction, , f(A), in part&tar,, , whose domain includes cl, . . . , ck, then, , = f(cdEl, , + *. . +, , f(Ck)&;, , A = clEl + . . . + c&k., , We recall that an operator on an inner product space V is non-negative, if T is self-adjoint, and (Ta/cr) 2. 0 for every a! in V., Theorem, 12. Let T be a diagonalizable, normal operator on a jinitedimensional inner product space V. Then T is self-adjoint, non-negative, or, unitary according as each characteristic value of T is real, non-negative, or of, absolute value 1., , Proof. Suppose T has the spectral resolution T = clEl + . ’ . +, ckEk, then T* = &El + ’ . . + i?kEk. To say T is self-adjoint, is to say, T = T*, or, (cl - FI)El + . . . + (c, - QEk = 0., Using the fact that EiEi = 0 for i # j, and the fact that no Ej is the zero, operator, we see that T is self-adjoint if and only if cj = li, j = 1, . . . , Ic., To distinguish the normal operators which are non-negative,, let us look at, (Tala), , = ( ,i cjEj~li~I, , Eia), , 3=1, =, , 3, , Jl, , cj(EjaIEia), , = 6 ljI/Eja[I’., We have used the fact that (E’+lEicr) = 0 for i # j. From this it is clear, that the condition (Tala) 2 0 is satisfied if and only if cj 2 0 for each j., To distinguish the unitary operators, observe that, TT*, If TT*, , = c,c,E, + . . . + ckckEE, = lcl12E1 + . . . + IcJJ~E~., , = I, then I = Ic112E1+ . . . + lck12Ek, and operating, Ej = Icjl”Ej., , with Ej
Page 349 :
Sec. 9.5, , Spectral Theory, , Since Ej # 0, we have [cj12 = 1 or lcj[ = 1. Conversely,, eachj, it is clear that TT* = I. [, , if (ci12 = 1 for, , It is important to note that this is a theorem about normal operators., If T is a general linear operator on V which has real characteristic values,, it does not follow that T is self-adjoint., The theorem states that if T has, real characteristic values, and if T is diagonalizable and normal, then T is, self-adjoint., A theorem of this type serves to strengthen the analogy between the adjoint operation and the process of forming the conjugate of a, complex number. A complex number z is real or of absolute value 1 according as z = Z, or zz = 1. An operator T is self-adjoint, or unitary according, as T = T* or T*T = I., We are going to prove two theorems now, which are the analogues of, these two statements:, (1) Every, , non-negative, , number, , has a unique, , non-negative, , square, , root., (2) Every complex number is expressible in the form TU, where r is, non-negative, and ju( = 1. This is the polar decomposition, x = rei* for, complex numbers., Theorem, 13. Let V be a Jinite-dimensional, inner product space and, T a non-negative operator on V. Then T has a unique non-negative square root,, that is, there is one and only one non-negative operator N on V such that, N2 = T., , Proof. Let T = clEl + . . . + cEEk be the spectral resolution of, By Theorem 12, each ci 2 0. If c is any non-negative, real number, let, denote the non-negative, square root of c. Then according to Theorem, and (9-12) N = d/T is a well-defined, diagonalizable, normal operator, V. It is non-negative, by Theorem 12, and, by an obvious computation,, = T., Now let P be a non-negative, operator on V such that P2 = T. We, shall prove that P = N. Let, , T., 4, 11, on, N2, , P = dlFl + . . . + d,F,, be the spectral resolution of P. Then di 2 0 for each j, since P is nonnegative. From P2 = T we have, T = dSFl f, , . . . f d$F,., , Now F1, . . . , F, satisfy the conditions, I, for i # j, and no Fj is 0. The numbers d?,, distinct non-negative, numbers have distinct, of the spectral resolution of T, we must have, ing) Fj = Ei, dj” = cj. Thus P = N. 1, , = F1 + *. . + F,, FiFj = 0, . . . , cl: are distinct, because, squares. By the uniqueness, r = k, and (perhaps reorder-
Page 350 :
Operators on Inner Product spaces, , Chap. 9, , Theorem, 14. Let v be a jinite-dimensional, inner product space and, let T be any linear operator on V. Then there exist a unitary operator U on V, and a non-negative operator N on V such that T = UN. The non-negative, operator N is unique. If T is invertible, the operator U is also unique., , Proof. Suppose we have T = UN, where U is unitary and N, is non-negative., Then T* = (UN)*, = N*U*, = NV., Thus T*T =, NU*UN, = N2. This shows that N is uniquely determined, as the nonnegative square root of the non-negative, operator T*T., So, to begin the proof of the existence of U and N, we use Theorem, 13 to define N as the unique non-negative, square root of T*T. If T is, invertible,, then so is N because, (NaINa), , = (N+), , = (T*T+), , = (TLY[Tcx)., , In this case, we define U = TN-l, and prove that U is unitary., U* = (TN-‘)*, = (N-‘)*‘J”*, = (N*)-‘T*, = N-IT*., Thus, uu*, , =, =, =, =, =, =, , Now, , TN-‘N-IT*, T(N-‘)2T*, T(N2)-lT*, T(T*T)-‘T*, TT-l(T*)-‘T*, I, , and U is unitary., If T is not invertible, we shall have to do a bit more work to define U., We first define U on the range of N. Let a be a vector in the range of, N, say Q = N/3. We define UCX = Tp, motivated, by the fact that we, want UN@ = T/l. We must verify that U is well-defined, on the range, of N; in other words, if No’ = Np then T/3’ = T/3. We verified above, that IINrl12= llWz for every y in V. Thus, with y = p - p’, we see, that N(P - p’) = 0 if and only if T@ - p’) = 0. So U is well-defined, on, the range of N and is clearly linear where defined. Now if W is the range, of N, we are going to define U on W I. To do this, we need the following, observation., Since T and N have the same null space, their ranges have, the same dimension. Thus WI has the same dimension as the orthogonal, complement of the range of T. Therefore,, there exists an (inner product, space) isomorphism U, of WI onto T(V)l. Now we have defined U on W,, and we define U on WI to be UO., Let us repeat the definition of U. Since V = W @ WI, each a! in V, is uniquely expressible in the form cx = Np + y, where No is in the range, W of N, and y is in WI. We define, Uol = Tfi + Uoy., This U is clearly linear, and we verified, , above that it is well-defined., , Also
Page 351 :
Spectral Theory, , Sec. 9.5, , and so U is unitary., , We also have UN/3 = T/3 for each fl., , 1, , We call T = UN a polar decomposition, for T. We certainly cannot, call it the polar decomposition,, since U is not unique. Even when T is, invertible,, so that U is unique, we have the difficulty that U and N may, not commute. Indeed, they commute if and only if T is normal. For, example, if T = UN = NU, with N non-negative, and U unitary, then, TT*, , = (NU)(NU)*, , = NUU*N, , = N2 = T*T., , The general operator T will also have a decomposition, T = NIUI, with, N1 non-negative, and U1 unitary. Here, N1 will be the non-negative, square, root of TT*. We can obtain this result by applying the theorem just, proved to the operator T*, and then taking adjoints., We turn now to the problem of what can be said about the simultaneous diagonalization, of commuting families of normal operators. For this, purpose the following terminology, is appropriate., Dejinitions., Let 5 be a family, of operators on an inner product space, V. A function r on 5 with values in the jield F of scalars will be called a root, of 5 if there is a non-zero a in V such that, , Ta = r(T)a, for all T in 5. For any junction r from 5 to F, let V(r) be the set of all a in V, such that Ta = r(T)cu for every T in 5., Then V(r) is a subspace of V, and r is a root of 5 if and only if V(r) #, {O}. Each non-zero (Y in V(r) is simultaneously, a characteristic vector for, every T in 5., Theorem, 1.5. Let 5 be a commuting family of diagonalizable normal, opera,tors on a Jinite-dimensional, inner product space V. Then 5 has only a, finite number of roots, If rl, . . . , rk are the distinct roots of 5, then, , (9 V( r i> as, . or th ogonal to V(rj) when i # j, and, (ii) V = V(r1) @ ’ ’ ’ @ V(rk)., Proof. Suppose r and s are distinct roots of F. Then there is an, operator, T in 5 such that r(T) # s(T). Since characteristic, vectors, belonging to distinct characteristic values of T are necessarily orthogonal,, it follows that V(r) is orthogonal to V(s). Because V is finite-dimensional,, this implies 5 has at most a finite number of roots. Let rl, . . . , rk be the
Page 352 :
344, , Operators on Inner Product Spaces, , Chap. 9, , roots of F. Suppose {T,, . . . , T,) is a maximal, of 5, and let, {&I, &2, . . .>, , linearly, , independent, , subset, , be the resolution of the identity defined by Ti (1 2 i < m). Then the, projections Eij form a commutative, family. For each Eij is a polynomial, in Ti and T1, . . . , T, commute with one another. Since, I, , =, , (2, , Elj,), , (2, , il, , each vector, , E2jJ, , 0 in V may be written, , (9-13), , CY=, , Suppose j,, . . . , j,, , . . ., , (z, , EmA, , j¶, , in the form, , Z, EljJ2iz, Jo,. . , j,,, , * * . Emi,,,a., , are indices for which, , 0 = EljlE2jp * . . Emj,,,a # 0. Let, , Pi = CnIIIi EnjJ a., Then /3 = Eij,/?i; hence there is a scalar ci such that, llilm., , T;P = CiP,, For each T in 5, there exist unique, , scalars bi such that, , T = s biTi., i=l, , Thus, TP = Z biT$, , = (T bici) P., The function, , T + 2: bici is evidently, , one of the roots, say rt of 5, and p lies, , in V(r,). Therefore,, each non-zero term in (9-13) belongs, spaces V(r,), . . . , V(Q). It follows that V is the orthogonal, V(c), . . . ) Vh)., I, , to one of the, direct sum of, , Under the assumptions of the theorem, let Pj be the orthogonal, of V on V(rj), (1 5 j 5 k). Then PiPj = 0 when i # j,, , Corollary., , projection, , I = P1 + *. * + Pk,, and every T in 5 may be written in the form, (9-14), , T = 2 rj(T)Pj., j, , DeJinitions., , called the resolution, spectral, , resolution, , The family, , of orthogonal, , of the identity, of T in terms, , projections, , determined, of this family., , (1’1, . . . , P,}, , is, , by 5, and (9-14) is the, , Although the projections PI, . . . , Pk in the preceding, canonically associated with the family 5, they are generally, , corollary are, not in 5 nor
Page 353 :
Sec. 9.5, , Spectral Theory, , even linear combinations of operators in 5; however, we shall show that, they may be obtained by forming certain products of polynomials, in, elements of 5., In the study of any family of linear operators on an inner product, space, it is usually profitable to consider the self-adjoint algebra generated, by the family., Dejinition., A self-adjoint, algebra, of operators, cm an inner, product space V is a linear subalgebra of L(V, V) which contains the adjoint, of each of its members., , An example of a self-adjoint, algebra is L(V, V) itself. Since the, intersection of any collection of self-adjoint algebras is again a self-adjoint, algebra, the following terminology, is meaningful., DeJinition., If 5 is a family of linear operators on a jinite-dimensional, algebra, generated, by 5 is the smallest, inner product space, the self-adjoint, self-adjoint algebra which contains 5., Theorem, 16. Let 5 be a commuting family of diagonalizable normal, operators on a finite-dimensional, inner product space V, and let C%be the selfadjoint algebra generated by 5 and the identity operator. Let {PI, . . . , Pk} be, the resolution of the identity defined by 5. Then a is the set of all operators on, V of the form, , T = f;l cjPj, , (9-15), , j=l, , where cl, . . . , ck are arbitrary, , scalars., , Proof. Let C? denote the set of all operators on V of the form, (9-15). Then C?contains the identity operator and the adjoint, , of each of its members., , If T = z CjPj and U = 2 diPj, then for every, j, j, , scalar a, aT + U = Z (ac + dj)Pj, j, and, TU = I: CidjPiPj, i,j, = 2 CjdiPj, i, = UT., Thus C?is a self-adjoint commutative, operator. Therefore C?contains B., , algebra containing, , 5 and the identity
Page 354 :
Operators, , Product Spaces, , on Inner, , Chap. 9, , Now let rl, . . . , rk be all the roots of 5. Then for each pair of indices, (i, n) with i # n, there is an operator Ti,, in 5 such that ri(Ti,), # r,(Ti,)., Let ai, = ri(Ti,), - r,(T;,) and bc = r,(T;,). Then the linear operator, Qi = Jli a~,‘(Ti,, , - b;,l), , is an element of the algebra a. We will show that Q; = Pi (1 5 i 5 k). For, this, suppose j # i and that (Y is an arbitrary vector in V(rj). Then, Tiia = ~j(Tij)a, = bija, so that (Tij - b&x, = 0. Since the factors in Qi all commute, it follows, that Qia = 0. Hence Qi agrees with Pi on V(rj) whenever j # i. Now, suppose cy is a vector in V(ri). Then Tina = ri(Ti,)a,, and, ai’(Tin, , - bd)a, , = ~*[ri(Tin), , - r,(Ti,)]a, , = CY., , Thus Q;cu = cr and Qi agrees with Pi on V(ri) ; therefore,, k. From this it follows that a = e. 1, i=l, J”‘,, , Qi = Pi for, , The theorem shows that the algebra Q. is commutative, and that each, element of a is a diagonalizable normal operator. We show next that Q. has, a single generator., Corollary., Under the assumptions of the theorem, there is an operator, T in B such that every member of a is a polynomial in T., Proof., , Let T = 2 tjPi where tl, . . . , tk are distinct scalars. Then, j=l, k, , Tn = Z t?P,, j=l, , for n = 1, 2, . . . . If, , f = k, ad”, it follows, , that, , f(T) = i, , a,Tn = i, , i, , n=l, , j=l, , n=l, , Given, , a,tyPj, , an arbitrary, U = 6 CjPj, j=l, , in a, there is a polynomial, 1, such f, U = j-(T)., , f such that f(tj) = cj (1 5 j 5 Ic), and for any
Page 355 :
Spectral Theory, , Sec. 9.5, , Exercises, 1. Give a reasonable definition, that such a matrix has a unique, 2. Let A be an n X n matrix, , of a non-negative, n X n matrix,, non-negative, square root., with complex, , entries, , and then prove, , such that A* = --A,, , and let, , B = e-4. Show that, (a) det B = et* A ;, (b) B* = e-A;, (c) B is unitary., 3. If U and T are normal, are normal., , operators, , which commute,, , prove that, , 4. Let T be a linear operator on the finite-dimensional, about, space V. Prove that, the following ten statements, (a), (b), (c), (d), (e), , complex, , U + T and UT, inner, , product, , T are equivalent., , T is normal., llTa\j = IIT*cxI/ for every (II in V., T = T, + iTz, where T1 and Tz are self-adjoint, and TIT2 = T~TI., If a! is a vector and c a scalar such that TCY= ca, then T*ar = k., There is an orthonormal, basis for V consisting, of characteristic, vectors, , for T., (f), (g), (h), (i), , There is an orthonormal, basis @ such that [T]a, is diagonal., There is a polynomial, g with complex coefficients such that T* = g(T)., Every subspace which is invariant, under T is also invariant, under T*., T = N U, where N is non-negative,, U is unitary, and N commutes with U., (j) T=clEl+..., + ckEk, where I = El + . * * + E,+, EiEj = 0 for i # j,, , and Ef = Ei = Ef., 5. Use Exercise 3 to show that any commuting, family of normal operators (not, necessarily diagonalizable, ones) on a finite-dimensional, inner product space generates a commutative, self-adjoint, algebra of normal operators., 6. Let, operator, , V be a finite-dimensional, complex inner product, on V such that Ua = (Y implies Q = 0. Let, j(z) = i H2, , space and U a unitary, , z#l, , and show that, , (a) f(U), , = i(1 + U)(Z - U)-‘;, , (b) f(U) is self-adjoint;, (c) for every self-adjoint, , operator, , T on V, the operator, , U = (T - il)(T, is unitary, , and such that, , + il)+, , T = f(U)., , 7. Let V be the space of complex, , n X n matrices, , (AIB), , = tr (AB*)., , equipped, , with the inner product, , 347
Page 356 :
348, , Operators on Inner Product Spaces, If B is an element, fined by, , Chap. 9, , of V, let LB, Rg, and Tg denote, , the linear, , operators, , on V de-, , (a) LB(A) = BA., (b) RB(A) = AB., (c) TB(A) = BA - AB., Consider the three families of operators obtained by letting B vary over all diagonal, matrices. Show that each of these families is a commutative, self-adjoint, algebra, and find their spectral resolutions., 8. If B is an arbitrary member of the inner product, LB is unitarily equivalent to RB~., , space in Exercise, , 7, show that, , 9. Let V be the inner product space in Exercise 7 and G the group of unitary, matrices in V. If B is in G, let C’S denote the linear operator on V defined by, , G(A), , = BAB-‘., , Show that, (a) CB is a unitary operator on V;, (b) C&B, = CBICB,;, (c) there is no unitary transformation, , U on V such that, , ULdF, , = ce, , for all B in G., 10. Let 5 be any family of linear operators on a finite-dimensional, space V and @ the self-adjoint, algebra generated by 5. Show that, (a) each root of Q. defines a root of 5;, (b) each root T of a is a multiplicative, T(TU), r(cT, , +, , u), , linear, , =, , T(!f)r(U), , =, , CT(T), , +, , function, , inner product, , on A, i.e.,, , T(u), , for all T and U in Q. and all scalars c., 11. Let 5 be a commuting, family of diagonalizable, dimensional, inner product space V; and let @ be the, by 5 and the identity, operator I. Show that each, and that for each root T of 5 there is a unique root, for all T in 5., 12. Let 5 be a commuting, dimensional, inner product, Let @, be the self-adjoint, Show that, , normal operators on a finiteself-adjoint, algebra generated, root of @, is d.fferent from 0,, s of & such that s(T) = r(T), , family of diagonalizable, normal operators on a finitealgebra generated by 5., space V and A0 the self-adjoint, algebra generated by 5 and the identity, operator I., , (a) a is the set of all operators on V of the form cl +, and T an operator in a,,, (b) There is at most one root T of @, such that r(T) = 0, PI,, (c) If one of the roots of Q is 0 on a,, the projections, tion of the identity, defined by 5 may be indexed in such a, of all operators on V of the form, , T where c is a scalar, for all T in &., . . . , Pk in the resoluway that (3, consists
Page 357 :
Further Properties of Normal Operators, , Sec. 9.6, , T = $ cjPi, j=2, , where CZ,. . . , ck are arbitrary scalars., (d) @,= a,, if and only if for each root r of a there exists an operator T in @, such that r(T) # 0., , 9.6. Further, , Properties, , of Normal, Operators, , In Section 8.5 we developed the basic properties of self-adjoint, and, normal operators, using the simplest and most direct methods possible., In Section 9.5 we considered various aspects of spectral theory. Here we, prove some results of a more technical nature which are mainly about, normal operators on real spaces., We shall begin by proving a sharper version of the primary decomposition theorem of Chapter 6 for normal operators. It applies to both the, real and complex cases., Theorem, 17. Let T be a normal operator on a Jinite-dimensional, inner, product space V. Let p be the minimal polynomial for T and pl; * *, pk, its distinct manic prime factors. Then each pj occurs with multiplicity, 1 in, the factorization of p and has degree 1 or 2. Suppose Wj is the null space of, pj (T). Then, , (i) Wj is orthogonal to Wi when i # j ;, (ii) V = W1 @ . +. @ Wk;, (iii) Wj is invariant under T, and pj is the minimal polynomial for the, restriction of T to Wj ;, (iv) for every j, there is a polynomial ej with coeficients in the scalar, jield such that ej(T) is the orthogonal projection of V on Wj., In the proof we use certain basic facts which we state as lemmas., Lemma, 1. Let N be a normal operator on an inner product space W., Then the null space of N is the orthogonal complement of its range., , Proof. Suppose (&V/3) = 0 for all 6 in W. Then (iV*c#) = 0, for all /3; hence N*cu = 0. By Theorem 19 of Chapter 8, this implies Na: = 0., Conversely, if Na = 0, then N*clc = 0, and, (N*c#), for all /3 in W., , = ((UpvP) = 0, , j, , Lemma, 2. If N is a normal, N2ar = 0, then Na! = 0., , operator, , and (Y is a vector such that
Page 358 :
550, , Operators, , on Inner, , Product, , Chap. 9, , Spaces, , Proof. Suppose N is normal and that N2a = 0. Then Na lies in, the range of N and also lies in the null space of N. By Lemma 1, this, implies Ncr = 0. 1, Lemma, 3. Let T be a normal operator, and f any, coeficients, in the scalar field. Then f(T) is also normal., Proof., , polynomial, , with, , Suppose f = a0 + alx + . . . + a,xn. Then, f(T), , = aoI + alT, , +, , . ~1 + a,Tn, , and, f(T)*, , Since T*T, , = TT*,, , = &,I, , it follows, , + aIT*, , +, , that f(T), , . . . + c&,(T*)~., , commutes, , withf(T)*., , 1, , Lemma, 4. Let T be a normal, operator and f, g relatively, prime polynomials, with coeficients, in the scalar field. Suppose a and /3 are vectors such, that f(T)a, = 0 and g(T)/3 = 0. Then ((Y/P) = 0., Proof., There are polynomials, a and b with, scalar field such that af + bg = 1. Thus, a(T)f(T), , and a! = g(T)b(T)cz., blP), , It follows, , + b(T)g(T), , coefficients, , in the, , = I, , that, , = (g(TMTblP), , = (b(Tblg(T)*P)., , By assumption g(T)0 = 0. By Lemma 3, g(T) is normal. Therefore,, Theorem 19 of Chapter 8, g(T)*@ = 0; hence (a/P) = 0. 1, , by, , Proof of Theorem, 17. Recall that the minimal polynomial, for T, is the manic polynomial of least degree among all polynomials f such that, f(T), = 0. The existence of such polynomials follows from the assumption, that V is finite-dimensional., Suppose some prime factor pj of p is repeated., Then p = pfg for some polynomial g. Since p(T) = 0, it follows that, , (Pj(T)>“g(T)a, for every LYin V. By Lemma 3, pj(T), , = 0, is normal., , Thus Lemma 2 implies, , PATh(, = 0, for every a! in V. But this contradicts the assumption that p has least, degree among all f such that f(T) = 0. Therefore,, p = pl . . . pk. If V is, a complex inner product space each pj is necessarily of the form, =, , pj, , X, , -, , Cj, , with cj real or complex. On the other hand, if V is a real inner product, space, then pi = xj - cj with cj in R or, pj, , where c is a non-real, , =, , (X, , -, , complex number., , C)(X, , - E)
Page 359 :
Sec. 9.6, , Further Properties of Normal Operators, , Now let fi = p/p+ Then, since fi, . . . ,6 are relatively, prime,, exist polynomials gj with coefficients in the scalar field such that, (9-16), , there, , 1 = z.fjgj., , We briefly indicate how such gj may be constructed. If pj = x - cj,, then fj(ci) # 0, and for gi we take the scalar polynomial, l/f,(cJ., When, every pj is of this form, the fjgj are the familiar Lagrange polynomials, associated with cl, . . . , ck, and (9-16) is clearly valid. Suppose some, pj = (x - c)(x - F) with c a non-real complex number. Then V is a real, inner product space, and we take, , where s =, , (C, , - F)fj(c)., , Then, , gj = (s + 3x - (cs + Es), ss, so that gj is a polynomial, , with real coefficients., , If p has degree n, then, , 1 - z.fisj, j, is a polynomial with real coefficients of degree at most n - 1; moreover,, it vanishes at each of the n (complex) roots of p, and hence is identically 0., Now let a! be an arbitrary, vector in V. Then by (9-16), a = Zfj(TMm, and since pj(T)fi( 2’) = 0, it follows that fi(!F)gj(T)a, is in Wj for every j., By Lemma 4, Wi is orthogonal to Wi whenever i # j. Therefore,, V is the, orthogonal direct sum of WI, . . . , We. If p is any vector in Wj, then, Pi(T>W, , = TPj(T>P, , = 0;, , thus Wj is invariant, under T. Let Tj be the restriction of T to W+ Then, pi(Tj) = 0, so that pj is divisible by the minimal polynomial for Tj. Since, pj is irreducible over the scalar field, it follows that pj is the minimal polynomial for Tj., Next, let ej = figj and Ej = ej(T). Then for every vector a! in V,, E+Y is in Wj, and, a = 2 E+Y., j, Thus (Y - Eiac = 2 Eja; since Wi is orthogonal, to Wi when j # i, this, j#i, implies that a! - Eia is in Wk. It now follows from Theorem 4 of Chapter, 8 that EC is the orthogonal projection of V on Wi. 1, Dejhition., We call the subspaces Wj, ponents, of V under, T., , (1 < j 2 k) the primary, , com-, , 351
Page 360 :
352, , Operators, , on Inner, , Product, , Spaces, , Chap. 9, , Corollary., Let T be a normal operator on a finite-dimensional, inner, product space V and WI, . . . , Wk the primary components of V under T., Suppose W is a subspace of V which is invariant under T. Then, W = Z W n Wj., j, Proof. Clearly W contains 2 W n Wj. On the other hand, W, being, invariant under T, is invariant under every polynomial in T. In particular,, W is invariant under the orthogonal projection Ei of V on Wj. If CYis in W,, it follows that E+Y is in W n Wj, and, at the same time, (Y = 2 Eja., Therefore, , W is contained, , in F W n Wj., , 1, , Theorem, 17 shows that every normal operator, T on a finitedimensional inner product space is canonically specified by a finite number, of normal operators T,, defined on the primary components Wj of V under, T, each of whose minimal, polynomials, is irreducible, over the field of, scalars. To complete our understanding, of normal operators it is necessary, to study normal operators of this special type., A normal operator whose minimal polynomial is of degree 1 is clearly, just a scalar multiple of the identity. On the other hand, when the minimal, polynomial is irreducible and of degree 2 the situation is more complicated., EXAMPLE 1. Suppose r > 0 and that 13is a real number which is not, an integral multiple of a. Let T be the linear operator on R2 whose matrix, in the standard orthonormal, basis is, A = r, , ig, , i, , [, , -sin “I., cos e, , Then 7’ is a scalar multiple of an orthogonal, normal. Let p be the characteristic polynomial, , transformation, of T. Then, , and hence, , p = det (~1 - A), = (z--rccos19)~+r~sin~~, = 2 - 2r cos 0x + r2., Let a = r cos 8, b = r sin 0, and c = a + ib. Then b # 0, c = rei*, , and p = (x - C)(X - E). Hence p is irreducible over R. Since p is divisible, by the minimal polynomial, for T, it follows that p is the minimal polynomial., This example suggests the following, , converse.
Page 361 :
Sec. 9.6, , Further Properties of Normal Operators, , Theorem, 18. Let T be a normal operator on a Jinite-dimensional, inner product space V and p its minimal polynomial. Suppose, , real, , p = (x - a)2 + b2, where a and b are real and b # 0. Then there is an integer s > 0 such that, p” is the characteristic polynomial for T, and there exist subspaces V1, . . . , V,, of V such that, (i) Vi is orthogonal to Vi when i # j ;, (ii) V = VI@ ... @Va;, (iii) each Vi has an orthonormal basis {aj, /3j} with the property, , that, , TCY~= acrj + bpj, T/3j = -baj + a@j., In other words, if r = w, and 0 is chosen so that a = r cos 8, and b = r sin 8, then V is an orthogonal, direct sum of two-dimensional, subspaces Vj on each of which T acts as ‘r times rotation through the, angle 0’ ., The proof of Theorem 18 will be based on the following result., Lemma., Let V be a real inner product space and S a normal operator, on V such that S2 + I = 0. Let cybe any vector in V and /!I = Sa. Then, , s*a = -p, (9-17), s*p = c, (43, , = 0, and lbll, Proof., , = lIdI., , We have A% = 0 and S/3 = 2% = -CY. Therefore, , 0 = IlSa - PII + IlSP + alI2 = lIfq12, Since X is normal,, , it follows, , 0 = lISh41” - 2(s*Ploo, , - ww3), + lliql”, + Ilmll” + %%3ld, , + l1412., , that, , + lIPlIZ + lls*Pl12 + 2(s*43), , + llal12, , = Ils*~ + PII2+ IIs* - ~l12., This implies (9-17) ; hence, , and (arip) = 0. Similarly, , (40) = @“PIP)= @l&3, = (PI-4, = -(ffIP), , 11412, = @*PI4= u$w = IlPl12.I, Proof of Theorem 18. Let VI, . . . , V, be a maximal collection, of two-dimensional, subspaces satisfying (i) and (ii), and the additional, conditions, , S5S
Page 362 :
354, , Operators on Inner Product Spaces, , Chap. 9, , T*aj, , = acrj - bpi,, , T*pj, , = baj + apj, , (9-18), , lljls., , direct sum of, Let MT = VI + * .* + V,. Then W is the orthogonal, Vl, . . . , V,. We shall show that W = V. Suppose that this is not the case., Then WI # (0). Moreover, since (iii) and (9-18) imply that W is invariant, under T and T*, it follows that W’ is invariant, under T* and T = T**., Let S = b-l(T - aI). Then X* = b-l(T* - al), S*S = SS*, and WI is, invariant, under S and S*. Since (T - aI)z + bZI = 0, it follows that, S2 + I = 0. Let (Y be any vector of norm 1 in W’- and set 0 = Sa. Then, p is in WA and Sp = --cr. Since T = al + bS, this implies, Tar = aa! + bfl, Tp = -ba+ap., By the lemma, S*CY = -6, S*p = LY, (LY]~) = 0, and llpll = 1. Because, T* = aI + bS*, it follows that, T*CY = aa! - bp, T*fl = bar + ~$3., But this contradicts, subspaces satisfying, , the fact that I/;, . . . , V, is a maximal collection of, (i), (iii), and (9-18). Therefore, W = V, and since, x-a, -b, , det, it follows from, , b, x-a, , 1, , = (x - a)” + b2, , (i), (ii) and (iii) that, det (x1 - T) = [(x - a)” + b2]“., , Corollary., , Under the conditions, , 1, , of the theorem, T is invertible,, , and, , T* = (a2 + b2)T-I., Proof., , Since, [g, , -i][-t, , i], , =[a2ibz, , it follows from (iii) and (9-18) that TT*, and T* = (a2 + b2)T-l., , a2tb2-J, , = (a2 + b2)I. Hence T is invertible, , Theorem, 19. Let T be a normal operator on a Jinite-dimensional, inner, product space V. Then any linear operator that commutes with T also commutes with T*. Moreover, every subspace invariant under T is also invariant, under T*., , Proof. Suppose U is a linear operator on V that commutes with, T. Let Ei be the orthogonal, projection of V on the primary component
Page 363 :
Sec. 9.6, , Further, , Wj (1 5 j 5 k) of V under, commutes with U. Thus, , T. Then, , EjUEi, , Ei, , = UEj”, , Properties, , of Normal, , is a polynomial, , Operators, , in T and hence, , = UEj., , Thus U(Wj) is a subset of Wi. Let Tj and Uj denote the restrictions of T and, U to Wj. Suppose Ii is the identity, operator on Wj. Then Uj commutes, with Tj, and if Tj = CjIj, it is clear that Uj also commutes with Tf* = FjIj., On the other hand, if Tj is not a scalar multiple of lj, then Tj is invertible, and there exist real numbers aj and bj such that, Tj* = (aj” + bj2)T;‘., , Since UjTj = TjUjy it follows that TJy’Ui = UiT;‘., Therefore, Vi eommutes with Tj* in both cases. Now T* also commutes with Ej, and hence, Wi is invariant, under T*. Moreover for every (Y and p in Wj, (TjalP>, , Since T*(Wi), to Wj. Thus, , = (TolIP), , is contained, , = (aIT*P), , = (aICP)*, , in Wj, this implies, UT*aj, , TT is the restriction, , = T*Uaj, , for every aj in Wj. Since V is the sum of WI, . . . , Wk, it follows, UT*ar, , of T*, , that, , = T*Ua!, , for every CYin V and hence that U commutes with T*., Now suppose W is a subspace of V that is invariant, under T, and let, Zj = W n W+ By the corollary to Theorem 17, W = x Zj. Thus it suffices, to show that each Zj is invariant under Tj*. This is cle& if Tj = CjI. When, this is not the case, Tj is invertible, and maps Zj into and hence onto Zj., = Zj, and since, Thus T;‘(Zj), Tj’ = (a; + bf)T;’, , it follows that T*(Zj), , is contained, , in Zj, for every j., , 1, , inner product, Suppose T is a normal operator on a finite-dimensional, under T. Then the preceding, space V. Let W be a subspace invariant, under T*. From this it follows that, corollary shows that W is invariant, W’ is invariant, under T** = T (and hence under T* as well). Using this, fact one can easily prove the following strengthened version of the cyclic, decomposition theorem given in Chapter 7., 20. Let T be a normal, linear operator on a finite-dimensional, product, space V (dim V 2 1). Then there exist r non-zero, vectors, in V with respective T-annihilators, el, . . . , e, such that, . ) CY=, , Theorem, , inner, ffl,, , ., , ., , (i) V = Z(~y1; T) @ . . . @ Z(cu,; T);, (ii) if 1 5 k 5 I - 1, then ek+l divides, , ek;, , 355
Page 364 :
356, , Operators on Inner Product Spaces, , Chap. 9, , the, (iii) Z(aj; T) is orthogonal to Z( (Yk; T) when j # k. Furthermore,, integer r and the annihilators, el, . . . , e, are uniquely determined by conditions (i) and (ii) and the fact that no (Yk is 0., Corollary., If A is a normal matrix with real (complex) entries, then, there is a real orthogonal (unitary) matrix P such that P-‘AP is in rational, canonical form., , It follows that two normal matrices A and B are unitarily equivalent, if and only if they have the same rational form; A and B are orthogonally, equivalent, if they have real entries and the same rational form., On the other hand, there is a simpler criterion for the unitary equivalence of normal matrices and normal operators., Definitions., Let V and V’ be inner product spaces over the same field., A linear transformation, , U:V+V, transformation, if it maps V onto V’ and preserves, is called a unitary, inner products. If T is a linear operator on V and T’ a linear operator on V’,, equivalent, to T’ if there exists a unitary transformation, then T is unitarily, U of V onto V’ such that, , UTU-’, , = T’., , Lemma., Let V and V’ be finite-dimensional, inner product spaces over, the same jleld. Suppose T is a linear operator on V and that T’ is a linear, operator on V’. Then T is unitarily equivalent to T’ if and only if there is an, orthonormal basis & of V and an orthonormal basis 6Y of V’ such that, , P% = [T’lw., Proof. Suppose there is a unitary transformation, U of V onto, V’ such that UTU-l, = T’. Let @ = {w, . . . , LY,} be any (ordered), orthonormal, basis for V. Let cz$= Uai (1 2 j 5 n). Then a = {(Y:, . . . ,, d} is an orthonormal, basis for V’ and setting, , we see that, , Hence [qa, , = A = [T’]w.
Page 365 :
Sec. 9.6, , Further Properties of Normal Operators, , Conversely,, suppose there is an orthonormal, orthonormal, basis 6? of V’ such that, , basis CB of V and an, , WI63= u%~, and let A = [T]a. Suppose B = {crl, . . . , an} and that (8’ = {ai, . . . , &}., Let U be the linear transformation, of V into V’ such that UOC~= al, (1 _< j 5 n). Then U is a unitary transformation, of V onto V’, and, UTU-%Y;, , = UTCX~, = UL: A!+Y~, k, , = F AkjaL., Therefore,, T’. 1, , UTU--lcr;, , = T’cx; (1 < j 5 n), and this, , implies, , UTU-1, , =, , It follows immediately, from the lemma that, unitarily, equivalent, operators on finite-dimensional, spaces have the same characteristic, polynomial. For normal operators the converse is valid., Theorem, 21. Let V and V’, over the same field. Suppose T is, normal operator on V’. Then T is, and T’ have the same characteristic, , be finite-dimensional, inner product spaces, a normal operator on V and that T’ is a, unitarily equivalent to T’ if and only if T, polynomial., , Proof. Suppose T and T’ have the same characteristic, polynomial f. Let VVj (1 5 j < k) be the primary components of V under T, and Tj the restriction of T to Wj. Suppose If is the identity operator on, Wi. Then, k, , f = II, , det(xIi-, , Tj)., , j=l, , Let pi be the minimal, , polynomial, , for Ti. If pj = x - cj it is clear that, , det (~1~ - Ti) = (z - c,)“i, where sj is the dimension of Wj. On the other hand, if pi = (5 - aj)” + b,”, with aj, bj real and bj # 0, then it follows from Theorem 18 that,, det (xlj - Tj) = ~7, where in this case 2sj is the dimension, , of Wi. Therefore, , f = n py., , NOW, , we can also compute f by the same method using the primary components, of V’ under T’. Since pl, . . . , pk are distinct primes, it follows from the, uniqueness of t,he prime factorization, off that there are exactly k primary, components Wi (1 < j < k) of V’ under T’ and that these may be indexed, in such a way that pi is the minimal polynomial for the restriction, Ti of, T’ to W;. If pi = x - ci, then Ti = CjIj and T(i = CjIi where I$ is the, , 357
Page 366 :
358, , Operators, , on Inner, , Product, , Spaces, , identity operator on W;. In this case it is evident that Tj, equivalent, to T:. If pi = (x - aj)’ + bf, as above, then using, and Theorem 20, we again see that Tj is unitarily equivalent, for each j there are orthonormal, bases Bj and & of Wj and, tively, such that, , Chap. 9, is unitarily, the lemma, to T$. Thus, W:, respec-, , [Tjlai = [GIcB/., Now let U be the linear transformation, of V into V’ that maps each 6+, onto 6~;. Then U is a unitary transformation, of V onto V’ such that, UTU-’, , = T’., , 1
Page 367 :
10. Bilinear, F orms, , 10.1., , Bilinear, , Forms, , In this chapter, we treat bilinear forms on finite-dimensional, vector, spaces. The reader will probably observe a similarity between some of the, material and the discussion of determinants, in Chapter 5 and of inner, products and forms in Chapter 8 and in Chapter 9. The relation between, bilinear forms and inner products is particularly, strong; however, this, chapter does not presuppose any of the material in Chapter 8 or Chapter 9., The reader who is not familiar with inner products would probably profit, by reading the first part of Chapter 8 as he reads the discussion of bilinear, forms., This first section treats the space of bilinear forms on a vector space, of dimension n. The matrix of a bilinear form in an ordered basis is introduced, and the isomorphism between the space of forms and the space of, n X n matrices is established. The rank of a bilinear form is defined, and, non-degenerate, bilinear forms are introduced. The second section discusses, symmetric bilinear forms and their diagonalization., The third section, treats skew-symmetric, bilinear forms. The fourth section discusses the, group preserving, a non-degenerate, bilinear form, with special attention, given to the orthogonal, groups, the pseudo-orthogonal, groups, and a, particular pseudo-orthogonal, group-the, Lorentz group., DeJinition., Let V be a vector space over the field I?. A bilinear, form, on V is a function f, which assigns to each ordered pair of vectors a, /3 in V a, scalar f(cu, /3) in F, and which satis$es, , 369
Page 368 :
360, , Bilinear Forms, (10-l), , Chap. 10, , t, , f(w, f(q, , Ml, , a2,, +, , P), 02), , =, , cf(a1,, , =, , cf(a,, , P), Pl), , +, , f(cr2,, , +, , fb,, , P), P2)., , If we let V X V denote the set of all ordered pairs of vectors in I’,, this definition can be rephrased as follows: A bilinear form on V is a function f from V X V into F which is linear as a function of either of its, arguments when the other is fixed. The zero function from V X V into F, is clearly a bilinear form. It is also true that any linear combination, of, bilinear forms on V is again a bilinear form. To prove this, it is sufficient, to consider linear combinations, of the type cf + g, where f and g are, bilinear forms on V. The proof that cf + g satisfies (10-l) is similar to many, others we have given, and we shall thus omit it. All this may be summarized, by saying that the set of all bilinear forms on V is a subspace of the space, of all functions from V X V into F (Example 3, Chapter 2). We shall, denote the space of bilinear forms on V by L(V, V, F)., EXAMPLE, 1. Let V be a vector space over the field F and let Ll and, Lz be linear functions on V. Define f by, , f(% PI =, , h(4L203)., , If we fix fi and regard f as a function of CY,then we simply have a scalar, multiple of the linear functional L1. With a~fixed, f is a scalar multiple of, L2. Thus it is clear that f is a bilinear form on V., EXAMPLE, 2. Let m and n be positive integers and F a field. Let V be, the vector space of all m X n matrices over F. Let A be a fixed m X m, matrix over F. Define, , fA(X,, Then fA is a bilinear, over F,, fA(cX, , Y) = tr (XtAY)., , form on V. For, if X, Y, and Z are m X n matrices, + 2, Y) = tr [(cX + Z)tAY], = tr (cXtA Y) + tr (ZtA Y), = cfz4(X, Y) +fA(z,, y>., , Of course, we have used the fact that the transpose operation and the, trace function are linear. It is even easier to show that fA is linear as a, function of its second argument. In the special case n = 1, the matrix, XtAY is 1 X 1, i.e., a scalar, and the bilinear form is simply, fA(X,, , Y) = X’AY, , We shall presently show that every bilinear form on the space of m X 1, matrices is of this type, i.e., is fA for some m X m matrix A.
Page 369 :
Bilinear Forms, , Sec. 10.1, , EXAMPLE 3. Let F be a field. Let us find all bilinear forms on the, space F2. Suppose f is such a bilinear form. If (Y = (21, Q) and @ = (~1, yz), are vectors in F2, then, f((Y,, , P), , =, , f(z1e1, , =, , d(E1,, , PI, , =, , Xlf(Q,, , eylQ, , =, , z1yJ(s,, , Thus f is completely, , +, , xzez,, , P), , +, , 4%, +, , El), , P>, g*ez), , +, , +, , 52f(%, , 51Y2f(El,, , determined, , y1e1, , 4, , +, , +, , ?I/24, , X&f(E2,, , 61), , +, , zzyzf(rz,, , Ed., , by the four scalars Aii = f(~, q) by, , f(a, 13) = Anxlyl + Alzxlyz + A~~x~YI + Azzxzyz, = Z A ijziyje, i,j, If X and Y are the coordinate matrices of (Y and @, and if A is the 2 X 2, matrix with entries A (i, j) = A ij = f(ti, ej), then, f(a, 0) = XtAY., , (10-2), , We observed in Example 2 that if A is any 2 X 2 matrix over F, then, (10-2) defines a bilinear form on F2. We see that the bilinear forms on F2, are precisely those obtained from a 2 X 2 matrix as in (10-a)., The discussion in Example 3 can be generalized so as to describe all, bilinear forms on a finite-dimensional, vector space. Let V be a finitedimensional vector space over the field F and let B = {LYE,. . . , ol,} be, an ordered basis for V. Suppose f is a bilinear form on V. If, a! = Xl(Y1 + . . . + xna,, are vectors, , and, , P = ym + . . . + YA, , in V, then, f(% PI = f (F Xi% P), = z Xif(oli, P), i, =, , Z, i, , Xif, , Qlip, (, , I:, j, , yjaj, >, , If we let Aij = f(ai, aj), then, f(a, PI = F 7 &xiyj, = XtAY, where X and Y are the coordinate matrices of OLand p in the ordered, basis a. Thus every bilinear form on V is of the type, (10-3), , f(a, PI = [~%A[PIa, , for some n X n matrix A over F. Conversely, if we are given any n X n, matrix A, it is easy to see that (10-3) defines a bilinear formf on V, such, that A;j = f(ai, aj)., , 361
Page 370 :
Bilinear Forms, , Chap. lo, , DeJinition., Let V be a jlnite-dimensional, 63 = {al, . . . ) a,} be an ordered basis for V. If f, of f in the ordered, basis 63 is the n, the matrix, Aij = f(ai, q). At times, we shall denote this matrix, , vector space, and let, is a bilinear form on V,, X n matrix A with entries, by [f]a., , Theorem, 1. Let V be a finite-dimensional, vector space over the jleld F., For each ordered basis @ of V, the function which associates with each bilinear, form on V its matrix in the ordered basis (R is an isomorphism of the space, L(V, V, F) onto the space of n X n matrices over the jleld F., , Proof. We observed above that f + [f]a is a one-one correspondence between the set of bilinear forms on V and the set of all n X n, matrices over F. That this is a linear transformation, is easy to see, because, (Cf + g)(aij aj> = cf(%, aj> + g(ai,, , aj>, , for each i and j. This simply says that, [cf + slol = c[flC% + rs1cB. I, Corollary., If 03 = {o(l, . . , a,} is an ordered basis for V, and, a* = (L1, . . . ) L,} is the dual basis for V”, then the n2 bilinear forms, fij(%, , l<i<n,l<j<n, , PI = Li(a)Lj@),, , form a basis for the space L(V, V, F). In, L(V, V, F) is n2., , particular,, , the dimension, , of, , Proof. The dual basis {Ll, . . . , L,} is essentially defined by the, fact that Li(ar) is the ith coordinate of cz in the ordered basis B (for any, Q in V). Now the functions fij defined by, fij(%, are bilinear, , P> = Li((y)Lj@), , forms of the type considered, , a! = Xlcvl + . . . + x&in, , in Example, , 1. If, , P = ylal + . . . + ynan,, , and, , then, fij(a,, , Let f be any bilinear, ordered basis 6% Then, , P), , =, , Xiyj., , form on V and let A be the matrix, , off, , in the, , V, F)., , 1, , f (at PI = 2 Aijxiyj, i,j, which simply says that, f = F Aifij., It is now clear that the n2 forms S,jLomprise, , a basis for L(V,, , One can rephrase the proof of the corollary as follows. The bilinear, form fij has as its matrix in the ordered basis @ the matrix ‘unit’ Pj,
Page 371 :
Bilinear Forms, , Sec. 10.1, , whose only non-zero entry is a 1 in row i and column j. Since these matrix, units comprise a basis for the space of n X n matrices, the forms fij comprise a basis for the space of bilinear forms., The concept of the matrix of a bilinear form in an ordered basis is, similar to that of the matrix of a linear operator in an ordered basis. Just, as for linear operators, we shall be interested in what happens to the, matrix representing a bilinear form, as we change from one ordered basis, to another. So, suppose @ = {al, . . . , ol,} a.nd (R’ = {oL:, . . . , &} are, two ordered bases for V and that f is a bilinear form on V. How are the, matrices [f]a and [flat related? Well, let P be the (invertible), n X n, matrix such that, for all LYin V. In other words, define P by, , For any vectors, , a, p in V, , f(Q, PI = [~l~LflcBk%, = (P[~lm~)t[fl~PIP1m~, , = [~lh3WmBP) Mlw., By the definition, and uniqueness, ordered basis a’, we must have, , of the matrix, , representing, , f in the, , [flw = P"[flmP., , (10-4), , EXAMPLE 4. Let V be the vector space R2. Let f be the bilinear, defined on a = (~1, z2) and /3 = (~1, yz) by, f(q, , P), , =, , w/l, , +, , 3x42, , =, , [x1,221, , +, , zzy1, , +, , form, , w/2., , Now, f(a,P), , and so the matrix, , off, , [;, , in the standard, , ;-J [;:I, , ordered, , basis a3 = (~1, Q} is, , Let a’ = {E:, ~4) be the ordered basis defined by C: = (1, -l), ~4 = (1, 1)., In this case, the matrix P which changes coordinates from a’ to a3 is, , [, , P=, , -;, , Thus, , ;., , 1, , Mm/ = P”[flmP, , = [:, , -:I[:, , :I [-i, , :I
Page 372 :
Bilinear Forms, , Chap. 10, , =[: 31[i iI, 01, =[0, 0 4’, What this means is that if we express the vectors, their coordinates in the basis B’, say, a = 5:E: + cc;&,, , cy and p by means of, , P = y:E: + y&i, , then, f(a, P) = 4&y;., One consequence of the change of basis formula (10-4) is the following:, If A and B are n X n matrices which represent the same bilinear form, on V in (possibly) different ordered bases, then A and B have the same, rank. For, if P is an invertible n X n matrix and B = PAP, it is evident, that A and B have the same rank. This makes it possible to define the, rank of a bilinear form on V as the rank of any matrix which represents, the form in an ordered basis for 8., It is desirable to give a more intrinsic definition, of the rank of a, bilinear form. This can be done as follows: Suppose f is a bilinear form, on the vector space V. If we fix a vector a! in V, then f(a, fl) is linear as, a function of /3. In this way, each fixed (Y determines a linear functional, on V; let us denote this linear functional, by Lf(oc). To repeat, if or is a, vector in V, then L,(a) is the linear functional on V whose value on any, vector fl is f(a, /3). This gives us a transformation, LY+ Lf(~) from V into, the dual space V*. Since, we see that, that is, L, is a linear transformation, from V into V*., In a similar manner, f determines a linear transformation, R, from V, into V*. For each fixed p in V, f(a, /3) is linear as a function of (Y. We define, R,(p) to be the linear functional on V whose value on the vector (Yisf(a, p)., Theorem, 2. Let f be a bilinear form on the finite-dimensional, vector, space V. Let Lf and Rf be the linear transformations from V into V* dejked, by (Lr(-u)(p) = f(a, p) = (R@)(a). Then rank (Lr) = ranlc (Rr)., , Proof. One can give a ‘coordinate free’ proof of this theorem., Such a proof is similar to the proof (in Section 3.7) that the row-rank of a, matrix is equal to its column-rank., So, here we shall give a proof which, proceeds by choosing a coordinate system (basis) and then using the, ‘row-rank equals column-rank’, theorem., To prove rank (L,) = rank (R,), it will suffice to prove that Lf and
Page 373 :
Sec. 10.1, , Bi1inea.r Forms, , Rf have the same nullity. Let B be an ordered basis for I’, and let A = [f&s., If o( and /3 are vectors in V, with coordinate matrices X and Y in the, ordered basis @, then f(a, p) = XtAY., Now Rf(fi) = 0 means that, f(a, 0) = 0 for every cyin V, i.e., that XIA Y = 0 for every n X 1 matrix X., The latter condition simply says that A Y = 0. The nullity of R, is therefore equal to the dimension of the space of solutions of A Y = 0., Similarly, ,$(a) = 0 if and only if XtA Y = 0 for every n X 1 matrix, Y. Thus (Y is in the null space of L, if and only if XIA = 0, i.e., AtX = 0., The nullity of L, is therefore equal to the dimension of the space of solutions of AtX = 0. Since the matrices A and At have the same columnrank, we see that, nullity, , (Lf), , = nullity, , (Rf)., , 1, , If f is a bilinear form on the finite-dimensional, of f is the integer r = rank (Lr) = rank (Rr)., , DeJinition., , the rank, , form is equal to the rank of the, , Corollary, 1. The rank of a bilinear, matrix of the form in any ordered basis., Corollary, , V, the following, , 2. If f is a bilinear, are equivalent:, , space V,, , form on the n-dimensional, , vector space, , rank (f) = n., (b) For each non-zero (Y in V, there is a /3 in V such that f(a, /?) # 0., (c) For each non-zero p in V, there is an o( in V such that f(a, /3) # 0., (a), , Proof. Statement (b) simply says that the null space of Lf is the, zero subspace. Statement (c) says that the null space of Rf is the zero, subspace. The linear transformations, Lf and R, have nullity 0 if and only, if they have rank n, i.e., if and only if rank (f) = n. 1, DeJinition., , degenerate, , Corollary, , A bilinear, (or non-singular), , form, if, , f on a vector space V is called nonit satisjies conditions (b) and (a) of, , 2., , If V is, any one of, degenerate, basis for V, , provided f satisfies, finite-dimensional,, then f is non-degenerate, the three conditions of Corollary, 2. In particular, f is non(non-singular), if and only if its matrix in some (every) ordered, is a non-singular, matrix., , EXAMPLE 5. Let V = Rn, and let f be the bilinear, a = (x1, . . . , x,) and P = (~1, . . . , yn) by, , f(%P> =, , z1y1+, , ..., , +, , GLyn., , form, , defined, , on
Page 374 :
Bilinear Forms, , Chap. 10, , Then f is a non-degenerate, bilinear form on Rn. The matrix, standard ordered basis is the n X n identity matrix:, f(X,, , of, , f, , in the, , Y) = XtY., , This f is usually called the dot (or scalar) product. The reader is probably, familiar with this bilinear form, at least in the case n = 3. Geometrically,, the number f(o, 0) is the product of the length of o, the length of p, and, the cosine of the angle between (Y and p. In particular, f(cr, p) = 0 if and, only if the vectors a! and p are orthogonal (perpendicular)., , Exercises, 1. Which of the following functions f, defined on vectors ar = (xi, Q) and /3 =, (~1, ye) in R2, are bilinear forms?, (4 f(% 0) = 1., (b), , f(a,, , PI, , =, , (~1, , -, , ~1)~ +, , 22~2., , (cl, , fb,, , P), , =, , (21, , +, , Yd2, , (21, , (d) f(cr, P) =, , 21~2, , -, , -, , -, , Yd2., , XZYI., , 2. Let f be the bilinear form on R* defined by, f((z1,, , Yl),, , (22,, , Y2>), , =, , XlYl, , +, , x2y2., , Find the matrix off in each of the following bases:, {(l, 01, (0, 111,, , ((1, --I), (1, 111,, , {Cl, 2), (374))., , 3. Let V be the space of all 2 X 3 matrices over R, and let f be the bilinear form, on V defined by f(X, Y) = trace (XIAY), where, , Find the matrix off in the ordered basis, {E”, E12, E13, E21, Ez2, E23}, where E’j is the matrix whose only non-zero entry is a 1 in row i and column j., 4. Describe explicitly all bilinear forms f on R3 with the property that f(ar, 0) =, f(/?, (u) for all (II, /3., 5. Describe the bilinear forms on R3 which satisfyf(cu, ,f3) = -f(@, a) for all cr, p., 6. Let n be a positive integer, and let V be the space of all n X n matrices over, the field of complex numbers. Show that the equation, f(A, B) = n tr (AB) - tr (A) tr (B), defines a bilinear form f on V. Is it true that f(A, B) = f(B, A) for all A, B?, ‘7. Let f be the bilinear form defined in Exercise 6. Show that f is degenerate, (not non-degenerate). Let Vl be the subspace of V consisting of the matrices of, trace 0, and let fi be the restriction off to VI. Show that fl is non-degenerate.
Page 375 :
Sec. 10.2, , Symmetric Bilinear, , 8. Let f be the bilinear form, of V consisting of all matrices, the conjugate transpose of A)., f2 is negative definite, i.e., that, , defined, A such, Denote, fi(A, A), , Forms, , in Exercise 6, and let Vz be the subspace, that trace (A) = 0 and A* = -A (A* is, byfZ the restriction off to VT. Show that, < 0 for each non-zero A in VP., , 9. Let f be the bilinear form defined in Exercise 6. Let W be the set of all matrices, A in V such that f(A, B) = 0 for all B. Show that lY is a subspace of V. Describe, W explicitly aud find its dimension., 10. Let f be auy bilinear form on a finite-dimensional vector space V. Let W be the, subspace of all ,B such that f((~, 0) = 0 for every (Y. Show that, rank f = dim V - dim W., Use this result and the result of Exercise 9 to compute the rank of the bilinear, form defined in Exercise 6., 11. Let S be a bilinear form on a finite-dimensional, vector space V. Suppose VI, is a subspace of V with the property that the restriction off to V, is non-degenerate., Show that rank f 2 dim 8,., , 12. Let f, g be bilinear forms on a finite-dimensional vector space V. Suppose g, is non-singular. Show that there exist unique liuear operators T1, Tz on V such that, f(a, PI = @‘la, PI = da,, , 7’2P), , for all cy,p., 13. Show that the result given in Exercise 12 need not be true if g is singular., 14. Let f be a bilinear form on a finite-dimensional vector space V. Show that f can, be expressed as a product of two linear functionals (i.e., f(a, /3) = Ll(cr)LZ(p) for, Li, L2 in V*) if and only if f has rank 1., , 10.2., , Symmetric, , Bilinear, , Forms, , The main purpose of this section is to answer the following question:, If S is a bilinear form on the finite-dimensional, vector space V, when is, there an ordered basis CBfor V in which f is represented by a diagonal, matrix? We prove that this is possible if and only if f is a symmetric, bilinear form, i.e., ~(LY,p) = f(P, cy). The theorem is proved only when, the scalar field has characteristic zero, that is, that if n is a positive integer, the sum 1 + . . . + 1 (n times) in F is not 0., Definition., , that f is symmetric, , Let, , f be a bilinear, form on the vector space V., if f(cu, p) = f(p, CX)for all vectors CY,p in V., , We, , say, , the bilinear form f is symmetric if and, If V is a finite-dimensional,, only if its matrix A in some (or every) ordered basis is symmetric, At = A., To see this, one inquires when the bilinear form, f(X,, , Y) = XtA Y
Page 376 :
368, , Bilinear Forms, , Chap. 10, , is symmetric. This happens if and only if XtAY = YtAX for all column, matrices X and Y. Since XtA Y is a 1 X 1 matrix, we have XtA Y = YtA tX., Thus f is symmetric if and only if YtA tX = YtAX for all X, Y. Clearly, this just means that A = At. In particular,, one should note that if there, is an ordered basis for V in which f is represented by a diagonal matrix,, then f is symmetric, for any diagonal matrix is a symmetric matrix., If f is a symmetric bilinear form, the quadratic, form, associated, with f is the function q from V into F defined by, UC4 = fb, 4, If F is a subfield of the complex numbers, the symmetric bilinear form f, is completely determined by its associated quadratic form, according to, identity, the polarization, (10-5), , f(% PI = $q(a + P> - t!da - P)., The establishment, of (10-5) is a routine computation,, which we omit. If, f is the bilinear form of Example 5, the dot product, the associated quadratic form is, Qh, , . * . , 2,), , =, , SF, , +, , . . ., , +, , 2;., , In other words, q(a) is the square of the length of (Y. For the bilinear, fA(X, Y) = XtAY, the associated quadratic form is, qA(x), , form, , = I: AijlC&., i,i, One important class of symmetric bilinear forms consists of the inner, products on real vector spaces, discussed in Chapter 8. If V is a real, vector space, an inner, product, on V is a symmetric bilinear form f on, V which satisfies, (10-S), , = XfAX, , f(a, a) > 0, , if, , o( # 0., , definite., Thus, an, A bilinear form satisfying (10-6) is called positive, inner product on a real vector space is a positive definite, symmetric, bilinear form on that space. Note that an inner product is non-degenerate., with respect to the inner product f, Two vectors Q, /3 are called orthogonal, if f(cu, p) = 0. The quadratic form n(a) = f(a, a) takes only non-negative, values, and q(a) is usually thought of as the square of the length of CL Of, course, these concepts of length and orthogonality, stem from the most, important example of an inner product-the, dot product of Example 5., If f is any symmetric bilinear form on a vector space V, it is convenient to apply some of the terminology, of inner products to f. It is, especially convenient to say that a! and p are orthogonal with respect to, f if f(cy, 0) = 0. It is not advisable to think of f(a, (Y) as the square of the, length of a; for example, if V is a complex vector space, we may have, f(a, CL) = G,, or on a real vector space, f(a, a) = -2., We turn now to the basic theorem of this section. In reading the
Page 377 :
Sec. 10.2, , Symmetric Bilinear, , Forms, , proof, the reader should find it helpful to think of the special case in, which V is a real vector space and f is an inner product on V., Theorem, 3. Let V be a Jinite-dimensional, vector space over a field, of characteristic zero, and let f be a symmetric bilinear form on V. Then there, is an ordered basis for V in which f is represented by a diagonal matrix., , Proof. What we must find is an ordered, , basis, , 63 = {a, . . . , %>, such that f((~+ olj) = 0 for i # j. If f = 0 or n = 1, the theorem is obviously true. Thus we may suppose f # 0 and n > 1. If ~(LY,a) = 0 for, every (Y in V, the associated quadratic form q is identically, 0, and the, polarization identity (10-5) shows that f = 0. Thus there is a vector a in, V such that f(a, (Y) = q(a) # 0. Let W be the one-dimensional, subspace, of V which is spanned by cy, and let W’ be the set of all vectors /? in V, such that f(a, p) = 0. Now we claim that V = W @ WI. Certainly the, subspaces W and WI are independent. Atypical vector in W is CCU,, where c is, a scalar. If ccyis also in WI, thenf(ccr, ecu) = c2f(a, a) = 0. But f(a, a) # 0,, thus c = 0. Also, each vector in V is the sum of a vector in W and a vector, in WI. For, let y be any vector in V, and put, , Then, , and since f is symmetric,, expression, , shows us that V =, The restriction, W’- has dimension, basis (o(~, . . . , an>, , j(cr, /3) = 0. Thus p is in the subspace WI. The, , W + WI., off to WI is a symmetric bilinear form on WI. Since, (n - l), we may assume by induction that WI has a, such that, , f(a;, QJ = 0,, , i # j (i 2 2, j 2 2)., , Putting 011= (Y,we obtain a basis {o(l, . . . , OC,}for V such thatf(cui,, for i f j. 1, , a,) = 0, , Let F be a sub$eld of the complex numbers, and let A be a, symmetric n X n matrix over F. Then there is an invertible 11 X n matrix, P over F such that PtAP is diagonal., Corollary., , In case P is the field of real numbers, the invertible, matrix P in this, corollary can be chosen to be an orthogonal matrix, i.e., Pt = P-l. In, , 369
Page 378 :
370, , Chap. 10, , Bilinear Forms, , other words, if A is a real symmetric n X n matrix, there is a real orthogonal matrix P such that PlAP is diagonal; however, this is not at all, apparent from what, we did a.bove (see Chapter 8)., Theorem, 4. Let V be a jkite-dimensional, vector space over the field of, complex numbers. Let f be a symmetric bilinear form on V which has rank r., Then there is an ordered basis 03 = {PI, . . . , &,} for V such that, , (i) the matrix of f in the ordered basis ~8 is diagonal;, , (ii) f(pj,pj) = ‘7 j = ‘j . . ’ 7r, { 0, j > r., Proof. By Theorem, for V such that, f(a;,, , 3, there, Ct!j), , 0, , =, , is an ordered, for, , basis {CQ, . . . , LU,}, , i # j., , Since f has rank r, so does its matrix in the ordered basis (w, . . . , G}., Thus we must have f(aj, aj) # 0 for precisely T values of j. By reordering, the vectors ai, we may assume that, j, =, 1,., . . ,T., fCocj, %> f O,, Now we use the fact that the scalar field is the field of complex numbers., If Jf(aj, aj) denotes any complex square root of f(ctj, aj), and if we put, , j=, oj, , =, , df(Cij,, , i, , “j,, , c.Uj), , l,...,r, , aj’, , j > T, , the basis (p,, . . . , on} satisfies conditions, , (i) and (ii)., , 1, , Of course, Theorem 4 is valid if the scalar field is any subfield of the, complex numbers in which each element has a square root. It is not valid,, for example, when the scalar field is the field of real numbers. Over the, field of real numbers, we have the following substitute for Theorem 4., Theorem, 5. Let V be an n-dimensional, vector space over the field of, real numbers, and let f be a symmetric bilinear form on V which has ranlc r., Then there is an ordered basis {PI, &, . . . , &,} for V in which the matrix of, f is diagonal and such that, , f(Pj, Pj) = fl,, , j = 1,. . .,r., , Furthermore, the number of basis vectors flj for which f@j, fij) = 1 is independent of the choice of basis., Proof., , There is a basis (0~1,. . . , an} for Ti such that, f(ai, aj), f(%t aj), fCaj, %I, , = 0,, # O,, = O,, , i#j, lljlr, j > r.
Page 379 :
Symmetric Bilinear Forms, , Sec. 10.2, , Let, l<j<r, Pi = lf(% 4l-‘%,, j, >, r., I% = ai,, is, a basis with the stated properties., Then {PI, . . . , Pn), Let p be the number of basis vectors /3i for which f(pj, pj) = 1; we, must show that the number p is independent of the particular, basis we, have, satisfying the stated conditions., Let V+ be the subspace of V, spanned by the basis vectors /3i for which f(fij, fij) = 1, and let V- be the, subspace spanned by the basis vectors & for which f(pj, /?j) = - 1. Now, p = dim V+, so it is the uniqueness of the dimension of V+ which we, must demonstrate. It is easy to see that if CYis a non-zero vector in V+,, then f(cr, CX) > 0; in other words, f is positive definite on the subspace Vf., Similarly, if LYis a non-zero vector in V-, thenf(a, CY)< 0, i.e., f is negative, definite on the subspace V-. Now let V1 be the subspace spanned by the, basis vectors /3i for which f(/?j, pi) = 0. If CYis in V’-, then f(a, /?) = 0 for, all j3 in V., Since {PI, . . . , Pn) is a basis for V, we have, v = v+@, , v-0, , VI., , Furthermore,, we claim that if W is any subspace of V on which f is positive definite, then the subspaces W, V-, and VL are independent., For,, suppose cr is in W, p is in V-, y is in VI, and OL+ p + y = 0. Then, 0 = f(% a + P + Y) = f(% 4 + f(% PI + fb,, , Y), , 0 =s@,~+P+Y), =f(P,d, +f(P,P), +f@,r)., Since y is in VI, f(ar, y) = j(P, y) = 0; and since f is symmetric,, 0 = fb,, , we obtain, , 4 + f(% PI, , 0 = m, P> + f(% P>, hence S((.y.,CX) = f(p, p). Since f(ol, CX) 2 0 and S(p, /3) < 0, it follows, f(% 4 = f@, P> = 0., But f is positive definite on W and negative definite, that OL= /3 = 0, and hence that y = 0 as well., Since, ‘., v = v+@v-@vL, , that, , on V-. We conclude, , and W, V-, VL are independent,, we see that dim W < dim V+. That is,, if W is any subspace of V on which f is positive definite, the dimension, of W cannot exceed the dimension of V+. If a1 is another ordered basis, for V which satisfies the conditions of the theorem, we shall have corresponding subspaces Vt, V;, and Vi; and, the argument, above shows, that dim Vf 5 dim V+. Reversing the argument, we obtain dim V+ I, dim V:, and consequently, dimV+ = dimVlf., 1
Page 380 :
Bilinear, , Chap. 10, , Forms, , There are, {Pl, . f . , on} of, First, note that, to all of V. We, dim VI, , several comments we should make about the basis, Theorem 5 and the associated subspaces V+, V-, and VI, VI is exactly the subspace of vectors which are ‘orthogonal’, noted above that V* is contained in this subspace; but,, = dim V - (dim V+ + dim V-), , = dim V - rankf, , so every vector (Y such that ~(cY.,/3) = 0 for all p must be in VI. Thus, the, subspace VA is unique. The subspaces Vf and V- are not unique; however,, their dimensions are unique. The proof of Theorem 5 shows us that dim, Vf is the largest possible dimension of any subspace on which f is positive, definite. Similarly, dim V- is the largest dimension of any subspace on, which f is negative definite. Of course, dim V+ + dim V- = rankf., The number, dim V+ - dim Vis often called the signature, off. It is introduced because the dimensions, of Vf and V- are easily determined from the rank off and the signature, off., Perhaps we should make one final comment about the relation of, symmetric bilinear forms on real vector spaces to inner products. Suppose, V is a finite-dimensional, real vector space and that VI, Vz, V3 are subspaces of V such that, v = Vl@, , V2@ v3., , Suppose that fi is an inner product on VI, and fi is an inner product on Vz., We can then define a symmetric bilinear form f on V as follows: If (Y, /3, are vectors in V, then we can write, a! = (~1 + (~2 + ~3 and, , P = 01 + 82 + P3, , with aj and pj in Vj. Let, , f(a, P> =, , flh,, , 01), , -, , f2Ca2,, , P2>., , The subspace VI for f will be Va, Vl is a suitable V+ for f, and VZ is a, suitable V-. One part of the statement of Theorem 5 is that every symcontent of, metric bilinear form on V arises in this way. The additional, the theorem is that an inner product is represented in some ordered basis, by the identity matrix., , Exercises, 1. The following expressions define quadratic, bilinear form f corresponding, to each q., , forms q on R2. Find the symmetric
Page 381 :
Symmetric, , Sec. 10.2, (a), (b), (c), (d), , az:., bxlxz., cx;., 2x? - &i~z., , Bilinear, , Forms, , (e) x4 + 9x”,., (f) 3x122 - x;., (g) 4x:: + 6x122 - 3x;., , 2. Find the matrix, in the standard, bilinear forms determined, in Exercise, , ordered basis, and the rank of each of the, 1. Indicate, which forms are non-degenerate., , 3. Let q(xl, x2) = ax: + 6x1x2 + czg be the quadratic, form associated with, symmetric, bilinear, form f on R2. Show that f is non-degenerate, if and only, b2 - 4ac # 0., , a, if, , 4. Let V be a finite-dimensional, vector space over a subfield F of the complex, numbers, and let S be the set of all symmetric, bilinear forms on V., (a) Show that S is a subspace of L(V,, (b) Find dim S., Let Q be the set of all quadratic, , V, F)., , forms on V., , (c) Show that Q is a subspace of the space of all functions, from V into F., (d) Describe explicitly, an isomorphism, T of Q onto S, without, reference to, a basis., (e) Let U be a linear operator on V and Q an element of Q. Show that the, equation, (lJtq)(a), = Q(UCY) defines a quadratic, form Utq on V., (f) If U is a linear operator on V, show that the function, UT defined in part, (e) is a linear operator on Q. Show that Ut is invertible, if and only if U is invertible., 5. Let q be the quadratic, , form on R2 given by, a # 0., , q(xl, x2) = ax: + 23x122 + cx;,, Find an invertible, , linear, , operator, (Utq)(xl,, , U on R2 such that, x2) = ax; +, , (Hint: To find U-’ (and hence U), complete, see part (e) of Exercise 4.), 6. Let q be the quadratic, , b2, ( c - ; >xg., the square. For the definition, , of Ut,, , form on R2 given by, q(x1, x2) = 26x1x2., , Find an invertible, , linear, , operator, , U on R2 such that, , (Ufq) (xl, x2) = 2bx: - 2bx;., 7. Let q be the quadratic, , form on R3 given by, q(x1, x2, x3) = x1x2 + 2x123 + xi., , Find an invertible, , linear, , operator, , U on R3 such that, , (U?q)(x1, 22) x3) = 2; - 2; + 2;., (Hint: Express, and 6.), , U as a product, , of operators, , similar, , to those used in Exercises, , 5
Page 382 :
374, , Bilinear Forms, 8. Let ii be a symmetric, on Rn given by, , Chap., n X n matrix, !hl,, , over R, and let q be the quadratic, , 10, , form, , . . . > Xn) = 2 A*jXiXj., i,i, , Generalize the method used in Exercise 7 to show that there is an invertible, operator U on Rn such that, (Wq)(Xl,, , . s * , X,) =, , 5, , linear, , CiXf, , i=l, , where ci is I, - 1, or 0, i = 1, . . . , n., 9. Let f be a symmetric, bilinear form on Rn. Use the result of Exercise, the existence of ah ordered basis 03 such that [j&s is diagonal., 10. Let V be the real vector space of all 2 X 2 (complex), that is, 2 X 2 complex matrices A which satisfy Ai, = A<., , Hermitian, , 8 to prove, matrices,, , (a) Show that the equation, ~(4) = det A defines a quadratic, form q on V., (b) Let W be the subspace of V of matrices of trace 0. Show that the bilinear, form j determined, by q i:: negative definite on the subspace W., vector space and j a non-degenerate, symmetric, 11. Let V be a finite-dimensional, bilinear form on V. Show that for each linear operator T on V there is a unique, linear operator T’ on V such that f(Tcu, fi) = f(a, T’P) for all 01, fi in V. Also, show that, , (TITZ)’ = T;T;, (CITI + czTd’ = sT: + czT;, (T’)’ = T., How much of the above is valid without, , the assumption, , that, , T is non-degenerate?, , 12. Let F be a field and V the space of n X 1 matrices over F. Suppose 4 is a, fixed n X n matrix over F and j is the bilinear form on V defined by j(X, Y) =, X’AY., Suppose f is symmetric, and non-degenerate., Let B be an n X n matrix, over F and T the linear operator on V sending X into BX. Find the operator T’, of Exercise Il., vector space and j a non-degenerate, symmetric, 13. Let V be a finite-dimensional, isomorphism, of V onto the, bilinear, form on V. Associated with j is a ‘natural’, being the transformation, Z$ of Section 10.1., dual space V*, this isomorphism, Using Lf, show that for each basis @ = {(Y,, . . . , (Y,} of V there exists a unique, basis a’ = {a:, . . . , a:} of 1’ such that j(~~i, cyi) = 6ip Then show that for every, vector Q( in V we have, (Y = 2 j(CY, CY:)(Y(= 2 j(CYi, CX)cYi., I, i, 14. Let V, j, 6$ and 6%’ be as in Exercise 13. Suppose T is a linear operator on V, and that T’ is the operator which j associates with T as in Exercise 11. Show that, , (a) CT% = IT1;8., (b) tr (T) = tr (T’) = Zf(Tozi,a:)., , i, , 15. Let V, j, 63, and 6~’ be as in Exercise, , a: = L: (A-l)ipi, j, , 13. Suppose, , [j]a, , = 2 (A-l)iicr,., i, , = A. Show that
Page 383 :
Sec. 10.3, , Skeur-Symmetric, , Bilinear, , Forms, , 375, , 16. Let F be a field and I’ the space of 12 X 1 matrices over F. Suppose A is an, invertible,, symmetric, n X n matrix over F and that f is the bilinear, form on V, defined by f(X, Y) = XIA Y. Let P be an invertible, n X n matrix over F and @,, the basis for V consisting of the columns of P. Show that the basis a’ of Exercise 13, consists of the columns of the matrix A-r(P1)-r., 17. Let V be a finite-dimensional, vector space over a field F and f a symmetric, bilinear form on V. For each subspace W of V, let W’ be the set of all vectors (Y, in V such that f((~, /3) = 0 for every fl in W. Show that, , (a) IV’ is a subspace., (b), (c), (d), (e), , V = {O}*., VL = (01 if and only if f is non-degenerate., rank f = dim V - dim VI., If dim V = 12 and dim W = m, then dim IV’- 2 n - m., &, * * . , pm} be a basis of W and consider the mapping, a, , of V into Fn.), (f) The restriction, , off, , +, , (f(%, , Pl),, , . . * , f(%, , to W is non-degenerate, wn, , (Hint:, , Let, , Pm)), , if and only if, , w-L = (0)., , (g) V = W @ w-L if and only if the restriction, , of f to W is non-degenerate., , 18. Let V be a finite-dimensional, vector space over C and f a non-degenerate, symmetric, bilinear form on V. Prove that there is a basis &!J of V such that a’ = 6%, (Se 1 Exercise 13 for a definition, of a’.), , 10.3., , Skew-Symmetric, , Bilinear, , Forms, , Throughout, this section I’ will be a vector space over a subfield F, of the field of complex numbers. A bilinear form f on V is called skewsymmetric, if f(or, /?) = -f@,, a) f or all vectors o(, /3 in V. We shall prove, one theorem concerning, the simplification, of the matrix of a skewsymmetric, , bilinear, , form, , on, , a finite-dimensional, , space, , V. First,, , let, , us, , make some general observations., Supposef is any bilinear form on V. If we let, , d% PI = 3Lf(% P> +.m, , a, , 41, , 0) = 3M% PI - m, 41, , then it is easy to verify that g is a symmetric bilinear form on V and h is, this, a skew-symmetric, bilinear form on V. Also f = g + h. Furthermore,, form, expression for V as the sum of a symmetric and a skew-symmetric, is unique. Thus, the space L(V, V, F) is the direct sum of the subspace, of symmetric forms and the subspace of skew-symmetric, forms., the bilinear, form f is skew-symmetric, if, If V is finite-dimensional,, and only if its matrix A in some (or every) ordered basis is skew-symmetric,, At = -A. This is proved just as one proves the corresponding, fact about
Page 384 :
376, , Bilinear, , Chap. 10, , Forms, , symmetric bilinear forms. When j is skew-symmetric,, the matrix of j in, any ordered basis will have all its diagonal entries 0. This just corresponds, to the observation, that j(a, a) = 0 for every a! in V, since j(a, a) =, -f(a, 4, Let us suppose j is a non-zero skew-symmetric, bilinear form on V., Since j # 0, there are vectors (Y, p in V such that j(a, /3) # 0. Multiplying, a! by a suitable scalar, we may assume that j(a, 0) = 1. Let y be any vector, in the subspace spanned by a! and /?, say y = ca + d/c?.Then, f(r, 4 = f(ca + &A 4 = df@, 4 = -d, f(r, PI = f(c~ + @, P> = da, P> = c, and so, (10-7), , Y = f(r, Pb - f(r, 4P., In particular, note that a! and 0 are necessarily linearly independent;, for,, if y = 0, then j(r, a) = j(r, 6) = 0., Let W be the two-dimensional, subspace spanned by o( and p. Let WI, be the set of all vectors 6 in V such that j(6, a) = j(6, p) = 0, that is, the, set of all 6 such that j(6, y) = 0 for every y in the subspace W. We claim, that V = W @ WI. For, let E be any vector in V, and, Y = f(% I+, 6=c--f., , - f(% 4P, , Then y is in W, and 6 is in WI, for, f(h, , 4 = f(e - f(% Pb +f(% 464, = f(% 4 + f(s 4m, 4, = 0, , and similarly j(8, p) = 0. Thus every t in V is of the form e = y + 6,, with y in W and 6 in WI. From (9-7) it is clear that W n WA = {0}, and, sov=, W@Wl., bilinear form on, Now the restriction of j to WL is a skew-symmetric, WI. This restriction may be the zero form. If it is not, there are vectors, a’ and p’ in WI sudh that j(Ly’, 0’) = 1. If we let W’ be the two-dimensional, subspace spanned by CJ and fl’, then we shall have, v=, , W@W’@Wo, , where Wo is the set of all vectors 6 in WI such that j(a’, S) = j(/3’, S) = 0., If the restriction of j to W. is not the zero form, we may select vectors, LY”, 0” in W. such that j(&‘, /3”) = 1, and continue., In the finite-dimensional, case it should be clear that we obtain a, finite sequence of pairs of vectors,, (%, , with the following, , Pl),, , properties:, , (a,, , Pd,, , . . . , (%, , h>
Page 385 :
Skew-Symmetric Bilinear Forms, , Sec. 10.3, , = 1, j = 1, * e * 7 lC., (a), (b) f(ai, aj) = f(P, Pj) = f(ai, Pi> = 0, i + j., (c) If Wj is the two-dimensional, subspace spanned by aj and pj, then, f(Ctj,, , pj), , v=, , w1@***@3Wk@WLl, , where every vector in Wo is ‘orthogonal’, tion off to Wo is the zero form., , to all aj and /3j, and the restric-, , Theorem, 6. Let V be an n-dimensional vector space over a subfield of, the complex numbers, and let f be a skew-symmetric bilinear form on V. Then, the rank r of f is even, and if r = 2k there is an ordered basis for V in which, the matrix of f is the direct sum of the (n - r) X (n - r) zero matrix and, k copies of the 2 X 2 matrix, , 0, r -1, , 1, 0’1, , Proof. Let al, pl, . . . , (Yk,& be vectors satisfying conditions (a),, (b), and (c) above. Let (71, . . . , rs} be any ordered basis for the subspace, WO. Then, 63 =, , bl,, , 01,, , a2,, , P2,, , . . . , (YE, Pk,, , 71,, , . . . , Y8), , is an ordered basis for V. From (a), (b), and (c) it is clear that the matrix, off in the ordered basis 63 is the direct sum of the (n - 21c) X (n - 2k), zero matrix and k copies of the 2 X 2 matrix, , [, , (10-8), , 0, -1, , 1, 0’, , 1, , Furthermore,, it is clear that the rank of this matrix,, off, is 2k. 1, , and hence the rank, , One consequence of the above is that if f is a non-degenerate,, skewsymmetric bilinear form on V, then the dimension of V must be even. If, dim V = 2k, there will be an ordered basis ((~1, &, . . . , (ok, /3k} for V such, that, , f(% 4 = f(P, Pi> = 0., The matrix of f in this ordered basis is the direct sum of k copies of the, 2 X 2 skew-symmetric, matrix (10-8). We obtain another standard form, for the matrix of a non-degenerate, skew-symmetric, form if, instead of the, ordered basis above, we consider the ordered basis, { al,, , . . .,, , ak,, , Pk,, , . . .,, , Pll, , ., , 377
Page 386 :
378, , Bilinear Forms, , Chap. 10, , The reader should find it easy to verify, ordered basis has the block form, , that the matrix, , off, , in the latter, , where J is the k X k matrix, , Exercises, 1. Let V be a vector space over a field F. Show that the set of all skew-symmetric, bilinear forms on V is a subspace of L(V, V, F)., 2. Find all skew-symmetric, 3. Find, , forms on R3., , bilinear, , a basis for the space of all skew-symmetric, , bilinear, , forms on Rn., , 4. Let j be a symmetric, bilinear, form on Cm and g a skew-symmetric, form on Cn. Suppose f + g = 0. Show that j = g = 0., 5. Let V be an *dimensional, following., , vector, , space over a subfield, , F of C. Prove, , (P~)((Y, 0) = U((Y, /3) - +j(p, ar) defines a linear, , (a) The equation, on L(V, V, F)., , bilinear, the, , operator, , P, , (b) P* = P, i.e., P is a projection., n(n (c) rank P = p;, , 1), , operato, on V thf equation, (Utj)(a,, Ut on L(V, V, F\., operator U, the projection, P commutes, , (d) If U is a lineir, defines a linear operator, (e) For every linear, 6. Prove, symmetric, , an analogue of Exercise, bilinear forms., , 7. Let j be a bilinear, , V into V* associated, only if L, = -R,., 8. Prove, , P = n(n, p. + 1), , nullity, , 11 in Section, , /?I) = j( Ua, U/3), with, , Ut., , 10.2 for non-degenerate,, , skew-, , form on a vector space V. Let I+ and RI be the mappings of, with j in Section 10.1. Prove that j is skew-symmetric, if and, , an analogue, , of Exercise, , 9. Let V be a finite-dimensional, Show that the equation, f(a,, , defines a skew-symmetric, are linearly dependent., , P), , 17 in Section, vector, , =, , bilinear, , Lk4L@), , 10.2 for skew-symmetric, , space and L1, Lz linear, , -, , functionals, , forms., on V., , Luwz(4, , form on V. Show that j = 0 if and only if L1, LZ, , vector space over a subfield of the complex, 10. Let V be a finite-dimensional, numbers and f a skew-symmetric bilinear form on V. Show hat f has rank 2 if
Page 387 :
Groups Preserving Bilinear Forms, , Sec. 10.4, and only if there exist linearly, , independent, , f(a,P), , =, , 11. Let f be any skew-symmetric, 4, Lz such that, functionals, f(%, , B), , L(4-w), , -, , bilinear, , =, , linear functionals, , Ll, Lz on I’ such that, , h@Wzbd., , form on R3. Prove that, , Ll(4L2@), , -, , 379, , there are linear, , Lu3)L2W., , 12. Let V be a finite-dimensional, vector space over a subfield of the complex, numbers, and let f, g be skew-symmetric, bilinear forms on V. Show that there is, T on V such that f(Tor, Tp) = g(q p) for all CY, /3, an invertible linear operator, if and only if f and g have the same rank., 13. Show that the result of Exercise 12 is valid for symmetric, bilinear forms on a, complex vector space, but is not valid for symmetric, bilinear forms on a real vector, space., , 10.4., , Groups, , Preserving, , Bilinear, , Forms, , Let f be a bilinear, form on the vector space V, and let T be a linear, f if f(Ta, TP) = f(a, 0) for, operator on V. We say that T preserves, all a, fi in V. For any T and f the function g, defined by g(cr, /3) = f(Ta, Tp),, is easily seen to be a bilinear form on V. To say that T preserves f is simply, to say g = f. The identity operator preserves every bilinear form. If X, and T are linear operators which preserve f, the product ST also preserves, f; for f(STq STP) = f(Tq, T/3) = f(cr, 0). In other words, the collection, of linear operators which preserve a given bilinear form is closed under, the formation, of (operator), products. In general, one cannot say much, more about this collection of operators; however, if f is non-degenerate,, we have the following., Theorem, 7. Let f be a non-degenerate, bilinear form on a finitedimensional vector space V. The set of all linear operators on V which preserve, f is a group under the operation of composition., , Proof. Let G be the set of linear operators preserving, f. We, observed that the identity operator is in G and that whenever X and T, are in G the composition ST is also in G. From the fact that f is nondegenerate, we shall prove that any operator T in G is invertible,, and, T-l is also in G. Suppose T preservesf., Let 01be a vector in the null space, of T. Then for any p in V we have, f(a, PI = f(Ta,, Since f is non-degenerate,, preserves f; for, f(T%,, , T-‘/3), , TP) = f(O, TP) = 0., , a = 0. Thus, = f(TT%,, , T is invertible., TT-l/3), , = f(a, /3)., , Clearly, 1, , T-l, , also
Page 388 :
380, , Bilinear Forms, , Chap. 10, , If f is a non-degenerate, bilinear form on the finite-dimensional, space, V, then each ordered basis @ for V determines a group of matrices, ‘preserving’ f. The set of all matrices [T]a, where T is a linear operator, There is an, preserving f, will be a group under matrix multiplication., alternative, description of this group of matrices, as follows. Let A = [f]@,, so that if o( and p are vectors in V with respective coordinate matrices X, and I’ relative to 03, we shall have, , f(a, 0) =, Let T be any linear operator, f(Ta,, , XtAY., , on V and M = [T]a. Then, T/?) = (MX)tA(MY), = X’(M’AM), Y., , Accordingly,, T preserves f if and only if M”AM = A. In matrix language, then, Theorem 7 says the following:, If A is an invertible, n X n matrix,, the set of all n X n matrices M such that MtAM = A is a group under, matrix multiplication., If A = [f]a, then M is in this group of matrices if, and only if M = [T]a, where T is a linear operator which preserves f., Before turning to some examples, let us make one further remark., Suppose f is a bilinear form which is symmetric. A linear operator T preserves f if and only if T preserves the quadratic form, d4 = f(% 4, associated with f. If T preserves f, we certainly, a(Td, , = f(Ta,, , for every (Yin V. Conversely,, , T4, , have, , = f(a, 4 = a(4, , since f is symmetric,, , the polarization, , identity, , f(% 0) = taca + PI - ida - P>, shows us that T preserves f provided that q(Ty) = q(r) for each y in V., (We are assuming, numbers.), EXAMPLE, , the bilinear, , here that the scalar field is a subfield, , of the complex, , 6. Let V be either the space R” or the space C”. Let, form, f(% P) = j,, , f, , be, , XjYj, , o( = (x1, . . . , z,) and 0 = (yl, . . . , y,). The group preserving f is, group., The, the n-dimensional, (real or complex) orthogonal, ‘orthogonal, group’ is more commonly applied to the associated, of matrices in the standard ordered basis. Since the matrix of f, standard basis is I, this group consists of the matrices M which, MtM = I. Such a matrix M is called an n X n (real or complex), orthogonal, matrix., The two n X n orthogonal groups are usually de-, , where, called, name, group, in the, satisfy
Page 389 :
Sec. 10.4, , Groups Preserving Bilinear Forms, , noted O(n, R) and O(n, C). Of course, the orthogonal, group which preserves the quadratic form, Qh, , group, , is also the, , . . . 3x,) = 2: + . . . + xi., , EXAMPLE 7. Let f be the symmetric, , bilinear, , form on Rn with quad-, , ratic form, * . . , x,), , !&l,, , =, , 5, j=l, , x;, , -, , 5, j=A-1, , xj”., , Then f is non-degenerate, and has signature 2p - n. The group of magroup., trices preserving a form of this type is called a pseudo-orthogonal, When p = n, we obtain the orthogonal group O(n, R) as a particular type, of pseudo-orthogonal, group. For each of the n + 1 values p = 0, 1, 2, . . . ,, n, we obtain different bilinear forms j; however, for p = k and p = n - k, the forms are negatives of one another and hence have the same associated, group. Thus, when n is odd, we have (n + 1)/2 pseudo-orthogonal, groups, of n X n matrices, and when n is even, we have (n + 2)/Z such groups., Theorem, 8. Let V be an n-dimensional, vector space over the Jield of, complex numbers, and let f be a non-degenerate symmetric bilinear form on V., Then the group preserving f is isomorphic to the complex orthogonal group, Oh 0., Proof. Of course, by an isomorphism, between two groups, we, mean a one-one correspondence between their elements which ‘preserves’, the group operation. Let G be the group of linear operators on V which, preserve the bilinear form f, Since f is both symmetric and non-degenerate,, Theorem 4 tells us that there is an ordered basis & for V in which f is, represented by the n X n identity matrix. Therefore, a linear operator T, preserves f if and only if its matrix in the ordered basis (B is a complex, orthogonal matrix. Hence, , is an isomorphism, , of G onto O(n, C)., , 1, , Theorem, 9. Let V be an n-dimensional, vector space over the field of, real numbers, and let f be a non-degenerate symmetric bilinear form on V., Then the group preserving f is isomorphic to an n X n pseudo-orthogonal, group., , Proof,, of Theorem 4., , Repeat the proof of Theorem, 1, , EXAMPLE 8. Let, , f, , be the symmetric, , 8, using Theorem, , bilinear, , 5 instead, , form on R4 with quad-, , ratic form, q(x, y, 2, t) = t2 - x2 - y2 - x2.
Page 390 :
Bilinear Forms, , Chap. 10, , A linear operator T on R4 which preserves this particular, bilinear (or, transformation,, and the group prequadratic) form is called a Lorentz, group., We should like to give one method, serving f is called the Lorentz, of describing some Lorentz transformations., Let H be the real vector space of all 2 X 2 complex matrices A which, are Hermitian,, 4 = 8*. It is easy to verify that, , 1, , t+x, y+iz, t-x, [ y-ix, defines an isomorphism @ of R4 onto the space H. Under this isomorphism,, the quadratic form q is carried onto the determinant, function, that is, wx, Y, 2, 2) =, , q(z, y, x, t) = det, , t + ’, y-ix, , ’ + ”, t-x, , or, , 1, , q(a) = det +(a)., This suggests that we might study Lorentz transformations, studying linear operators on H which preserve determinants., Let M be any complex 2 X 2 matrix and for a Hermitian, define, U,(A), = MAM*., , on R4 by, matrix, , A, , Now MAM”, is also Hermitian., From this it is easy to see that UM is a, (real) linear operator on H. Let us ask when it is true that UM ‘preserves’, determinants,, i.e., det [UM(A)] = det A for each A in H. Since the, determinant, of M* is the complex conjugate of the determinant, of M,, we see that, det [UM(A)] = ldet Ml2 det A., Thus U,v preserves determinants, exactly when det M has absolute value 1., So now let us select any 2 X 2 complex matrix M for which, ldet MI = 1. Then UM is a linear operator on H which preserves determinants. Define, TM = WVJ.&., Since @Jis an isomorphism,, TM is a linear operator, Lorentz transformation;, for, q(Twx), , =, =, =, =, =, , on R4. Also, TM is a, , a(@-‘U&a), det (~~-lU~~a), det (U&u), det (@a), a(4, , and so TM preserves the quadratic form q., By using specific 2 X 2 matrices M, one can use the method above, to compute specific Lorentz transformations., There are two comments, which we might make here; they are not difficult to verify.
Page 391 :
Sec. 10.4, , Groups Preserving Bilinear Forms, , (1) If MI and MrL are invertible 2 X 2 matrices with complex entries,, then UM, = UM, if and only if M2 is a scalar multiple of Ml. Thus, all of, the Lorentz transformations, exhibited above are obtainable, from unimodular matrices M, that is, from matrices M satisfying det M = 1. If, MI and M, are unimodular, matrices such that Mr # il/lz and Ml # -M,,, then TM, # TM,., by the above, (2) Not every Lorentz transformation, is obtainable, method., , Exercises, 1. Let M be a member of the complex orthogonal group, O(n, C). Show that Mt,, z, and Al* = &!’ also belong to O(n, C)., 2. Suppose M belongs to O(n, C) and that M’ is similar to M. Does M’ also, belong to O(n, C)?, 3. Let, , Yi = 5 MjkXk, k=l, , where M is a member of O(n, C). Show that, , 4. Let M be an n X n matrix over C with columns MI, Mz, . . . , M,. Show that, M belongs to O(n, C) if and only if, , 5. Let X be an n X 1 matrix over C. Under what conditions does O(n, C) contain, a matrix M whose first column is X?, 6. Find a matrix in 0(3, (7) whose first row is (Zi, Zi, 3)., 7. Let V be the space of all n X 1 matrices over C and f the bilinear form on V, given by f(X, Y) = XIY. Let Ab belong to O(n, C). What is the matrix off in the, basis of I’ consisting of the columns MI, M2, . . . , M, of M?, 8. Let X be an n X 1 matrix over C such that XtX = 1, and lj be the jth column, of the identity matrix. Show there is a matrix M in O(n, C) such that MX = Ii., If X has real entries, show there is an M in O(n, R) with the property that MX = Ii., 9. Let V be the, space of, and f the bilinear form on, under O(n, C), i.e., f(MX,, if and only if A commutes, , all n X 1 matrices over C, A an n X n matrix over C,, V given by f(X, Y) = XtAY. Show that f is invariant, MY) = f(X, Y) for all X, Y in V and M in O(n, C),, with each member of O(n, C)., , 10. Let S be any set of n X n matrices over C and S’ the set of all n X n matrices, over C which commute with each element of S. Show that S’ is an algebra over C.
Page 392 :
Bilinear, , Forms, , Chap., , 10, , 11. Let F be a subfield, non-singular, bilinear, that det T = fl., , of C, V a finite-dimensional, vector space over F, and f a, form on V. If T is a linear operator on V preserving f, prove, , 12. Let F be a subfield of C, V the space of n X 1 matrices over F, A an invertible, form on V given by f(X, Y) = XtA Y., n x n matrix over F, and f the bilinear, =, If M is an n X n matrix over F, show that M preservesf if and only if A-iAltA, M-1., 13. Let g be a non-singular, bilinear form on a finite-dimensional, vector space V., linear operator on V and that f is the bilinear, form, Suppose T is an invertible, on V given by ~(LY, p) = g(cr, Tp). If U is a linear operator on V, find necessary, and sufficient conditions for U to preserve f., 14. Let T be a linear, Show that, , operator, , on C2 which preserves, , the quadratic, , form xf - xg., , (a) det (T) = fl., (b) If M is the matrix of T in the standard basis, then Mzz = fMn, Mzl =, fMlz, MT, - M& = 1., (c) If det M = 1, then there is a non-zero complex number c such that, 1, , [ I*, L 1, c+;, , &f=;, , 1, c, , c-, , C-Ic+l,, , (d) If det M = - 1 then there is a complex, C,J, , M=;, , 15. Let f be the bilinear, Show that, , 4,, , -, , c such that, , 1, c, , c--, , -c, , form on C2 defined, f((Xl,, , number, , C, , -c+;, , c, , 1 ., c, , by, , (Yl, z/z)) = x1yz - XzT/l., , if T is a linear operator on C2, then f(Ta, T/3) = (det T)f(a, p) for all, C2., T preserves f if and only if det T = $1., M such that, What does (b) say about the group of 2 X 2 matrices, MtAM = A where, (a), a, p in, (b), (c), , A=, 16. Let n be a positive integer,, 2n X 2n matrix given by, , 1, , [ 1, -;, , :,a, , over C of the form, , M=, , A, C, , t, , ;?, , 1 the n X ?z identity, , J=, Let M be a 2n X 2n matrix, , [, , -;, , B, D, , [ 1, , matrix, , over C, and J the
Page 393 :
Sec. 10.4, , Groups Preserving, , where -4, B, C, D are n X n matrices over C. Find, ditions on A, B, C, D in order that MtJM, = J., , necessary, , Bilinear, , and sufficient, , Forms, con-, , 17. Find all bilinear forms on the space of n X 1 matrices, variant under O(n, R)., , over R which, , are in-, , 18. Find all bilinear forms, variant under O(n, C)., , over C which, , are in-, , on the space of n X 1 matrices, , 585
Page 394 :
A ppendix, , This Appendix separates logically into two parts. The first part,, comprising the first three sections, contains certain fundamental, concepts, which occur throughout, the book (indeed, throughout, mathematics)., It, is more in the nature of an introduction, for the book than an appendix., The second part is more genuinely an appendix to the text., Section 1 contains a discussion of sets, their unions and intersections., Section 2 discusses the concept of function, and the related ideas of range,, domain, inverse function, and the restriction of a function to a subset of, its domain. Section 3 treats equivalence relations. The material in these, three sections, especially that in Sections 1 and 2, is presented in a rather, concise manner. It is treated more as an agreement upon terminology, than as a detailed exposition. In a strict logical sense, this material constitutes a portion of the prerequisites for reading the book; however, the, reader should not be discouraged if he does not completely grasp the, significance of the ideas on his first reading. These ideas are important,, but the reader who is not too familiar with them should find it easier to, absorb them if he reviews the discussion from time to time while reading, the text proper., Sections 4 and 5 deal with equivalence, relations in the context of, linear algebra. Section 4 contains a brief discussion of quotient spaces., It can be read at any time after the first two or three chapters of the book., Section 5 takes a look at some of the equivalence relations which arise in, the book, attempting to indicate how some of the results in the book might, be interpreted, from the point of view of equivalence relations. Section 6, describes the Axiom of choice and its implications for linear algebra., 386
Page 395 :
sets, , Sec. A.1, , A.l., We shall use the words ‘set, ‘class,’ ‘collection,’ and ‘family’ interchangeably, although we give preference to ‘set.’ If S is a set and 2 is, of S, that 5 is an, an object in the set S, we shall say that x is a member, element, of S, that z belongs, to S, or simply that zr is in S. If X has, only a finite number of members, x1, . . . , zn, we shall often describe S, by displaying its members inside braces:, s=, Thus, the set X of positive, , {xl,...,xn}., , integers from 1 through, , 5 would be, , s = (1, 2, 3, 4, 5)., If S and T are sets, we say that S is a subset, of T, or that S is conin T, if each member of S is a member of T. Each set S is a subset, of itself. If S is a subset of T but S and T are not identical, we call S a, proper, subset, of T. In other words, S is a proper subset of T provided, that S is contained in T but T is not contained in S., of S and T is the set S u T, consisting, If S and T are sets, the union, of all objects 2 which are members of either S or T. The intersection, T is the set S n T, consisting of all x which are members of, of S and, both S and 7’. For any two sets, S and I’, the intersection S n T is a, subset of the union S U T. This should help to clarify the use of the word, ‘or’ which will prevail in this book. When we say that x is either in S or, in T, we do not preclude the possibility that x is in both S and T., In order that the intersection of S and T should always be a set, it, set, i.e., the set with no memis necessary that one introduce the empty, bers. Then S 0 7’ is the empty set if and only if S and T have no members, in common., We shall frequently need to discuss the union or intersection of several, tained, , is the set 6 Sj consisting of all, j=l, x which are members of at least one of the sets &, . . . , S,. Their intersets. If S1, . . . , S, are sets, their union, , section, , is the set h Sj, consisting, , of all x which are members, , of each of, , j=l, , the sets S1, . . . , S,. On a few occasions, we shall discuss the union or, intersection of an infinite collection of sets. It should be clear how such, unions and intersections are defined. The following example should clarify, these definitions and a notation for them., EXAMPLE, 1. Let R denote the set of all real numbers (the real line)., If t is in R, we associate with t a subset St of R, defined as follows: St, consists of all real numbers 2 which are not less than t., , 387, , Sets
Page 396 :
688, , Appendix, (a) St, U St, = St, where t is the smaller of tl and t2., (b) St, 0 St, = St, where t is the larger of tl and t2., (c), Let 1 be the unit interval, that is, the set of all t in R satisfying, 0 5 t 5 1. Then, u St = so, t in I, , n St = SI., t in I, A.2., , Functions, A function, , consists of the following:, , (1) a set X, called the domain of the function;, (2) a set Y, called the co-domain of the function;, (3) a rule (or correspondence) f, which associates with each element, J: of X a single element f(x) of Y., from, X, If (X, Y, f) is a function, we shall also say f is a function, Y. This is a bit sloppy, since it is not f which is the function; f is, the rule of the function. However, this use of the same symbol for the, function and its rule provides one with a much more tractable way of, speaking about functions. Thus we shall say that f is a function from X, into Y, that X is the domain of f, and that Y is the co-domain of f-all, this meaning that (X, Y, f) is a function as defined above. There are, several other words which are commonly used in place of the word ‘function.’ Some of these are ‘transformation,, ‘operator,’, and ‘mapping.’, These are used in contexts where they seem more suggestive in conveying, the role played by a particular function., If f is a function from X into Y, the range (or image), off is the set, of all f(z), x: in X. In other words, the range of f consists of all elements, y in Y such that y = f(z) for some Z.Zin X. If the range of f is all of Y,, from, X onto Y, or simply that f is onto. The, we say that f is a function, range of f is often denoted f(X)., into, , EXAMPLE, 2. (a) Let X be the set of real numbers, and let Y = X., Let f be the function from X into Y defined by f(x) = i2. The range of, f is the set of all non-negative, real numbers. Thus f is not onto., (b) Let X be the Euclidean plane, and Y = X. Let f be defined as, follows: If P is a point in the plane, then f(P) is the point obtained by, direcrotating P through 90” (about the origin, in the counterclockwise, tion). The range off is all of Y, i.e., the entire plane, and so f is onto., (c) Again let X be the Euclidean plane. Coordinatize X as in analytic, geometry, using two perpendicular, lines to identify the points of X with, ordered pairs of real numbers (51, x2). Let Y be the zl-axis, that is, all
Page 397 :
Functions, , Sec. A.2, , points (21, x2) with x2 = 0. If P is a point of X, let f(P) be the point, obtained by projecting P onto the xl-axis, parallel to the x2-axis. In other, words, f( (x1, x2)) = (x1,0). The range off is all of Y, and so f is onto., (d) Let X be the set of real numbers, and let Y be the set of positive, real numbers. Define a function f from X into Y by f(x) = ez. Then f is, a function from X onto Y., (e) Let X be the set of positive real numbers and Y the set of all real, numbers. Let f be the natural logarithm function, that is, the function, defined by f(x) = log IC = In x. Again f is onto, i.e., every real number, is the natural logarithm of some positive number., Suppose that X, Y, and 2 are sets, that f is a function from X into, Y, and that g is a function from Y into 2. There is associated with f and g, a function go f from X into 2, known as the composition, of g and f., It is defined by, (gof)(x), = df(x)>., For one simple example, let X = Y = 2, the set of real numbers;, f, g, h be the functions from X into X defined by, , f(x) = x2,, , g(x) = ez,, , let, , h(z) = eze, , and then h = g o f. The composition g o f is often denoted simply gf;, however, as the above simple example shows, there are times when this, may lead to confusion., One question of interest is the following. Suppose f is a function from, X into Y. When is there a function g from Y into X such that g(f(x)) = 2, function, on X, that is,, for each x in X? If we denote by I the identity, the function from X into X defined by I(x) = x, we are asking the following: When is there a function g from Y into X such that go f = I?, Roughly speaking, we want a function g which ‘sends each element of Y, back where it came from.’ In order for such a g to exist, f clearly must be, 1 :I, that is, f must have the property that if x1 # x2 then f(xl) # f(x2)., If f is l:l, such a g does exist. It is defined as follows: Let y be an element, of Y. If y is in the range of f, then there is an element x in X such that, y = f(x); and since f is l:l, there is exactly one such x. Define g(y) = x., If y is not in the range off, define g(y) to be any element of X. Clearly we, then have go f = I., if, Let f be a function from X into Y. We say that f is invertible, there is a function g from Y into X such that, (1) g o f is the identity, (2) fog is the identity, , function, function, , on X,, on Y., , We have just seen that if there is a g satisfying (I), then f is 1 :l. Similarly,, one can see that if there is a g satisfying (2), the range off is all of Y, i.e.,, f is onto. Thus, if f is invertible, f is 1:l and onto. Conversely, if f is 1:l, , 580
Page 398 :
390, , Appendix, , and onto, there is a function g from Y into X which satisfies (1) and (2)., Furthermore,, this g is unique. It is the function from Y into X defined by, this rule: if y is in Y, then g(y) is the one and only element z in X for, which j(z) = y., If j is invertible, (1:l and onto), the inverse of j is the unique function, j-I from Y into X satisfying, (1’) j-‘(j(z)), (2’) j(j-l(y)), , = I%,for each z in X,, = y, for each y in Y., , EXAMPLE 3. Let us look at the functions, , in Example, , 2., , (a) If X = Y, the set of real numbers, and j(z) = zz, then j is not, invertible., Forj is neither 1:l nor onto., (b) If X = Y, the Euclidean plane, and j is ‘rotation through go”,’, then j is both 1 :l and onto. The inverse function j-l is ‘rotation through, -go’,’ or ‘rotation through 270”.’, (c) If X is the plane, Y the zl-axis, and j(&, a)) = (x1, 0), then j is, not invertible., For, although j is onto, j is not 1 :l., (d) If X is the set of real numbers, Y the set of positive real numbers,, and j(z) = e5, then j is invertible. The function j-l is the natural logarithm, function of part (e): log e” = 2, el”gu = y., (e) The inverse of this natural logarithm function is the exponential, function of part (d)., Let j be a function from X into Y, and let j0 be a function from X0, of j (or a restriction of j to XiJ if, into Y,. We call fo a restriction, (1) X0 is a subset of X,, (2) jo(x) = j(x) for each z in X3., Of course, whenjo is a restriction of j, it follows that Yo is a subset of Y., The name ‘restriction’, comes from the fact that j and j0 have the same, rule, and differ chiefly because we have restricted the domain of definition, of the rule to the subset X0 of X., If we are given the function j and any subset X0 of X, there is an, obvious way to construct a restriction, of j to X0. We define a function, jo from X0 into Y by Jo, = j(z) f or each x in X0. One might wonder why, we do not call this the restriction, of j to X0. The reason is that in discussing restrictions of j we want the freedom to change the co-domain Y,, as well as the domain X., EXAMPLE 4. (a) Let X be the set of real numbers and j the function, from X into X defined by j(z) = x2. Then j is not an invertible, function,, but it is if we restrict its domain to the non-negative, real numbers. Let, X0 be the set of non-negative, real numbers, and let jo be the function, from X0 into X0 defined by jo(z) = x2. Then jo is a restriction of j to XC,.
Page 399 :
Equivalence, , Sec. A.3, , Relations, , Now f is neither 1:l nor onto, whereas f. is both 1:l and onto. The latter, statement simply says that each non-negative, number is the square of, exactly one non-negative, number. The inverse function fg’ is the function, from X0 into X0 defined by fi’(s), = 6., (b) Let X be the set of real numbers, and let f be the function from, X into X defined by f(s) = x3 + x2 + 1. The range of f is all of X, and, = f(0). But, so f is onto. The function f is certainly not l:l, e.g., f(-1), f is 1 :l on X0, the set of non-negative real numbers, because the derivative, off is positive for z > 0. As J: ranges over all non-negative, numbers, f(x), ranges over all real numbers y such that y 2 1. If we let Y0 be the set of, all y 2 1, and let fo be the function from X0 into Yo defined by fo(x) = f(x),, then fo is a 1:l function from X0 onto Yo. Accordingly, f. has an inverse, functionft, ’ from Yo onto X0. Any formula for f; ‘(y) is rather complicated., (c) Again let X be the set of real numbers, and let f be the sine function, that is, the function from X into X defined by f(x) = sin x. The, range of f is the set of all y such that - 1 5 y 5 1; hence, f is not onto., Since f(x + 27r) = f(x), we see that f is not 1 :l. If we let X0 be the interval, -7r/2 5 x 5 r/2, thenf is 1:l on X0. Let Y, be the interval -1 _< y 5 1,, and let f. be the function from X,, into Y0 defined by f,,(x) = sin 2. Then, f. is a restriction off to the interval X0, and f. is both 1 :l and onto. This, is just another way of saying that, on the interval from --r/2 to 1r/2,, the sine function takes each value between -1 and 1 exactly once. The, function fil is the inverse sine function:, f;‘(y), , = sin-l y = arc sin y., , (d) This is a general example of a restriction, of a function. It is, much more typical of the type of restriction we shall use in this book, than are the examples in (b) and (c) above. The example in (a) is a special, case of this one. Let X be a set and f a function from X into itself. Let X0, be a subset of X. We say that X0 is invariant, under, f if for each x in X0, the element f(x) is in X0. If X0 is invariant, under f, then f induces a function f,, from X0 into itself, by restricting the domain of its definition to X0., The importance of invariance is that by restricting f to X0 we can obtain, a function from X0 into itself, rather than simply a function from X0, into X., , A.3., , Equivalence, , Relations, , An equivalence relation is a specific type of relation between pairs, of elements in a set. To define an equivalence relation, we must first decide, what a ‘relation’ is., Certainly a formal definition, of ‘relation’ ought to encompass such, familiar relations as ‘5 = y,’ ‘x < y, ’ ‘z is the mother of y,’ and ‘5 is, , 391
Page 400 :
Appendix, , older than y.’ If X is a set, what does it take to determine a relation between pairs of elements of X? What it takes, evidently, is a rule for determining whether, for any two given elements x and y in X, x stands in, the given relationship to y or not. Such a rule R, we shall call a (binary), relation, on X. If we wish to be slightly more precise, we may proceed, as follows. Let X X X denote the set of all ordered pairs (x, y) of elements, of X. A binary relation on X is a function R from X X X into the set, (0, l}. In other words, R assigns to each ordered pair (z, y) either a 1 or, a 0. The idea is that if R(x, y) = 1, then x stands in the given relationship, to y, and if R(x, y) = 0, it does not., If R is a binary relation on the set X, it is convenient to write xRy, when R(x, y) = 1. A binary relation R is called, (1) reflexive,, (2) symmetric,, (3) transitive,, An equivalence, binary relation, , if xRx for each z in X;, if yRx whenever xRy;, if xRz whenever xRy and yRx., on X is a reflexive,, , relation, , symmetric,, , and transitive, , on X., , EXAMPLE 5. (a) On any set, equality is an equivalence relation. In, other words, if xRy means x = y, then R is an equivalence relation. For,, x = x, if x = y then y = x, if x = y and y = z then x = z. The relation, ‘x # y’ is symmetric, but neither reflexive nor transitive., (b) Let X be the set of real numbers, and suppose xRy means x < y., but it is neither, Then R is not an equivalence relation. It is transitive,, reflexive nor symmetric. The relation ‘11:5 y’ is reflexive and transitive,, but not symmetric., (c) Let E be the Euclidean plane, and let X be the set of all triangles, in the plane E. Then congruence is an equivalence relation on X, that is,, ‘Tl z T,’ (T, is congruent to Tz) is an equivalence, relation on the set of, all triangles in a plane., (d) Let X be the set of all integers:, . . . ) -2,, , -l,O,, , 1,2,, , . . . ., , Let n be a fixed positive integer. Define a relation R, on X by: XR,Y, if and only if (x - y) is divisible by n. The relation R, is called congruence, modulo, n. Instead of xR,y, one usually writes, x = y, mod n, , (x is congruent, , to y modulo, , n), , when (x - y) is divisible by n. For each positive integer n, congruence, modulo n is an equivalence relation on the set of integers., (e) Let X and Y be sets and f a function from X into Y. We define, a relation R on X by: xlRxz if and only if f(xl) = f(xz). It is easy to verify, that R is an equivalence relation on the set X. As we shall see, this one, example actually encompasses all equivalence relations.
Page 401 :
Equivalence Relations, , Sec. A.3, , Suppose R is an equivalence relation on the set X. If I(: is an element, of X, n-e let E(z; R) denote the set of all elements y in X such that xRy., class, of 2 (for the equivalence, This set E(z; R) is called the equivalence, relation R). Since R is an equivalence relation, the equivalence, classes, have the following properties:, (1) Each E(x; R) is non-empty; for, since xRx, the element x belongs, to E(x; R)., (2) Let x and y be (elements of X. Since R is symmetric, y belongs to, E(x; R) if and only if x belongs to E(y; R)., (3) If x and y are elements of X, the equivalence classes E(z; R) and, E(y; R) are either identical or they have no members in common. First,, suppose xRy. Let x be any element of E(z; R) i.e., an element of X such, that xRz. Since R is sy,mmetric, we also have zRx. By assumption xRy,, and because R is transitive,, we obtain xRy or yRz. This shows that any, member of E(z; R) is a member of E(y; E). By the symmetry of R, we, likewise see that any member of E(y; R) is a member of E(x; R); hence, B(x; R) = E(y; R). Now we argue that if the relation xRy does not hold,, then E(x; R) n E(y; R) is empty. For, if z is in both these equivalence, classes, we have xRz and yRz, thus xRz and zRy, thus xRy., If we let 5 be the family of equivalence classes for the equivalence, relation R, we see that (1) each set in the family 5 is non-empty, (2) each, element x of X belongs to one and only one of the sets in the family 5,, (3) xRy if and only if :r and y belong to the same set in the family 5., Briefly, the equivalence relation R subdivides X into the union of a family, of non-overlapping, (non-empty), subsets. The argument also goes in the, other direction. Suppose 5 is any family of subsets of X which satisfies, conditions (1) and (2) immediately, above. If we define a relation R by (3),, then R is an equivalence relation on X and 5 is the family of equivalence, classes for R., EXAMPLE, , equivalence, , 6. Let us see what the equivalence, relations in Example 5., , classes are for the, , (a) If R is equality on the set X, then the equivalence class of the, element x is simply the set {x>, whose only member is x., (b) If X is the set of all triangles in a plane, and R is the congruence, relation, about all one can say at the outset is that the equivalence class, of the triangle T consists of all triangles which are congruent to T. One of, the tasks of plane geometry is to give other descriptions of these equivalence, classes., (c) If X is the set of integers and R, is the relation ‘congruence, modulo n,’ then there are precisely n equivalence, classes. Each integer, x is uniquely expressible in the form x = pn + T, where p and r are integers, and 0 5 r 5 12 - 1. This shows that each x is congruent modulo n to, , SQS
Page 402 :
$94, , Appendix, exactly one of the n integers 0, 1, 2, . . . , n - 1. The equivalence, are, , classes, , I30 = {. . . ) -2n, --n, 0, n, 2n,. . .}, ICI = {. . .) 1 - 2n, 1 - n, 1 + n, 1 + 272, . . .}, =, El-, = (. . .,n-1-2n,n-1-n,n-1,72-l++,, n - 1 + 2n, . . .}., (d) Suppose X and Y are sets, f is a function from X into Y, and R, is the equivalence relation defined by: xlRxz if and only if f(zr) = f(5.J., The equivalence classes for R are just the largest subsets of X on which, ,f is ‘constant.’ Another description of the equivalence classes is this. They, are in 1 :l correspondence with the members of the range of f. If y is in, the range off, the set of all 2 in X such that f(z) = y is an equivalence, class for R; and this defines a 1:l correspondence between the members, of the range off and the equivalence classes of R., Let us make one more comment about equivalence relations. Given, an equivalence relation R on X, let 5 be the family of equivalence classes, for R. The association of the equivalence class E(x; R) with the element, z, defines a function f from X into 5 (indeed, onto 5):, f(x), , = E(x; R)., , This shows that R is the equivalence relation associated with a function, whose domain is X, as in Example 5(e). What this tells us is that every, equivalence relation on the set X is determined as follows. We have a rule, (function) f which associates with each element x of X an object f(x),, and xRy if and only if f(x) = f(y). Eow one should think of f(x) as some, property of x, so that what the equivalence relation does (roughly) is to, lump together all those elements of X which have this property in common. If the object f(x) is the equivalence class of 5, then all one has said, is that the common property of the members of an equivalence class is, that they belong to the same equivalence, class. Obviously this doesn’t, ;say much. Generally, there are many different functions f which determine the given equivalence, relation as above, and one objective in the, istudy of equivalence relations is to find such an f which gives a meaningful, and elementary description, of the equivalence, relation. In Section A.5, we shall see how this is accomplished for a few special equivalence relations which arise in linear algebra., A.4., , Quotient, , Spaces, , Let V be a vector space over the field F, and let W be a subspace of, V. There are, in general, many subspaces W’ which are complementary, to W, i.e., subspaces with the property that V = W @ W’. If we have
Page 403 :
Quotient Spaces, , Sec. A.4, , an inner product on V, and W is finite-dimensional,, there is a particular, subspace which one would probably, call the ‘natural’, complementary, subspace for W. This is the orthogonal complement of W. But, if V has, no structure in addition to its vector space structure, there is no way of, selecting a subspace W’ which one could call the natural complementary, subspace for W. However, one can construct from V and W a vector space, V/W, known as the ‘quotient’ of V and W, which will play the role of the, natural complement to W. This quotient space is not a subspace of V,, and so it cannot actually be a subspace complementary, to W; but, it is, a vector space defined only in terms of V and W, and has the property, that it is isomorphic to1 any subspace W’ which is complementary, to W., Let W be a subspace of the vector space V. If CYand 0 are vectors, in V, we say that a is congruent, to 0 modulo, W, if the vector (a - 0), is in the subspace W. If a! is congruent to /3 modulo W, we write, a 5%p,, Now congruence, , modulo, , mod W., , W is an equivalence, , relation, , on V., , (1) a! = o(, mod IV, because LY- cx = 0 is in W., (2) If a! = p, mod W, then 0 = CY,mod W. For, since W is a subspace, of V, the vector (CX- 6) is in W if and only if (P - o() is in W., (3) If o( = 6, mod W, and p = y, mod W, then (Y = y, mod W. For,, if (QI - p) and (/? - 7) are in W, then a! - y = (a - p) + 0 - 7) is in W., The equivalence classes for this equivalence, relation are known as, the cosets of IV. What is the equivalence, class (coset) of a vector or? It, consists of all vectors p in V such that (fi - (r) is in W, that is, all vectors, fi of the form /S’ = cy -+ y, with y in W. For this reason, the coset of the, vector CYis denoted by, a + w., It is appropriate, to think of the coset of LYrelative to W as the set of, vectors obtained by translating, the subspace W by the vector 01. To, picture these cosets, the reader might think of the following special case., Let V be the space R2, and let W be a one-dimensional, subspace of V., If we picture V as the Euclidean plane, W is a straight line through the, origin. If a! = (21, 5) is a vector in V, the coset Q + W is the straight line, which passes through the point (51, x2) and is parallel to W., The collection of all cosets of IV will be denoted by V/W. We now, define a vector addition and scalar multiplication, on V/W as follows:, (a + w> + (P + w> = (a + P) + w, c(a + W) = (ca) + w., In other words, the sum of the coset of o( and the coset of p is the coset of, (CY+ /3), and the product of the scalar c and the coset of 01is the coset of, the vector ccx Now many different vectors in V will have the same coset, , 895
Page 404 :
396, , Appendix, relative to W, and so we must verify, depend only upon the cosets involved., show the following:, , that the sum and product above, What this means is that we must, , (a) If a~= a’, mod W, and 0 = /Y, mod W, then, ac+p+ar’+p’,, , modW., , (2) If a! = a’, mod W, then CCY= CC/, mod W., These facts are easy to verify. (1) If (Y -- 00 is in W and /3 - /3’ is in, W, then since (01 + P) - (a - P’) = (CX- a’) + (0 - /?‘), we see that, 01+ p is congruent to a’ - 0’ modulo W. (2) If (Y - a’ is in W and c is, any scalar, then ccx - CCY’= C(CX- a!) is in W., It is now easy to verify that V/W, with i;he vector addition and scalar, multiplication, defined above, is a vector space over the field F. One must, directly check each of the axioms for a vector space. Each of the properties, of vector addition and scalar multiplication, follows from the corresponding, property of the operations in V. One comment should be made. The zero, vector in V/W will be the coset of the zero vector in V. In other words,, W is the zero vector in V/W., (or difference) of li, The vector space V/W is called the quotient, and W. There is a natural linear transformation, Q from V onto V/W., It is defined by Q(a) = a + W. One should see that we have defined, the operations in V/W just so that this transformation, Q would be linear., Note that the null space of Q is exactly the subspace W. We call Q the, quotient, transformation, (or quotient, mapping), of V onto V/W., The relation between the quotient space V/W and subspaces of V, which are complementary, to W can now be stated as follows., Theorem., Let W be a subspace of the vector space V, and let Q be the, quotient mapping of V onto V/W. Suppose W’ is a subspace of V. Then, V = W @ W’ if and only if the restriction qf Q to W’ is an isomorphism, of W’ onto V/W., , Proof. Suppose V = W @ W’. This means that each vector (Y in, V is uniquely expressible in the form CY= y + y’, with y in W and y’ in, W’. Then QCY= Q-y + Qr’ = Q-y’, that is (Y + W = y’ + W. This shows, that Q maps W’ onto V/W, i.e., that Q(W’) = V/W. Also Q is 1:l on W’;, for suppose 7: and $ are vectors in W’ and that Qr: = Q$. Then, Q(y: - 74) = 0 so that -& - 74 is in W. This vector is also in W’, which, is disjoint from W; hence 7; - 74 = 0. The restriction, of Q to W’ is, therefore a one-one linear transformation, of W’ onto V/W., Suppose W’ is a subspace of V such that Q is one-one on W’ and, Q(W’) = V/W. Let a be a vector in V. Thlen there is a vector y’ in W’, such that Qr’ = QCX,i.e., y’ + W = a! + W. This means that cy = y + y’, for some vector y in W. Therefore V = W -/- W’. To see that W and W’
Page 405 :
Equivalence Relations in Linear Algebra, , Sec. A.5, , are disjoint, suppose y is in both W and W’. Since y is in W, we have, Qr = 0. But Q is 13 on W’, and so it must be that y = 0. Thus we have, V=W@W’., 1, What this theorem really says is that W’ is complementary, to W if, and only if W’ is a subspace which contains exactly one element from each, coset of W. It shows that when V = W @ W’, the quotient mapping Q, ‘identifies’ W’ with V/W. Briefly (W @ W’)/W is isomorphic to W’ in, a ‘natural’ way., One rather obvious fact should be noted. If W is a subspace of the, finite-dimensional, vector space V, then, dim W + dim (V/W), One can see this from the above theorem., that what this dimension formula says is, nullity, , = dim V., Perhaps it is easier to observe, , (Q) + rank (Q) = dim V., , It is not our object here to give a detailed treatment, of quotient, spaces. But there is one fundamental, result which we should prove., Theorem., Let V a,nd Z be vector spaces over the jield F. Xuppose T is, a linear transformation, of V onto Z. If W is the null space of T, then Z is, isomorphic to V/W., , U(o( +, a + W, the null, happens, defined,, It, because, , Proof. We define a transformation, U from V/W into 2 by, W) = TCY. We must verify that U is well defined, i.e., that if, = p + W then Tel = T@. This follows from the fact that W is, space of T; for, cy + W = fi + W means a - fl is in W, and this, if and only if T(CX - 6) = 0. This shows not only that U is well, but also that lJ is one-one., is now easy to verify that U is linear and sends V/W onto Z,, T is a linear transformation, of V onto Z. 1, , A.5., , Equivalence, in, , Relations, Linear, , We shall consider some of the equivalence relations which, the text of this book. This is just a sampling of such relations., , Algebra, , arise in, , (1) Let m and n be positive integers and F a field. Let X be the set, is an equivalence, of all m X n matrices over F. Then row-equivalence, relation on the set X. The statement ‘A is row-equivalent, to B’ means, that A can be obtained from B by a finite succession of elementary row, operations. If we write A - B for A is row-equivalent, to B, then it is not, difficult to check the properties (i) A N A; (ii) if A N B, then B - A;, , 397
Page 406 :
598, , Appendix, (iii) if A N B and B - C, then A - C. What do we know about this, equivalence relation? Actually, we know a great deal. For example, we, m X m, know that A - B if and only if A = PB for some invertible, if and only if the homogeneous systems of linear, matrix P; or, A -B, equations AX = 0 and BX = 0 have the same solutions. We also have, very explicit information, about the equivalence classes for this relation., Each m X n matrix A is row-equivalent, to one and only one row-reduced, echelon matrix. What this says is that each equivalence class for this relation contains precisely one row-reduced echelon matrix R; the equivalence, class determined, by R consists of all matrices A = PR, where P is an, m X m matrix. One can also think of this description of the, invertible, equivalence classes in the following way. Given an m X n matrix A, we, have a rule (function) f which associates with A the row-reduced, echelon, to A. Row-equivalence, is completely, matrix f(A) which is row-equivalent, determined by f. For, A - B if and only if f(A) = f(B), i.e., if and only, echelon form., if A and B have the same row-reduced, (2) Let n be a positive integer and F a field. Let X be the set of all, n X n matrices over F. Then similarity is an equivalence relation on X;, each n X n matrix A is similar to itself; if A is similar to B, then B is, similar to A; if A is similar to B and B is similar to C, then A is similar to, C. We know quite a bit about this equivalence relation too. For example,, A is similar to B if and only if A and B represent the same linear operator, on Fn in (possibly) different ordered bases. But, we know something much, deeper than this. Each n X n matrix A over F is similar (over F) to one, and only one matrix which is in rational forrn (Chapter 7). In other words,, each equivalence class for the relation of similarity contains precisely one, matrix which is in rational form. A matrix in rational form is determined, by a Ic-tuple (pl, . . . , pk) of manic polynomials having the property that, pi+1 divides pj, j = 1, . . . , I%- 1. Thus, we have a function f which, associates with each n X n matrix A a L-tuple f(A) = (PI, . . . , pd, satisfying the divisibility, condition pi+l divides pi. And, A and B are, similar if and only if f(A) = f(B)., (3) Here is a special case of Example 2 above. Let X be the set of, 3 X 3 matrices over a field F. We consider the relation of similarity on X., If A and B are 3 X 3 matrices over F, then A and B are similar if and, only if they have the same characteristic polynomial and the same minimal, polynomial., Attached to each 3 X 3 matrix A, we have a pair (f, p) of, manic polynomials satisfying, , (4 deg.f = 3,, (b) p dividesf,, f being the characteristic polynomial for A, and p the minimal polynomial, for A. Given manic polynomials f and p over F which satisfy (a) and (b),, it is easy to exhibit a 3 X 3 matrix over F, having f and p as its charac-
Page 407 :
Sec. A.6, , The Axiom, , of Choice, , teristic and minimal polynomials,, respectively., What all this tells us is, the following. If we consider the relation of similarity on the set of 3 X 3, classes are in one-one correspondence, matrices over F, the equivalence, with ordered pairs (f, p) of manic polynomials, over F which satisfy (a), and (b)., , A.6., , The, , Axiom, , of Choice, , Loosely speaking, the Axiom of Choice is a rule (or principle), of, thinking which says that, given a family of non-empty sets, we can choose, one element out of each set. To be more precise, suppose that we have, an index set A and for each Q: in A we have an associated set S,, which is, non-empty., To ‘choose’ one member of each S, means to give a rule f, which associates with each (Y some element f(a) in the set S,. The axiom, of choice says that this is possible, i.e., given the family of sets {Se}, there, exists a function f from A into, u sa, m, such thatf(ar) is in X, for each a. This principle is accepted by most mathematicians, although many situations arise in which it is far from clear, how any explicit function f can be found., The Axiom of Choice has some startling consequences. Most of them, have little or no bearing on the subject matter of this book; however, one, consequence is worth mentioning:, Every vector space has a basis. For, example, the field of real numbers has a basis, as a vector space over the, field of rational numbers. In other words, there is a subset S of R which, is linearly independent, over the field of rationals and has the property, that each real number is a rational linear combination, of some finite, number of elements of S. We shall not stop to derive this vector space, result from the Axiom of Choice. For a proof, we refer the reader to the, book by Kelley in the bibliography., , 399
Page 408 :
Bibliography, , Halmos,, P., Finite-Dimensional, 1958., , Vector Spaces, ID. Van Nostrand, , Jacobson, N., Lectures in Abstract, 1953., , Algebra,, , Kelley,, , D. Van Nostrand, , MacLane,, , John L., General, S. and Birkhoff,, , Topology,, , G., Algebra,, , II,, , .D. Van Nostrand, Co., Princeton,, , The Macmillan, , van der Waerden,, B. L., Modern Algebra (two, Ungar Publishing, Co., New York, 1969., , 400, , Co., Princeton,, 195.5., , Co., New York,, , Schreier, 0. and Sperner, E., Introduction, to Modern Algebra, 2nd Ed., Chelsea Publishing, Co., New York., 1955., &, , Co., Princeton,, , volumes),, , and Matrix, Rev., , Ed.,, , 1967., Theory,, Frederick
Page 409 :
Index, , A, Adjoint:, classical, 148, 159, of transformation,, 295, Admissible sltbspace, 232, Algebra, 117, of formal power series, 119, self-adjoint,, 345, Algebraically, closed field, 138, Alternating, n-linear function,, 144, 169, Annihilator:, of subset, 101, of sum and intersection,, 106(Ex. 11), of vector (T-annihilator),, 201, 202, 228, Approximation,, 283, Associativity,, 1, of matrix multiplicat,ion,, 19, 90, of vector addition, 28, Augmented, matrix, 14, Axiom of choice, 400, , B, Basis, 41, change of, 92, dual, 99, 165, for module, 164, ordered, 50, orthonormal,, 281, standard basis of P, 41, , Bessel’s inequality,, 287, Bilinear form, 166, 320, 359, diagonalization, of, 370, group preserving,, 379, matrix of, 362, non-degenerate, (non-singular),, positive definite, 368, rank of, 365, signature of, 372, skew-symmetric,, 375, symmetric,, 367, , 365, , C, Cauchy-Schwars, inequality,, 278, Cayley-Hamilton, theorem, 194, 237, Cayley transform,, 309(Ex. 7), Characteristic:, of a field, 3, polynomial,, 183, space, 182, value,, , 182,, , 183, , vector, 182, Classical adjoint, 148, 159, Coefficients of polynomial,, 120, Cofactor,, 158, Column:, equivalence, 256, operations, 26, 256, rank, 72, 114, , 401
Page 410 :
402, , Index, Commutative:, algebra, 117, group, 83, ring, 140, Companion, matrix, 230, Complementary, subspace, 231, orthogonal,, 286, Composition,, 390, Conductor,, 201, 202, 232, Congruence,, 139, 393, 396, Conjugate, 271, transpose,, 272, Conjugation,, 276(Ex. 13), Coordinates,, 50, coordinate matrix, 51, Coset, 177, 396, Cramer’s, rule, 161, Cyclic:, decomposition, theorem, 233, subspace, 227, vector, 227, , D, Degree:, of multilinear, form, 166, of polynomial,, 119, Dependence,, linear, 40, 47, Derivative, of polynomial,, 129, 266, Determinant, function,, 144, existence of, 147, for linear transformations,, 172, uniqueness of, 152, Determinant, rank, 163(Ex. 9), Diagonalizable:, operator, 185, part of linear operator, 222, simultaneously,, 207, Diagonalization,, 204, 207, 216, of Hermitian, form, 323, of normal matrix (operator),, 317, of self-adjoint, matrix (operator),, 314, of symmetric, bilinear form, 370, unitary, 317, Differential, equations, 223(Ex. 14),, 249(Ex. 8), Dimension,, 44, formula, 46, Direct sum, 210, invariant,, 214, of matrices, 214, of operators,, 214, , Disjoint, , subspaces (see Independent:, spaces), Distance, 289(Ex. 4), Division with remainder,, 128, Dual:, basic, 99, 165, module, 165, space, 98, , E, Eigenvalue, (see Characteristic:, value), Elementary:, column operation, 26, 256, Jordan matrix, 245, matrix, 20, 253, row Dperation, 6, 252, Empty set, 388, Entries of a matrix, 6, Equiva,lence relation, 393, Equivalent, systems of equations, 4, Euclidean space, 277, Exterior, (wedge) product, 175, 177, , F, F” x n, 29, F”, :!9, Factorization, of polynomial,, 136, Factor.3, invariant,, 239, 261, Field, :2, algebraically, closed, 138, subfield, 2, Finite-dimensional,, 41, Finitely generated module, 165, Form:, alternating,, 169, bilinear, 166, 320, 359, Hermitian3 t- 3”3, matrix of, 322, multilinear,, 166, non-degenerate,, 324(Ex. 6), non-negative,, 325, normal, 257, 261, positive, 325, 328, quadratic,, 273, 368, r-linear, 166, raticlnal, 238, sesqJi-linear,, 320, Formal power series, 119, Free module, 164, , sub-
Page 411 :
Index, Function,, 389, determinant,, 144, identity, 390, inverse of, 391, invertible,, 390, linear, 67, 97, 291, multilinear,, 166, n-linear, 142, polynomial function,, range of, 389, restriction, of, 391, Fundamental, theorem, , Inner product (cont.):, quadratic form of, 273, space, 277, standard,, 271, 272, Integers, 2, positive, 2, Interpolation,, 124, Intersection,, 388, of subspaces, 36, Invariant:, direct sum, 214, factors of a matrix, 239, 261, subset, 392, subspace, 199, 206, 314, Inverse:, of function, 391, left, 22, of matrix, 22, 160, right, 22, two-sided,, 22, Invertible:, function, 390, linear transformation,, 79, matrix, 22, 160, Irreducible, polynomial,, 135, Isomorphism:, of inner product spaces, 299, of vector spaces, 84, , 30, , of algebra,, , 138, , G, Gram-Schmidt, process, 280, 287, Grassman ring, 180, Greatest common divisor, 133, Group, 82, commutative,, 83, general linear, 307, Lorentz, 382, orthogonal,, 380, preserving, a form, 379, pseudo-orthogonal,, 381, symmetric,, 153, , H, Hermitian, (see Self-adjoint), Hermitian, form, 323, Homogeneous, system of linear equations,, Hyperspace,, 101, 109, , J, 4, , Jordan, , form of matrix,, , 247, , K, I, Ideal, 131, principal ideal, 131, Idempotent, transformation, (see Projection), Identity:, element, 117, 140, function, 390, matrix, 9, resolution of, 337, 344, Independence,, linear, 40, 47, Independent:, linearly, 40, 47, subspaces, 209, Inner product, 271, matrix of, 274, , Kronecker, , delta, 9, , L, Lagrange interpolation, formula, 124, Laplace expansions,, 179, Left inverse, 22, Linear algebra, 117, Linear combination:, of equations, 4, of vectors, 31, Linear equations, (see System, of linear, equations), Linear functional,, 97, Linearly dependent (independent),, 40, 47, , 403
Page 412 :
404, , Index, Linear transformation, (operator),, 67, 76, adjoint of, 295, cyclic decomposition, of, 233, determinant, of, 172, diagonalizable,, 185, diagonalizable, part of, 222, invertible,, 79, matrix in orthonormal, basis, 293, matrix of, 87, 88, minimal polynomial, of, 191, nilpotent, 222, non-negative,, 329, 341, non-singular,, 79, normal, 312, nullity of, 71, orthogonal,, 303, polar decomposition, of, 343, positive, 329, product of, 76, quotient, 397, range of, 71, rank of, 71, self-adjoint,, 298, 314, semi-simple,, 263, trace of, 106(Ex. 15), transpose of, 112, triangulable,, 202, unitary,, 302, Lorentz:, group, 382, transformation,, 311(Ex. 15), 38’2, , M, Matrix,, 6, augmented,, 14, of bilinear form, 362, classical adjoint, of, 148, 159, coefficient, 6, cofactors, 158, companion,, 230, conjugate transpose, 272, coordinate, 51, elementary,, 20, 253, elementary,, Jordan, 245, of form, 322, identity,, 9, of inner product, 274, invariant, factors of, 239, 261, inverse of, 22, 160, invertible,, 22, 160, Jordan form of, 247., , Matrix, (cont.) :, of linear transformation,, 87, 88, lninimal polynomial, of, 191, nilpotent, 244, normal, 315, (orthogonal, 162(Ex. 4), 380, positive, 329, principal minors of, 326, product, 17, 90, rank of, 114, rational form of, 238, row rank of, 56, 72, 114, row-reduced,, 9, row-reduced, echelon, 11, 56, self-adjoint, (Hermitian),, 35, 314, similarity, of, 94, skew-symmetric,, 162(Ex. 3), 210, symmetric,, 35, 210, trace of, 98, transpose of, 114, triangular,, 155(Ex. 7), unitary, 163(Ex. 5), 303, upper-triangular,, 27, Vandermonde,, 125, zero, 12, Minimal polynomial,, 191, Module, 164, basis for, 164, dual, 165, finitely generated, 165, free, 164, rank of, 165, Manic polynomial,, 120, Multilinear, function (form), 166, degree of, 166, Multiplicity,, 130, , N, n-linear function, 142, alternating,, 144, 169, n-tuple, 29, Nilpotent:, matrix, 244, operator, 222, Non-degenerate:, bilinear form, 365, form, 324(Ex. 6), Non-negative:, form, 325, operator, 329, 341
Page 413 :
Index, Non-singular:, form (see Non-degenerate), linear transformation,, 79, Norm, 273, Normal:, form, 257, 261, matrix, 315, operator, 312, Nullity of linear transformation,, Null space, 71, Numbers:, complex, 2, rational, 3, real, 2, , 0, onto, 389, Operator, linear, 76, Ordered basis, 50, Orthogonal:, complement., 285, equivalence of matrices, 308, group, 380, linear transformation,, 304, matrix, 162(Ex. 4), 380, projection,, 285, set, 278, vectors, 278, 368, Orthogonalization,, 280, Orthonormal:, basis, 281, set, 278, , P, Parallelogram, law, 276(Ex. 9), Permutation,, 151, even, odd, 152, product of, 153, sign of, 152, Polar decomposition,, 343, Polarization, identities, 274, 368, Polynomial,, 119, characteristic,, 183, coefficients of, 120, degree of, 119, derivative, of, 129, 266, function, 30, irreducible, (prime), 135, minimal, 191, , 71, , Polynomial, (cont.) :, manic, 120, primary decomposition, of, 137, prime (irreducible),, 135, prime factorization, of, 136, reducible, 135, root of, 129, scalar, 120, zero of, 129, Positive:, form, 325, 328, integers, 2, matrix, 329, operator, 329, Positive definite, 368, Power series, 119, Primary, components,, 351, Primary, decomposition:, of polynomial,, 137, theorem, 220, Prime:, factorization, of polynomial,, 136, polynomial,, 135, Principal:, access theorem, 323, ideal, 131, minors, 326, Product:, exterior (wedge), 175, 177, of linear transformations,, 76, of matrices, 14, 90, of permutations,, 153, tensor, 168, Projection,, 211, Proper subset, 388, Pseudo-orthogonal, group, 381, , Q, Quadratic form,, Quotient:, space, 397, transformation,, , 273, 368, , 397, , R, Range, 71, Rank:, of bilinear form, 365, column, 72, 114, determinant,, 163(Ex. 9), , 405
Page 414 :
406, , Index, Rank (cont.) :, of linear transformation,, 71, of matrix, 114, of module, 165, row, 56, 72, 114, Rational form of matrix, 238, Reducible polynomial,, 135, Relation, 393, equivalence,, 393, Relatively, prime, 133, Resolution:, of the identity,, 337, 344, spectral, 336, 344, Restriction:, of function, 391, operator, 199, Right inverse, 22, Rigid motion, 310(Ex. 14), Ring, 140, Grassman,, 180, Root:, of family of operators,, 343, of polynomial,, 129, Rotation,, 54, 309(Ex. 4), Row:, operations,, 6, 252, rank, 56, 72, 114, space, 39, vectors, 38, Row-equivalence,, 7, 58, 253, summary, of, 55, Row-reduced, matrix, 9, row-reduced, echelon matrix., , S, ;Scalar, 2, polynomial,, 120, Self-adjoint:, algebra, 345, matrix, 35, 314, operator, 298, 314, Semi-simple operator, 263, Separating vector, 243(Ex. 14), Sequence of vectors, 47, Sesqui-linear, form, 320, Set 388, element of (member of), 388, empty, 388, Shuffle, 171, Signature, 372, Sign of permutation,, 152, , 11, 56, , Sj milar matrices, 94, Simultaneous:, diagonalization,, 207, triangulation,, 207, Skew-symmetric:, bilinear form, 375, matrix, 162(Ex. 3), 210, Solution space, 36, Spectral :, resolution,, 336, 344, theorem, 335, Spectrum,, 336, Square root, 341, Standard basis of F”, 41, Stuffer (das einstopfende, Ideal), 201, Subfield, 2, Submatrix,, 163CEx. 9), Subset, 388, invariant,, 392, proper, 388, Subspace, 34, annihilator, of, 101, complementary,, 231, cyclic, 227, independent subspaces, 209, invariant,, 199, 206, 314, orthogonal, complement of, 285, quotient by, 397, lspanned by, 36, :sum of subspaces, 37, T-admissible,, 232, !zero, 35, Sum :, (direct, 210, of subspaces, 37, Symmetric:, bilinear form, 367, group,, 153, matrix, 35, 210, System of linear equations, 3, homogeneous,, 4, , T, T-admissible, subspace, 232, T-annihilator,, 201, 202, 228, T-conductor,, 201, 202, 232, Ta,ylor’s formula, 129, 266, Te.?sor, 166, product, 168, Trace:, of linear transformation,, 106(Ex., of matrix, 98, , 15)
Page 415 :
Index, Transformation:, differentiation,, 67, linear, 67, 76, zero, 67, Transpose:, conjugate, 272, of linear transformation,, 112, of matrix, 114, Triangulable, linear transformation,, 316, Triangular, matrix, 155(Ex. 7), Triangulation,, 203, 207, 334, , U, Union, 388, Iinitary:, diagonalization,, 317, equivalence, of linear transformations,, 356, equivalence of matrices, 30s, matrix, 163(Ex. 5), 303, operator, 302, space, 277, transformation,, kq6, , Upper-triangular, , matrix,, , 27, , V, Vandermonde, matrix, 125, Vector space, 28, basis of, 41, dimension of, 44, finite dimensional,, 41, isomorphism, of, 84, of n-tuples, 29, of polynomial, functions,, 30, quotient of, 397, of solutions to linear equations,, subspace of, 34, , 36, , W, Wedge, , (exterior), , product,, , Z, Zero :, matrix, 12, of polynomial,, , 129, , 175, 177, , 407