# Matrices and Vectors

### 1.01 Linear transformations and vectors.

In a set of linear equations

η'_{1} = a_{11}η_{1} + a_{12}η_{2} + ..... + a_{1n}η_{n}

η'_{2} = a_{21}η_{1} + a_{22}η_{2} + ..... + a_{2n}η_{n}

................................................................................................................................

η'_{n} = a_{n1}η_{1} + a_{n2}η_{2} + ..... + a_{nn}η_{n}

or

$(1) \ \ \ \ \ \ \underline{\eta'_i = \sum \limits_{i=1}^n a_{ij}\eta_i \ \ \ \ \ \ (i = 1, 2, . . . ,n)}$

the quantities η_{1},η_{2},...., η_{n} may be regarded as the coordinates of a point P in n-space and the point P'(η'_{1}, η'_{2},....,η'_{n}) is then said to be derived from P by the *linear homogeneous transformation* (1). Or, in place of regarding the η's as the coordinates of a point we may look on them as the components of a vector y and consider (1) as defining an operation which transforms y into a new vector y'. We shall be concerned here with the properties of such transformations, sometimes considered abstractly as entities in themselves, and sometimes in conjunction with vectors.

To prevent misconceptions as to their meaning we shall now define a few terms which are probably already familiar to the reader. By a *scalar* or number we mean an element of the field in which all coefficients of transformations and vectors are supposed to lie; unless otherwise stated the reader may assume that a scalar is an ordinary number real or complex.

A *vector* of order n is defined as a set of n scalars (ξ_{1}, ξ_{2},...., ξ_{n}) given in a definite order. This set, regarded as a single entity, is denoted by a single symbol, say x, and we write

x = (ξ_{1}, ξ_{2},.....,ξ_{n}).

The scalars ξ_{1}, ξ_{2},....., ξ_{n} are called the *coordinates* or *components* of the vector. If y = (η_{1},η_{2},.....,η_{n}) is also a vector, we say that x = y if, and only if, corresponding coordinates are equal, that is, ξ_{i} = η_{I}, (i = 1, 2,......, n). The vector

z = (ζ_{1}, ζ_{2},...., ζ_{n}) = (ξ_{1} + η_{1}, ξ_{2} + η_{2},....., ξ_{n} + η_{n})

is called the sum of x and y and is written x + y; it is easily seen that the operation of addition so defined is commutative and associative, and it has a unique inverse if we agree to write 0 for the vector (0, 0,...., 0).

If ρ is a scalar, we shall write

ρx = xρ = (ρξ_{i}, ρξ_{2},...., ρξ_{n}).

This is the only kind of multiplication we shall use regularly in connection with vectors.

### 1.02 Linear dependence.

In this section we shall express in terms of vectors the familiar notions of linear dependence. If x_{1}, x_{2},....., x_{r} are vectors and ω_{1}, ω_{2},...., ω_{r} scalars, any vector of the form

(2) x = ω_{1}x_{1} + ω_{2}x_{2} + ..... + ω_{r}x_{r}

is said to be *linearly dependent* on x_{1}, x_{2},...., x_{r}; and these vectors are called linearly independent if an equation which is reducible to the form

0 = ω_{1}x_{1} + ω_{2}x_{2} + ..... + ω_{r}x_{r}

can only be true when each ω_{i} = 0. Geometrically the r vectors determine an r-dimensional subspace of the original n-space and, if x_{1}, x_{2},...., x_{r} are taken as the coordinate axes, ω_{1}, ω_{2},...., ω_{r} in (2) are the coordinates of x.

We shall call the totality of vectors x of the form (2) the *linear set* or *subspace* (x_{1}, x_{2},...., x_{r}) and, when x_{1}, x_{2},...., x_{r} are linearly independent, they are said to form a *basis* of the set. The number of elements in a basis of a set is called the *order* of the set.

Suppose now that (x_{1}, x_{2},...., x_{r}), (y_{1}, y_{2},...., y_{s}) are bases of the same linear set and assume s ≥ r. Since the x's form a basis, each y can be expressed in the form

(3) y_{i} = a_{i1}x_{1} + a_{i2}x_{2} + .... + a_{ir}x_{r} (i = 1, 2,...., s)

and, since the y'a form a basis, we may set

x_{i} = b_{i1}y_{1} + b_{i2}y_{2} + ..... + b_{is}y_{s} (i = 1, 2,...., r)

and therefore from (3)

(4) $y_i = \sum \limits_{j=1}^r a_{ij}x_j = \sum \limits_{j=1}^r a_{ij} \sum \limits_{k=1}^s b_{jk}y_k = \sum \limits_{k=1}^s c_{ik}y_k,$

where $c_{ik} = \sum \limits_{j=1}^r a_{ij}b_{jk}$ which may also be written

(5) $c_{ik} = \sum \limits_{j=1}^s a_{ij}b_{jk} \ \ \ \ \ (i = 1,2,...,s)$

if we agree to set a_{ij} = 0 when j > r. Since the y's are linearly independent, (4) can only hold true if c_{ii} = 1, c_{ik} = 0 (i ≠ k) so that the determinant |c_{ik}| = 1. But from the rule for forming the product of two determinants it follows from (5) that |c_{ik}| = | a_{ik}||b_{ik}| which implies (i) that |a_{ik}| ≠ 0 and (ii) that r = s, since otherwise |a_{ik}| contains the column a_{i,r+1} each element of which is 0. The order of a set is therefore independent of the basis chosen to represent it.

It follows readily from the theory of linear equations (or from §1.11 below) that, if |a_{ij}| ≠ 0 in (3), then these equations can be solved for the x's in terms of the y'sy so that the conditions established above are sufficient as well as necessary in order that the y's shall form a basis.

If e_{i} denotes the vector whose i-th coordinate is 1 and whose other coordinates are 0, we see immediately that we may write

x = ξ_{1}e_{1} + ξ_{2}e_{2} + .... + ξ_{n}e_{n}

in place of x = (ξ_{1}, ξ_{2},..., ξ_{n}). Hence e_{1}, e_{2},...., e_{n} form a basis of our n-space. We shall call this the *fundamental basis* and the individual vectors e_{i} the fundamental *unit vectors*.

If x_{1}, x_{2},....., x_{r}(r < n) is a basis of a subspace of order r, we can always find n — r vectors x_{r+1},...., x_{n} such that x_{1}, x_{2},...., x_{n} is a basis of the fundamental space. For, if x_{r+1} is any vector not lying in (x_{1}, x_{2},...., x_{r}), there cannot be any relation

ω_{1}x_{1} + ω_{2}x_{2} + .... + ω_{r}x_{r} + ω_{r+1}x_{r+1} = 0

in which ω_{r+1} ≠ 0 (in fact every ω must be 0) and hence the order of (x_{1}, x_{2},...., x_{r},x_{r+1}) is r + 1. Since the order of (e_{1}, e_{2},..., e_{n}) is n, a repetition of this process leads to a basis x_{1}, x_{2},..., x_{r},...., x_{n} of order n after a finite number of steps; a suitably chosen e_{i} may be taken for x_{r+1}. The (n — r)-space (x_{r+1},...., x_{n}) is said to be *complementary* to (x_{1}, x_{2},..., x_{r}); it is of course not unique.

### 1.03 Linear vector functions and matrices.

The set of linear equations given in §1.01, namely,

(6) $\underline{\eta'_i = \sum \limits_{j=1}^n a_{ij}\eta_j}$

define the vector y' = (η'_{2},.....,η'_{n}) as a linear homogeneous function of the coordinates of y = (η_{1}, η_{2},..., η_{n}) and in accordance with the usual functional notation it is natural to write y' = A(y); it is usual to omit the brackets and we therefore set in place of (6)

y' = Ay.

The function or operator A when regarded as a single entity is called a matrix; it is completely determined, relatively to the fundamental basis, when the n^{2} numbers a_{ij} are known, in much the same way as the vector y is determined by its coordinates. We call the a_{ij} the coordinates of A and write

(7) $A = \begin{Vmatrix}a_{11} & a_{12} & . . . & a_{1n} \\ a_{21} & a_{22} & . . . & a_{2n} \\ . . & . . & . . . & . \\ . . & . . & . . . & . \\ a_{n1} & a_{n2} & . . . & a_{nn} \end{Vmatrix}$

or, when convenient, A = ||a_{ij}||. It should be noted that in a_{ij} the first suffix denotes the row in which the coordinate occurs while the second gives the column.

If B = ||b_{ij}|| is a second matrix, y" = A(By) is a vector which is a linear vector homogeneous function of y, and from (6) we have

$\eta''_i = \sum \limits_{p=1}^n a_{ip} \sum \limits_{p=1}^n b_{pj} \eta_j = \sum \limits_{j=1}^n d_{ij} \eta_j,$

where

(8) $d_{ij} = \sum \limits_{p=1}^n a_{ip} b_{pj}$

The matrix D = ||d_{ij}|| is called the *product* of A into B and is written AB. The form of (8) should be carefully noted; in it each element of the i-th row of A is multiplied into the corresponding element of the j-th column of B and the terms so formed are added. Since the rows and columns are not interchangeable, AB is in general different from BA; for instance

$\begin{Vmatrix}1 & 0 \\ 2 & 1\end{Vmatrix} \ \ \begin{Vmatrix} a & b \\ c & d\end{Vmatrix} = \begin{Vmatrix} a & b \\ 2a + c & 2b + d\end{Vmatrix} \\ \begin{Vmatrix} a & b \\ c & d\end{Vmatrix} \ \ \begin{Vmatrix}1 & 0 \\ 2 & 1\end{Vmatrix} = \begin{Vmatrix} a + 2b & b \\ c + 2d & d\end{Vmatrix}$

The product defined by (8) is associative; for if C = ||c_{ij}||, the element in the i-th row and j-th column of (AB)C is

$\sum \limits_{q=1}^n (\sum \limits_{p=1}^n a_{ip} b_{pq}) c_{qj} = \sum \limits_{p=1}^n a_{ip} (\sum \limits_{q=1}^n b_{pq} c_{qj})$

and the term on the right is the (i, j) coordinate of A(BC).

If we add the vectors Ay and By, we get a vector whose i-th coordinate is (cf. (6))

$\eta'_i = \sum \limits_{j=1}^n a_{ij}\eta_j + \sum \limits_{j=1}^n b_{ij}\eta_j + \sum \limits_{j=1}^n c_{ij}\eta_j$

where c_{ij} = a_{ij} + b_{ij}. Hence Ay + By may be written Cy where C = ||c_{ij}||. We define C to be the *sum* of A and B and write C = A + B; two matrices are then added by adding corresponding coordinates just as in the case of vectors. It follows immediately from the definition of sum and product that

A + B = B + A,

(A + B) + C = A + (B + C),

A(B + C) = AB + AC, (B + C)A = BA + CA,

A(x + y) = Ax + Ay,

A, B, C being any matrices and x, y vectors. Also, if k is a scalar and we set y' = Ay, y" = ky', then

y" = ky' = kA(y) = A(ky)

or in terms of the coordinates

$\eta''_i = \sum \limits_{i} ka_{ij}\eta_j$

Hence kA may be interpreted as the matrix derived from A by multiplying each coordinate of A by k.

On the analogy of the unit vectors e» we now define the *fundamental unit matrices* e_{ij} (i, j = 1, 2,..., n). Here e_{ij} is the matrix whose coordinates are all 0 except the one in the i-th row and j-th column whose value is 1. Corresponding to the form ∑ξ_{i}e_{i} for a vector we then have

(9) $A = \sum \limits_{i,j = 1}^n a_{ij} e_{ij}$

Also from the definition of multiplication in (8)

(10) e_{ij}e_{jk} = e_{ik}, e_{ij}e_{pq} = 0, (j ≠ p)

a set of relations which might have been made the basis of the definition of the product of two matrices. It should be noted that it follows from the definition of e_{ij} that

(11) e_{ij}e_{j} = e_{i}, e_{ij}e_{k} = 0 (j ≠ k),

(12) $A e_k = \sum \limits_{i,j} a_{ij} e_{ij} e_k = \sum \limits_{i} a_{ik} e_i$

Hence the coordinates of Ae_{k} are the coordinates of A that lie in the k-th column.

### 1.04 Scalar matrices.

If k is a scalar, the matrix K defined by Ky = ky is called a *scalar matrix*; from (1) it follows that, if K = ||k_{ij}||, then k_{ii} = k (i = 1, 2,...., n), k_{ij} = 0 (i ≠ j). The scalar matrix for which k = 1 is called the identity matrix of order n; it is commonly denoted by I but, for reasons explained below, we shall here usually denote it by 1, or by 1_{n} if it is desired to indicate the order. When written at length we have

$I_n = \begin{Vmatrix} 1 & & & & & \\ & 1 & & & & \\ & & . & & & \\ & & & . & & \\ & & & & . & \\ & & & & & 1 \end{Vmatrix}, \ \ K = \begin{Vmatrix} k & & & & & \\ & k & & & & \\ & & . & & & \\ & & & . & & \\ & & & & . & \\ & & & & & k \end{Vmatrix}$

A convenient notation for the coordinates of the identity matrix was introduced by Kronecker: if δ_{ij} is the numerical function of the integers i, j defined by

(13) δ_{ii} = 1, δ_{ij} = 0 (i ≠ j)

then 1_{n} = ||δ_{ij}||. We shall use this Kronecker delta function in future without further comment.

**Theorem 1.** * Every matrix is commutative with a scalar matrix.*

Let k be the scalar and K = ||k_{ij}|| = ||kδ_{ijij}|| is any matrix, then from the definition of multiplication

$KA = || \sum \limits_p k_{ip} a_{pj}|| = || \sum \limits_p k \delta_{ip} a_{pj}|| = ||k a_{ij}|| \\ AK = || \sum \limits_p a_{ip} k_{pj}|| = || \sum \limits_p k a_{ip} \delta_{pj}|| = ||k a_{ij}||$

so that AK = KA.

If k and h are two scalars and K, H the corresponding scalar matrices, then K + H and KH are the scalar matrices corresponding to k + h and kh. Hence the *one-to-one* correspondence between scalars and scalar matrices is maintained under the operations of addition and multiplication, that is, the two sets are simply isomorphic with respect to these operations. So long therefore as we are concerned only with matrices of given order, there is no confusion introduced if we replace each scalar by its corresponding scalar matrix, just as in the theory of ordinary complex numbers, (a, b) = a + bi, the set of numbers of the form (a, 0) is identified with the real continuum. We shall therefore as a rule denote ||δ_{ij}|| by 1 and ||kδ_{ij}|| by k.

### 1.05 Powers of a matrix; adjoint matrices.

Positive integral powers of A = ||a_{ij}|| are readily defined by induction; thus

A^{2} = A • A, A^{3} = A • A^{2}, ...., A^{m} = A • A^{m-1}

With this definition it is clear that A^{r}A^{s} = A^{r+s} for any positive integers r, s. Negative powers, however, require more careful consideration.

Let the determinant formed from the array of coefficients of a matrix be denoted by

|A| = det.A

and let α_{qp} be the cofactor of a_{pq} in A, so that from the properties of determinants

(14) $\sum \limits_p a_{ip} \alpha_{pj} = |A| \delta_{ij} = \sum \limits_p \alpha_{ip} a_{pj} \ \ \ \ (i,j = 1, 2, ..., n)$

The matrix ||α_{ij}|| is called the *adjoint* of A and is denoted by adj A. In this notation (14) may be written

(15) A (adj A) = |A| = (adj A)A,

so that a matrix and its adjoint are commutative.

If |A| ≠ 0, we define A^{-1} by

(16) A^{-1} = |A|^{-1} adj A.

Negative integral powers are then defined by A^{-r} = (A^{-1})r; evidently A^{-r} = (A^{r})^{-1}. We also set A^{0} = 1, but it will appear later that a different interpretation must be given when |A| = 0. Since AB • B^{-1}A^{-1} = A • BB^{-1} • A^{-1} = AA^{-1} = 1, the reciprocal of the product AB is

(AB)^{-1} = B^{-1}A^{-1}

If A and B are matrices, the rule for multiplying determinants, when stated in our notation, becomes

|AB| = |A||B|.

In particular, if AB = 1, then |A||B| = 1; hence, if |A| = 0, there is no matrix B such that AB = 1 or BA = 1. The reader should notice that, if k is a scalar matrix of order n, then |k| = k^{n}.

If A = 0, A is said to be *singular*; if A ≠ 0, A is regular or *non-singular*. When A is regular, A^{-1} is the only solution of AX = 1 or of XA = 1. For, if AX = 1, then

A^{-1} = A^{-1} • 1 = A^{-1}AX = X.

If AX = 0, then either X = 0 or A is singular; for, if A^{-1} exists,

0 = A^{-1}Ax = X.

If A^{2} = A ≠ 0, then A is said to be *idempotent*, for example e_{11} and $\begin{Vmatrix} 4 & -2 \\ 6 & -3 \end{Vmatrix}$ are idempotent. A matrix a power of which is 0 is called *nilpotent*. If the lowest power of A which is 0 is A^{r}, r is called the *index* of A; for example, if A = e_{12} + e_{23} + e_{34}, then

A^{2} = e_{13} + e_{24}, A^{3} = e_{14}, A^{4} = 0,

so that the index of A in this case is 4.

### 1.06 The transverse of a matrix.

If A = ||a_{ij}|| the matrix ||a'_{ij}|| in which a'_{ij} = a_{ij} is called the *transverse* of A and is denoted by A'. For instance the transverse of

$\begin{Vmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{Vmatrix} \ is \ \begin{Vmatrix} a_{11} & a_{21} & a_{31} \\ a_{12} & a_{22} & a_{32} \\ a_{13} & a_{23} & a_{33} \end{Vmatrix}$

The transverse, then, is obtained by the interchange of corresponding rows and columns. It must be carefully noted that this definition is relative to a particular set of fundamental units and, if these are altered, the transverse must also be changed.

**Theorem 2.** * The transverse of a sum is the sum of the transverses of the separate terms, and the transverse of a product, is the product of the transverses of the separate factors in the reverse order.*

The proof of the first part of the theorem h immediate and is left to the reader. To prove the second it is sufficient to consider two factors. Let A = ||a_{ij}||, B = ||b_{ij}||, C = AB = ||c_{ij}|| and, as above, set a'_{ij} = a_{ji}, b'_{ij} = b_{ji},c'_{ij} = c_{ji}, then

$c'_{ij} = c_{ji} = \sum \limits_p a_{jp} b_{pi} = \sum \limits_p b'_{ip} a'_{pj}$

whence

(AB)' = C' = B'A'.

The proof for any number of factors follows by induction.

If A = A', A is said to be *symmetric* and, if A = —A', it is called skew-symmetric or *skew*. A scalar matrix k is symmetric and the transverse of kA is kA'.

**Theorem 3.** * Every matrix can be expressed uniquely as the sum of a symmetric and a skew matrix.*

For if A = B + C, B' = B, C' = -C, then A' = B' + C' = B - C and therefore

B = (A + A')/2, C = (A - A')/2.

Conversely 2A = (A + A') + (A — A') and A + A' is symmetric, A — A' skew.

### 1.07 Bilinear forms.

A scalar bilinear form in two variable vectors, x = ∑ξ_{i}e_{i}, y = ∑η_{i}e_{i}, is a function of the form

(17) $A(x,y) = \sum \limits_{i,j = 1}^n a_{ij} \xi_i \eta_j$

There is therefore a one-to-one correspondence between such forms and matrices, A = ||a_{ij}|| corresponding to A(x, y). The special form for which A = ||δ_{ij}|| = 1 is of very frequent occurrence and we shall denote it by S; it is convenient to omit the brackets and write simply

(18) Sxy = ξ_{1}η_{1} + ξ_{2}η_{2} + ..... + ξ_{n}η_{n}

and, because of the manner in which it appears in vector analysis, we shall call it the *scalar* of xy. Since S is symmetric, Sxy = Syx.

The function (17) can be conveniently expressed in terms of A and S; for we may write A(x, y) in the form

$A(x,y) = \sum \limits_{i = 1}^n \xi_i (\sum \limits_{j = 1}^n a_{ij} \eta_j) = Sx Ay$

It may also be written

$\sum \limits_{j = 1}^n (\sum \limits_{i = 1}^n a_{ij} \xi_i) \eta_j = S A'xy = Sy A'x;$

hence

(19) SxAy = SyA'x,

so that the form (17) is unaltered when x and y are interchanged if at the same time A is changed into A'. This gives another proof of Theorem 2. For

Sx(AB)'y = SyABx = SBxA'y = SxB'A'y,

which gives (AB)' = B'A' since x and y are independent variables.

### 1.0.8 Change of basis.

We shall now investigate more closely the effect of a change in the fundamental basis on the coordinates of a vector or matrix. If f_{1}, f_{2},...,f_{n} is a basis of our n-space, we have seen (§1.02) that the f's are linearly independent. Let

$(20) \ \ \ \ f_i = \sum \limits_{j = 1}^n p_{ji} e_j = P e_i, \ \ \ (i = 1,2,...,n)$

P = ||p_{ij}||.

Since the f's form a basis, the e's are linearly expressible in terms of them, say

$(21) \ \ \ \ e_i = \sum \limits_{j = 1}^n q_{ji} f_i,$

and, if Q = ||q_{ij}||, this may be written

$(22) \ \ \ \ e_i = \sum \limits_j q_{ji} \sum \limits_k p_{kj} e_k = PQ e_i \ \ \ \ (i = 1, 2, 3,..., n)$

Hence PQ = 1, which is only possible if |P| ≠ 0, Q = P^{-1}.

Conversely, if |P| ≠ 0, Q = P^{-1}, and f_{i} = Pe_{i} as in (20), then (22) holds and therefore also (21), that is, the e's, and therefore also any vector x, are linearly expressible in terms of the f's. We have therefore the following theorem.

**Theorem 4.** * If f _{i} = Pe_{i} (i = 1, 2,...., n), the vectors f_{i} form a basis if, and only if |P| ≠ 0.*

If we have fewer than n vectors, say f_{1}, f_{2}, ....,f_{r}, we have seen in 1.02 that we can choose f_{r+1},...., f_{n} so that f_{1}, f_{2},...., f_{n} form a basis. Hence

**Theorem 5.** * If f _{1},f_{2},....,f_{r} are linearly independent, there exists at least one non-singular matrix P such that Pe_{i} = f_{i}; (i = 1, 2,...., r).*

We shall now determine how the form Sxy which was defined relatively to the fundamental basis, is altered by a change of basis. As above let

(23) f_{i} = Pe_{i}, e_{i} = P^{-1}f_{i} = Qf_{i}, |P| ≠ 0, (i = 1, 2,...., n)

be a basis and

x = ∑ξ_{i}e_{i} = ∑ξ'_{i}f_{i}, y = ∑η_{i}e_{i} = ∑η'_{i}f_{i}

variable vectors; then from (23)

x = Q∑ξ_{i}f_{i} = P∑ξ'_{i}e_{i}, y = Q∑η_{i}f_{i} = P∑η'_{i}e_{i}

and

∑ξ'_{i}e_{i} = P^{-1}x = Qx, ∑η'_{i}e_{i} = Qy.

Let us set temporarily S_{e}xy for Sxy and also put S_{f}xy = ∑ξ'_{i}η'_{i}, the corresponding form with reference to the new basis; then

(24) S_{f}xy = S_{e}QxQy = S_{e}xQ'Qy S_{e}xy = S_{f}PxPy.

Consider now a matrix A = ||a_{ij}|| defined relatively to the fundamental basis and let A_{1} be the matrix which has the same coordinates when expressed in terms of the new basis as A has in the old. From the definition of A and from ξ_{i} = S_{e}e_{j}X we have

$Ax = \sum \limits_{i,j} a_{ij} \xi_j e_i = \sum \limits_{i,j} a_{ij} e_i S_e e_j x$

and hence

(25) A_{1} = ∑a_{ij}ξ'_{i}f_{i} = ∑ a_{ij}f_{i}S_{f}f_{j}x = ∑a_{ij}Q^{-1}e_{i}S_{e}Qf_{j}Qx = Q^{-1}∑a_{ij}e_{i}S_{s}e_{j}Qx = Q^{-1}AQx

We have therefore, remembering that Q = P^{-1},

**Theorem 6.** * If f _{i} = Pe_{i}; (i = 1, 2,...., n) is a basis and A any matrix, the matrix PAP^{-1} has the same coordinates when expressed in terms of this basis as A has in terms of the fundamental basis.*

The matrix Q^{-1}AQ is said to be *similar* to A and to be the *transform* of A by Q. Obviously the transform of a product (sum) is the product (sum) of the transforms of the individual factors (terms) with the order unaltered. For instance Q^{-1}ABQ = Q^{-1}AQ • Q^{-1}BQ.

Theorem 6 gives the transformation of the'matric units e_{ij} defined in §1.03 which corresponds to the vector transformation (23); the result is that, if f_{ij} is the unit in the new system corresponding to e_{ij}, then

f_{ij} = Pe_{ij}P^{-1}

which is readily verified by setting

A = e_{ij} = e_{i}S_{e}e_{j}( ), A_{1} = f_{ij} = f_{i}S_{f}f_{i}( )

in (25). The effect of the change of basis on the form of the transverse is found as follows. Let A* be defined by

S_{f}xAy = S_{f}yA*x;

then

S_{f}yA*x = S_{f}xAy = S_{e}QxQAy = S_{e}xQ'QAy = S_{e}Qy(Q')A'Q'Qx

:= S_{f}y(Q'Q)A'Q'Qx.

Hence

(26) A* = (Q'Q)A'Q'Q.

### 1.09 Reciprocal and orthogonal bases.

With the same notation as in the previous section we have S_{f}f_{i}f_{j} = 0 (i ≠ j), S_{f}f_{i}f_{j} = 1. Hence

δ_{ij} = S_{f}f_{i}f_{j} = S_{e}Qf,sub>iQf_{j} = S_{e}f_{i}Q'Qf_{j}.

If, therefore, we set

(27) f'_{i}Q'Qf_{i} (j= 1,2, .... n),

we have, on omitting the subscript e in S_{e},

(28) Sf_{i}f'_{j} = δ_{ij} (i,j = 1,2,...., n).

Since |Q'Q| ≠ 0, the vectors f'_{1}, f'_{2},..., f'_{n} form a basis which we say is *reciprocal* to f_{1}, f_{2},.....,f_{n}. This definition is of course relative to the fundamental basis since it depends on the function S but, apart from this the basis (f'_{i}) is uniquely defined when the basis (f_{i}) is given since the vectors f_{i} determine P and Q = P^{-1}.

The relation between (f'_{i}) and (f_{i}) is a reciprocal one; for

f'_{j} = Q'Qf_{j} = Q'QPe_{j} = Q'e_{j},

and, if R = (Q')^{-1} we have f_{j} = R'Rf'_{j}.

If only the set (f_{1}, f_{2},...., f_{r}) is supposed given originally, and this set of linearly independent vectors is extended by f_{r+1},...., f_{n} to form a basis of the n-space, then f'_{r+1},...., f'_{n} individually depend on the choice of f_{r+1},...., f_{n}. But (28) shows that, if Sf_{i}x = 0 (i = 1, 2,...., r), then x belongs to the linear set (f'_{r+1},....,f'_{n}); hence this linear set is uniquely determined although the individual members of its basis are not. We may therefore without ambiguity call ℑ' = (f'_{r+1},...., f'_{n}) reciprocal to ℑ = (f_{1},f_{2},...., f_{r}); ℑ' is then the set of all vectors x for which Sxy = 0 whenever y belongs to ℑ.

In a later chapter we shall require the following lemma.

**Lemma 1.** * If (f _{1}, f_{2},...., f_{r}) and (f'_{r+1},.....,f'_{n}) are reciprocal, so also are (B^{-1}f_{1}, B^{-1}f_{2},...., B^{-1}f_{r}) and (B'f'_{r+1}, B'f'_{r+2},....., B'f'_{n}) wtare B is any non-singular matrix. *

For SB'f'_{i}tB^{-1}f_{j} = Sf'_{i}BB^{-1}f_{j}, = Sf'_{i}f_{j} = δ_{ij}.

Reciprocal bases have a close connection with reciprocal or inverse matrices in terms of which they might have been defined. If P is non-singular and Pe_{i} = f_{i} as above, then P = ∑f_{i}Se_{i}( ) and, if Q = ∑e_{i}Sf'_{i}( ), then

PQ = ∑e_{i}Sf'_{i}f_{j}Se_{j}( ) = ∑δ_{ij}e_{j}Se_{j}( ) = 1

so that Q = P^{-1}.

If QQ' = 1, the bases (f_{i}) and (f'_{i}) are identical and Sf_{i}f_{j}= δ_{ij} for all i and j; the basis is then said to be *orthogonal* as is also the matrix Q. The inverse of an orthogonal matrix and the product of two or more orthogonal matrices are orthogonal; for, if RR' = 1,

(RQ)(RQ)' = RQQ'R' = RR' = 1.

Suppose that h_{1}, h_{2},....., h_{r} are real vectors which are linearly independent and for which Sh_{i}h_{j} = δ_{ij} (i ≠ j); since h_{i} is real, we have Sh_{i}h_{i} ≠ 0. If r < n, we can always find a real vector x which is not in the linear set (h_{1},....., h_{r}) and, if we put

$h_{r + 1} = x - \sum \limits_1^r \frac{h_i S h_i x}{S h_i h_i}$

then h_{r+1} ≠ 0 and Sh_{i}h_{r+1} = 0 (i = 1, 2,......, r). Hence we can extend the original set to form a basis of the fundamental n-space. If we set f_{i} = h_{i}/(Sh_{i}h_{i})^{i} then Sf_{i}f_{j} = δ_{ij} even when i = j, this modified basis is called an *orthogonal basis* of the set.

If the vectors h_{i} are not necessarily real, it is not evident that x can be chosen so that Sh_{r+1}h_{r+1} ≠ 0 when Sh_{i}h_{i} ≠ 0 (i = 1, 2,...., r). This may be shown as follows. In the first place we cannot have Syh_{r+1} = 0 for every y, and hence Sh_{r+1}h_{r+1} ≠ 0 when r = n — 1. Suppose now that for every choice of x we have Sh_{r+1}h_{r+1} = 0; we can then choose a basis h_{r+1},...., h_{n} supplementary to h_{1},...., h_{r} such that Sh_{i}h_{i} = 0 (i = r + 1,...., n) and Sh_{i}h_{j} = 0 (i = r + 1, ...., n; j = 1, 2,....., r). Since we cannot have Sh_{r+1}h_{i} = 0 for every h_{i} of the basis of the n-space, this scalar must be different from 0 for some value of i > r, say r + k. If we then put h'_{r+1} = h_{r+1} + h_{r+k}; in place of h_{r+1}, we have Sh_{i}h'_{r+1} = 0 (i = 1, 2,...., r) as before and also

Sh'_{r+1}h'_{r+1} = Sh_{r+1}h_{r+1} + Sh_{r+k}h_{r+k} + 2Sh_{r+1}h_{r+k}

= 2Sh_{r+1}h_{r+k} ≠ 0.

We can therefore extend the basis in the manner indicated for real vectors even when the vectors are complex.

When complex coordinates are in question the following lemma is useful; it contains the case discussed above when the vectors used are real.

**Lemma 2.** * When a linear set of order r is given, it is always possible to choose a basis g _{1}, g_{2},...., g_{n} of the fundamental space such that g_{1},...., g_{r} is a basis of the given set and such that Sg_{i}g^_{j} = δ_{ij} where g^_{j} is the vector whose coordinates are the conjugates of the coordinates of g_{j} when expressed in terms of the fundamental basis. *

The proof is a slight modification of the one already given for the real case. Suppose that g_{1},....., g_{s} are chosen so that Sg_{i}g^{^}_{i} = δ_{ij} (i, j = 1, 2,....,s) and such that (g_{1},...., g_{s}) lies in the given set when s < r and when s > r, then g_{1},...., g_{r} is a basis of this set. We now put

$g'_{s + 1} = x - \sum \limits_1^s \frac{g_i S g_i x}{S \bar{g}_i g_i}$

which is not 0 provided x is not in (g_{1},...., g_{s}) and, if s < r, will lie in the given set provided x does. We may then put

g_{s+1} = g'_{s+1}/(Sg'_{s+1}g^{^}_{s+1})^{1/2}

and the lemma follows readily, by induction.

If U is the matrix ∑e_{i}Sg_{i}, then U^{^} = ∑e_{i}Sg^{^}_{i} and

(29) UU^{^}' = 1.

Such a matrix is called a *unitary matrix* and the basis g_{1}, g_{2},....., g_{n} is called a unitary basis. A real unitary matrix is of course orthogonal.

### 1.10 The rank of a matrix.

Let A = ||a_{ij}|| be a matrix and set (cf. (12) §1.03)

h_{i} = Ae_{i} = a_{ji}e_{j};

then, if

x = ∑ξ_{i}e_{i} = ∑e_{i}Se_{i}x

is any vector, we have

Ax = A∑e_{i}Se_{i}x = ∑Ae_{i}Se_{i}x

or

$(30) \ \ \ \ \ Ax = \sum \limits_1^n h_i S e_i x$

Any expression of the form $Ax = \sum \limits_1^m a_i S b_i x$, where a_{i}, b_{i} are constant vectors, is a linear homogeneous vector function of x. Here (30) shows that it is never necessary to take m > n, but it is sometimes convenient to do so. When we are interested mainly in the matrix and not in x, we may write A = ∑a_{i}Sb_{i}( ) or, omitting the brackets, merely

(31) A = ∑a_{i}Sb_{i}.

It follows readily from the definition of the transverse that

(32) A' = ∑b_{i}Sa_{i}.

No matter what vector x is, Ax, being equal to ∑a_{i}Sb_{i}x is linearly dependent on a_{1}, a_{2},..., a_{m} or, if the form (30) is used, on h_{1}, h_{2},...., h_{n}. When |A| ≠ 0, we have seen in Theorem 4 that the h's are linearly independent but, if A is singular, there are linear relations connecting them, and the order of the linear set (a_{1}, a_{2},...., a_{m}) is less than n.

Suppose in (31) that the a'a are not linearly independent, say

a_{s} = α_{1}a1 + α_{2}a2 + ..... + α_{s-1}as-1,

then on substituting this value of a_{s} in (31) we have

$A = a_1 S (b_1 + \alpha_1 b_s) + ... + a_{s-1} S (b_{s-1} + \alpha_{s-1} b_s) + \sum \limits_{s+1}^m a_i S b_i,$

an expression similar to (31) but having at least one term less. A similar reduction can be carried out if the b's are not linearly independent. After a finite number of repetitions of this process we shall finally reach a form

$(33) \ \ \ \ \ \ A =\sum \limits_1^r c_i S d_i$

in which c_{1}, c_{2},..., c_{r} are linearly independent and also d_{1}, d_{2},..., d_{r}. The integer r is called the *rank* of A.

It is clear that the value of r is independent of the manner in which the reduction to the form (33) is carried out since it is the order of the linear set (Ae_{1}, Ae_{2},...., Ae_{n}). We shall, however, give a proof of this which inci-dently yields some important information regarding the nature of A.

Suppose that by any method we have arrived at two forms of A

$A =\sum \limits_1^r c_i S d_i = \sum \limits_1^s p_i S q_i,$

where (c_{1}, c_{2},...., c_{r}) and (d_{1}, d_{2},...., d_{r}) are spaces of order r and (p_{1}, p_{2},....,p_{s}), (q_{1}, q_{2},...., q_{s}) spaces of order s, and let (c'_{r+1}, c'_{r+2},..., c'_{n}),...., (q'_{s+1}, q'_{s+2},...., q'_{n}) be the corresponding reciprocal spaces, Then

$A q'_j = \sum \limits_1^s p_i S q_i q'_j = p_j \ \ \ \ \ \ \ (j = 1, 2, ..., s)$

and also Aq'_{j} = ∑ c_{i}Sd_{i}q'_{j}. Hence each p_{j} lies in (c_{1}, c_{2},...., c_{r}). Similarly each c_{i} lies in (p_{1}, p_{2},..., p_{s}) so that these two subspaces are the same and, in particular, their orders are equal, that is, r = s. The same discussion with A' in place of A shows that (d_{1}, d_{2},...., d_{r}) and (q_{1}, q_{2},...., q_{s}) are the same. We shall call the spaces ℘_{l} = (c_{1}, c_{2},...., c_{r}), ℘_{r} = (d_{1}, d_{2},...., d_{r}) the left and right *grounds* of A, and the total space ℘ = (c_{1},...., c_{r}, d_{1},...., d_{r}) will be called the (total) ground of A.

If x is any vector in the subspace R_{r} = (d'_{r+1}, d'_{r+2},..., d'_{n}) reciprocal to ℘_{r}, then Ax = 0 since Sd_{i}d'_{j} = 0 (i ≠ j). Conversely, if

0 = Ax = ∑ c_{i}Sd_{i}x,

each multiplier Sd_{i}x must be 0 since the c's are linearly independent; hence every solution of Ax = 0 lies in R_{r}. Similarly every solution of A'x = 0 lies in R_{l} = (c'_{r+1}, c'_{r+2},...., c'_{n}). We call R_{r} and R_{l} the right and left *nullspaces* of A; their order, n — r, is called the *nullity* of A.

We may summarize these results as follows.

**Theorem 7.** *If a matrix A is expressed in the form *$\sum \limits_1^r a_i S b_i$, *where ℘ _{l} = (a_{1}, a_{2},...., a_{r}) and ℘_{r} = (b_{1}, b_{2},...., b_{r}) define spaces of order r, then, no matter how the reduction to this form is carried out, the spaces ℘_{r} and ℘_{l} are always the same. Further, if R_{l} and R_{r} are the spaces of order n — r reciprocal to ℘_{l} and ℘_{r}, respectively, every solution of Ax = 0 lies in R_{r} and every solution of A'x = 0 in R_{l}. *

The following theorem is readily deduced from Theorem 7 and its proof is left to the reader.

**Theorem 8.** * If A, B are matrices of rank r, s, the rank of A + B is not greater than r + s and the rank of AB is not greater than the smaller of r and s. *

### 1.11 Linear dependence.

The definition of the rank of a matrix in the preceding section was made in terms of the linear dependence of vectors associated with the matrix. In this section we consider briefly the theory of linear dependence introducing incidentally a notation which we shall require later.

Let $x_i = \sum \limits_{j=1}^n \xi_{ij} e_j$ (i = 1, 2,...., r; r ≤ n) be a set of r vectors. From the rectangular array of their coordinates

ξ_{11} ξ_{12} ...... ξ_{1n}

ξ_{21} ξ_{22} ...... ξ_{2n}

(34) .......................................................

ξ_{r1} ξ_{r2} ...... ξ_{rn}

there can be formed n!/r!(n — r)! different determinants of order r by choosing r columns out of (34), these columns being taken in their natural order. If these determinants are arranged in some definite order, we may regard them as the coordinates of a vector in space of order n!/r!(n — r)! and, when this is done, we shall denote this vector by

(35) |x_{1}x_{2}.....x_{r}|

and call it a *pure vector* of *grade* r. It follows from this definition that |x_{1}x_{2}....x_{r}| has many of the properties of a determinant; its sign is changed if two x's are interchanged, it vanishes when two x's are equal and, if λ and μ are scalars,

(36) |(λx_{1} + μx'_{1})x_{2}....x_{r}| = λ|x_{1}x_{2}....x_{r}| + μ|x'_{1}x_{2}.....x_{r}|.

If we replace the x'a in (35) by r different units e_{i1}, e_{i2},...., e_{ir}, the result is clearly not 0: we thus obtain ${n \choose r}$ vectors which we shall call the fundamental unit vectors of grade r; and any linear combination of these units, say

∑ ξ_{i1i2....ir}|e_{i1}e_{i1}....e_{ir}|,

is called a vector of grade r. It should be noticed that not every vector is a pure vector except when r equals 1 or n.

If we replace x_{i} by ∑ ξ_{ij}e_{j} in (35), we get

|x_{1}x_{2}....x_{r}| = ∑ ξ_{1j1}ξ_{2j2}.....ξ_{rjr}|e_{j1}e_{j2}....e_{jr}|

where the summation extends over all permutations j_{1}, j_{2},...., j_{r} of 1, 2,...., n taken r at a time. This summation may be effected by grouping together the sets j_{1}, j_{2},...., j_{r} which are permutations of the same combination i_{1}, i_{2},...., i_{r}, whose members may be taken to be arranged in natural order, and then summing these partial sums over all possible combinations i_{1}, i_{2},...., i_{r}. Taking the first step only we have

$\sum \xi_{1j_1} \xi_{2j_2} ... \xi_{rj_r} | e_{j_1} e_{j_2} ... e_{j_r} | = \sum \delta^{i_1 ... i_r}_{j_1 ... j_r} \xi_{1j_1} ... \xi_{rj_r} | e_{i_1} e_{i_2} ... e_{i_r}|$

where $\delta^{i_1 ... i_r}_{j_1 ... j_r}$ is the sign corresponding to the permutations ${i_1 i_2 ... i_r \choose j_1 j_2 ... j_r}$ and this equals |ξ_{1i1}.....ξ_{rir}||e_{i1}....e_{ir}|. We have therefore

$(37) \ \ \ \ \ \ | x_1 x_2 ... x_r | = \sum \limits_{(i)}^* |\xi_{1i_1} \xi_{2i_2} . . . \xi_{ri_r}| \ |e_{i_1} e_{i_2} . . . e_{i_r}|$

where the asterisk on ∑ indicates that the sum is taken over all r-combinations of 1, 2, ...., n each combination being arranged in natural order.

**Theorem 9** * |x _{1}x_{2}....x_{r}| = 0 if, and only if, x_{1}, x_{2},..., x_{r} are linearly dependent. *

The first part of this theorem is an immediate consequence of (36). To prove the converse it is sufficient to show that, if |x_{1}x_{2}....x_{r-1}| ≠ 0, then there exist scalars α_{1}, α_{2},...., α_{r-1} such that

x_{r} = α_{1}x_{1} + α_{2}x_{2} + ..... + α_{r-1}x_{r-1}.

Let $x_i = \sum \limits_j \xi_{ij} e_j$. Since |x_{1}x_{2}....x_{r-1}| ≠ 0, at least one of its coordinates is not 0, and for convenience we may suppose without loss of generality that

(38) |ξ_{11}ξ_{22}.....ξ_{r-1,r-1}| ≠ 0.

Since |x_{1}x_{2}....x_{r}| = 0, all its coordinates equal 0 and in particular

|ξ_{11}ξ_{22}....ξ_{r-1,r-1}| = 0 (i = 1, 2,...., n).

If we expand this determinant according to the elements of its last column, we get a relation of the form

β_{1}ξ_{ri} + β_{2}ξ_{1i} + ...... + β_{r}ξ_{r-1,i} = 0

where the β's are independent of i and β_{1} ≠ 0 by (38). Hence we may write

(39) ξ_{ri} = α_{1}ξ_{1i} + .... + α_{r-1}ξ_{r-1,i} (i = 1, 2,...., n)

the α's being independent of i. Multiplying (39) by e_{i} and summing with regard to i, we have

x_{r} = α_{1}x_{1} + .... + α_{r-1}x_{r-1},

which proves the theorem.

If (a_{1}, a_{2},...., a_{m}) is a linear set of order r, then some set of r a's form a basis, that is, are linearly independent while each of the other a's is linearly dependent on them. By a change of notation, if necessary, we may take a_{1}, a_{2},...., a_{r} as this basis and write

$(40) \ \ \ \ \ \ a_{r+i} = \sum \limits_{j=1}^r \beta_{ij} a_j, \ \ \ \ \ \ \ (i = 1, 2, ... , m-r)$

We shall now discuss the general form of all linear relations among the a's in terms of the special relations (40); and in doing so we may assume the order of the space to be equal to or greater than m since we may consider any given space as a subspace of one of arbitrarily higher dimensionality.

Let

$(41) \ \ \ \ \ \ \ \ \sum \limits_1^m r_j a_j = 0$

be a relation connecting the a's and set

$c = \sum \limits_1^m r_j e_j$

Then (40), considered as a special case of (41), corresponds to settmg for c

$(42) \ \ \ \ \ \ c_i = - \sum \limits_{j=1}^r \beta_{ij} e_j + e_{r+i}, \ \ \ \ \ \ \ (i = 1, 2, ... , m-r)$

and there is clearly no linear relation connecting ihese vectors so that they define a linear set of order m — r. Using (40) in (41) we have

$\sum \limits_{j=1}^r (\gamma_j + \sum \limits_{i=1}^{m-r} \gamma_{r+i} \beta_{ij} ) a_j = 0$

and, since a_{1}, a_{2},...., a_{r} are linearly independent, we have

$j = - \sum \limits_{i=1}^{m-r} \beta_{ij} \gamma_{r+i}$

whence

$(43) \ \ \ \ \ \ \ c = \sum \limits_1^m \gamma_j e_j = - \sum \limits_{i=1}^{m-r} \gamma_{r+i} \sum \limits_{j=1}^{r} \beta_{ij} e_j + \sum \limits_{i=1}^{m-r} \gamma_{r+i} e_{r+i} = \sum \limits_{i=1}^{m-r} \gamma_{r+i} c_i,$

so that c is linearly dependent on c_{1}, c_{2},...., c_{m-r}. Conversely, on retracing these steps in the reverse order we see that, if c is linearly dependent on these vectors, so that γ_{r+i} (i = 1, 2,...., m — r) are known, then from (43) the γ_{j} (j = 1, 2,...., r) are defined in such a way that $c = \sum \limits_1^m \gamma_j e_j \ and \ \sum \limits_1^m \gamma_j a_j = 0$. We have therefore the following theorem.

**Theorem 10.** * If a _{1}, a_{2},...., a_{m} is a linear set of order r, there exist m — r linear relations $\sum \limits_{j=1}^m \gamma_{ij} a_j = 0$ (i = 1, 2,...., m — r) such that (i) the vectors $c_i = \sum \limits_{j=1}^m \gamma_{ij} e_j$ are linearly independent and (ii) if ∑ γ_{j}a_{j} = 0 is any linear relation connecting the a's, and if c = ∑ γ_{j}e_{j}, then c belongs to the linear set (c_{1}, c_{2},...., c_{m—r}). *

This result can be translated immediately in terms concerning the solution of a system of ordinary linear equations or in terms of matrices. If $a_j = \sum \limits_i a_{ji} e_i$, then (41) may be written

a_{11}γ_{1} + a_{21}γ_{2} + ...... + a_{m1}γ_{m} = 0

(44) ...........................................................................................

............................................................................................

a_{1n}γ_{1} + a_{2n}γ_{2} + ...... + a_{mn}γ_{m} = 0

a system of linear homogeneous equations in the unknowns γ_{1}, γ_{2},...., γ_{m}. Hence (44) has solutions for which some γ_{i} ≠ 0 if, and only if, the rank r of the array

$(45) \ \ \ \ \ \ \ \ \ \ \begin{matrix} a_{11} & a_{21} & ... & a_{m1} \\ a_{12} & a_{22} & ... & a_{m2} \\ . & . & ... & . \\ a_{1n} & a_{2n} & ... & a_{mn} \end{matrix}$

is less then m and, when this condition is satisfied, every solution is linearly dependent on the set of m — r solutions given by (42) which are found by the method given in the discussion of Theorem 9.

Again, if we make (45) a square array by the introduction of columns or rows of zeros and set A = ||a_{ij}||, c = ∑ γ_{i}e_{i}, then (41) becomes A'c = 0 and Theorem 10 may therefore be interpreted as giving the properties of the nullspace of A' which were derived in §1.10.