• Matrices as collections of vectors

  • Matrix-vector product

  • Matrix-matrix product

  • Trace, scalar product

  • Some special matrices

Matrices as collections of vectors

Matrices can be viewed simply as a collection of vectors of same size, that is, as a collection of points in a high-dimensional space.

Matrices as collections of columns

Matrices can be described in column-wise fashion: given n vectors a_1,ldots,a_n in mathbf{R}^m, we can define the m times n matrix A with a_j’s as columns:
 A = left[ begin{array}{ccc} a_1 & ldots & a_n end{array}right].
Geometrically, A represents n points in a m-dimensional space.

Transpose

The notation A_{ij} denotes the element of A sitting in row i and column j. The transpose of a matrix A. denoted by A^T, is the matrix with (i,j) element A_{ji}, i=1,ldots,m, j=1,ldots,n.

Matrices as collections of rows

Similarly, we can describe a matrix in row-wise fashion: given m vectors b_1,ldots,b_m in mathbf{R}^n, we can define the m times n matrix B with the transposed vectors b_i^T as rows:
 B = left[ begin{array}{c} b_1^T  vdots  b_m^T end{array}right] .
Geometrically, B represents m points in a n-dimensional space.

The notation mathbf{R}^{m times n} denotes the set of m times n matrices.

Examples:

Matrix-vector product

We define the matrix-vector product between a m times n matrix and a n-vector x, and denote by Ax, the m-vector with i-th component
 (Ax)_i = sum_{j=1}^n A_{ij}x_j , ;; i=1,ldots,m.

If the columns of A are given by the vectors a_i, i=1,ldots,n, so that A = (a_1 , ldots, a_n), then Ax can be interpreted as a linear combination of these columns, with weights given by the vector x:
 Ax = sum_{i=1}^n x_i a_i .

Alternatively, if the rows of A are the row vectors a_i^T, i=1,ldots,m:
 A = left[ begin{array}{c} a_1^T  vdots  a_m^T end{array} right],
then Ax is the vector with elements a_i^Tx, i=1,ldots,m:
 Ax = left[ begin{array}{c} a_1^Tx  vdots  a_m^Tx end{array} right] .

Examples:

Matrix-matrix product

Definition

We can extend matrix-vector product to matrix-matrix product, as follows. If A in mathbf{R}^{m times n} and B in mathbf{R}^{n times p}, the notation AB denotes the m times p matrix with i,j element given by
 (AB)_{ij} = sum_{k=1}^n A_{ik} B_{kj} .
It can be shown that transposing a product changes the order, so that (AB)^T = B^TA^T.

Column-wise interpretation

If the columns of B are given by the vectors b_i, i=1,ldots,n, so that B = [b_1 , ldots, b_n], then AB can be written as
 AB = A left[ begin{array}{ccc} b_1 & ldots & b_n end{array} right] =  left[ begin{array}{ccc} Ab_1 & ldots & Ab_n end{array} right] .
In other words, AB results from transforming each column b_i of B into Ab_i.

Row-wise interpretation

The matrix-matrix product can also be interpreted as an operation on the rows of A. Indeed, if A is given by its rows a_i^T, i=1,ldots,m, then AB is the matrix obtained by transforming each one of these rows via B, into a_i^TB, i=1,ldots,m:
 AB = left[ begin{array}{c} a_1^T  vdots  a_n^T end{array} right] B =  left[ begin{array}{c} a_1^TB  vdots  a_n^TB end{array} right] .
(Note that a_i^TB’s are indeed row vectors, according to our matrix-vector rules.)

Matrix-matrix products by blocks

Matrix algebra generalizes to blocks, provided block sizes are consistent. To illustrate this, consider the matrix-vector product between a m times n matrix A and a n-vector x, where A,x are partitioned in blocks, as follows:
 A = left[begin{array}{cc} A_{1} & A_{2} end{array} right], ;; x = left[begin{array}{c} x_1  x_2 end{array}right],
where A_i is m times n_i, x_i in mathbf{R}^{n_i}, i=1,2, n_1+n_2=n. Then
 Ax = A_1x_1 + A_2x_2.
Likewise, if a n times p matrix B is partitioned into two blocks B_i, each of size n_i, i=1,2, with n_1+n_2=n, then
 AB = left[begin{array}{cc} A_{1} & A_{2} end{array} right] left[begin{array}{c} B_1  B_2 end{array}right] = A_1B_1+A_2B_2.

Example: Gram matrix.

Trace, scalar product

Trace

The trace of a square n times n matrix A, denoted by mbox{bf Tr} A, is the sum of its diagonal elements: mbox{bf Tr} A = sum_{i=1}^n A_{ii}.

Scalar product

We can define the scalar product between two m times n matrices A,B via
 langle A,B rangle  = mbox{bf Tr} A^TB = sum_{i=1}^m sum_{j=1}^m A_{ij}B_{ij}.
We can interpret the above scalar product as the (vector) scalar product between two long vectors of length mn each, obtained by stacking all the columns of A,B on top of each other.

Special matrices

Important classes of matrices include the following.

Identity matrix

The n times n identity matrix (often denoted I_n, or simply I, if context allows), has ones on its diagonal and zeros elsewhere. It is diagonal, symmetric, and orthogonal, and satisfies A cdot I_n = A for every matrix A with n columns.

Square matrices

Square matrices are matrices that have the same number of rows as columns.

Diagonal matrices

Diagonal matrices are square matrices A with A_{ij} = 0 when i ne j.

Symmetric matrices

Symmetric matrices are square matrices that satisfy A_{ij} = A_{ji} for every pair (i,j). An entire topic is devoted to symmetric matrices.

Orthogonal matrices

Orthogonal matrices are square matrices, such that the columns form an orthonormal basis. If U = [u_1,ldots,u_n] is an orthogonal matrix, then
 u_i^Tu_i = left{ begin{array}{ll} 1 & mbox{if } i=j,  0 & mbox{otherwise.} end{array} right.
Thus, U^TU = I_n. Similarly, UU^T = I_n.

Orthogonal matrices correspond to bases that are a rotation of the standard basis. Their effect on a vector is to rotate it, leaving its length (Euclidean norm) invariant: for every vector x,
 |Ux|_2^2 = (Ux)^T(Ux) = x^TU^TUx = x^Tx = |x|_2^2.

Example: A 2 times 2 orthogonal matrix.