Consider the $m$ by $n$ matrix $A \in \mathbb{M}(m,n)\equiv \mathbb{R}^{m\times n}$. What operations can we do on it?
We denote the matrix as a whole $A$ and refer to its individual entries as $a_{ij}$, where $a_{ij}$ is the entry in the $i$-th row and the $j$-th column of $A$.
The matrix addition and subtraction operations take two matrices as inputs (the matrices must have the same dimensions). \[ +: \mathbb{M}, \mathbb{M} \to \mathbb{M}, \qquad -: \mathbb{M}, \mathbb{M} \to \mathbb{M}. \]
The addition and subtraction operations are performed component wise. For two $m\times n$-matrices $A$ and $B$, their sum is the matrix $C$ with entries: \[ C = A + B \Leftrightarrow c_{ij} = a_{ij} + b_{ij}, \forall i \in [1,\ldots,m], j\in [1,\ldots,n]. \]
Or written out explicitly for $3\times3$ matrices: \[ \left[\begin{array}{ccc} a_{11} & a_{12} & a_{13} \nl a_{21} & a_{22} & a_{23} \nl a_{31} & a_{32} & a_{33} \end{array}\right] + \left[\begin{array}{ccc} b_{11} & b_{12} & b_{13} \nl b_{21} & b_{22} & b_{23} \nl b_{31} & b_{32} & b_{33} \end{array}\right] = \left[\begin{array}{ccc} a_{11}+b_{11} & a_{12}+b_{12} & a_{13}+b_{13} \nl a_{21}+b_{21} & a_{22}+b_{22} & a_{23}+b_{23} \nl a_{31}+b_{31} & a_{32}+b_{32} & a_{33}+b_{33} \end{array}\right]. \]
Given a number $\alpha$ and a matrix $A$, we can scale $A$ by $\alpha$: \[ \alpha A = \alpha \left[\begin{array}{ccc} a_{11} & a_{12} & a_{13} \nl a_{21} & a_{22} & a_{23} \nl a_{31} & a_{32} & a_{33} \end{array}\right] = \left[\begin{array}{ccc} \alpha a_{11} & \alpha a_{12} & \alpha a_{13} \nl \alpha a_{21} & \alpha a_{22} & \alpha a_{23} \nl \alpha a_{31} & \alpha a_{32} & \alpha a_{33} \end{array}\right] \]
The matrix-vector product of some matrix $A \in \mathbb{R}^{m\times n}$ and a vector $\vec{v} \in \mathbb{R}^n$ consists of computing the dot product between the vector $\vec{v}$ and each of the rows of $A$: \[ \textrm{matrix-vector product} : \mathbb{M}(m,n) \times \mathbb{V}(n) \to \mathbb{V}(m) \] \[ \vec{w} = A\vec{v} \Leftrightarrow w_{i} = \sum_{j=1}^n a_{ij}v_{j}, \forall i \in [1,\ldots,m]. \]
\[ A\vec{v} = \left[\begin{array}{ccc} a_{11} & a_{12} & a_{13} \nl a_{21} & a_{22} & a_{23} \nl a_{31} & a_{32} & a_{33} \end{array}\right] \left[\begin{array}{c} v_{1} \nl v_{2} \nl v_{3} \end{array}\right] = \left[\begin{array}{c} a_{11}v_{1} + a_{12}v_{2} + a_{13}v_{3} \nl a_{21}v_1 + a_{22}v_2 + a_{23}v_3 \nl a_{31}v_1 + a_{32}v_2 + a_{33}v_3 \end{array}\right] \quad \in \mathbb{R}^{3 \times 1}. \]
The matrix multiplication $AB$ of matrices $A \in \mathbb{R}^{m\times n}$ and $B \in \mathbb{R}^{n\times \ell}$ consists of computing the dot product between each the rows of $A$ and each the columns of $B$. \[ \textrm{matrix-product} : \mathbb{M}(m,n) \times \mathbb{M}(n,\ell) \to \mathbb{M}(m,\ell) \] \[ C = AB \Leftrightarrow c_{ij} = \sum_{k=1}^n a_{ik}b_{kj}, \forall i \in [1,\ldots,m],j \in [1,\ldots,\ell]. \]
\[ \left[\begin{array}{ccc} a_{11} & a_{12} \nl a_{21} & a_{22} \nl a_{31} & a_{32} \end{array}\right] \left[\begin{array}{ccc} b_{11} & b_{12} \nl b_{21} & b_{22} \nl \end{array}\right] = \left[\begin{array}{ccc} a_{11}b_{11} + a_{12}b_{21} & a_{11}b_{12} + a_{12}b_{22} \nl a_{21}b_{11} + a_{22}b_{21} & a_{21}b_{12} + a_{22}b_{22} \nl a_{31}b_{11} + a_{32}b_{21} & a_{31}b_{12} + a_{32}b_{22} \end{array}\right] \qquad \in \mathbb{R}^{3 \times 2}. \]
The transpose of a matrix $A$ is defined by: $a_{ij}^T=a_{ji}$, i.e., we just “flip” the matrix through the diagonal: \[ \textrm{T} : \mathbb{M}(m,n) \to \mathbb{M}(n,m), \] \[ \begin{bmatrix} \alpha_1 & \alpha_2 & \alpha_3 \nl \beta_1 & \beta_2 & \beta_3 \end{bmatrix}^T = \begin{bmatrix} \alpha_1 & \beta_1 \nl \alpha_2 & \beta_2 \nl \alpha_3 & \beta_3 \end{bmatrix}. \]
Note that the entries on the diagonal are not changed by the transpose operation.
\[ \begin{align*} (A+B)^T &= A^T + B^T \nl (AB)^T &= B^TA^T \nl (ABC)^T &= C^TB^TA^T \nl (A^T)^{-1} &= (A^{-1})^T \end{align*} \]
You can think of vectors as a special kinds of matrices. You can think of a vector $\vec{v}$ either as a column vector (an $n\times 1$ matrix) or as a row vector (a $1 \times n$ matrix).
Recall the definition of the dot product or inner product for vectors: \[ \textrm{inner-product} : \mathbb{V}(n) \times \mathbb{V}(n) \to \mathbb{R}. \] Given two $n$-dimensional vectors $\vec{u}$ and $\vec{v}$ with real coefficients, their dot product is computed as follows: $\vec{u}\cdot\vec{v} = \sum_{i=1}^n u_iv_i$.
If we think of these vectors as column vectors, i.e., think of them as $n\times1$ matrices, then we can write the dot product using the transpose operation $T$ and the standard rules of matrix multiplication: \[ \vec{u}\cdot \vec{v} = \vec{u}^T\vec{v} = \left[\begin{array}{ccc} u_{1} & u_{2} & u_{3} \end{array}\right] \left[\begin{array}{c} v_1 \nl v_2 \nl v_3 \end{array}\right] = u_1v_1 + u_2v_2 + u_3v_3. \]
You see that the dot product for vectors is really a special case of matrix multiplication. Alternately, you say that the matrix multiplication was defined in terms of the dot product.
Consider again two column vectors ($n\times 1$ matrices) $\vec{u}$ and $\vec{v}$. We obtain the inner product if we put the transpose on the first vector $\vec{u}^T\vec{v}\equiv \vec{u}\cdot \vec{v}$. If instead we put the transpose on the second vector, we will obtain the outer product of $\vec{u}$ and $\vec{v}$: \[ \vec{u}\vec{v}^T = \left[\begin{array}{c} u_1 \nl u_2 \nl u_3 \end{array}\right] \left[\begin{array}{ccc} v_{1} & v_{2} & v_{3} \end{array}\right] = \begin{bmatrix} u_1v_1 & u_1v_2 & u_1v_3 \nl u_2v_1 & u_2v_2 & u_2v_3 \nl u_3v_1 & u_3v_2 & u_3v_3 \end{bmatrix} \qquad \in \mathbb{R}^{n \times n}. \] The result of this outer product is an $n \times n$ matrix. It is the result of a multiplication of an $n\times1$ matrix and a $1 \times n$ matrix. More specifically, the outer product is a map that takes two vectors as inputs and gives a matrix as output: \[ \textrm{outer-product} : \mathbb{V}(n) \times \mathbb{V}(n) \to \mathbb{M}(n,n). \] The outer product can be used to build projection matrices. For example, the matrix which corresponds to the projection onto the $x$-axis is given by $M_x = \hat{\imath}\hat{\imath}^T \in \mathbb{R}^{n \times n}$. The $x$-projection of any vector $\vec{v}$ can be computed as a matrix-vector product: $M_x\vec{v} = \hat{\imath}\hat{\imath}^T\vec{v} = \hat{\imath}(\hat{\imath}\cdot\vec{v}) = v_x \hat{\imath}$. The last equation follows dot-product formula for calculating the components of vectors.
The inverse matrix $A^{-1}$ has the property that $A A^{-1}=I = A^{-1}A$, where $I$ is the identity matrix which obeys $I\vec{v} = \vec{v}$ for all vectors $\vec{v}$. The inverse matrix $A^{-1}$ has the effect of undoing whatever $A$ did. The cumulative effect of multiplying by $A$ and $A^{-1}$ is equivalent to the identity transformation: \[ A^{-1}(A(\vec{v})) = (A^{-1}A)\vec{v} = I\vec{v} = \vec{v}. \]
We can think of “finding the inverse” $\textrm{inv}(A)=A^{-1}$ as an operation of the form: \[ \textrm{inv} : \mathbb{M}(n,n) \to \mathbb{M}(n,n). \] Note that only invertible matrices have an inverse.
\[ \begin{align*} (A+B)^{-1} &= A^{-1} + B^{-1} \nl (AB)^{-1} &= B^{-1}A^{-1} \nl (ABC)^{-1} &= C^{-1}B^{-1}A^{-1} \nl (A^T)^{-1} &= (A^{-1})^T \end{align*} \]
The matrix inverse plays the role of “division by the matrix A” in matrix equations. We will discuss the peculiarities of associated with matrix equations in the next section.
The trace of an $n\times n$ matrix, \[ \textrm{Tr} : \mathbb{M}(n,n) \to \mathbb{R}, \] is the sum of the $n$ values on the diagonal of the matrix: \[ \textrm{Tr}\!\left[ A \right] \equiv \sum_{i=1}^n a_{ii}. \]
\[ \begin{align*} \textrm{Tr}\!\left[ A + B\right] &= \textrm{Tr}\!\left[ A \right] + \textrm{Tr}\!\left[ B\right] \nl \textrm{Tr}\!\left[ AB \right] &= \textrm{Tr}\!\left[ BA \right] \nl \textrm{Tr}\!\left[ ABC \right] &= \textrm{Tr}\!\left[ CAB \right] = \textrm{Tr}\!\left[ BCA \right] \nl \textrm{Tr}\!\left[ A \right] &= \sum_{i=1}^{n} \lambda_i \qquad \textrm{ where } \{ \lambda_i\} = \textrm{eig}(A) \textrm{ are the eigenvalues } \nl \textrm{Tr}\!\left[ A^T \right] &= \textrm{Tr}\!\left[ A \right] \nl \end{align*} \]
The determinant of a matrix is a calculation which involves all the coefficients of the matrix and the output of which a single real number: \[ \textrm{det} : \mathbb{M}(n,n) \to \mathbb{R}. \]
The determinant describes the relative geometry of the vectors that make up the matrix. More specifically, the determinant of a matrix $A$ tells you the volume of a box with sides given by rows of $A$.
For example, the determinant of a $2\times2$ matrix is \[ \det(A) = \det\left(\begin{array}{cc}a&b\nl c&d \end{array}\right) =\left|\begin{array}{cc}a&b\nl c&d \end{array}\right| =ad-cb, \] which corresponds to the area of the parallelogram formed by the vectors $(a,b)$ and $(c,d)$. Observe that if the rows of $A$ point in the same direction $(a,b) = \alpha(c,d)$ for some $\alpha \in \mathbb{R}$, then the area of the parallelogram will be zero. Conversely, if the determinant of a matrix is non-zero then the rows the matrix must be linearly independent.
\[ \begin{align*} \textrm{det}\!\left( AB\right) &= \textrm{det}\!\left( A \right)\textrm{det}\!\left( B\right) \nl \textrm{det}\!\left( A \right) &= \prod_{i=1}^{n} \lambda_i \qquad \textrm{ where } \{\lambda_i\} = \textrm{eig}(A) \textrm{ are the eigenvalues } \nl \textrm{det}\!\left( A^T \right) &= \textrm{det}\!\left( A \right) \nl \textrm{det}\!\left( A^{-1}\right) &= \frac{1}{\textrm{det}\!\left( A \right) } \end{align*} \]
For any invertible matrix $P$ we can define the similarity transformation: \[ \textrm{Sim}_P : \mathbb{M}(n,n) \to \mathbb{M}(n,n), \] which acts as follows: \[ \textrm{Sim}_P(A) = P A P^{-1}. \]
The similarity transformation $A^\prime = P A P^{-1}$ leaves many of the properties of the matrix unchanged:
A similarity transformation can be interpreted as a change of basis in which case the matrix $P$ is called the change-of-basis matrix.
In the remainder of this chapter we will learn about various algebraic and geometric interpretations for each of the matrix operations defined above. But first we must begin with an important discussion about matrix equations and how they differ from equations with numbers.