The page you are reading is part of a draft (v2.0) of the "No bullshit guide to math and physics."

The text has since gone through many edits and is now available in print and electronic format. The current edition of the book is v4.0, which is a substantial improvement in terms of content and language (I hired a professional editor) from the draft version.

I'm leaving the old wiki content up for the time being, but I highly engourage you to check out the finished book. You can check out an extended preview here (PDF, 106 pages, 5MB).


Vector coordinates

In the physics chapter we learned how to work with vectors in terms of their components. We can decompose the effects of a force $\vec{F}$ is terms of its $x$ and $y$ components: \[ F_x = \| \vec{F} \| \cos\theta, \qquad F_y = \| \vec{F} \| \sin\theta, \] where $\theta$ is the angle that the vector $\vec{F}$ makes with the $x$ axis. We can write the vector $\vec{F}$ in the following equivalent ways: \[ \vec{F} = F_x\hat{\imath} + F_y \hat{\jmath} = (F_x,F_y)_{\hat{\imath}\hat{\jmath}}, \] in which the vectors is expressed as components or coordinates with respect the basis $\{ \hat{\imath}, \hat{\jmath} \}$ (the $xy$ coordinate system).

The number $F_x$ (the first coordinate of $\vec{F}$) corresponds to the length of the projection of the vector $\vec{F}$ on the $x$ axis. In the last section we formalized the notion of projection and saw that the projection operation on a vector can be represented as a matrix product: \[ F_x\:\hat{\imath} = \Pi_x(\vec{v}) = (\vec{v} \cdot \hat{\imath})\hat{\imath} = \underbrace{\ \ \hat{\imath}\ \ \hat{\imath}^T}_{M_x} \ \vec{v}, \] where $M_x$ is called “the projection matrix onto the $x$ axis.”

In this section we will discuss in detail the relationship between vectors $\vec{v}$ (directions in space) and their representation in terms of coordinates with respect to a basis.

Definitions

We will discuss the three “quality grades” that exist for bases. For an $n$-dimensional vector space $V$, you could have a:

  • A generic basis $B_f=\{ \hat{f}_1, \hat{f}_2, \ldots, \hat{f}_n \}$,

which consists of any set of linearly independent vectors in $V$.

  • An orthogonal basis $B_{e}=\{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \}$,

which consists of $n$ mutually orthogonal vectors in $V$: $\vec{e}_i \cdot \vec{e}_j = \delta_{ij}$.

  • An orthonormal basis $B_{\hat{e}}=\{ \hat{e}_1, \hat{e}_2, \ldots, \hat{e}_n \}$

which is an orthogonal basis of unit length vectors: $\| \vec{e}_i \|^2 =1, \ \forall i \in \{ 1,2,\ldots,n\}$.

The main idea is quite simple.

  • Any vector can be expressed as coordinates with respect a basis:

\[ \vec{v} = v_1 \vec{e}_1 + v_2\vec{e}_2 + \cdots + v_n\vec{e}_n = (v_1, v_2, \ldots, v_n)_{B_e}. \]

However, things can get confusing when we use multiple bases:

  • $\vec{v}$: a vector.
  • $[\vec{v}]_{B_e}=(v_1, v_2, \ldots, v_n)_{B_e}$: the vector $\vec{v}$

expressed in terms of the basis $B_e$.

  • $[\vec{v}]_{B_f}=(v^\prime_1, v^\prime_2, \ldots, v^\prime_n)_{B_f}$: the same vector $\vec{v}$

expressed in terms of the basis $B_f$.

  • $_{B_f}[I]_{B_e}$: the change of basis matrix which converts the components of any vector

from the $B_e$ basis to the $B_f$ basis: $[\vec{v}]_{B_f} = _{B_f}[I]_{B_e}[\vec{v}]_{B_e}$.

Components with respect to a basis

The notion of “how much of a vector is in a given direction” is what we call the components of the vector $\vec{v}=(v_x,v_y,v_z)_{\hat{\imath}\hat{\jmath}\hat{k}}$, where we have indicated that the components are with respect to the standard orthonormal basis like $\{ \hat{\imath}, \hat{\jmath}, \hat{k} \}$. The dot product is used to calculate the components of the vector with respect to this basis: \[ v_x = \vec{v}\cdot \hat{\imath}, \quad v_y = \vec{v}\cdot \hat{\jmath}, \quad v_z = \vec{v} \cdot \hat{k}. \]

We can therefore write down the exact “prescription” for computing the components of a vector as follows: \[ (v_x,v_y,v_z)_{\hat{\imath}\hat{\jmath}\hat{k}} \ \Leftrightarrow \ (\vec{v}\cdot \hat{\imath})\: \hat{\imath} \ + \ (\vec{v}\cdot \hat{\jmath})\: \hat{\jmath} \ + \ (\vec{v} \cdot \hat{k})\: \hat{k}. \]

Let us consider now how this “prescription” can be applied more generally to compute the coordinates with respect to other bases. In particular we will think about an $n$-dimensional vector space $V$ and specify three different types of bases for that space: an orthonormal basis, an orthogonal basis and a generic basis. Recall that a basis for an $n$-dimensional space is any set of $n$ linearly independent vectors in that space.

Orthonormal basis

An orthonormal basis $B_{\hat{e}}=\{ \hat{e}_1, \hat{e}_2, \ldots, \hat{e}_n \}$ consists of a set of mutually orthogonal unit-length vectors: \[ \vec{e}_i \cdot \vec{e}_j = \delta_{ij}, \] The function $\delta_{ij}$ is equal to one whenever $i=j$ and equal to zero otherwise. For each $i$ we have: \[ \vec{e}_i \cdot \vec{e}_i = 1 \qquad \Rightarrow \qquad \| \vec{e}_i \|^2 =1. \]

To compute the components of the vector $\vec{a}$ with respect to an orthonormal basis $B_{\hat{e}}$ we use the standard “prescription” that we used for the $\{ \hat{\imath}, \hat{\jmath}, \hat{k} \}$ basis: \[ (a_1,a_2,\ldots,a_n)_{B_{\hat{e}}} \ \Leftrightarrow \ (\vec{a}\cdot \hat{e}_1)\: \hat{e}_1 \ + \ (\vec{a}\cdot \hat{e}_2)\: \hat{e}_2 \ + \ \cdots \ + \ (\vec{a}\cdot \hat{e}_n)\: \hat{e}_n. \]

Orthogonal basis

With appropriate normalization factors, you can use unnormalized vectors as a basis as well. Consider a basis which is orthogonal, but not orthonormal: $B_{e}=\{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \}$, then we have \[ (b_1,b_2,\ldots,b_n)_{B_{e}} \ \Leftrightarrow \ \left(\frac{\vec{v}\cdot\vec{e}_1}{\|\vec{e}_1\|^2}\right)\vec{e}_1 \ + \ \left(\frac{\vec{v}\cdot\vec{e}_2}{\|\vec{e}_2\|^2}\right)\vec{e}_2 \ + \ \cdots \ + \ \left(\frac{\vec{v}\cdot\vec{e}_n}{\|\vec{e}_n\|^2}\right)\vec{e}_n. \]

In order to find the coefficients of some vector $\vec{b}$ with respect to the basis $\{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \}$ we proceed as follows: \[ b_1 = \frac{ \vec{b} \cdot \vec{e}_1 }{ \|\vec{e}_1\| }, \quad b_2 = \frac{ \vec{b} \cdot \vec{e}_2 }{ \|\vec{e}_2\| }, \quad \cdots, \quad b_n = \frac{ \vec{b} \cdot \vec{e}_n }{ \|\vec{e}_n\| }. \]

Observe that each of the coefficients can be computed independently of the coefficients for the other basis vectors. To compute $b_1$, all I need to know is $\vec{b}$ and $\vec{e}_1$ and I do not need to know what $\vec{e}_2$ and $\vec{e}_3$ are. This is because the computation of the coefficient corresponds to an orthogonal projection. The length $b_1$ corresponds to the length of $\vec{b}$ in the $\vec{e}_1$ dimension, and because we know that the basis is orthogonal, this means that the length $b_1\vec{e}_1$ is does not depend on the other dimensions.

Generic basis

What if we have a generic basis $\{ \vec{f}_1, \vec{f}_2, \vec{f}_3 \}$ for that space? To find the coordinates $(a_1,a_2,a_3)$ of some vector $\vec{a}$ with respect to this basis we need to solve the equation \[ a_1\vec{f}_1+ a_2\vec{f}_2+ a_3\vec{f}_3 = \vec{a}, \] for the three unknowns $a_1,a_2$ and $a_3$. Because the vectors $\{ \vec{v}_i \}$ are not orthogonal, the calculation of the coefficients $a_1,a_2,\ldots,a_n$ must be done simultaneously.

Example

Express the vector $\vec{v}=(5,6)_{\hat{\imath}\hat{\jmath}}$ in terms of the basis $B_f = \{ \vec{f}_1, \vec{f}_2 \}$ where $\vec{f}_1 = (1,1)_{\hat{\imath}\hat{\jmath}}$ and $\vec{f}_2 = (3,0)_{\hat{\imath}\hat{\jmath}}$.

We are looking for the coefficients $v_1$ and $v_2$ such that \[ v_1 \vec{f}_1 + v_2\vec{f}_2 = \vec{v} = (5,6)_{\hat{\imath}\hat{\jmath}}. \] To find the coefficients we need to solve the following system of equations simultaneously: \[ \begin{align*} 1v_1 + 3v_2 & = 5 \nl 1v_1 + 0 \ & = 6. \end{align*} \]

From the second equation we find that $v_1=6$ and substituting into the first equation we find that $v_2 = \frac{-1}{3}$. Thus, the vector $\vec{v}$ written with respect to the basis $\{ \vec{f}_1, \vec{f}_2 \}$ is \[ \vec{v} = 6\vec{f}_1 - \frac{1}{3}\vec{f}_2 = \left(6,\tfrac{-1}{3}\right)_{B_f}. \]

Change of basis

We often identify a vector $\vec{v}$ with its components in a certain basis $(v_x,v_y,v_z)$. This is fine for the most part, but it is important to always keep in mind the basis with respect to which the coefficients are taken, and if necessary specify the basis as a subscript $\vec{v}=(v_x,v_y,v_z)_{\hat{\imath}\hat{\jmath}\hat{k}}$.

When performing vector arithmetic operations like $\vec{u}+\vec{v}$, we don't really care what the basis the vectors are expressed in so long as the same basis is used for both $\vec{u}$ and $\vec{v}$.

We sometimes need to use two different bases. Consider for example the basis $B_e=\{ \hat{e}_1, \hat{e}_2, \ldots, \hat{e}_n \}$ and another basis $B_f=\{ \hat{f}_1, \hat{f}_2, \ldots, \hat{f}_n \}$. Suppose we are given the coordinates $v_1,v_2,v_3$ of some $\vec{v}$ in terms of the basis $B_e$: \[ \vec{v} = \left( v_1 , v_2 , v_3 \right)_{ B_e } = v_1 \hat{e}_1 + v_2 \hat{e}_2 + v_3 \hat{e}_3. \] How can we find the coefficients of $\vec{v}$ in terms of the basis $B_f$?

This is called a change-of-basis transformation and can be performed as a matrix multiplication: \[ \left[ \begin{array}{c} v_1^\prime \nl v_2^\prime \nl v_3^\prime \end{array} \right]_{ B_f } = \underbrace{ \left[ \begin{array}{ccc} \hat{f}_1 \cdot \hat{e}_1 & \hat{f}_1 \cdot \hat{e}_2 & \hat{f}_1 \cdot \hat{e}_3 \nl \hat{f}_2 \cdot \hat{e}_1 & \hat{f}_2 \cdot \hat{e}_2 & \hat{f}_2 \cdot \hat{e}_3 \nl \hat{f}_3 \cdot \hat{e}_1 & \hat{f}_3 \cdot \hat{e}_2 & \hat{f}_3 \cdot \hat{e}_3 \end{array} \right] }_{ _{B_f}[I]_{B_e} } \left[ \begin{array}{c} v_1 \nl v_2 \nl v_3 \end{array} \right]_{ B_e }. \] Each of the entries in the “change of basis matrix” describes how each of the $\hat{e}$ basis vectors transforms in terms of the $\hat{f}$ basis.

Note that the matrix doesn't actually do anything, since it doesn't move the vector. The change of basis acts like the identity transformation which is why we use the notation $_{B_f}[I]_{B_e}$. This matrix contains the information about how each of the vectors of the old basis ($B_e$) is expressed in terms of the new basis ($B_f$).

For example, the vector $\hat{e}_1$ will get mapped into: \[ \hat{e}_1 = (\hat{f}_1 \cdot \hat{e}_1)\:\hat{f}_1 + (\hat{f}_2 \cdot \hat{e}_1)\:\hat{f}_2 + (\hat{f}_3 \cdot \hat{e}_1)\:\hat{f}_3. \]

which is just the generic formula for expressing any vector in terms of the basis $B_f$.

The change of basis operation does not change the vector. The vector $\vec{v}$ stays the same, but we have now expressed it in terms of another basis: \[ \left( v_1^\prime , v_2^\prime , v_3^\prime \right)_{ B_f } = v_1^\prime \: \hat{f}_1 + v_2^\prime \: \hat{f}_2 + v_3^\prime \: \hat{f}_3 = \vec{v} = v_1 \:\hat{e}_1 + v_2 \: \hat{e}_2 + v_3 \: \hat{e}_3 = \left( v_1 , v_2 , v_3 \right)_{ B_e }. \]

Matrix components

So we have spoke in very mathematical terms about different representations of vectors. What about representations of linear transformations: \[ T_A : \mathbb{R}^n \to \mathbb{R}^n? \] Recall that each linear transformation can be represented as a matrix with respect to some basis. The matrix of $T_A$ with respect to the basis $B_{e}=\{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \}$ is given by: \[ \ _{B_e}[A]_{B_e} = \begin{bmatrix} | & | & \mathbf{ } & | \nl T_A(\vec{e}_1) & T_A(\vec{e}_2) & \dots & T_A(\vec{e}_n) \nl | & | & \mathbf{ } & | \end{bmatrix}. \] where we assume that the outputs $T_A(\vec{e}_j)$ are given to us as column vectors with respect to $B_{e}$.

The action of $T_A$ on any vector $\vec{v}$ is the same as the matrix-vector multiplication by $\ _{B_e}[A]_{B_e}$ of the coefficients vector $(v_1,v_2,\ldots,v_n)_{B_{e}}$ expressed in the basis $B_e$.

A lot of mathematical buzz comes from this kind of parallel structure between worlds. The mathematical term used to describe a one-to-one correspondence between two mathematical objets is called an isomorphism. It's the same thing. Everything you know about matrices can be applied to linear transformation and everything you know about linear transformations can be applied to matrices.

In this case, we can say more precisely that the abstract concept of some linear transformation is represented as the concrete matrix of coefficients with respect to some basis. The matrix $\ _{B_{e}}[A]_{B_{e}}$ is the representation of $T_A$ with respect to the basis $B_{e}$.

What would be the representation of $T_A$ with respect to some other basis $B_{f}$?

Change of basis for matrices

Recall that the change of basis matrix $\ _{B_f}[I]_{B_e}$ which can be used to transform a coefficients $[\vec{v}]_{B_e}$ to a coefficient vector in a different basis $[\vec{v}]_{B_f}$: \[ [\vec{v}]_{B_f} = \ _{B_f}[I]_{B_e} \ [\vec{v}]_{B_e}. \]

Suppose now that you are given the representation $\ _{B_{e}}[A]_{B_{e}}$ of the linear transformation $T_A$ with respect to $B_e$ and you are asked to find the matrix $\ _{B_{f}}[A]_{B_{f}}$ which is the representation of $T_A$ with respect to the basis $B_f$.

The answer is very straightforward \[ \ _{B_f}[A]_{B_f} = \ _{B_f}[I]_{B_e} \ _{B_e}[A]_{B_e} \ _{B_e}[I]_{B_f}, \] where $\ _{B_e}[I]_{B_f}$ is the inverse matrix of $\ _{B_f}[I]_{B_e}$ and corresponds to the change of basis from the $B_f$ basis to the $B_e$ basis.

The interpretation of the above three-matrix sandwich is also straightforward. Imagine an input vector $[\vec{v}]_{B_f}$ multiplying the sandwich from the right. In the first step $\ _{B_e}[I]_{B_f}$ will convert it to the $B_e$ basis so that the $\ _{B_e}[A]_{B_e}$ matrix can be applied. In the last step the matrix $\ _{B_f}[I]_{B_e} $ converts the output of $T_A$ to the $B_f$ basis.

A transformation of the form: \[ A \to P A P^{-1}, \] where $P$ is any invertible matrix is called a similarity transformation.

The similarity transformation $A^\prime = P A P^{-1}$ leaves many of the properties of the matrix $A$ unchanged:

  • Trace: $\textrm{Tr}\!\left( A^\prime \right) = \textrm{Tr}\!\left( A \right)$.
  • Determinant: $\textrm{det}\!\left( A^\prime \right) = \textrm{det}\!\left( A \right)$.
  • Rank: $\textrm{Tr}\!\left( A^\prime \right) = \textrm{Tr}\!\left( A \right)$.
  • Eigenvalues: $\textrm{eig}\!\left( A^\prime \right) = \textrm{eig}\!\left( A \right)$.

In some sense, the basis invariant properties like the trace, the determinant, the rank and the eigenvalues are the only true properties of matrices. Everything else is maya—just one representation out of many.

Links

[ Change of basis explained. ]
http://planetmath.org/ChangeOfBases.html

NOINDENT [ Change of basis example by Salman Khan. ]
http://www.youtube.com/watch?v=meibWcbGqt4

 
home about buy book