The page you are reading is part of a draft (v2.0) of the "No bullshit guide to math and physics."
The text has since gone through many edits and is now available in print and electronic format. The current edition of the book is v4.0, which is a substantial improvement in terms of content and language (I hired a professional editor) from the draft version.
I'm leaving the old wiki content up for the time being, but I highly engourage you to check out the finished book. You can check out an extended preview here (PDF, 106 pages, 5MB).
In the physics chapter we learned how to work with vectors in terms of their components. We can decompose the effects of a force →F is terms of its x and y components: Fx=‖ where \theta is the angle that the vector \vec{F} makes with the x axis. We can write the vector \vec{F} in the following equivalent ways: \vec{F} = F_x\hat{\imath} + F_y \hat{\jmath} = (F_x,F_y)_{\hat{\imath}\hat{\jmath}}, in which the vectors is expressed as components or coordinates with respect the basis \{ \hat{\imath}, \hat{\jmath} \} (the xy coordinate system).
The number F_x (the first coordinate of \vec{F}) corresponds to the length of the projection of the vector \vec{F} on the x axis. In the last section we formalized the notion of projection and saw that the projection operation on a vector can be represented as a matrix product: F_x\:\hat{\imath} = \Pi_x(\vec{v}) = (\vec{v} \cdot \hat{\imath})\hat{\imath} = \underbrace{\ \ \hat{\imath}\ \ \hat{\imath}^T}_{M_x} \ \vec{v}, where M_x is called “the projection matrix onto the x axis.”
In this section we will discuss in detail the relationship between vectors \vec{v} (directions in space) and their representation in terms of coordinates with respect to a basis.
We will discuss the three “quality grades” that exist for bases. For an n-dimensional vector space V, you could have a:
which consists of any set of linearly independent vectors in V.
which consists of n mutually orthogonal vectors in V: \vec{e}_i \cdot \vec{e}_j = \delta_{ij}.
which is an orthogonal basis of unit length vectors: \| \vec{e}_i \|^2 =1, \ \forall i \in \{ 1,2,\ldots,n\}.
The main idea is quite simple.
\vec{v} = v_1 \vec{e}_1 + v_2\vec{e}_2 + \cdots + v_n\vec{e}_n = (v_1, v_2, \ldots, v_n)_{B_e}.
However, things can get confusing when we use multiple bases:
expressed in terms of the basis B_e.
expressed in terms of the basis B_f.
from the B_e basis to the B_f basis: [\vec{v}]_{B_f} = _{B_f}[I]_{B_e}[\vec{v}]_{B_e}.
The notion of “how much of a vector is in a given direction” is what we call the components of the vector \vec{v}=(v_x,v_y,v_z)_{\hat{\imath}\hat{\jmath}\hat{k}}, where we have indicated that the components are with respect to the standard orthonormal basis like \{ \hat{\imath}, \hat{\jmath}, \hat{k} \}. The dot product is used to calculate the components of the vector with respect to this basis: v_x = \vec{v}\cdot \hat{\imath}, \quad v_y = \vec{v}\cdot \hat{\jmath}, \quad v_z = \vec{v} \cdot \hat{k}.
We can therefore write down the exact “prescription” for computing the components of a vector as follows: (v_x,v_y,v_z)_{\hat{\imath}\hat{\jmath}\hat{k}} \ \Leftrightarrow \ (\vec{v}\cdot \hat{\imath})\: \hat{\imath} \ + \ (\vec{v}\cdot \hat{\jmath})\: \hat{\jmath} \ + \ (\vec{v} \cdot \hat{k})\: \hat{k}.
Let us consider now how this “prescription” can be applied more generally to compute the coordinates with respect to other bases. In particular we will think about an n-dimensional vector space V and specify three different types of bases for that space: an orthonormal basis, an orthogonal basis and a generic basis. Recall that a basis for an n-dimensional space is any set of n linearly independent vectors in that space.
An orthonormal basis B_{\hat{e}}=\{ \hat{e}_1, \hat{e}_2, \ldots, \hat{e}_n \} consists of a set of mutually orthogonal unit-length vectors: \vec{e}_i \cdot \vec{e}_j = \delta_{ij}, The function \delta_{ij} is equal to one whenever i=j and equal to zero otherwise. For each i we have: \vec{e}_i \cdot \vec{e}_i = 1 \qquad \Rightarrow \qquad \| \vec{e}_i \|^2 =1.
To compute the components of the vector \vec{a} with respect to an orthonormal basis B_{\hat{e}} we use the standard “prescription” that we used for the \{ \hat{\imath}, \hat{\jmath}, \hat{k} \} basis: (a_1,a_2,\ldots,a_n)_{B_{\hat{e}}} \ \Leftrightarrow \ (\vec{a}\cdot \hat{e}_1)\: \hat{e}_1 \ + \ (\vec{a}\cdot \hat{e}_2)\: \hat{e}_2 \ + \ \cdots \ + \ (\vec{a}\cdot \hat{e}_n)\: \hat{e}_n.
With appropriate normalization factors, you can use unnormalized vectors as a basis as well. Consider a basis which is orthogonal, but not orthonormal: B_{e}=\{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \}, then we have (b_1,b_2,\ldots,b_n)_{B_{e}} \ \Leftrightarrow \ \left(\frac{\vec{v}\cdot\vec{e}_1}{\|\vec{e}_1\|^2}\right)\vec{e}_1 \ + \ \left(\frac{\vec{v}\cdot\vec{e}_2}{\|\vec{e}_2\|^2}\right)\vec{e}_2 \ + \ \cdots \ + \ \left(\frac{\vec{v}\cdot\vec{e}_n}{\|\vec{e}_n\|^2}\right)\vec{e}_n.
In order to find the coefficients of some vector \vec{b} with respect to the basis \{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \} we proceed as follows: b_1 = \frac{ \vec{b} \cdot \vec{e}_1 }{ \|\vec{e}_1\| }, \quad b_2 = \frac{ \vec{b} \cdot \vec{e}_2 }{ \|\vec{e}_2\| }, \quad \cdots, \quad b_n = \frac{ \vec{b} \cdot \vec{e}_n }{ \|\vec{e}_n\| }.
Observe that each of the coefficients can be computed independently of the coefficients for the other basis vectors. To compute b_1, all I need to know is \vec{b} and \vec{e}_1 and I do not need to know what \vec{e}_2 and \vec{e}_3 are. This is because the computation of the coefficient corresponds to an orthogonal projection. The length b_1 corresponds to the length of \vec{b} in the \vec{e}_1 dimension, and because we know that the basis is orthogonal, this means that the length b_1\vec{e}_1 is does not depend on the other dimensions.
What if we have a generic basis \{ \vec{f}_1, \vec{f}_2, \vec{f}_3 \} for that space? To find the coordinates (a_1,a_2,a_3) of some vector \vec{a} with respect to this basis we need to solve the equation a_1\vec{f}_1+ a_2\vec{f}_2+ a_3\vec{f}_3 = \vec{a}, for the three unknowns a_1,a_2 and a_3. Because the vectors \{ \vec{v}_i \} are not orthogonal, the calculation of the coefficients a_1,a_2,\ldots,a_n must be done simultaneously.
Express the vector \vec{v}=(5,6)_{\hat{\imath}\hat{\jmath}} in terms of the basis B_f = \{ \vec{f}_1, \vec{f}_2 \} where \vec{f}_1 = (1,1)_{\hat{\imath}\hat{\jmath}} and \vec{f}_2 = (3,0)_{\hat{\imath}\hat{\jmath}}.
We are looking for the coefficients v_1 and v_2 such that v_1 \vec{f}_1 + v_2\vec{f}_2 = \vec{v} = (5,6)_{\hat{\imath}\hat{\jmath}}. To find the coefficients we need to solve the following system of equations simultaneously: \begin{align*} 1v_1 + 3v_2 & = 5 \nl 1v_1 + 0 \ & = 6. \end{align*}
From the second equation we find that v_1=6 and substituting into the first equation we find that v_2 = \frac{-1}{3}. Thus, the vector \vec{v} written with respect to the basis \{ \vec{f}_1, \vec{f}_2 \} is \vec{v} = 6\vec{f}_1 - \frac{1}{3}\vec{f}_2 = \left(6,\tfrac{-1}{3}\right)_{B_f}.
We often identify a vector \vec{v} with its components in a certain basis (v_x,v_y,v_z). This is fine for the most part, but it is important to always keep in mind the basis with respect to which the coefficients are taken, and if necessary specify the basis as a subscript \vec{v}=(v_x,v_y,v_z)_{\hat{\imath}\hat{\jmath}\hat{k}}.
When performing vector arithmetic operations like \vec{u}+\vec{v}, we don't really care what the basis the vectors are expressed in so long as the same basis is used for both \vec{u} and \vec{v}.
We sometimes need to use two different bases. Consider for example the basis B_e=\{ \hat{e}_1, \hat{e}_2, \ldots, \hat{e}_n \} and another basis B_f=\{ \hat{f}_1, \hat{f}_2, \ldots, \hat{f}_n \}. Suppose we are given the coordinates v_1,v_2,v_3 of some \vec{v} in terms of the basis B_e: \vec{v} = \left( v_1 , v_2 , v_3 \right)_{ B_e } = v_1 \hat{e}_1 + v_2 \hat{e}_2 + v_3 \hat{e}_3. How can we find the coefficients of \vec{v} in terms of the basis B_f?
This is called a change-of-basis transformation and can be performed as a matrix multiplication: \left[ \begin{array}{c} v_1^\prime \nl v_2^\prime \nl v_3^\prime \end{array} \right]_{ B_f } = \underbrace{ \left[ \begin{array}{ccc} \hat{f}_1 \cdot \hat{e}_1 & \hat{f}_1 \cdot \hat{e}_2 & \hat{f}_1 \cdot \hat{e}_3 \nl \hat{f}_2 \cdot \hat{e}_1 & \hat{f}_2 \cdot \hat{e}_2 & \hat{f}_2 \cdot \hat{e}_3 \nl \hat{f}_3 \cdot \hat{e}_1 & \hat{f}_3 \cdot \hat{e}_2 & \hat{f}_3 \cdot \hat{e}_3 \end{array} \right] }_{ _{B_f}[I]_{B_e} } \left[ \begin{array}{c} v_1 \nl v_2 \nl v_3 \end{array} \right]_{ B_e }. Each of the entries in the “change of basis matrix” describes how each of the \hat{e} basis vectors transforms in terms of the \hat{f} basis.
Note that the matrix doesn't actually do anything, since it doesn't move the vector. The change of basis acts like the identity transformation which is why we use the notation _{B_f}[I]_{B_e}. This matrix contains the information about how each of the vectors of the old basis (B_e) is expressed in terms of the new basis (B_f).
For example, the vector \hat{e}_1 will get mapped into: \hat{e}_1 = (\hat{f}_1 \cdot \hat{e}_1)\:\hat{f}_1 + (\hat{f}_2 \cdot \hat{e}_1)\:\hat{f}_2 + (\hat{f}_3 \cdot \hat{e}_1)\:\hat{f}_3.
which is just the generic formula for expressing any vector in terms of the basis B_f.
The change of basis operation does not change the vector. The vector \vec{v} stays the same, but we have now expressed it in terms of another basis: \left( v_1^\prime , v_2^\prime , v_3^\prime \right)_{ B_f } = v_1^\prime \: \hat{f}_1 + v_2^\prime \: \hat{f}_2 + v_3^\prime \: \hat{f}_3 = \vec{v} = v_1 \:\hat{e}_1 + v_2 \: \hat{e}_2 + v_3 \: \hat{e}_3 = \left( v_1 , v_2 , v_3 \right)_{ B_e }.
So we have spoke in very mathematical terms about different representations of vectors. What about representations of linear transformations: T_A : \mathbb{R}^n \to \mathbb{R}^n? Recall that each linear transformation can be represented as a matrix with respect to some basis. The matrix of T_A with respect to the basis B_{e}=\{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \} is given by: \ _{B_e}[A]_{B_e} = \begin{bmatrix} | & | & \mathbf{ } & | \nl T_A(\vec{e}_1) & T_A(\vec{e}_2) & \dots & T_A(\vec{e}_n) \nl | & | & \mathbf{ } & | \end{bmatrix}. where we assume that the outputs T_A(\vec{e}_j) are given to us as column vectors with respect to B_{e}.
The action of T_A on any vector \vec{v} is the same as the matrix-vector multiplication by \ _{B_e}[A]_{B_e} of the coefficients vector (v_1,v_2,\ldots,v_n)_{B_{e}} expressed in the basis B_e.
A lot of mathematical buzz comes from this kind of parallel structure between worlds. The mathematical term used to describe a one-to-one correspondence between two mathematical objets is called an isomorphism. It's the same thing. Everything you know about matrices can be applied to linear transformation and everything you know about linear transformations can be applied to matrices.
In this case, we can say more precisely that the abstract concept of some linear transformation is represented as the concrete matrix of coefficients with respect to some basis. The matrix \ _{B_{e}}[A]_{B_{e}} is the representation of T_A with respect to the basis B_{e}.
What would be the representation of T_A with respect to some other basis B_{f}?
Recall that the change of basis matrix \ _{B_f}[I]_{B_e} which can be used to transform a coefficients [\vec{v}]_{B_e} to a coefficient vector in a different basis [\vec{v}]_{B_f}: [\vec{v}]_{B_f} = \ _{B_f}[I]_{B_e} \ [\vec{v}]_{B_e}.
Suppose now that you are given the representation \ _{B_{e}}[A]_{B_{e}} of the linear transformation T_A with respect to B_e and you are asked to find the matrix \ _{B_{f}}[A]_{B_{f}} which is the representation of T_A with respect to the basis B_f.
The answer is very straightforward \ _{B_f}[A]_{B_f} = \ _{B_f}[I]_{B_e} \ _{B_e}[A]_{B_e} \ _{B_e}[I]_{B_f}, where \ _{B_e}[I]_{B_f} is the inverse matrix of \ _{B_f}[I]_{B_e} and corresponds to the change of basis from the B_f basis to the B_e basis.
The interpretation of the above three-matrix sandwich is also straightforward. Imagine an input vector [\vec{v}]_{B_f} multiplying the sandwich from the right. In the first step \ _{B_e}[I]_{B_f} will convert it to the B_e basis so that the \ _{B_e}[A]_{B_e} matrix can be applied. In the last step the matrix \ _{B_f}[I]_{B_e} converts the output of T_A to the B_f basis.
A transformation of the form: A \to P A P^{-1}, where P is any invertible matrix is called a similarity transformation.
The similarity transformation A^\prime = P A P^{-1} leaves many of the properties of the matrix A unchanged:
In some sense, the basis invariant properties like the trace, the determinant, the rank and the eigenvalues are the only true properties of matrices. Everything else is maya—just one representation out of many.
[ Change of basis explained. ]
http://planetmath.org/ChangeOfBases.html
NOINDENT
[ Change of basis example by Salman Khan. ]
http://www.youtube.com/watch?v=meibWcbGqt4