Processing math: 100%

The page you are reading is part of a draft (v2.0) of the "No bullshit guide to math and physics."

The text has since gone through many edits and is now available in print and electronic format. The current edition of the book is v4.0, which is a substantial improvement in terms of content and language (I hired a professional editor) from the draft version.

I'm leaving the old wiki content up for the time being, but I highly engourage you to check out the finished book. You can check out an extended preview here (PDF, 106 pages, 5MB).


Gram-Schmidt orthogonalization

Suppose you are given a set of n linearly independent vectors {v1,v2,,vn} taken from an n-dimensional space V and you are asked to transform them into an orthonormal basis {ˆe1,ˆe2,,ˆen} for which: ˆei,ˆej={1 if i=j,0 if ij. This procedure is known as orthogonalization. In this section, we'll learn an intuitive algorithm for converting any set of vectors into a set of orthonormal vectors. The algorithm is called Gram-Schmidt orthogonalization and it uses repeated projection and subtraction operations.

Definitions

  • V: An n-dimensional vector space.
  • {v1,v2,,vn}: A generic basis for the space V.
  • {e1,e2,,en}: An orthogonal basis for V,

is one which satisfies eiej=0 if ij.

  • {ˆe1,ˆe2,,ˆen}: An orthonormal basis for V

is an orthogonal basis of unit length vectors.

We assume that the vector space V is equipped with an inner product operation: ,:V×VR.

The following operations are defined in terms of the inner product:

  • The length of a vector v=v,v.
  • The projection operation. The projection of the vector u onto the subspace

spanned by v is given by:

  \[
   \Pi_{\mathbf{v}}(\mathbf{u}) =  \frac{  \langle \mathbf{u}, \mathbf{v} \rangle }{ \|\mathbf{v}\|^2 } \mathbf{v}.
  \]
* The //projection complement// of the projection $\Pi_{\mathbf{v}}(\mathbf{u})$ is the vector
  $\mathbf{w}$ that we need to add to $\Pi_{\mathbf{v}}(\mathbf{u})$ 
  to get back the //complete// original vector $\mathbf{u}$:
  \[
   \Pi_{\mathbf{v}}(\mathbf{u}) + \mathbf{w} = \mathbf{u}
   \qquad
   \textrm{or}
   \qquad
   \mathbf{w}  = \mathbf{u} - \Pi_{\mathbf{v}}(\mathbf{u}).
  \]
  Observe that the vector $\mathbf{w}$ is, by construction, //orthogonal// to the vector $\mathbf{v}$:
  $\langle \mathbf{u} - \Pi_{\mathbf{v}}(\mathbf{u}), \mathbf{v} \rangle = 0$.

The discussion in this section is in terms of abstract vectors denoted in bold u and the operations are performed in an abstract inner product space. Thus, the algorithm described below can be used with vectors vRn, matrices MRm×n, and polynomials pPn(x). Indeed, we can talk about orthogonality for any vector space for which we can define an inner product operation.

Orthonormal bases are nice

Recall that a basis for an n-dimensional vector space V is any set of n linearly independent vectors in V. The choice of basis is a big deal because it is with respect to that basis that we write down the coordinates of vectors and matrices. From the theoretical point of view, all bases are equally good, but from a practical point of view orthogonal and orthonormal bases are much easier to work with.

An orthonormal basis is the most useful kind of basis because the coefficients (c1,c2,c3) of a vector c can be obtained simply using the inner product: c1=c,ˆe1,c2=c,ˆe2,c3=c,ˆe3.

Indeed we can write down any vector v as v=v,ˆe1ˆe1+v,ˆe2ˆe2+v,ˆe3ˆe3. This formula is a generalization of the usual formula for coefficients with respect to the standard basis {ˆı,ˆȷ,ˆk}: v=(vˆı)ˆı+(vˆȷ)ˆȷ+(vˆk)ˆk.

Orthogonalization

As we said earlier, the “best” kind of basis for computational purposes is an orthonormal one like {ˆe1,ˆe2,,ˆen}. A common task in linear algebra is to upgrade some general set of n linearly independent vectors {v1,v2,,vn} into an orthonormal basis {ˆe1,ˆe2,,ˆen}, where the vectors {ˆei} are all formed as linear combinations of the vectors {vi}. Note that the vector space spanned by both these sets of vectors is the same: Vspan{v1,v2,,vn}=span{ˆe1,ˆe2,,ˆen}, but the basis {ˆe1,ˆe2,,ˆem} is easier to work with since we can compute the vector coefficients using the inner product ui=u,ˆei.

The technical term for distilling a high quality basis from a low quality basis is called orthogonalization. Note that it is not called orthonormalization, which would be 1) way too long for a word (in German it would be Okay I guess) and 2) over-complicated for nothing. You see, the actual work is in getting the set of vectors {ei} which are orthogonal to each other: eiej=0, for all ij. Converting these into an orthonormal basis is then done simply by dividing each vector by its length: ˆei=eiei.

Let's now see how this is done.

Gram-Schmidt orthogonalization

The Gram-Schmidt orthogonalization procedure converts a set of arbitrary vectors {v1,v2,,vn} into an orthonormal set of vectors {ˆe1,ˆe2,,ˆen}. The main idea is to take the directions of the vectors {vi} one at a time and each time define a new vector ei as the orthogonal complement to all the previously chosen vectors e1, e2, , ei1. The orthogonalization algorithm consists of n steps: e1=v1ˆe1=v1v1,e2=v2Πˆe1(v2),ˆe2=e2e2,e3=v3Πˆe1(v3)Πˆe2(v3),ˆe3=e3e3,e4=v4Πˆe1(v4)Πˆe2(v4),Πˆe3(v4),ˆe4=e4e4,en=vnn1i=1Πˆei(vn),ˆen=enen. In the jth step of the procedure, we compute a vector ej by starting from vj and subtracting all the projections onto the previous vectors ei for all i<j. In other words, ej is the part of vj that is orthogonal to all the vectors e1,e2,,ej1.

The above procedure is known as orthogonalization because it splits the vector space V into orthogonal subspaces V1,V2,,Vn: Wj=span{vV | v=ji=1αivi}span{vV | v=j1i=1αivi}. Recall that the symbol denotes the set minus operation. The set AB consists of all elements that are in A but not in B.

Observe that the subspaces V1,V2,,Vn are, by construction, mutually orthogonal. Given any vector uVi and another vector vVj,ji then uv=0.

The vector space V is sum of these subspaces: V=V1V2V3Vn. The notation means orthogonal sum.

Discussion

The main point about orthogonalization that I want you to know is that it can be done. Any “low quality” basis (a set of n linearly independent vectors {v1,v2,,vn} in an n-dimensional space) can be converted into a “high quality” orthonormal basis {ˆe1,ˆe2,,ˆen} by using the Gram-Schmidt procedure.

In the next section we will learn how to think about this orthogonalization procedure in terms of matrices, where the Gram-Schmidt procedure is known as the QR decomposition.

 
home about buy book