The page you are reading is part of a draft (v2.0) of the "No bullshit guide to math and physics."

The text has since gone through many edits and is now available in print and electronic format. The current edition of the book is v4.0, which is a substantial improvement in terms of content and language (I hired a professional editor) from the draft version.

I'm leaving the old wiki content up for the time being, but I highly engourage you to check out the finished book. You can check out an extended preview here (PDF, 106 pages, 5MB).


Vector spaces

We will now discuss no vector in particular, but rather the set of all possible vectors. In three dimensions this is the space $(\mathbb{R},\mathbb{R},\mathbb{R}) \equiv \mathbb{R}^3$. We will also discuss vector subspaces of $\mathbb{R}^3$ like lines and planes thought the origin.

In this section we develop the vocabulary needed to talk about vector spaces. Using this language will allow us to say some interesting things about matrices. We will formally define the fundamental subspaces for a matrix $A$: the column space $\mathcal{C}(A)$, the row space $\mathcal{R}(A)$, and the null space $\mathcal{N}(A)$.

Definitions

Vector space

A vector space $V \subseteq \mathbb{R}^n$ consists of a set of vectors and all possible linear combinations of these vectors. The notion of all possible linear combinations is very powerful. In particular it has the following two useful properties. We say that vector spaces are closed under addition, which means the sum of any two vectors taken from the vector space is a vector in the vector space. Mathematically, we write: \[ \vec{v}_1+\vec{v}_2 \in V, \qquad \forall \vec{v}_1, \vec{v}_2 \in V. \] A vector space is also closed under scalar multiplication: \[ \alpha \vec{v} \in V, \qquad \forall \alpha \in \mathbb{R},\ \vec{v} \in V. \]

Span

Given a vector $\vec{v}_1$, we can define the following vector space: \[ V_1 = \textrm{span}\{ \vec{v}_1 \} \equiv \{ \vec{v} \in V \ | \vec{v} = \alpha \vec{v}_1 \textrm{ for some } \alpha \in \mathbb{R} \}. \] We say $V_1$ is the space spanned by $\vec{v}_1$ which means that it is the set of all possible multiples of $\vec{v}_1$. The shape of $V_1$ is an infinite line.

Given two vectors $\vec{v}_1$ and $\vec{v}_2$ we can define a vector space: \[ V_{12} = \textrm{span}\{ \vec{v}_1, \vec{v}_2 \} \equiv \{ \vec{v} \in V \ | \vec{v} = \alpha \vec{v}_1 + \beta\vec{v}_2 \textrm{ for some } \alpha,\beta \in \mathbb{R} \}. \] The vector space $V_{12}$ contains all vectors that can be written as a linear combination of $\vec{v}_1$ and $\vec{v}_2$. This is a two-dimensional vector space which has the shape of an infinite plane.

Note that the same space $V_{12}$ can be obtained as the span of different vectors: $V_{12} = \textrm{span}\{ \vec{v}_1, \vec{v}_{2^\prime} \}$, where $\vec{v}_{2^\prime} = \vec{v}_2 + 30\vec{v}_1$. Indeed, $V_{12}$ can be written as the span of any two linearly independent vectors contained in $V_{12}$. This is precisely what is cool about vector spaces: you can talk about the space as a whole without necessarily having to talk about the vectors in it.

As a special case, consider the the situation when $\vec{v}_1 = \gamma\vec{v}_2$, for some $\gamma \in \mathbb{R}$. In this case, the vector space $V_{12} = \textrm{span}\{ \vec{v}_1, \vec{v}_2 \}=\textrm{span}\{ \vec{v}_1 \}$ is actually one-dimensional since $\vec{v}_2$ can be written as a multiple of $\vec{v}_1$.

Vector subspaces

A subset $W$ of the vector space $V$ is called a subspace if:

  1. It is closed under addition: $\vec{w}_1 + \vec{w}_2 \in W$, for all $\vec{w}_1,\vec{w}_2 \in W$.
  2. It is closed under scalar multiplication: $\alpha \vec{w} \in W$, for all $\vec{w} \in W$.

This means that if you take any linear combination of vectors in $W$, the result will also be a vector in $W$. We use the notation $W \subseteq V$ to indicate that $W$ is a subspace of $V$.

An important fact about subspaces is that they always contains the zero vector $\vec{0}$. This is implied by the second property, since any vector becomes the zero vector when multiplied by the scalar $\alpha=0$: $\alpha \vec{w} = \vec{0}$.

Constraints

One way to define a vector subspace $W$ is to start with a larger space $(x,y,z) \in V$ and describe the a set of constraints that must be satisfied by all points $(x,y,z)$ in the subspace $W$. For example, the $xy$-plane can be defined as the set points $(x,y,z) \in \mathbb{R}^3$ that satisfy \[ (0,0,1) \cdot (x,y,z) = 0. \] More formally, we define the $xy$-plane as follows: \[ P_{xy} = \{ (x,y,z) \in \mathbb{R}^3 \ | \ (0,0,1) \cdot (x,y,z) = 0 \}. \] The vector $\hat{k}\equiv(0,0,1)$ is perpendicular to all the vectors that lie in the $xy$-plane so another description for the $xy$-plane is “the set of all vectors perpendicular to the vector $\hat{k}$.” In this definition, the parent space is $V=\mathbb{R}^3$, and the subspace $P_{xy}$ is defined as the set of points that satisfy the constraint $(0,0,1) \cdot (x,y,z) = 0$.

Another way to represent the $xy$-plane would be to describe it as the span of two linearly independent vectors in the plane: \[ P_{xy} = \textrm{span}\{ (1,0,0), (1,1,0) \}, \] which is equivalent to saying: \[ P_{xy} = \{ \vec{v} \in \mathbb{R}^3 \ | \ \vec{v} = \alpha (1,0,0) + \beta(1,1,0), \forall \alpha,\beta \in \mathbb{R} \}. \] This last expression is called an explicit parametrization of the space $P_{xy}$ and $\alpha$ and $\beta$ are the two parameters. There corresponds a unique pair $(\alpha,\beta)$ for each point in the plane. The explicit parametrization of an $m$-dimensional vector space requires $m$ parameters.

Matrix subspaces

Consider the following subspaces which are associated with a matrix $M \in \mathbb{R}^{m\times n}$. These are sometiemes referred to as the fundamental subspaces of the matrix $M$.

  • The row space $\mathcal{R}(M)$ is the span of the rows of the matrix.

Note that computing a given linear combination of the rows of a matrix can be

  done by multiplying the matrix //on the left// with an $m$-vector:
  \[
    \mathcal{R}(M) \equiv \{ \vec{v} \in \mathbb{R}^n \ | \ \vec{v} = \vec{w}^T M \textrm{ for some } \vec{w} \in \mathbb{R}^{m} \},
  \]
  where we used the transpose $T$ to make $\vec{w}$ into a row vector.
* The null space $\mathcal{N}(M)$ of a matrix $M \in \mathbb{R}^{m\times n}$
  consists of all the vectors that the matrix $M$ sends to the zero vector:
  \[
    \mathcal{N}(M) \equiv \{ \vec{v} \in \mathbb{R}^n \ | \ M\vec{v} = \vec{0} \}.
  \]
  The null space is also known as the //kernel// of the matrix.
* The column space $\mathcal{C}(M)$ is the span of the columns of the matrix.
  The column space consist of all the possible output vectors that the matrix can produce
  when multiplied by a vector on the right:
  \[
    \mathcal{C}(M) \equiv \{ \vec{w} \in \mathbb{R}^m 
    \ | \ 
    \vec{w} = M\vec{v} \textrm{ for some } \vec{v} \in \mathbb{R}^{n} \}.
  \]
* The left null space $\mathcal{N}(M^T)$ which is the null space of the matrix $M^T$. 
  We say //left// null space, 
  because this is the null space of vectors when multiplying the matrix by a vector on the left:
  \[
    \mathcal{N}(M^T) \equiv \{ \vec{w} \in \mathbb{R}^m \ | \ \vec{w}^T M = \vec{0}^T \}.
  \]
  The notation $\mathcal{N}(M^T)$ is suggestive of the fact that we can 
  rewrite the condition $\vec{w}^T M = \vec{0}^T$ as $M^T\vec{w} = \vec{0}^T$.
  Hence the left null space of $A$ is equivalent to the null space of $A^T$.
  The left null space consists of all the vectors $\vec{w} \in \mathbb{R}^m$ 
  that are orthogonal to the columns of $A$.

The matrix-vector product $M \vec{x}$ can be thought of as the action of a vector function (a linear transformation $T_M:\mathbb{R}^n \to \mathbb{R}^m$) on an input vector $\vec{x}$. The columns space $\mathcal{C}(M)$ plays the role of the image of the linear transformation $T_M$, and the null space $\mathcal{N}(M)$ is the set of zeros (roots) of the function $T_M$. The row space $\mathcal{R}(M)$ is the pre-image of the column space $\mathcal{C}(M)$. To every point in $\mathcal{R}(M)$ (input vector) corresponds one point (output vector) in $\mathcal{C}(M)$. This means the column space and the rows space must have the same dimension. We call this dimension the rank of the matrix $M$: \[ \textrm{rank}(M) = \dim\left(\mathcal{R}(M) \right) = \dim\left(\mathcal{C}(M) \right). \] The rank is the number of linearly independent rows, which is also equal to the number of independent columns.

We can characterize the domain of $M$ (the space of $n$-vectors) as the orthogonal sum ($\oplus$) of the row space and the null space: \[ \mathbb{R}^n = \mathcal{R}(M) \oplus \mathcal{N}(M). \] Basically a vector either has non-zero product with at least one of the rows of $M$ or it has zero product with all of them. In the latter case, the output will be the zero vector – which means that the input vector was in the null space.

If we think of the dimensions involved in the above equation: \[ \dim(\mathbb{R}^n) = \dim(\mathcal{R}(M)) + \dim( \mathcal{N}(M)), \] we obtain an important fact: \[ n = \textrm{rank}(M) + \dim( \mathcal{N}(M)), \] where $\dim( \mathcal{N}(M))$ is called the nullity of $M$.

Linear independence

The set of vectors $\{\vec{v}_1, \vec{v}_2, \ldots, \vec{v}_n \}$ is linear independent if the only solution to the equation \[ \sum\limits_i\lambda_i\vec{v}_i= \lambda_1\vec{v}_1 + \lambda_2\vec{v}_2 + \cdots + \lambda_n\vec{v}_n = \vec{0} \] is $\lambda_i=0$ for all $i$.

The above condition guarantees that none of the vectors can be written as a linear combination of the other vectors. To understand the importance of the “all zeros” solutions, let's consider an example where a non-zero solution exists. Suppose we have a set of three vectors $\{\vec{v}_1, \vec{v}_2, \vec{v}_3 \}$ which satisfy $\lambda_1\vec{v}_1 + \lambda_2\vec{v}_2 + \lambda_3\vec{v}_3 = 0$ with $\lambda_1=-1$, $\lambda_2=1$, and $\lambda_3=2$. This means that \[ \vec{v}_1 = 1\vec{v}_2 + 2\vec{v}_3, \] which shows that $\vec{v}_1$ can be written as a linear combination of $\vec{v}_2$ and $\vec{v}_3$, hence the vectors are not linearly independent.

Basis

In order to carry out calculations with vectors in a vector space $V$, we need to know a basis $B=\{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \}$ for that space. A basis for an $n$-dimensional vector space $V$ is a set of $n$ linearly independent vectors in $V$. Intuitively, a basis is a set of vectors that can be used as a coordinate system for a vector space.

A basis $B=\{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \}$ for the vector space $V$ has the following two properties:

  • Spanning property.

Any vector $\vec{v} \in V$ can be expressed as a linear combination of the basis elements:

  \[
   \vec{v} = v_1\vec{e}_1 + v_2\vec{e}_2 + \cdots +  v_n\vec{e}_n.
  \]
  This property guarantees that the vectors in the basis $B$ are //sufficient// to represent any vector in $V$.
* **Linear independence property**. 
  The vectors that form the basis $B = \{ \vec{e}_1,\vec{e}_2, \ldots, \vec{e}_n \}$ are linearly independent.
  The linear independence of the vectors in the basis guarantees that none of the vectors $\vec{e}_i$ is redundant.

If a set of vectors $B=\{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \}$ satisfies both properties, we say $B$ is a basis for $V$. In other words $B$ can serve as a coordinate system for $V$. Using the basis $B$, we can represent any vector $\vec{v} \in V$ as a unique tuple of coordinates \[ \vec{v} = v_1\vec{e}_1 + v_2\vec{e}_2 + \cdots + v_n\vec{e}_n \qquad \Leftrightarrow \qquad (v_1,v_2, \ldots, v_n)_B. \] The coordinates of $\vec{v}$ are calculated with respect to the basis $B$.

The dimension of a vector space is defined as the number of vectors in a basis for that vector space. A basis for an $n$-dimensional vector space contains exactly $n$ vectors. Any set of less than $n$ vectors would not satisfy the spanning property. Any set of with more than $n$ vectors from $V$ cannot be linearly independent. To form a basis for a vector space, the set of vectors must be “just right”: it must contain a sufficient number of vectors but not too many so that the coefficients of each vector will be uniquely determined.

Distilling a basis

A basis for an $n$-dimensional vector space $V$ consist of exactly $n$ vectors. Any set of vectors $\{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \}$ can serve as a basis as long as they are linearly independent and there is exactly $n$ of them.

Sometimes an $n$-dimensional vector space $V$ will be specified as the span of more than $n$ vectors: \[ V = \textrm{span}\{ \vec{v}_1, \vec{v}_2, \ldots, \vec{v}_m \}, \quad m > n. \] Since there are $m>n$ of the $\vec{v}$-vectors, they are too many to form a basis. We say this set of vectors is over-complete. They cannot all be linearly independent since there can be at most $n$ linearly independent vectors in an $n$-dimensional vector space.

If we want to have a basis for the space $V$, we'll have to reject some of the vectors. Given the set of vectors $\{ \vec{v}_1, \vec{v}_2, \ldots, \vec{v}_m \}$, our task is to distill a set of $n$ linearly indecent vectors $\{ \vec{e}_1, \vec{e}_2, \ldots, \vec{e}_n \}$ from them.

We can use the Gauss–Jordan elimination procedure to distil a set of linearly independent vectors. Actually, you know how to do this already! You can write the set of $m$ vectors as the rows of a matrix and then do row operations on this matrix until you find the reduced row echelon form. Since row operations do not change the row space of the matrix, there will be $n$ non-zero rows of the final RREF of the matrix which form a basis for $V$. We will learn more about this procedure in the next section.

Examples

Example 1

Describe the set of vectors which are perpendicular to the vector $(0,0,1)$ in $\mathbb{R}^3$.
Sol: We need to find all the vectors $(x,y,z)$ such that $(x,y,z)\cdot (0,0,1) = 0$. By inspection we see that whatever choice of $x$ and $y$ components we choose will work so we say that the set of vectors perpendicular to $(0,0,1)$ is $\textrm{span}\{ (1,0,0), (0,1,0) \}$.

 
home about buy book