Table of Contents

Inner product spaces

An inner product space is an abstract vector space $(V,\mathbb{R},+,\cdot)$ for which we define an abstract inner product operation: \[ \langle \cdot, \cdot \rangle : V \times V \to \mathbb{R}. \]

Any inner product operation can used, so long as it satisfies the following properties for all $\mathbf{u}, \mathbf{v}, \mathbf{v}_1,\mathbf{v}_2\in V$ and $\alpha,\beta \in \mathbb{R}$.

  1. Symmetric: $\langle \mathbf{u},\mathbf{v}\rangle =\langle \mathbf{v},\mathbf{u}\rangle$.
  2. Linear: $\langle \mathbf{u},\alpha\mathbf{v}_1+\beta\mathbf{v}_2\rangle =\alpha\langle \mathbf{u},\mathbf{v}_1\rangle +\beta\langle \mathbf{u},\mathbf{v}_2\rangle $
  3. Positive semi-definite: $\langle \mathbf{u},\mathbf{u}\rangle \geq0$ for all $\mathbf{u}\in V$, $\langle \mathbf{u},\mathbf{u}\rangle =0$ if and only if $\mathbf{u}=\mathbf{0}$.

The above properties are inspired by the properties of the standard inner product (dot product) for vectors in $\mathbb{R}^n$: \[ \langle \vec{u}, \vec{v}\rangle \equiv \vec{u} \cdot \vec{v} = \sum_{i=1}^n u_i v_i = \vec{u}^T \vec{v}. \] In this section, we generalize the idea of dot product to abstract vectors $\mathbf{u}, \mathbf{v} \in V$ by defining an inner product operation $\langle \mathbf{u},\mathbf{v}\rangle$ appropriate for the elements of $V$. We will define a product for matrices $\langle M,N\rangle$, polynomials $\langle \mathbf{p},\mathbf{q}\rangle$ and functions $\langle f,g \rangle$. This inner product will in turn allow us to talk about orthogonality between abstract vectors, \[ \mathbf{u} \textrm{ and } \mathbf{v} \textrm{ are orthogonal } \quad \Leftrightarrow \quad \langle \mathbf{u},\mathbf{v}\rangle = 0, \] the length of an abstract vector, \[ \| \mathbf{u} \| \equiv \sqrt{ \langle \mathbf{u},\mathbf{u}\rangle }, \] and the distance between two abstract vectors: \[ d(\mathbf{u},\mathbf{v}) \equiv \| \mathbf{u}-\mathbf{v} \| =\sqrt{ \langle (\mathbf{u}-\mathbf{v}),(\mathbf{u}-\mathbf{v})\rangle }. \]

Let's get started.

Definitions

We will be dealing with vectors from an abstract vector space $(V,\mathbb{R},+,\cdot)$ where:

  1. $V$ is the set of vectors in the vector space.
  2. $\mathbb{R}=F$ is the field of real numbers.

The coefficients of the generalized vectors are taken from that field.

  1. $+$ is the addition operation defined for elements of $V$.
  2. $\cdot$ is the scalar multiplication operation between an

element of the field $\alpha \in \mathbb{R}$ and vector $\mathbf{u} \in V$.

  Scalar multiplication is usually denoted implicitly $\alpha \mathbf{u}$
  so as not to be confused with the dot product.

We define a new operation called inner product for that space: \[ \langle \cdot, \cdot \rangle : V \times V \to \mathbb{R}, \] which takes as inputs two abstract vectors $\mathbf{u}, \mathbf{v} \in V$ and returns a real number $\langle \mathbf{u},\mathbf{v}\rangle$.

We define the following related quantities in term so the inner product operation:

the norm or length of an abstract vector $\mathbf{u} \in V$.

the distance between two abstract vector $\mathbf{u},\mathbf{v} \in V$.

Orthogonality

Recall that two vectors $\vec{u}, \vec{v} \in \mathbb{R}^n$ are said to be orthogonal if their dot product is zero. This follows from the geometric interpretation of the dot product: \[ \vec{u}\cdot \vec{v} = \|\vec{u}\| \|\vec{v}\| \cos\theta, \] where $\theta$ is the angle between $\vec{u}$ and $\vec{v}$. Orthogonal means “at right angle with.” Indeed, the angle between $\vec{u}$ and $\vec{v}$ must be $90^\circ$ or $270^\circ$ if we have $\vec{u}\cdot \vec{v}=0$ since $\cos\theta = 0$ only for those angles.

In analogy with the above reasoning, we now define the notion of orthogonality between abstract vectors in terms of the inner product: \[ \mathbf{u} \textrm{ and } \mathbf{v} \textrm{ are orthogonal } \quad \Leftrightarrow \quad \langle \mathbf{u},\mathbf{v}\rangle = 0. \]

Norm

Every definition of an inner product for an abstract vector space $(V,\mathbb{R},+,\cdot)$ induces a norm on that vector space: \[ \| . \| : V \to \mathbb{R}. \] The norm is defined in terms of the inner product: \[ \|\mathbf{u}\|=\sqrt{\langle \mathbf{u},\mathbf{u}\rangle }. \] The norm $\|\mathbf{u}\|$ of a vector $\mathbf{u}$ corresponds, in some sense, to the “length” of the vector.

NOINDENT Important properties of norms:

\[ \|\mathbf{u}+\mathbf{v}\|\leq\|\mathbf{u}\|+\|\mathbf{v}\| \]

\[ | \langle \mathbf{x} , \mathbf{y} \rangle | \leq \|\mathbf{x} \|\: \| \mathbf{y} \|. \]

  The equality holds if and only if $\mathbf{x}$ and $\mathbf{y} $ are linearly dependent.

Distance

The distance between two points $p$ and $q$ in $\mathbb{R}^n$ is equal to the length of the vector that goes from $p$ to $q$: $d(p,q)=\| q - p \|$. We can similarly define a distance function between pairs of vectors in an abstract vector space $V$: \[ d : V \times V \to \mathbb{R}. \] The distance between two abstract vectors is the norm of their difference: \[ d(\mathbf{u},\mathbf{v}) \equiv \| \mathbf{u}-\mathbf{v} \| =\sqrt{ \langle (\mathbf{u}-\mathbf{v}),(\mathbf{u}-\mathbf{v})\rangle }. \]

NOINDENT Important properties of distances:

Examples

Matrix inner product

The Hilbert-Schmidt inner product for real matrices is \[ \langle A, B \rangle_{\textrm{HS}} = \textrm{Tr}\!\left[ A^T B \right]. \]

We can use this inner product to talk about orthogonality properties of matrices. In the last section we defined the set of $2\times2$ symmetric matrices \[ \mathbb{S}(2,2) = \{ A \in \mathbb{M}(2,2) \ | \ A = A^T \}, \] and gave an explicit basis for this space: \[ \mathbf{v}_1 = \begin{bmatrix} 1 & 0 \nl 0 & 0 \end{bmatrix}, \ \ \mathbf{v}_1 = \begin{bmatrix} 0 & 1 \nl 1 & 0 \end{bmatrix}, \ \ \mathbf{v}_3 = \begin{bmatrix} 0 & 0 \nl 0 & 1 \end{bmatrix}. \]

It is easy to show that these vectors are all mutually orthogonal with respect to the Hilbert-Schmidt inner product $\langle \cdot , \cdot \rangle_{\textrm{HS}}$: \[ \langle \mathbf{v}_1 , \mathbf{v}_2 \rangle_{\textrm{HS}}=0, \quad \langle \mathbf{v}_1 , \mathbf{v}_3 \rangle_{\textrm{HS}}=0, \quad \langle \mathbf{v}_2 , \mathbf{v}_3 \rangle_{\textrm{HS}}=0. \] Verify each of these by hand on a piece of paper right now. The above equations certify that the set $\{ \mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3 \}$ is an orthogonal basis for the vector space $\mathbb{S}(2,2)$.

Hilbert-Schmidt norm

The Hilbert-Schmidt inner product induces the Hilbert-Schmidt norm: \[ ||A||_{\textrm{HS}} \equiv \sqrt{ \langle A, A \rangle_{\textrm{HS}} } = \sqrt{ \textrm{Tr}\!\left[ A^T A \right] } = \left[ \sum_{i,j=1}^{n} |a_{ij}|^2 \right]^{\frac{1}{2}}. \]

We can therefore talk about the norm or length of a matrix. To continue with the above example, we can obtain an orthonormal basis $\{ \hat{\mathbf{v}}_1, \hat{\mathbf{v}}_2, \hat{\mathbf{v}_3} \}$ for $\mathbb{S}(2,2)$ as follows: \[ \hat{\mathbf{v}}_1 = \mathbf{v}_1, \quad \hat{\mathbf{v}}_2 = \frac{ \mathbf{v}_2 }{ \|\mathbf{v}_2\|_{\textrm{HS}} } = \frac{1}{\sqrt{2}}\mathbf{v}_2, \quad \hat{\mathbf{v}}_3 = \mathbf{v}_3. \] Verify that $\|\hat{\mathbf{v}}_2\|_{\textrm{HS}}=1$.

Function inner product

Consider two functions $\mathbf{f}=f(t)$ and $\mathbf{g}=g(t)$ and define their inner product as follows: \[ \langle f,g\rangle =\int_{-\infty}^\infty f(t)g(t)\; dt. \] The above formula is the continuous-variable version of the inner product formula for vectors $\vec{u}\cdot\vec{v}=\sum_i u_i v_i$. Instead of a summation we have an integral, but otherwise the idea is again to measure how strong the overlap between the $\mathbf{f}$ and $\mathbf{g}$ is.

Example

Consider the function inner product on the interval $[-1,1]$ as defined by the formula: \[ \langle f,g\rangle =\int_{-1}^1 f(t)g(t)\; dt. \]

Verify that the following polynomials, known as the Legendre polynomials $P_n(x)$, are mutually orthogonal with respect to the above inner product. \[ P_0(x)=1, \quad P_1(x)=x, \quad P_2(x)=\frac{1}{2}(3x^2-1), \quad P_3(x)=\frac{1}{2}(5x^3-3x), \] \[ \quad P_4(x)=\frac{1}{8}(35x^4-30x^2+3), \quad P_5(x)=\frac{1}{8}(63x^5-70x^3+15x). \]

TODO: Maybe add to math section on polynomials with intuitive expl: the product of any two of these: half above x axis, half below

Generalized dot product

We can think of the regular dot product for vectors as the following matrix product: \[ \vec{u} \cdot \vec{v} = \vec{u}^T \vec{v}= \vec{u}^T I \vec{v}. \]

In fact we can insert any symmetric and positive semidefinite matrix $M$ in between the vectors to obtain the generalized inner product: \[ \langle \vec{x}, \vec{y} \rangle_M \equiv \vec{x}^T M \vec{y}. \] The matrix $M$ is called the metric for this inner product and it encodes the relative contributions of the different components of the vectors to the length.

The requirement that $M$ be a symmetric matrix stems from the symmetric requirement of the inner product: $\langle \mathbf{u},\mathbf{v}\rangle =\langle \mathbf{v},\mathbf{u}\rangle$. The requirement that the matrix be positive semidefinite comes from the positive semi-definite requirement of the inner product: $\langle \mathbf{u},\mathbf{u}\rangle = \vec{u}^T M \vec{u} \geq 0$ for all $\mathbf{u}\in V$.

We can always obtain a symmetric and positive semidefinite matrix $M$ by setting $M = A^TA$ for some matrix $A$. To understand why we might want to construct $M$ in this way you need to recall that we can think of the matrix $A$ as performing some linear transformation $T_A(\vec{u})=A\vec{u}$. An inner product $\langle \vec{u},\vec{v}\rangle_M$ can be interpreted as the inner product in the image space of $T_A$: \[ \langle \vec{u}, \vec{v} \rangle_M = \vec{u}^T M \vec{v}= \vec{u}^T A^T A \vec{v}= (A\vec{u})^T (A \vec{v})= T_A(\vec{u}) \cdot T_A(\vec{v}). \]

Standard inner product

Why is the standard inner product for vectors $\langle \vec{u}, \vec{v} \rangle = \vec{u} \cdot \vec{v} = \sum_i u_i v_i$ called the “standard” inner product? If we are free to define ….

TODO: copy from paper… maybe move below next par

To be a inner product space

A standard question that profs like to ask on exams is to make you check whether some weird definition of an inner product forms an inner product space. Recall that any operation can be used as the inner product so long as it satisfies the symmetry, linearity, and positive semidefinitness requirements. Thus, what you are supposed to do is check whether the weird definition of an inner product which you will be given satisfies the three axioms. Alternately, you can show that the vector space $(V,\mathbb{R},+,\cdot)$ with inner product $\langle \mathbf{u}, \mathbf{v} \rangle$ is not an inner product space if you find an example of one of more $\mathbf{u},\mathbf{v} \in V$ which do not satisfy one of the axioms.

Discussion

This has been another one of those sections where we learn no new linear algebra but simply generalize what we already know about standard vectors $\vec{v} \in \mathbb{R}^n$ to more general vector-like things $\textbf{v} \in V$. You can now talk about inner products, orthogonality, and norms of matrices, polynomials, and other functions.