The page you are reading is part of a draft (v2.0) of the "No bullshit guide to math and physics."

The text has since gone through many edits and is now available in print and electronic format. The current edition of the book is v4.0, which is a substantial improvement in terms of content and language (I hired a professional editor) from the draft version.

I'm leaving the old wiki content up for the time being, but I highly engourage you to check out the finished book. You can check out an extended preview here (PDF, 106 pages, 5MB).


Optimization algorithm

In this section we show and explain the details of the algorithm for finding the maximum of a function. This is called optimization, as in finding the optimal value(s).

Say you have the function $f(x)$ that represents a real world phenomenon. For example, $f(x)$ could represent how much fun you have as a function of alcohol consumed during one evening. We all know that too much $x$ and the fun stops and you find yourself, like the Irish say, “talking to God on the big white phone.” Too little $x$ and you might not have enough Dutch courage to chat up that girl/guy from the table across the room. To have as much fun as possible, you want to find the alcohol consumption $x^*$ where $f$ takes on its maximum value.

This is one of the prominent applications of calculus (optimization not alcohol consumption). This is why you have been learning about all those limits, derivative formulas and differentiation rules in the previous sections.

Definitions

  • $x$: the variable we have control over.
  • $[x_i,x_f]$: some interval of values where $x$ can be chosen from, i.e., $x_i \leq x \leq x_f$. These are the constraints on the optimization problem. (For the drinking optimization problem $x\geq 0$ since you can't drink negative alcohol, and probably $x<2$ (in litres of hard booze) because roughly around there you will die from alcohol poisoning. So we can say we are searching for the optimal amount of alcohol $x$ in the interval $[0,2]$.)
  • $f(x)$: the function we want to optimize. This function has to be differentiable, meaning that we can take its derivative.
  • $f'(x)$: The derivative of $f(x)$. The derivative contains the information about the slope of $f(x)$.
  • maximum: A place where the function reaches a peak. Furthermore, when there are multiple peaks, we call the highest of them the global maximum, while all others are called local maxima.
  • minimum: A place where the function reaches a low point: the bottom of a valley. The global minimum is the lowest point overall, whereas a local minimum is only the minimum in some neighbourhood.
  • extremum: An extremum is a general term that includes maximum and minimum.
  • saddle point: A place where $f'(x)=0$ but that point is neither a max nor a min. Ex: $f(x)=x^5$ when $x=0$.

Suppose some function $f(x)$ has a global maximum at $x^*$ and the value of that maximum is $f(x^*)=M$. The following mathematical notations apply:

  • $\mathop{\text{argmax}}_x \ f(x)=x^*$, to refer the location (the argument) where the maximum occurs.
  • $\max_x \ f(x) = M$, to refer to the maximum value.

Algorithm for finding extrema

Input: Some function $f(x)$ and a constraint region $C=[x_i,x_f]$.
Output: The location and value of all maxima and minima of $f(x)$.

You should proceed as follows to find the extrema of a function:

  1. First look at $f(x)$. If you can, plot it. If not, just try to imagine it.
  2. Find the derivative $f'(x)$.
  3. Solve the equation $f'(x)=0$. There will usually be multiple solutions. Make a list of them. We will call this the list of candidates.
  4. For each candidate $x^*$ in the list check if is a max, a min or a saddle point.
    • If $f'(x^*-0.1)$ is positive and $f'(x^*+0.1)$ is negative, then the point $x^*$ is a max.

The function was going up, then flattens at $x^*$ then goes down after $x^*$. Therefore $x^*$ must be a peak.

  • If $f'(x^*-0.1)$ is negative and $f'(x^*+0.1)$ is positive, then the point $x^*$ is a min.

The function goes down, flattens then goes up, so the point must be a minimum.

  • If $f'(x^*-0.1)$ and $f'(x^*+0.1)$ have the same sign, then the point $x^*$ is a saddle point. Remove it from the list of candidates.
  1. Now go through the list one more time and reject all candidates $x^*$ that do not satisfy the constraints C. In other words if $x\in [x_i,x_f]$ it stays, but if $x \not\in [x_i,x_f]$, we remove it since it is not feasible. For example, if you have a candidate solution in the alcohol consumption problem that says you should drink 5[L] of booze, you have to reject it, because otherwise you would die.
  2. Add $x_i$ and $x_f$ to the list of candidates. These are the boundaries of the constraint region and should also be considered. If no constrain was specified use the default constraint $x \in \mathbb{R}\equiv[-\infty,\infty]$ and add $-\infty$ and $\infty$ to the list.
  3. For each candidate $x^*$, calculate the function value $f(x^*)$.

The resulting list is a list of local extrema: maxima, minima and endpoints. The global maximum is the largest value from the list of local maxima. The global minimum is the smallest of the local minima.

Note that in dealing with points at infinity like $x^*=\infty$, you are not actually calculating a value but the limit $\lim_{x\to\infty}f(x)$. Usually the function either blows up $f(\infty)=\infty$ (like $x$, $x^2$, $e^x$, $\ldots$), drops down indefinitely $f(\infty)=-\infty$ (like $-x$, $-x^2$, $-e^x$, $\ldots$), or reaches some value (like $\lim_{x\to\infty} \frac{1}{x}=0, \ \lim_{x\to\infty} e^{-x}=0$). If a function goes to positive $\infty$ it doesn't have a global maximum: it simply keeps growing indefinitely. Similarly, functions that go towards negative $\infty$ don't have a global minimum.

Example 1

Find all the maxima and minima of the function \[ f(x)=x^4-8x^2+356. \]

Since no interval is specified we will use the default interval $x \in \mathbb{R}= -\infty,\infty$. Let's go through the steps of the algorithm.

  1. We don't know how a $x^4$ function looks like, but it is probably similar to the $x^2$ – it goes up to infinity on the far left and the far right.
  2. Taking the derivative is simple for polynomials:

\[ f'(x)=4x^3-16x. \]

  1. Now we have to solve

\[ 4x^3-16x=0, \]

  which is the same as
  \[
    4x(x^2-4)=0,
  \]
  which is the same as
  \[
    4x(x-2)(x+2)=0.
  \]
  So our list of candidates is $\{ x=-2, x=0, x=2 \}$.
- For each of these we have to check if it is a max, a min or a saddle point.
  - For $x=-2$, we check $f'(-2.1)=4(-2.1)(-2.1-2)(-2.1+2) < 0$ and 
    $f'(-1.9)=4(-1.9)(-1.9-2)(-1.9+2) > 0$ so $x=-2$ must be minimum.
  - For $x=0$ we try $f'(-0.1)=4(-0.1)(-0.1-2)(-0.1+2) > 0$ and 
    $f'(0.1)=4(0.1)(0.1-2)(0.1+2) < 0$ so we have a maximum.
  - For $x=2$, we check $f'(1.9)=4(1.9)(1.9-2)(1.9+2) < 0$
    and $f'(2.1)=4(2.1)(2.1-2)(2.1+2) > 0$ so $x=2$ must be a minimum.
- We don't have any constraints so all of the above candidates make the cut.
- We add the two constraint boundaries $-\infty$ and $\infty$ to the list of candidates. At this point our final shortlist of candidates contains $\{ x=-\infty, x=-2, x=0, x=2, x=\infty \}$.
- We now evaluate the function $f(x)$ for each of the values to
  get location-value pairs $(x,f(x))$ like so:  $\{ (-\infty,\infty),$ $(-2,340),$ $(0,356),$ $(2,340),$ $(\infty,\infty) \}$.
  Note that $f(\infty)=\lim_{x\to\infty} f(x) =$ $\infty^4 - 8\infty^2+356$ $= \infty$ and same for $f(-\infty)=\infty$. 

We are done now. The function has no global maximum since it goes up to infinity. It has a local maximum at $x=0$ with value $356$ and two global minima at $x=-2$ and $x=2$ both of which have value $340$. Thank you, come again.

Alternate algorithm

Instead of checking nearby points to the left and to the right of each critical point, we can use an alternate Step 4 of the algorithm known as the second derivative test. Recall that the second derivative tells you the curvature of the function: if the second derivative is positive at a critical point $x^*$, then the point $x^*$ must be a minimum. If on the other hand the second derivative at a critical point is negative, then the function must be a maximum at $x^*$. If the second derivative is zero, the test is inconclusive.

Alternate Step 4

  • For each candidate $x^*$ in the list check if is a max, a min or a saddle point.
    • If $f^{\prime\prime}(x^*) < 0$ then $x^*$ is a max.
    • If $f^{\prime\prime}(x^*) > 0$ then $x^*$ is a min.
    • If $f^{\prime\prime}(x^*) = 0$ then, revert back to checking nearby values: $f'(x^*-\epsilon)$ and $f'(x^*+\epsilon)$,

to determine if $x^*$ is max, min or saddle point.

Limitations

The above optimization algorithm applies to differentiable functions of a single variable. It just happens to be that most functions you will face in life are of this kind, so what you have learned is very general. Not all functions are differentiable however. Functions with sharp corners like the absolute value function $|x|$ are not differentiable everywhere and therefore we cannot use the algorithms above. Functions with jumps in them (like the Heaviside step function) are not continuous and therefore not differentiable either so the algorithm cannot be used on them either.

There are also more general kinds of functions and optimization scenarios. We can optimize functions of multiple variables $f(x,y)$. You will learn how to do this in multivariable calculus. The techniques will be very similar to the above, but with more variables and intricate constraint regions.

At last, I want to comment on the fact that you can only maximize one function. Say the Chicago crime boss in the example above wanted to maximize his funds $f(x)$ and his gangster street cred $g(x)$. This is not a well posed problem, either you maximize $f(x)$ or you maximize $g(x)$, but you can't do both. There is no reason why a single $x$ will give the highest value for $f(x)$ and $g(x)$. If both functions are important to you, you can make a new function that combines the other two $F(x)=f(x)+g(x)$ and maximize $F(x)$. If gangster street cred is three times more important to you than funds, you could optimize $F(x)=f(x)+3g(x)$, but it is mathematically and logically impossible to maximize two things at the same time.

Exercises

The function $f(x)=x^3-2x^2+x$ has a local maximum on the interval $x \in [0,1]$. Find where this maximum occurs and the value of $f$ at that point. ANS:$\left(\frac{1}{3},\frac{4}{27}\right)$.

 
home about buy book