The Mean Value Theorem

Let's say we embark on a flight from Boston to Los Angeles. The flight lasts 6{6} hours, covering roughly 3000 mi.{3000 \text{~mi}.} The Mean Value Theorem (MVT) states that at some point during this flight, we're travelling at the average speed:

3000 mi6 hr=500 mihr \dfrac{3000 \text{~mi}}{6 \text{~hr}} = 500 ~\frac{\text{mi}}{\text{hr}}

At various points in the flight we might've travelled faster than 500 mihr,{500 ~\frac{\text{mi}}{\text{hr}},} at other points less than 500 mihr.{500 ~\frac{\text{mi}}{\text{hr}}.} mvt provides that regardless of those facts, there is at least one point during the flight where we were traveling at 500 mihr.{500 ~\frac{\text{mi}}{\text{hr}}.} It is because of the role of an average that the theorem is called the "Mean Value Theorem."

Stating the theorem formally:

Mean Value Theorem. Let f(x){f(x)} be a function where the following conditions are true:

  1. f(x){f(x)} is continuous when axb.{a \leq x \leq b.}
  2. f(x){f(x)} is differentiable when a<x<b.{a < x < b.}

Then there exists a number c{c} where a<c<b{a < c < b} such that:

f(c)=f(b)f(a)ba f'(c) = \dfrac{f(b) - f(a)}{b - a}

Or, equivalently:

f(b)f(a)=f(c)[(ba)] f(b) - f(a) = f'(c)[(b - a)]

For the first condition: There's a link between a{a} and b.{b.} If it were discontinuous, there would be no relation between a{a} and b.{b.} And for the second condition: we can determine the speed, or more generally, the slope, at each point between a{a} and b.{b.}

The intuition behind mvt is best gained by illustration:

Geometric interpretation

In the diagram above, the blue line is the secant line between the points a{a} and b.{b.} This line has the slope:

f(b)f(a)ba \dfrac{f(b) - f(a)}{b - a}

What's the relationship between this slope and f(c)?{f'(c)?} Well, the slope f(c){f'(c)} is the slope of the red line. That red line is parallel to the blue secant line. Initially, it's somewhere far from the blue secant line, but as we shift it closer and closer to the secant line, it will eventually touch the graph. That point where it touches the graph is the point c,f(c).{c, f(c).} And because this red line now touches the graph, it has the same slope as the blue line:

f(c)=f(b)f(a)ba f'(c) = \dfrac{f(b) - f(a)}{b-a}

A few sidepoints: First, when we apply MVT, we're just focusing on the interval (a,b).{(a, b).} We're not considering anything outside this interval. Second, the parallel line doesn't have to just touch one point. mvt merely provides that it touches some point.1

But what happens if f(x){f(x)} was something like this instead:

Hypothetical

In this case, we wouldn't get a tangent line. Fortunately, there's a simple fix. Instead of having the parallel lines come from the bottom, have them come from the top:

Parallel lines

A critical issue we should always consider when applying mvt is whether f{f} is differentiable on the open interval (a,b).{(a,b).} This is because all it takes is one non-differentiable point to prevent us from applying mvt. For example, some function that looks like:

Error

Shifting the parallel line up, we would never get a tangent line. There is no derivative at the red point. For mvt to be true, f{f'} must exist at all points a<x<b.{a < x < b.}

Let's take a closer look at the formula

f(b)f(a)ba=f(c) \dfrac{f(b) - f(a)}{b - a} = f'(c)

Rewriting the equation:

f(b)f(a)=[f(c)](ba) f(b) - f(a) = [f'(c)](b-a)

And rewriting one more time:

f(b)=f(a)+[f(c)](ba) f(b) = f(a) + [f'(c)](b-a)

Suppose that a<b.{a < b.} This means that ba{b - a} is positive:

a<b0<ba \begin{aligned} a & < b \\ 0 & < b - a \end{aligned}

This in turn means that the factor (ba){(b-a)} in our rewritten equation is positive:

f(b)=f(a)+[f(c)](ba) f(b) = f(a) + [f'(c)]\textcolor{teal}{(b-a)}

This leads to several inferences. First, if f(c){f'(c)} is positive (f(c)>0),{(f'(c) > 0),} then the term [f(c)](ba){[f(c)](b-a)} is also positive.

f(b)=f(a)+[f(c)](ba) f(b) = f(a) + \textcolor{teal}{[f'(c)](b-a)}

And given the equation, it follows that:

f(b)>f(a) f(b) > f(a)

From this analysis, we can conclude that:

(f(c)>0)f is increasing (f'(c) > 0) \nc \text{$f$ is increasing}

Following the same thought process, if f(c)<0,{f'(c) < 0,} then the term [f(c)](ba){[f'(c)](b-a)} is negative:

f(b)=f(a)+[f(c)](ba) f(b) = f(a) + \textcolor{firebrick}{[f'(c)](b-a)}

Which leads to the conclusion:

(f(c)<0)f is decreasing (f'(c) < 0) \nc \text{$f$ is decreasing}

Finally, we have the case where f(c)=0.{f'(c) = 0.} Where this occurs, we have:

f(b)=f(a)+[f(c)](ba)=f(a)+[0](ba)=f(a)+0=f(a) \begin{aligned} f(b) &= f(a) + [f'(c)](b-a) \\ &= f(a) + [0](b-a) \\ &= f(a) + 0 \\ &= f(a) \end{aligned}

Which implies the conclusion:

(f(c)=0)f is constant (f'(c) = 0) \nc \text{$f$ is constant}

Now, because we can apply the mean value theorem for any interval on the function f,{f,} (provided the theorem's conditions are met), our inferences are applicable to f{f} more generally. Accordingly, the inferences we just drew are extremely powerful and useful propositions. We state them as a lemma:

lemma. Where the mean value theorem is true, the following propositions hold:

  1. If f>0,{f' > 0,} then f{f} is increasing.
  2. If f<0,{f' < 0,} then f{f} is decreasing.
  3. If f=0,{f' = 0,} then f{f} is constant.

With these inferences drawn, we now point out a few nuances with

mvt. mvt is rarely, if ever, actually used to find some value c.{c.} This is because mvt never tells you how many c{c}s there are. It could be a hundred, it could be just one. Instead, mvt is more often used as a tool for drawing larger conclusions, particularly through the inferences we stated in the lemma above.

In the realm of applied mathematics, mvt is used to determine whether a particular rate of change lies between two extremes. For example, suppose vmin{v_{min}} is the minimum velocity for some vehicle, and vmax{v_{max}} is the maximum velocity. Then suppose that while the vehicle travels, a speedometer reads the velocities v(a){v(a)} and v(b).{v(b).}

mvt allows us to determine whether:

vminv(b)v(a)bavmax v_{min} \leq \dfrac{v(b) - v(a)}{b - a} \leq v_{max}

where vmin,vmaxaxb{v_{min}, v_{max} \in a \leq x \leq b} (since the minimum and maximum can occur at the endpoints). In essence, mvt's greatest value lies in what it implies, rather than what it states. To examine this value, let's consider inequalities.

Suppose we're given the following inequality:

ex>1+x  where  x>0 e^x > 1 + x ~\text{ where }~ x > 0

Question: Is this proposition true? Let's start by rewriting the proposition as a function:

ex>1+xex(1+x)=0ex(1+x)=f(x) \begin{aligned} e^x &> 1 + x \\ e^x - (1+x) &= 0 \\ e^x - (1+x) &= f(x) \end{aligned}

We have the function f(x)=ex(1+x).{f(x) = e^x - (1+x).} Where x=0,{x = 0,} we have:

f(0)=e0(1+0)=1(1)=0 \begin{aligned} f(0) &= e^0 - (1+0) \\ &= 1 - (1) \\ &= 0 \end{aligned}

If we differentiate f,{f,} we get:

f(x)=ex(1+x)f(x)=(ex)(1)+(x)=ex1 \begin{aligned} f(x) &= e^x - (1+x) \\ f'(x) &= (e^x)' - (1)' + (x)' \\ &= e^x - 1 \end{aligned}

We know that ex>1{e^x > 1} where x>0.{x > 0.} Thus, it follows that f(x)>0{f'(x) > 0} where x>0.{x > 0.} This means that:

f(x)>f(0)  where  x>0 f(x) > f(0) ~\text{ where }~ x > 0

Plugging in our function definition, we get:

f(x)>f(0)  where  x>0ex(1+x)>0ex>1+x \begin{aligned} f(x) &> f(0) ~\text{ where }~ x > 0 \\ e^x - (1 + x) &> 0 \\ e^x &> 1 + x \end{aligned}

This proves that the proposition ex>1+x{e^x > 1 + x} where x>0{x > 0} is true. Now consider another problem:

problem. Is ex{e^x} greater than 1+x+x22?{1+x+\dfrac{x^2}{2}?}

Abstracting the problem statement, we're asked to determine whether the proposition:

ex>1+x+x22 e^x > 1 + x + \dfrac{x^2}{2}

is true. Once more, we use the same technique. We rewrite the inequality as a function:

ex>1+x+x22ex(1+x+x22)=0ex(1+x+x22)=g(x) \begin{aligned} e^x &> 1 + x + \dfrac{x^2}{2} \\[1em] e^x - (1 + x + \dfrac{x^2}{2}) &= 0 \\[1em] e^x - (1 + x + \dfrac{x^2}{2}) &= g(x) \end{aligned}

Where x=0,{x = 0,} we have:

g(x)=ex(1+x+x22)g(0)=e0(1+0+02)=1(1+0+0)=11=0 \begin{aligned} g(x) &= e^x - (1 + x + \dfrac{x^2}{2}) \\[1em] g(0) &= e^0 - (1 + 0 + \dfrac{0}{2}) \\[1em] &= 1 - (1 + 0 + 0) \\[1em] &= 1 - 1 \\[1em] &= 0 \end{aligned}

Then we differentiate g(x):{g(x):}

g(x)=(ex)(1)+(x)+(x22)=ex0+1+x=ex+x+1 \begin{aligned} g'(x) &= (e^x)' - (1)'+ (x)' + \left( \dfrac{x^2}{2} \right)' \\[1em] &= e^x - 0 + 1 + x \\[1em] &= e^x + x + 1 \end{aligned}

Given the derivative g(x),{g'(x),} we know that:

g(x)=ex(1+x)>0  where  x>0 g'(x) = e^x - (1 + x) > 0 ~\text{ where }~ x > 0

from the previous problem. This tells us that g(x)>g(0).{g(x) > g(0).} Which implies that the proposition:

ex>1+x+x22 e^x > 1 + x + \dfrac{x^2}{2}

is true. It turns out that we can keep going with the proposition:

ex>1+x+x22+x332+x4432+ e^x > 1 + x + \dfrac{x^2}{2} + \dfrac{x^3}{3 \cdot 2} + \dfrac{x^4}{4 \cdot 3 \cdot 2} + \ldots

What this tells us is that the left hand side of the inequality, ex,{e^x,} grows pretty quickly, but if we go infinitely far, the right hand side eventually catches up.

Footnotes

  1. In mathematics, the word "some" is shorthand for "at least one." It could be a million points, a thousand, or just one. In all of those cases, there is some point.