Derivative Manipulation

We can think of derivative formulas as falling into two categories: (1) the specific, and (2) the general. A specific formula provides a way to differentiate a specific example. For example, f(x)=xn{f(x) = x^n} has the derivative fβ€²(x)=nxnβˆ’1.{f'(x) = nx^{n-1}.}

In contrast, general formulas are those that apply to all functions. For example:

  • (f+g)β€²=fβ€²+gβ€²{(f + g)' = f' + g'}
  • Cβ‹…u=C(uβ€²){C \cdot u = C(u')}

In the first formula above: The derivative of the sum of two functions is the sum of the derivatives of the two functions. And in the second: The derivative of a constant multiplied by a function is the constant multiplied by the derivative of the function. Both specific and general formulas are needed for polynomials.

The Sum Rule

What is the derivative of f(x)=3x3+2x2?{f(x) = 3x^3 + 2x^2?} To compute this derivative, we must apply the sum rule for derivatives. To do so, let's derive the general rule. The function f(x)=3x3+2x2{f(x) = 3x^3 + 2x^2} can be written as the sum of two functions: Suppose g(x)=3x3{g(x) = 3x^3} and j(x)=2x2.{j(x) = 2x^2.} Then f(x)=g(x)+j(x).{f(x) = g(x) + j(x).} That said, we apply the definition of a derivative:

fβ€²(x)=lim⁑hβ†’0f(x+h)βˆ’f(x)h f'(x) = \lim\limits_{h \to 0} \dfrac{f(x + h) - f(x)}{h}

Now we substitute f(x+h)=g(x+h)+j(x+h){f(x + h) = g(x + h) + j(x + h)} and f(x)=g(x)+j(x):{f(x) = g(x) + j(x):}

fβ€²(x)=lim⁑hβ†’0(f(x+h)+g(x+h))βˆ’(f(x)+g(x))h f'(x) = \lim\limits_{h \to 0} \dfrac{(f(x + h) + g(x + h)) - (f(x) + g(x))}{h}

Rearranging:

fβ€²(x)=lim⁑hβ†’0(f(x+h)βˆ’f(x)h+g(x+h)βˆ’g(x)h) f'(x) = \lim\limits_{h \to 0} \left( \dfrac{f(x+h) - f(x)}{h} + \dfrac{g(x + h) - g(x)}{h} \right)

And applying the sum law for limits and the definition of a derivative:

fβ€²(x)=lim⁑hβ†’0(f(x+h)βˆ’f(x)h)+lim⁑hβ†’0(g(x+h)βˆ’g(x)h)=fβ€²(x)+gβ€²(x) \begin{aligned} f'(x) &= \lim\limits_{h \to 0} \left( \dfrac{f(x+h)-f(x)}{h} \right) + \lim\limits_{h \to 0}\left( \dfrac{g(x+h)-g(x)}{h} \right) \\[1em] &= f'(x) + g'(x) \end{aligned}

The above derivation yields the following:

Sum Rule for Derivatives. Let f(x){f(x)} and g(x){g(x)} be differentiable functions. Then the derivative of the sum of f{f} and g{g} is the sum of the derivative of f{f} and the derivative of g:{g:}

ddx(f(x)+g(x))=ddx(f(x))+ddx(g(x)) \dfrac{d}{dx}(f(x) + g(x)) = \dfrac{d}{dx}(f(x)) + \dfrac{d}{dx}(g(x))

Alternatively, given the differentiable functions r(x),{r(x),} s(x),{s(x),} and t(x),{t(x),} and r(x)=s(x)+t(x),{r(x) = s(x) + t(x),} then:

rβ€²(x)=sβ€²(x)+tβ€²(x) r'(x) = s'(x) + t'(x)

Thus, returning to our original question, what is the derivative of f(x)=3x3+2x2,{f(x) = 3x^3 + 2x^2,} the derivative is simply the derivative of each term:

fβ€²(x)=9x2+4x,{f'(x) = 9x^2 + 4x,} by the sum rule and the power rule.

The Difference Rule

Now let's consider the opposite of addition: What is the derivative of p(x)=2x5βˆ’x3?{p(x) = 2x^5 - x^3?} Here too we can think of p(x){p(x)} as consisting of two separate functions: Given q(x)=2x5{q(x) = 2x^5} and r(x)=x3,{r(x) = x^3,} p(x)=q(x)βˆ’r(x).{p(x) = q(x) - r(x).} Again we apply the definition of a derivative:

pβ€²(x)=lim⁑hβ†’0p(x+h)βˆ’p(x)h p'(x) = \lim\limits_{h \to 0} \dfrac{p(x + h) - p(x)}{h}

Now substitute: p(x+h)=q(x+h)βˆ’r(x+h){p(x + h) = q(x + h) - r(x + h)} and p(x)=q(x)βˆ’r(x):{p(x) = q(x) - r(x):}

pβ€²(x)=lim⁑hβ†’0(q(x+h)βˆ’r(x+h))βˆ’(q(x)βˆ’r(x))h p'(x) = \lim\limits_{h \to 0} \dfrac{(q(x + h) - r(x + h)) - (q(x) - r(x))}{h}

Then rearranging:

pβ€²(x)=lim⁑hβ†’0(q(x+h)βˆ’q(x)hβˆ’r(x+h)+r(x)h) p'(x) = \lim\limits_{h \to 0} \left( \dfrac{q(x + h) - q(x)}{h} - \dfrac{r(x + h) + r(x)}{h} \right)

Applying the sum law for limits and the definition of a derivative:

pβ€²(x)=lim⁑hβ†’0(q(x+h)βˆ’q(x)h)βˆ’lim⁑hβ†’0(r(x+h)βˆ’r(x)h)=qβ€²(x)βˆ’rβ€²(x) \begin{aligned} p'(x) &= \lim\limits_{h \to 0} \left(\dfrac{q(x + h) - q(x)}{h}\right) - \lim\limits_{h \to 0} \left(\dfrac{r(x + h) - r(x)}{h}\right) \\[1em] &= q'(x) - r'(x) \end{aligned}

From our analysis, we have the following rule:

Difference Rule for Derivatives. Let f(x){f(x)} and g(x){g(x)} be differentiable functions. The derivative of the difference of f{f} and g{g} is the difference of the derivative of f{f} and g:{g:}

ddx(f(x)βˆ’g(x))=ddx(f(x))βˆ’ddx(g(x)) \dfrac{d}{dx}(f(x) - g(x)) = \dfrac{d}{dx}(f(x)) - \dfrac{d}{dx}(g(x))

Alternatively, given the differentiable functions r(x),{r(x),} s(x),{s(x),} and t(x),{t(x),} and that r(x)=sβ€²(x)βˆ’tβ€²(x):{r(x) = s'(x) - t'(x):}

rβ€²(x)=sβ€²(x)βˆ’tβ€²(x) r'(x) = s'(x) - t'(x)

Applying our newly derived rule, the derivative of p(x)=2x5βˆ’x3{p(x) = 2x^5 - x^3} is:

p'(x) = 10x^4 - 3x^2,}$ by the difference rule and power rule.

Constant Multiple Rule

What is the derivative of f(x)=x3βˆ’x2(Ο€2)?{f(x) = x^3 - x^2\left(\dfrac{\pi}{2}\right)?} This may seem complex, but we know enough to handle this. To begin with, the term Ο€2{\dfrac{\pi}{2}} is just a constant. Thus, we can really write this function as: f(x)=C(g(x)),{f(x) = C(g(x)),} where g(x)=x3βˆ’x2{g(x) = x^3 - x^2} and C=Ο€2.{C = \dfrac{\pi}{2}.} Let's come up with a rule for computing the derivative of f(x)=C(g(x)).{f(x) = C(g(x)).}

Because we can factor out constants from limits, applying the derivative definition is straightforward. First, the derivative of fβ€²(x){f'(x)} is the derivative (C(g(x)))β€².{(C(g(x)))'.} Thus:

fβ€²(x)=(C(g(x)))β€²=lim⁑hβ†’0Cg(x+h)βˆ’Cg(x)h=Clim⁑hβ†’0g(x+h)βˆ’g(x)h=Cgβ€²(x) \begin{aligned} f'(x) = (C(g(x)))' &= \lim\limits_{h \to 0} \dfrac{Cg(x + h) - Cg(x)}{h} \\[1em] &= C \lim\limits_{h \to 0} \dfrac{g(x+h) - g(x)}{h} \\[1em] &= Cg'(x) \end{aligned}

From our derivation above, we have the following rule:

Constant Multiple Rule. Given the function Cf(x),{Cf(x),} where f(x){f(x)} is differentiable and C{C} is a constant:

ddxCf(x)=Cβ‹…ddxf(x) \dfrac{d}{dx}Cf(x) = C \cdot \dfrac{d}{dx}f(x)

Thus, the derivative of f(x)=x3βˆ’x2(Ο€2){f(x) = x^3 - x^2\left(\dfrac{\pi}{2}\right)} is a simple application of the constant multiple rule, difference rule, and power rule:

fβ€²(x)=(3x2βˆ’2x)(Ο€2) f'(x) = (3x^2 - 2x)\left(\dfrac{\pi}{2}\right)

The Product Rule

It may be tempting to think that the derivative of a product is the product of the derivatives. This is wrong, and we've already seen why. Given the function f(x)=x2,{f(x) = x^2,} the derivative of f{f} is fβ€²(x)=2x.{f'(x) = 2x.} However, the function f(x)=x2{f(x) = x^2} can be written as f(x)=xβ‹…x.{f(x) = x \cdot x.} If it were the case the derivative of a product is the product of the derivatives, then we would have ddx(x2)=ddx(x)β‹…ddx(x)=1β‹…1=1.{\frac{d}{dx} (x^2) = \frac{d}{dx}(x) \cdot \frac{d}{dx}(x) = 1 \cdot 1 = 1.} Wrong. So what is the rule for products?

Well, let's apply the definition of a derivative. Suppose we have two functions, f(x){f(x)} and g(x),{g(x),} both of which are differentiable. Now let's say that the function k(x){k(x)} is the product of these two functions:

k(x)=f(x)g(x) k(x) = f(x)g(x)

Now apply the definition of a derivative to the function k(x):{k(x):}

kβ€²(x)=lim⁑hβ†’0f(x+h)g(x+h)βˆ’f(x)g(x)h k'(x) = \lim\limits_{h \to 0} \dfrac{f(x + h)g(x + h) - f(x)g(x)}{h}

Let's apply the old trick of adding zero. We add and subtracting f(x)g(x+h){f(x)g(x + h)} in the numerator:

kβ€²(x)=lim⁑hβ†’0f(x+h)g(x+h)βˆ’f(x)g(x+h)+f(x)g(x+h)βˆ’f(x)g(x)h k'(x) = \lim\limits_{h \to 0} \dfrac{f(x + h)g(x + h) - f(x)g(x + h) + f(x)g(x + h) - f(x)g(x)}{h}

Using the associative property, let's group the terms to break the quotient:

kβ€²(x)=lim⁑hβ†’0(f(x+h)g(x+h)βˆ’f(x)g(x+h)h)+lim⁑hβ†’0(f(x)g(x+h)βˆ’f(x)g(x)h) k'(x) = \lim\limits_{h \to 0} \left( \dfrac{f(x + h)g(x + h) - f(x)g(x + h)}{h} \right) + \lim\limits_{h \to 0} \left( \dfrac{f(x)g(x + h) - f(x)g(x)}{h} \right)

Applying the associative property again:

kβ€²(x)=lim⁑hβ†’0(f(x+h)βˆ’f(x)hβ‹…g(x+h))+lim⁑hβ†’0(g(x+h)βˆ’g(x)hβ‹…f(x)) k'(x) = \lim\limits_{h \to 0} \left( \dfrac{f(x + h) - f(x)}{h} \cdot g(x + h) \right) + \lim\limits_{h \to 0} \left( \dfrac{g(x + h) - g(x)}{h} \cdot f(x) \right)

Now, we know that g(x){g(x)} is differentiable (by assumption), so it follows that g(x){g(x)} is continuous. And since g(x){g(x)} is continuous, lim⁑hβ†’0g(x+h)=g(x).{\lim\limits_{h \to 0}g(x + h) = g(x).} Hence:

kβ€²(x)=lim⁑hβ†’0f(x+h)βˆ’f(x)h⏟fβ€²(x)β‹…lim⁑hβ†’0g(x+h)⏟g(x)+lim⁑hβ†’0g(x+h)βˆ’g(x)h⏟gβ€²(x)β‹…lim⁑hβ†’0f(x)⏟f(x)=fβ€²(x)g(x)+gβ€²(x)f(x) \begin{aligned} k'(x) &= \underbrace{\cancel{\lim\limits_{h \to 0} \dfrac{f(x + h) - f(x)}{h}}}_{f'(x)} \cdot \underbrace{\cancel{\lim\limits_{h \to 0} g(x + h)}}_{g(x)} + \underbrace{\cancel{\lim\limits_{h \to 0} \dfrac{g(x + h) - g(x)}{h}}}_{g'(x)} \cdot \underbrace{\cancel{\lim\limits_{h \to 0} f(x)}}_{f(x)} \\ &= f'(x)g(x) + g'(x)f(x) \end{aligned}

We now have a general derivative formula, the product rule:

Product Rule. Given functions f{f} and g,{g,} the derivative of fg,{fg,} denoted (fg)β€²,{(fg)',} is the sum of the derivative of the function f,{f,} denoted fβ€²,{f',} times g,{g,} and the derivative of the function g,{g,} denoted gβ€²,{g',} times f.{f.} I.e.,

(fg)β€²=fβ€²g+fgβ€² (fg)' = f'g + fg'

For example, suppose we are given the functions f(x)=xn{f(x) = x^n} and g(x)=sin⁑x.{g(x) = \sin x.} The product of these functions:

(fβ‹…g)(x)=xnsin⁑x (f \cdot g)(x) = x^n \sin x

What is the derivative of (fβ‹…g)?{(f \cdot g)?} We apply the product rule:

(fβ‹…g)β€²=fβ€²g+fgβ€²=(xn)β€²sin⁑x+(xn)(sin⁑x)=(nxnβˆ’1)(sin⁑x)+(xn)(cos⁑x) \begin{aligned} (f \cdot g)' &= f'g + fg' &= (x^n)' \sin x + (x^n) (\sin x) &= (nx^{n-1})(\sin x) + (x^n)(\cos x) \end{aligned}

Quotient Rule

What is the derivative of t(x)=5x+13xβˆ’4?{t(x) = \dfrac{5x + 1}{3x - 4}?} For this, we need another rule. Never, ever, ever, think that ddxf(x)g(x)=ddx(f(x))ddx(g(x)).{\frac{d}{dx} \dfrac{f(x)}{g(x)} = \dfrac{\frac{d}{dx}(f(x))}{\frac{d}{dx}(g(x))}.} This is absolutely wrong, and it is one of the most common mistakes made among calculus newcomers.

As always, let's generalize our problem, and construct a rule. Suppose f(x)=5x+1{f(x) = 5x + 1} and g(x)=3xβˆ’4.{g(x) = 3x - 4.} Thus, t(x)=f(x)g(x).{t(x) = \dfrac{f(x)}{g(x)}.} Accordingly, we want to compute (gj)β€².{\left(\dfrac{g}{j}\right)'.} Applying the quotient rule:

(fg)β€²=lim⁑hβ†’0f(x+h)g(x+h)βˆ’f(x)g(x)h=lim⁑hβ†’01hf(x+h)g(x)βˆ’f(x)g(x+h)g(x+h)g(x)=lim⁑hβ†’01hf(x+h)g(x)βˆ’f(x)g(x)+f(x)g(x)+f(x)g(x)βˆ’f(x)g(x+h)g(x+h)g(x)=lim⁑hβ†’01g(x+h)g(x)f(x+h)g(x)βˆ’f(x)g(x)+f(x)g(x)βˆ’f(x)g(x+h)h=lim⁑hβ†’01g(x+h)g(x)(f(x+h)g(x)βˆ’f(x)g(x)h+f(x)g(x)βˆ’f(x)g(x+h)h)=lim⁑hβ†’01g(x+h)g(x)(g(x)f(x+h)βˆ’f(x)hβˆ’f(x)g(x+h)βˆ’g(x)h)=1(lim⁑hβ†’0g(x+h))(lim⁑hβ†’0g(x))((lim⁑hβ†’0g(x))(lim⁑hβ†’0f(x+h)βˆ’f(x)h)βˆ’(lim⁑hβ†’0f(x))(lim⁑hβ†’0g(x+h)βˆ’g(x)h))=1(lim⁑hβ†’0g(x+h))⏟g(x)(lim⁑hβ†’0g(x))⏟g(x)((lim⁑hβ†’0g(x))⏟g(x)(lim⁑hβ†’0f(x+h)βˆ’f(x)h)⏟fβ€²(x)βˆ’(lim⁑hβ†’0f(x))⏟f(x)(lim⁑hβ†’0g(x+h)βˆ’g(x)h)⏟gβ€²(x))=1g(x)g(x)(g(x)fβ€²(x)βˆ’f(x)gβ€²(x))=fβ€²gβˆ’fgβ€²g2 \small \begin{aligned} \left( \dfrac{f}{g} \right)' &= \lim\limits_{h \to 0} \dfrac{ \dfrac{f(x+h)}{g(x+h)} - \dfrac{f(x)}{g(x)} }{h} \\[1em] &= \lim\limits_{h \to 0} \dfrac{1}{h} \dfrac{f(x+h)g(x) - f(x)g(x+h)}{g(x+h)g(x)} \\[1em] &= \lim\limits_{h \to 0} \dfrac{1}{h} \dfrac{f(x+h)g(x) - f(x)g(x) + f(x)g(x) + f(x)g(x) - f(x)g(x+h)}{g(x+h)g(x)} \\[1em] &= \lim\limits_{h \to 0} \dfrac{1}{g(x+h)g(x)} \dfrac{f(x+h)g(x) - f(x)g(x) + f(x)g(x) - f(x)g(x+h)}{h} \\[1em] &= \lim\limits_{h \to 0} \dfrac{1}{g(x+h)g(x)} \left( \dfrac{f(x+h)g(x) - f(x)g(x)}{h} + \dfrac{f(x)g(x) - f(x)g(x+h)}{h} \right) \\[1em] &= \lim\limits_{h \to 0} \dfrac{1}{g(x + h)g(x)} \left( g(x) \dfrac{f(x + h) - f(x)}{h} - f(x) \dfrac{g(x+h) - g(x)}{h} \right) \\[1em] &= \dfrac {1} {\left(\lim\limits_{h \to 0}g(x+h)\right)\left(\lim\limits_{h \to 0}g(x)\right)} \left(\left(\lim\limits_{h \to 0}g(x)\right) \left(\lim\limits_{h \to 0} \dfrac {f(x+h) - f(x)} {h}\right) - \left(\lim\limits_{h \to 0}f(x)\right) \left( \lim\limits_{h \to 0} \dfrac{g(x + h) - g(x)}{h}\right)\right) \\[1em] &= \dfrac{1}{\underbrace{\cancel{\left(\lim\limits_{h \to 0}g(x+h)\right)}}_{g(x)}\underbrace{\cancel{\left(\lim\limits_{h \to 0}g(x)\right)}}_{g(x)}} \left(\underbrace{\cancel{\left(\lim\limits_{h \to 0}g(x)\right)}}_{g(x)} \underbrace{\cancel{\left(\lim\limits_{h \to 0} \dfrac {f(x+h) - f(x)} {h}\right)}}_{f'(x)} - \underbrace{\cancel{\left(\lim\limits_{h \to 0}f(x)\right)}}_{f(x)} \underbrace{\cancel{\left( \lim\limits_{h \to 0} \dfrac{g(x + h) - g(x)}{h}\right)}}_{g'(x)}\right) \\[1em] &= \dfrac{1}{g(x)g(x)} \left( g(x)f'(x) - f(x)g'(x) \right) \\[1em] &= \dfrac{f'g - fg'}{g^2} \end{aligned}

From the derivation above, we have the following rule:

Quotient Rule. Given functions f{f} and g,{g,} the derivative of fg,{\dfrac{f}{g},} denoted (fg)β€²,{\left(\dfrac{f}{g}\right)',} is provided by the following:

(fg)β€²=fβ€²gβˆ’fgβ€²g2 \left(\dfrac{f}{g}\right)' = \dfrac{f'g - fg'}{g^2}

Accordingly, the derivative of t(x)=5x+13xβˆ’4{t(x) = \dfrac{5x + 1}{3x - 4}} is:

tβ€²(x)=ddx(5x+1)β‹…(3xβˆ’4)βˆ’(5x+1)β‹…ddx(3xβˆ’4)(3xβˆ’4)2=(5)(3xβˆ’4)βˆ’(5x+1)(3)(3xβˆ’4)2=15xβˆ’20βˆ’15xβˆ’3(3xβˆ’4)2=βˆ’23(3xβˆ’4)2 \begin{aligned} t'(x) &= \dfrac{\frac{d}{dx}(5x + 1) \cdot (3x - 4) - (5x + 1) \cdot \frac{d}{dx}(3x - 4)}{(3x - 4)^2} \\[1em] &= \dfrac{(5)(3x - 4) - (5x + 1)(3)}{(3x - 4)^2} \\[1em] &= \dfrac{15x - 20 - 15x - 3}{(3x - 4)^2} \\[1em] &= - \dfrac{23}{(3x - 4)^2} \end{aligned}

Chain Rule

In many of the previous examples, we abstracted the function's terms to arrive at a more general rule. For example, given the function f(x)=x3βˆ’(1/x),{f(x) = x^3 - (1/x),} we would write f(x)=t(x)βˆ’s(x),{f(x) = t(x) - s(x),} where t(x)=x3{t(x) = x^3} and s(x)=1/x.{s(x) = 1/x.} We would then arrive at a derivative from this more abstracted form. This technique results from two observations: First, every function can be expressed as the combination of smaller functions. Borrowing from computer science, we might call these smaller functions helper functions. For example, silly as it may be, the function f(x)=x2{f(x) = x^2} can be written as f(x)=g(x),{f(x) = g(x),} where g(x)=x2.{g(x) = x^2.} Or it can be written as f(x)=(g(x))2{f(x) = (g(x))^2}, where g(x)=x.{g(x) = x.} Second, there are really only three ways to combine functions.

First, we can add functions. For example, the function f(x)=sin⁑x+x2{f(x) = \sin x + x^2} can be written as f(x)=g(x)+h(x),{f(x) = g(x) + h(x),} where g(x)=sin⁑x{g(x) = \sin x} and h(x)=x2.{h(x) = x^2.} The ability to add functions implies the ability to subtract functions, since addition is really just the addition of a negative. For example, the function a(x)=x3βˆ’x2{a(x) = x^3 - x^2} can be written as a(x)=b(x)+c(x),{a(x) = b(x) + c(x),} where b(x)=x3{b(x) = x^3} and c(x)=βˆ’x2.{c(x) = -x^2.}

Second, we can multiply functions. The function Ξ±(x)=(sin⁑x)(x2){\alpha(x) = (\sin x)(x^2)} can be written as Ξ±(x)=Ξ²(x)Ξ³(x),{\alpha(x) = \beta(x) \gamma(x),} where Ξ²(x)=sin⁑x{\beta(x) = \sin x} and Ξ³(x)=x2.{\gamma(x) = x^2.} The ability to multiply functions implies the ability to divide functions, since division is just multiplication with the reciprocal. The function L(x)=sin⁑xx5{L(x) = \dfrac{\sin x}{x^5}} can be written as L(x)=M(x)β‹…N(x){L(x) = M(x) \cdot N(x)} where M(x)=sin⁑x{M(x) = \sin x} and N(x)=1x5.{N(x) = \dfrac{1}{x^5}.}

Third, we can compose, or nest, functions. For example, the function f(x)=(x2βˆ’1)2{f(x) = (x^2 - 1)^2} can be written as f(x)=(g(x))2,{f(x) = (g(x))^2,} where g(x)=x2βˆ’1.{g(x) = x^2 - 1.} Similarly, the function ΞΊ(x)=sin⁑(x3){\kappa(x) = \sin (x^3)} can be written as ΞΊ(x)=sin⁑(Ξ»(x)),{\kappa(x) = \sin (\lambda(x)),} where Ξ»(x)=x3.{\lambda(x) = x^3.}

Borrowing again from computer science, these are the three means of combination. The remarkable aspect of functions is that every function can be written and rewritten, or expressed, with just these three means.

So far, we've used our abstraction technique for the first two means: addition and multiplication. What we haven't used it for, however, is with composition. This limitation leads to some problems. We haven't yet seen a way to apply the technique to a function like s(a)=(a2βˆ’1)2.{s(a) = (a^2 - 1)^2.} Fortunately, we can simply expand this function: s(a)=a4βˆ’2a2+1,{s(a) = a^4 - 2a^2 + 1,} then apply our familiar rules.

Expansion, however, begins falling apart once we have something like (29a4βˆ’17a3+32a2βˆ’17)15.{(29a^4 - 17a^3 + 32a^2 - 17)^{15}.} This is a polynomial that would be messy to expand. We'd sooner exhaust ourselves before seeing the polynomial's final form.

Worse, neither expansion nor our original method of generalization would help with functions like f(x)=cos⁑(x2){f(x) = \cos (x^2)} or t(x)=tan⁑(12x3βˆ’x).{t(x) = \tan \left(\dfrac{1}{2x^3 - x}\right).} With the rules we have thus far, it isn't exactly clear how we would compute the derivatives for these functions. All these functions are called composite functions β€” functions combined through composition, or nesting. The most efficient way to compute their derivatives is with a new theorem: the chain rule.

To understand the chain rule, we must first review how composite functions work. Composite functions are simply compositions of functions. Our original method of generalization is applicable to composing functions. With the composite function y=(x3βˆ’1)100,{y = (x^3 - 1)^{100},} we generalize x3βˆ’1{x^3 - 1} to u=x3βˆ’1.{u = x^3 - 1.} Following this generalization, y=u100.{y = u^{100}.} Now, let's compute dydu:{\dfrac{dy}{du}:}

y=u100dydu=100u99 y = u^{100} \dfrac{dy}{du} = 100u^{99}

That was a straightforward application of the power rule. Now let's compute the derivative of dudx:{\dfrac{du}{dx}:}

u=x3βˆ’1dudx=uβ€²=3x2 u = x^3 - 1 \dfrac{du}{dx} = u' = 3x^2

So now we have these two results:

dydu=100u99dudx=uβ€²=3x2 \dfrac{dy}{du} = 100u^{99} \dfrac{du}{dx} = u' = 3x^2

But, what we ultimately want to find is dydx.{\dfrac{dy}{dx}.} Now, notice the way Leibniz's notation looks:

dydududx \dfrac{dy}{du} \dfrac{du}{dx}

Those look an awful lot like fractions, and it's tempting to multiply them and cancel:

dyduβ‹…dudx=dydx \dfrac{dy}{\cancel{du}} \cdot \dfrac{\cancel{du}}{dx} = \dfrac{dy}{dx}

Let's put a pin on this observation for now, but we'll revisit it shortly. What we should do instead is think a little more carefully about what it means to compute the derivative of a nested function, say f(g(x)).{f(g(x)).} If suppose that y=g(x),{y = g(x),} then f(g(x))=f(y).{f(g(x)) = f(y).} Looking at it this way, we can see that g(x){g(x)} outputs a y,{y,} and that y{y} is an input for f.{f.} Now let's say that z=f(x).{z = f(x).} Viewing it this way, when we ask for the derivative of f(g(x)),{f(g(x)),} what we're really asking for is, how quickly (or slowly), does f{f} change as x{x} changes?

The answer to that question is the derivative. We know that a derivative is simply the application of a limit to the quotient rule. Thus, what we want to do is to consider the Newton quotient of the compositve function f∘g:{f \circ g:}

Suppose f{f} has a derivative at g(x){g(x)} and g{g} has a derivative at x.{x.} Then, by Newton's quotient:

f(x+h)βˆ’f(x)h=f(g(x+h))βˆ’f(g(x))h \dfrac{f(x + h) - f(x)}{h} = \dfrac{f(g(x + h)) - f(g(x))}{h}

This expression is a bit of an eye sore, so we'll replace some of the terms with simpler variables. Suppose:

u=g(x)k=g(x+h)βˆ’g(x) u = g(x) k = g(x + h) - g(x)

Notice that with the above substitutions, k{k} depends on h.{h.} As h{h} approaches 0, k{k} tends to 0. Bearing this in mind, we can the corresponding terms and arrive at the following:

f(u+k)βˆ’f(u)h \dfrac{f(u + k) - f(u)}{h}

This looks very similar to the derivative f′(u),{f'(u),} but we have h{h} in the denominator and k{k} in the numerator. Recall that k{k} depends on h.{h.} And because k{k} depends on h,{h,} we have two cases to consider: k=0{k = 0} and k≠0.{k \neq 0.}

The easier (and more common) case is when k≠0,{k \neq 0,} so let's deal with it it first. Suppose k≠0.{k \neq 0.} If k≠0,{k \neq 0,} then we can multiply and divide the our equation by k:{k:}

f(u+k)βˆ’f(u)kβ‹…kh=f(u+k)βˆ’f(u)kβ‹…g(x+h)βˆ’g(x)h \dfrac{f(u + k) - f(u)}{k} \cdot \dfrac{k}{h} = \dfrac{f(u + k) - f(u)}{k} \cdot \dfrac{g(x + h) - g(x)}{h}

Examining the resulting equation, when h{h} approaches 0,{0,} several things occur. First, the quotient:

g(x+h)βˆ’g(x)h \dfrac{g(x + h) - g(x)}{h}

tends to gβ€²(x).{g'(x).} Second, kβ†’0{k \to 0} (β€œk{k} tends towards 0”) as hβ†’0{h \to 0} because k=g(x+h)βˆ’g(x){k = g(x + h) - g(x)} and g{g} is continuous at x{x} (by our first assumptions). Thus, the quotient:

g(x+h)βˆ’g(x)h \dfrac{g(x + h) - g(x)}{h}

approaches g′(x){g'(x)} as h→0,{h \to 0,} and the quotient:

f(u+k)βˆ’f(u)k \dfrac{f(u + k) - f(u)}{k}

approaches f′(u){f'(u)} as h→0.{h \to 0.} Hence, we have the following rule:

(f∘g)β€²(x)=fβ€²(g(x))β‹…gβ€²(x) (f \circ g)'(x) = f'(g(x)) \cdot g'(x)

The above rule, however, only applies when k≠0.{k \neq 0.} We must still consider the case where k=0.{k = 0.} This is somewhat of an edge case, but because the possibility exists, we must address it. Otherwise, our preceding wouldn't apply.

First, say we have a number u{u} such that f(u){f(u)} is defined. By the limit of Newton's quotient, we know that:

lim⁑hβ†’0f(u+k)βˆ’f(u)k=fβ€²(u) \lim\limits_{h \to 0} \dfrac{f(u + k) - f(u)}{k} = f'(u)

Encapsulating the above:

Ο†(k)=f(u+k)βˆ’f(u)kβˆ’fβ€²(u) \varphi(k) = \dfrac{f(u + k) - f(u)}{k} - f'(u)

It follows then that, as k{k} approaches 0, Ο†(k)=0.{\varphi(k) = 0.} In other words:

lim⁑kβ†’0Ο†(k)=f(u+k)βˆ’f(u)kβˆ’fβ€²(u)=0 \lim\limits_{k \to 0} \varphi(k) = \dfrac{f(u + k) - f(u)}{k} - f'(u) = 0

Multiplying both sides by k,{k,} we have:

kβ‹…Ο†(k)=f(u+k)βˆ’f(u)βˆ’kfβ€²(u) k \cdot \varphi(k) = f(u + k) - f(u) - kf'(u)

We can rewrite this as:

f(u+k)βˆ’f(u)=kβ‹…fβ€²(u)+kβ‹…Ο†(k) f(u + k) - f(u) = k \cdot f'(u) + k \cdot \varphi(k)

Consider what happens when k≠0.{k \neq 0.} The equation is holds. But if k=0,{k = 0,} the equation breaks down, because φ(k){\varphi(k)} leaves a 0 in the denominator. We can avoid this by supposing that φ(k)=0,{\varphi(k) = 0,} in which case the equation holds when k=0,{k = 0,} since doing so would simply yield:

f(u)βˆ’f(u)=0 f(u) - f(u) = 0

Of course, this is true. With this in mind, let's say u=g(x){u = g(x)} and k=g(x+h)βˆ’g(x).{k = g(x + h) - g(x).} Then as h{h} tends towards 0, so too does k.{k.} Now, recall that the Newton quotient for the function f∘g{f \circ g} is:

f(g(x+h))βˆ’f(g(x))h=f(u+k)βˆ’f(u)h \dfrac{f(g(x + h)) - f(g(x))}{h} = \dfrac{f(u + k) - f(u)}{h}

Based on our previous analysis, it follows that:

f(g(x+h))βˆ’f(g(x))h=f(u+k)βˆ’f(u)h=kβ‹…fβ€²(u)+kβ‹…Ο†(k)h \begin{aligned} \dfrac{f(g(x + h)) - f(g(x))}{h} &= \dfrac{f(u + k) - f(u)}{h} \\ &= \dfrac{k \cdot f'(u) + k \cdot \varphi(k)}{h} \end{aligned}

Substituting the value for k,{k,} we have:

f(g(x+h))βˆ’f(g(x))h=f(u+k)βˆ’f(u)h=kβ‹…fβ€²(u)+kβ‹…Ο†(k)h=g(x+h)βˆ’g(x)hfβ€²(u)+g(x+h)βˆ’g(x)βˆ’g(x)hΟ†(k) \begin{aligned} \dfrac{f(g(x + h)) - f(g(x))}{h} &= \dfrac{f(u + k) - f(u)}{h} \\ &= \dfrac{k \cdot f'(u) + k \cdot \varphi(k)}{h} \\ &= \dfrac{g(x + h) - g(x)}{h} f'(u) + \dfrac{g(x + h) - g(x) - g(x)}{h} \varphi(k) \end{aligned}

Now we take the limit as h{h} as 0. Applying this limit, we see that the first term approaches g′(x)f′(u).{g'(x)f'(u).} Given that the limit of φ(k){\varphi(k)} as h→0{h \to 0} or k→0{k \to 0} is 0, applying the limit to the second term results in:

lim⁑hβ†’0g(x+h)βˆ’g(x)hΟ†(k)=gβ€²(x)β‹…0=0 \lim\limits_{h \to 0} \dfrac{g(x + h) - g(x)}{h} \varphi(k) = g'(x) \cdot 0 = 0

Accordingly, we have the rule:

f∘g(x)=fβ€²(g(x))gβ€²(x) f \circ g(x) = f'(g(x))g'(x)

Formally stating this rule:

The Chain Rule. Suppose f{f} and g{g} are functions. For all x{x} in the domain of g{g} for which g{g} is differentiable at x{x} and f{f} is differentiable at g(x),{g(x),} the derivative of the composite function h(x)=(f∘g)(x)=f(g(x)){h(x) = (f \circ g)(x) = f(g(x))} is given by:

hβ€²(x)=fβ€²(g(x))gβ€²(x) h'(x) = f'(g(x))g'(x)

Alternatively, if y{y} is a function of u,{u,} and u{u} is a function of x,{x,} then:

dydx=dyduβ‹…dudx \dfrac{dy}{dx} = \dfrac{dy}{du} \cdot \dfrac{du}{dx}

Once more we see that the chain rule applies even when k=0.{k = 0.} This proves that the chain rule applies in general. Having proved the chain rule, we can now see that we can rest easy succumbing to our previous temptation:

dyduβ‹…dudx=dydx \dfrac{dy}{\cancel{du}} \cdot \dfrac{\cancel{du}}{dx} = \dfrac{dy}{dx}

This is because the chain rule can be expressed by the formula:

d(f∘g)dx=dfduβ‹…dudx \dfrac{d(f \circ g)}{dx} = \dfrac{df}{du} \cdot \dfrac{du}{dx}

With the chain rule, derivatives of composite functions behave as if we could cancel du.{du.} For example, suppose we're asked to compute the following derivative:

ddx(3x2βˆ’4)100 \dfrac{d}{dx} (3x^2 - 4)^{100}

We can compute this derivative quickly with the chain rule. First, we need the two terms, dydu{\dfrac{dy}{du}} and dudx.{\dfrac{du}{dx}.} The variable u{u} embodies the outer function. u=3x2βˆ’4.{u = 3x^2 - 4.} Thus:

ddx(3x2βˆ’4)100=dyduβ‹…dudx=ddu(u100)β‹…ddx(3x2βˆ’4)=100u99β‹…6x=100(3x2βˆ’4)99β‹…6x=600x(3x2βˆ’4)99 \begin{aligned} \dfrac{d}{dx} (3x^2 - 4)^{100} &= \dfrac{dy}{du} \cdot \dfrac{du}{dx} \\[1em] &= \dfrac{d}{du}(u^{100}) \cdot \dfrac{d}{dx}(3x^2 - 4) \\[1em] &= 100u^{99} \cdot 6x \\[1em] &= 100(3x^2 - 4)^{99} \cdot 6x \\[1em] &= 600x(3x^2 - 4)^{99} \end{aligned}

Higher-order Derivatives

The mathematician Hugo Rossi famously quipped, β€œIn the fall of 1972, President Nixon announced that the rate of increase of inflation was decreasing. This was the first time a sitting president used the third derivative to advance his case for reelection.” What is this third derivative? It's simply a derivative of a derivative β€” a higher-order derivative.

Higher derivatives are denoted ordinally. I.e., the β€œsecond derivative,” the β€œthird derivative,” etc. For example, velocity is the first derivative of the position vector with respect to time, representing the change in an object's position over time. Acceleration is the second derivative of the position vector with respect to time, representing the moving object's change in velocity (e.g., how fast does this car go from 0mph to 100mph?). Jerk is the third derivative of the position vector with respect to time, representing the change in acceleration. Somewhat humorously and non-standardized: A snap is the fourth derivative representing the change in jerk. Crackle is the fifth derivative representing the change in snap. Pop is the sixth derivative representing the change in crackle.

The rules we've covered thus far equally apply to the computation of higher-order derivatives. For example, given u(x)=sin⁑x,{u(x) = \sin x,} the first derivative is uβ€²=cos⁑x.{u' = \cos x.} The second derivative is (uβ€²)β€²=(cos⁑x)β€²=βˆ’sin⁑x.{(u')' = (\cos x)' = - \sin x.}

There are several forms of notation for higher derivatives:

  • fβ€²(x)≑Df≑dfdx≑ddxf{f'(x) \equiv D f \equiv \dfrac{df}{dx} \equiv \dfrac{d}{dx} f}

  • fβ€²β€²(x)≑D2f≑d2fdx2≑(ddx)2f{f''(x) \equiv D^2 f \equiv \dfrac{d^2f}{dx^2} \equiv (\dfrac{d}{dx})^2 f}

  • fβ€²β€²β€²(x)≑D3f≑d3fdx3≑(ddx)3f{f'''(x) \equiv D^3 f \equiv \dfrac{d^3f}{dx^3} \equiv (\dfrac{d}{dx})^3 f}

  • f(n)(x)≑Dnf≑dnfdxn≑(ddx)nf{f^{(n)}(x) \equiv D^n f \equiv \dfrac{d^nf}{dx^n} \equiv (\dfrac{d}{dx})^n f}

The symbols D{D} and ddx{\dfrac{d}{dx}} are operators we can apply to functions. Applying such operators, the derivative of the function is returned.