Skip to main content
\(\require{cancel}\newcommand{\highlight}[1]{{\color{blue}{#1}}} \newcommand{\apex}{A\kern -1pt \lower -2pt\mbox{P}\kern -4pt \lower .7ex\mbox{E}\kern -1pt X} \newcommand{\colorlinecolor}{blue!95!black!30} \newcommand{\bwlinecolor}{black!30} \newcommand{\thelinecolor}{\colorlinecolor} \newcommand{\colornamesuffix}{} \newcommand{\linestyle}{[thick, \thelinecolor]} \newcommand{\bmx}[1]{\left[\hskip -3pt\begin{array}{#1} } \newcommand{\emx}{\end{array}\hskip -3pt\right]} \newcommand{\ds}{\displaystyle} \newcommand{\fp}{f'} \newcommand{\fpp}{f''} \newcommand{\lz}[2]{\frac{d#1}{d#2}} \newcommand{\lzn}[3]{\frac{d^{#1}#2}{d#3^{#1}}} \newcommand{\lzo}[1]{\frac{d}{d#1}} \newcommand{\lzoo}[2]{{\frac{d}{d#1}}{\left(#2\right)}} \newcommand{\lzon}[2]{\frac{d^{#1}}{d#2^{#1}}} \newcommand{\lzoa}[3]{\left.{\frac{d#1}{d#2}}\right|_{#3}} \newcommand{\plz}[2]{\frac{\partial#1}{\partial#2}} \newcommand{\plzoa}[3]{\left.{\frac{\partial#1}{\partial#2}}\right|_{#3}} \newcommand{\inflim}[1][n]{\lim\limits_{#1 \to \infty}} \newcommand{\infser}[1][1]{\sum_{n=#1}^\infty} \newcommand{\Fp}{F\primeskip'} \newcommand{\Fpp}{F\primeskip''} \newcommand{\yp}{y\primeskip'} \newcommand{\gp}{g\primeskip'} \newcommand{\dx}{\Delta x} \newcommand{\dy}{\Delta y} \newcommand{\ddz}{\Delta z} \newcommand{\thet}{\theta} \newcommand{\norm}[1]{\left\lVert#1\right\rVert} \newcommand{\vnorm}[1]{\left\lVert\vec #1\right\rVert} \newcommand{\snorm}[1]{\left|\left|\ #1\ \right|\right|} \newcommand{\la}{\left\langle} \newcommand{\ra}{\right\rangle} \newcommand{\dotp}[2]{\vec #1 \cdot \vec #2} \newcommand{\proj}[2]{\text{proj}_{\,\vec #2}{\,\vec #1}} \newcommand{\crossp}[2]{\vec #1 \times \vec #2} \newcommand{\veci}{\vec i} \newcommand{\vecj}{\vec j} \newcommand{\veck}{\vec k} \newcommand{\vecu}{\vec u} \newcommand{\vecv}{\vec v} \newcommand{\vecw}{\vec w} \newcommand{\vecx}{\vec x} \newcommand{\vecy}{\vec y} \newcommand{\vrp}{\vec r\, '} \newcommand{\vsp}{\vec s\, '} \newcommand{\vrt}{\vec r(t)} \newcommand{\vst}{\vec s(t)} \newcommand{\vvt}{\vec v(t)} \newcommand{\vat}{\vec a(t)} \newcommand{\px}{\partial x} \newcommand{\py}{\partial y} \newcommand{\pz}{\partial z} \newcommand{\pf}{\partial f} \newcommand{\mathN}{\mathbb{N}} \newcommand{\zerooverzero}{\ds \raisebox{8pt}{\text{``\ }}\frac{0}{0}\raisebox{8pt}{\textit{ ''}}} \newcommand{\deriv}[2]{\myds\frac{d}{dx}\left(#1\right)=#2} \newcommand{\myint}[2]{\myds\int #1\ dx= {\ds #2}} \DeclareMathOperator{\sech}{sech} \DeclareMathOperator{\csch}{csch} \newcommand{\primeskip}{\hskip.75pt} \newcommand{\plotlinecolor}{blue} \newcommand{\colorone}{blue} \newcommand{\colortwo}{red} \newcommand{\coloronefill}{blue!15!white} \newcommand{\colortwofill}{red!15!white} \newcommand{\abs}[1]{\left\lvert #1\right\rvert} \newcommand{\lt}{<} \newcommand{\gt}{>} \newcommand{\amp}{&} \)

Section2.5The Chain Rule

We have covered almost all of the derivative rules that deal with combinations of two (or more) functions. The operations of addition, subtraction, multiplication (including by a constant) and division led to the Sum/Difference Rule, the Constant Multiple Rule, the Power Rule, the Product Rule and the Quotient Rule. To complete the list of differentiation rules, we look at the last way two (or more) functions can be combined: the process of composition (i.e. one function “inside” another).

One example of a composition of functions is \(f(x) = \cos(x^2)\text{.}\) We currently do not know how to compute this derivative. If forced to guess, one might guess \(\fp(x) = -\sin(2x)\text{,}\) where we recognize \(-\sin(x)\) as the derivative of \(\cos(x)\) and \(2x\) as the derivative of \(x^2\text{.}\) However, this is not the case; \(\fp(x)\neq -\sin(2x)\text{.}\) One way to see this is to examine the graph of \(y=\cos\mathopen{}\left(x^2\right)\mathclose{}\) in Figure 2.5.1 and its tangent line at \(x=\pi/2\text{.}\) Clearly the slope of the tangent line there is nonzero, but \(-2\sin(2\cdot\pi/2)=0\text{.}\) So it can't be correct to say that \(y'=-\sin(2x)\text{.}\)

<<SVG image is unavailable, or your browser cannot render it>>

Figure2.5.1A graph of \(y=\cos(x^2)\) and a tangent line at \(\pi/2\)

In Example 2.5.7 we'll see the correct way to compute the derivative of \(\sin\mathopen{}\left(x^2\right)\mathclose{}\text{,}\) which employs the new rule this section introduces, the Chain Rule.

Before we define this new rule, recall the notation for composition of functions. We write \((f \circ g)(x)\) or \(f(g(x))\text{,}\) read as “\(f\) of \(g\) of \(x\text{,}\)” to denote composing \(f\) with \(g\text{.}\) In shorthand, we simply write \(f \circ g\) or \(f(g)\) and read it as “\(f\) of \(g\text{.}\)” Before giving the corresponding differentiation rule, we note that the rule extends to multiple compositions like \(f(g(h(x)))\) or \(f(g(h(j(x))))\text{,}\) etc.

To motivate the rule, let's look at three derivatives we can already compute.

Example2.5.2Exploring similar derivatives

Find the derivatives of \(F_1(x) = (1-x)^2\text{,}\) \(F_2(x) = (1-x)^3\text{,}\) and \(F_3(x) = (1-x)^4\text{.}\) (We'll see later why we are using subscripts for different functions and an uppercase \(F\text{.}\))

Solution

Here is the Chain Rule in words:

The derivative of the outside function, evaluated at the inside function, multiplied by the derivative of the inside function.

To help understand the Chain Rule, we return to Example 2.5.2.

Example2.5.4Using the Chain Rule

Use the Chain Rule to find the derivatives of the following functions, as given in Example 2.5.2.

Solution

Example 2.5.4 demonstrated a particular pattern: when \(f(x)=x^n\text{,}\) then \(y' =n\cdot (g(x))^{n-1}\cdot g'(x)\text{.}\) This is called the Generalized Power Rule.

This allows us to quickly find the derivative of functions like \(y = (3x^2-5x+7+\sin(x) )^{20}\text{.}\) While it may look intimidating, the Generalized Power Rule states that \begin{equation*} y' = 20(3x^2-5x+7+\sin(x) )^{19}\cdot (6x-5+\cos(x) ). \end{equation*}

Treat the derivative-taking process step-by-step. In the example just given, first multiply by \(20\text{,}\) then rewrite the inside of the parentheses, raising it all to the \(19\)th power. Then think about the derivative of the expression inside the parentheses, and multiply by that.

We now consider more examples that employ the Chain Rule.

Example2.5.6Using the Chain Rule

Find the derivatives of the following functions:

  1. \(y = \sin(2x)\)

  2. \(y= \ln(4x^3-2x^2)\)

  3. \(y = e^{-x^2}\)

Solution
Example2.5.7Using the Chain Rule to find a tangent line

Let \(f(x) = \cos(x^2)\text{.}\) Find the equation of the line tangent to the graph of \(f\) at \(x=1\text{.}\)

Solution

The Chain Rule is used often in taking derivatives. Because of this, one can become familiar with the basic process and learn patterns that facilitate finding derivatives quickly. For instance, \begin{equation*} \lzoo{x}{\ln(\text{anything})} = \frac{1}{\text{anything}}\cdot\lzoo{x}{\text{anything}} = \frac{\lzoo{x}{\text{anything}}}{\text{anything}}. \end{equation*}

A concrete example of this is \begin{equation*} \lzoo{x}{\ln(3x^{15}-\cos(x) +e^x)} = \frac{45x^{14}+\sin(x) +e^x}{3x^{15}-\cos(x) +e^x}. \end{equation*}

While the derivative may look intimidating at first, look for the pattern. The denominator is the same as what was inside the natural log function; the numerator is simply its derivative.

This pattern recognition process can be applied to lots of functions. In general, instead of writing “anything”, we use \(u\) as a generic function of \(x\text{.}\) We then say \begin{equation*} \lzoo{\ln(u)} = \frac{u'}{u}. \end{equation*}

The following is a short list of how the Chain Rule can be quickly applied to familiar functions.

  1. \(\ds\lzoo{x}{u^n} = n\cdot u^{n-1}\cdot u'\text{.}\)

  2. \(\ds\lzoo{x}{e^u} = e^u \cdot u'\text{.}\)

  3. \(\ds\lzoo{x}{\sin(u)} = \cos(u) \cdot u'\text{.}\)

  4. \(\ds\lzoo{x}{\cos(u)} = -\sin(u)\cdot u'\text{.}\)

  5. \(\ds\lzoo{x}{\tan(u)} = \sec^2(u) \cdot u'\text{.}\)

Of course, the Chain Rule can be applied in conjunction with any of the other rules we have already learned. We practice this next.

Example2.5.9Using the Product, Quotient and Chain Rules

Find the derivatives of the following functions.

  1. \(f(x) = x^5 \sin(2x^3)\)

  2. \(f(x) = \dfrac{5x^3}{e^{-x^2}}\)

Solution

A key to correctly working these problems is to break the problem down into smaller, more manageable pieces. For instance, when using the Product Rule and Chain Rule together, just consider the first part of the Product Rule at first: \(f(x)g'(x)\text{.}\) Just rewrite \(f(x)\text{,}\) then find \(g'(x)\text{.}\) Then move on to the \(\fp(x)g(x)\) part. Don't attempt to figure out both parts at once.

Likewise, using the Quotient Rule, approach the numerator in two steps and handle the denominator after completing that. Only simplify afterward.

We can also employ the Chain Rule itself several times, as shown in the next example.

Example2.5.10Using the Chain Rule multiple times

Find the derivative of \(y = \tan^5(6x^3-7x)\text{.}\)

Solution

It is a traditional mathematical exercise to find the derivatives of arbitrarily complicated functions just to demonstrate that it can be done. Just break everything down into smaller pieces.

Example2.5.11Using the Product, Quotient and Chain Rules

Find the derivative of \(f(x) = \frac{x\cos(x^{-2})-\sin^2(e^{4x})}{\ln(x^2+5x^4)}\text{.}\)

Solution

The Chain Rule also has theoretic value. That is, it can be used to find the derivatives of functions that we have not yet learned as we do in the following example.

Example2.5.12The Chain Rule and exponential functions

Use the Chain Rule to find the derivative of \(y= a^x\) where \(a>0\text{,}\) \(a\neq 1\) is constant.

Solution

The previous example produced a result worthy of special note.

Subsection2.5.1Alternate Chain Rule Notation

It is instructive to understand what the Chain Rule “looks like” using “\(\lz{y}{x}\)” notation instead of \(y'\) notation. Suppose that \(y=f(u)\) is a function of \(u\text{,}\) where \(u=g(x)\) is a function of \(x\text{,}\) as stated in Theorem 2.5.3. Then, through the composition \(f \circ g\text{,}\) we can think of \(y\) as a function of \(x\text{,}\) as \(y=f(g(x))\text{.}\) Thus the derivative of \(y\) with respect to \(x\) makes sense; we can talk about \(\lz{y}{x}\text{.}\) This leads to an interesting progression of notation: \begin{align*} y' \amp = \fp(g(x))\cdot g'(x)\\ \lz{y}{x} \amp = y'(u) \cdot u'(x)\amp\amp \text{ since } y=f(u) \text{ and }u=g(x)\\ \lz{y}{x} \amp = \lz{y}{u} \cdot \lz{u}{x}\amp\amp \text{(using “fractional notation” for the derivative)} \end{align*}

Here the “fractional” aspect of the derivative notation stands out. On the right hand side, it seems as though the “\(du\)” terms cancel out, leaving \begin{equation*} \frac{dy}{dx} = \frac{dy}{dx}. \end{equation*}

It is important to realize that we are not canceling these terms; the derivative notation of \(\lz{y}{x}\) is one symbol. It is equally important to realize that this notation was chosen precisely because of this behavior. It makes applying the Chain Rule easy with multiple variables. For instance, \begin{equation*} \lz{y}{t} = \lz{y}{\bigcirc} \cdot \lz{\bigcirc}{\triangle} \cdot \lz{\triangle}{t}. \end{equation*} where \(\bigcirc\) and \(\triangle\) are any variables you'd like to use.

One of the most common ways of “visualizing” the Chain Rule is to consider a set of gears, as shown in Figure 2.5.14. The gears have \(36\text{,}\) \(18\text{,}\) and \(6\) teeth, respectively. That means for every revolution of the \(x\) gear, the \(u\) gear revolves twice. That is, the rate at which the \(u\) gear makes a revolution is twice as fast as the rate at which the \(x\) gear makes a revolution.

Using the terminology of calculus, the rate of \(u\)-change, with respect to \(x\text{,}\) is \(\lz{u}{x} = 2\text{.}\)

Likewise, every revolution of \(u\) causes \(3\) revolutions of \(y\text{:}\) \(\lz{y}{u} = 3\text{.}\) How does \(y\) change with respect to \(x\text{?}\) For each revolution of \(x\text{,}\) \(y\) revolves \(6\) times; that is, \begin{equation*} \frac{dy}{dx} = \frac{dy}{du}\cdot \frac{du}{dx} = 2\cdot 3 = 6. \end{equation*}

We can then extend the Chain Rule with more variables by adding more gears to the picture.

<<SVG image is unavailable, or your browser cannot render it>>

Figure2.5.14A series of gears to demonstrate the Chain Rule. Note how \(\lz{y}{x} = \lz{y}{u}\cdot\lz{u}{x}\)

It is difficult to overstate the importance of the Chain Rule. So often the functions that we deal with are compositions of two or more functions, requiring us to use this rule to compute derivatives. It is often used in practice when actual functions are unknown. Rather, through measurement, we can calculate \(\lz{y}{u}\) and \(\lz{u}{x}\text{.}\) With our knowledge of the Chain Rule, finding \(\lz{y}{x}\) is straightforward.

In Section 2.6, we use the Chain Rule to justify another differentiation technique. There are many curves that we can draw in the plane that fail the “vertical line test.” For instance, consider \(x^2+y^2=1\text{,}\) which describes the unit circle. We may still be interested in finding slopes of tangent lines to the circle at various points. Section 2.6 shows how we can find \(\lz{y}{x}\) without first “solving for \(y\text{.}\)” While we can in this instance, in many other instances solving for \(y\) is impossible. In these situations, implicit differentiation is indispensable.

Subsection2.5.2Exercises

In the following exercises, find the equations of tangent and normal lines to the graph of the function at the given point. Note: the functions here are the same as in Exercises 2.5.2.7 through 2.5.2.10.