MA101.20 Generalisation of the chain rule to partials

Consider the following simple example. Let f be a function of two variables defined by:

f(x,y)=x^2 y where x,y \in \mathbb{R}.

By substituting,

X = u \cos{v}
Y = u + 3v

Let us define a function F of two variables by F(x,y) = u^2 \cos{^2}{v}(u + 3v)

Then we can calculate

\frac{\partial{F}}{\partial{u}} = 3u\cos{^2}{v}(u+2v)

With the obvious intentions regarding the partial derivatives \frac{\partial{f}}{\partial{x}}, \frac{\partial{x}}{\partial{u}}, \frac{\partial{x}}{\partial{u}}, \frac{\partial{y}}{\partial{u}} we can also calculate

\displaystyle \begin{aligned} &\frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{u}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{u}} \\&= 2xy\cos{v}+x^2\cdot 1 \\&= 2u\cos{v}(u+3v)\cos{u} + y^2\cos{^2}{v} \\&= 3u\cos{^2}{v}(u+2v) \\&= \frac{\partial{F}}{\partial{u}} \end{aligned}

It can also be checked that

\displaystyle \begin{aligned} &\frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{v}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{v}} = \frac{\partial{F}}{\partial{v}} \end{aligned}

These two results are indeed true for a general function f(x,y) and substitution x=x(u,v), y=y(u,v). They are regarded as a generalisation of the chain rule for one variable. Again \frac{\partial{F}}{\partial{u}} and \frac{\partial{F}}{\partial{v}} are often confusingly written as \frac{\partial{f}}{\partial{u}} and \frac{\partial{f}}{\partial{v}}.

The rule is then

\displaystyle \frac{\partial{f}}{\partial{u}}  = \frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{u}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{u}}

\displaystyle \frac{\partial{f}}{\partial{v}}  = \frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{v}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{v}}

It is very important to remember what is meant by all these items.

Note the chain rule can be used both ways.

ie let f(x,y), x=x(u,v), y=y(u,v) define F(u,v)

We have

\displaystyle \frac{\partial{F}}{\partial{u}}  = \frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{u}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{u}}

But also f(u,v) defines f(x,y) (substituting for u and v in terms of x and y)


\displaystyle \frac{\partial{f}}{\partial{x}}  = \frac{\partial{f}}{\partial{u}} \cdot \frac{\partial{u}}{\partial{x}}+\frac{\partial{f}}{\partial{v}} \cdot \frac{\partial{v}}{\partial{x}}

and, of course, we put F as f throughout.

These results are special cases of the so called general chain rule.

If we have a function f(x_{1},...x_{n}) where x are substitutions

\displaystyle x_1 = \phi (u_{1},...u_{n})
\displaystyle x_2 = \phi (u_{1},...u_{n})
\displaystyle \vdots
\displaystyle x_n = \phi (u_{1},...u_{n})

Then \displaystyle \frac{\partial{f}}{\partial{u_1}} = \frac{\partial{f}}{\partial{x_1}} \cdot \frac{\partial{x_1}}{\partial{u_1}}+ \frac{\partial{f}}{\partial{x_2}} \cdot \frac{\partial{x_2}}{\partial{u_1}}+\cdots+ \frac{\partial{f}}{\partial{x_n}} \cdot \frac{\partial{x_n}}{\partial{u_1}}

Special cases

i) \displaystyle f=f(x,y), x=x(t), y=y(t)

\displaystyle \frac{\partial{f}}{\partial{t}}  = \frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{t}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{t}}

ii) \displaystyle f=f(x,y), y=y(x), x=x

\displaystyle \frac{\partial{f}}{\partial{x}}  = \frac{\partial{f}}{\partial{x}} +\frac{\partial{f}}{\partial{v}} \cdot \frac{\partial{v}}{\partial{x}}

MA101.19 The Chain Rule – One Variable

Consider the function f:\mathbb{R} \to \mathbb{R} given by f(x) = \sin{x}. We are used to wring such things as:

i) f'
ii) f'(x)
iii) \displaystyle \frac{df}{dx}.  For example we would write f'(x) = \cos{x} for example.

Equally well, of course, it would be true to write f'(u) = \cos{u}.

The meaning of (i) and (ii) are mathematically precise. f' means the derived function and f'(x) means the value of f' at x.

The meaning of \displaystyle \frac{df}{dx} can be more devious.

It can simply be taken as synonymous with f'(x). That is \displaystyle \frac{df}{dx} = f'(x).

When such is the intention it would be indisputable that \displaystyle \frac{df}{du} means f'(u).

But there are other more shady uses as we will see.

Now consider substituting x = u^2 in f(x) = \sin{x} to define a function F defined as f(u) = \sin{u^2}.

The chain rule says

\begin{aligned} \displaystyle \frac{dF}{du} &= \frac{df}{dx} \cdot \frac{dx}{du} \\&= \cos{x} \cdot 2u \\&= 2u\cos{u^2} \end{aligned}

where here \frac{dF}{du} means F'(u).

Note that F is not equal to f, but mathematicians frequently write the chain rule as,

\displaystyle \frac{df}{du} = \frac{df}{dx} \cdot \frac{dx}{du}.

Here \frac{df}{du} does not mean f'(u) which is after all \cos{u}.

To see the chain rule in a more precise and unambiguous form think of x=u^2 as defining a function g given by g(u)=u^2, then F = f \circ g and we see the chain rule as saying

(f \circ g)'(u) = f'(g(u)) \cdot g'(u)

Of course the u here is an entirely dummy symbol.