MA101.20 Generalisation of the chain rule to partials

Consider the following simple example. Let f be a function of two variables defined by:

f(x,y)=x^2 y where x,y \in \mathbb{R}.

By substituting,

X = u \cos{v}
Y = u + 3v

Let us define a function F of two variables by F(x,y) = u^2 \cos{^2}{v}(u + 3v)

Then we can calculate

\frac{\partial{F}}{\partial{u}} = 3u\cos{^2}{v}(u+2v)

With the obvious intentions regarding the partial derivatives \frac{\partial{f}}{\partial{x}}, \frac{\partial{x}}{\partial{u}}, \frac{\partial{x}}{\partial{u}}, \frac{\partial{y}}{\partial{u}} we can also calculate

\displaystyle \begin{aligned} &\frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{u}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{u}} \\&= 2xy\cos{v}+x^2\cdot 1 \\&= 2u\cos{v}(u+3v)\cos{u} + y^2\cos{^2}{v} \\&= 3u\cos{^2}{v}(u+2v) \\&= \frac{\partial{F}}{\partial{u}} \end{aligned}

It can also be checked that

\displaystyle \begin{aligned} &\frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{v}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{v}} = \frac{\partial{F}}{\partial{v}} \end{aligned}

These two results are indeed true for a general function f(x,y) and substitution x=x(u,v), y=y(u,v). They are regarded as a generalisation of the chain rule for one variable. Again \frac{\partial{F}}{\partial{u}} and \frac{\partial{F}}{\partial{v}} are often confusingly written as \frac{\partial{f}}{\partial{u}} and \frac{\partial{f}}{\partial{v}}.

The rule is then

\displaystyle \frac{\partial{f}}{\partial{u}}  = \frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{u}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{u}}

\displaystyle \frac{\partial{f}}{\partial{v}}  = \frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{v}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{v}}

It is very important to remember what is meant by all these items.

Note the chain rule can be used both ways.

ie let f(x,y), x=x(u,v), y=y(u,v) define F(u,v)

We have

\displaystyle \frac{\partial{F}}{\partial{u}}  = \frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{u}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{u}}

But also f(u,v) defines f(x,y) (substituting for u and v in terms of x and y)


\displaystyle \frac{\partial{f}}{\partial{x}}  = \frac{\partial{f}}{\partial{u}} \cdot \frac{\partial{u}}{\partial{x}}+\frac{\partial{f}}{\partial{v}} \cdot \frac{\partial{v}}{\partial{x}}

and, of course, we put F as f throughout.

These results are special cases of the so called general chain rule.

If we have a function f(x_{1},...x_{n}) where x are substitutions

\displaystyle x_1 = \phi (u_{1},...u_{n})
\displaystyle x_2 = \phi (u_{1},...u_{n})
\displaystyle \vdots
\displaystyle x_n = \phi (u_{1},...u_{n})

Then \displaystyle \frac{\partial{f}}{\partial{u_1}} = \frac{\partial{f}}{\partial{x_1}} \cdot \frac{\partial{x_1}}{\partial{u_1}}+ \frac{\partial{f}}{\partial{x_2}} \cdot \frac{\partial{x_2}}{\partial{u_1}}+\cdots+ \frac{\partial{f}}{\partial{x_n}} \cdot \frac{\partial{x_n}}{\partial{u_1}}

Special cases

i) \displaystyle f=f(x,y), x=x(t), y=y(t)

\displaystyle \frac{\partial{f}}{\partial{t}}  = \frac{\partial{f}}{\partial{x}} \cdot \frac{\partial{x}}{\partial{t}}+\frac{\partial{f}}{\partial{y}} \cdot \frac{\partial{y}}{\partial{t}}

ii) \displaystyle f=f(x,y), y=y(x), x=x

\displaystyle \frac{\partial{f}}{\partial{x}}  = \frac{\partial{f}}{\partial{x}} +\frac{\partial{f}}{\partial{v}} \cdot \frac{\partial{v}}{\partial{x}}

MA101.19 The Chain Rule – One Variable

Consider the function f:\mathbb{R} \to \mathbb{R} given by f(x) = \sin{x}. We are used to wring such things as:

i) f'
ii) f'(x)
iii) \displaystyle \frac{df}{dx}.  For example we would write f'(x) = \cos{x} for example.

Equally well, of course, it would be true to write f'(u) = \cos{u}.

The meaning of (i) and (ii) are mathematically precise. f' means the derived function and f'(x) means the value of f' at x.

The meaning of \displaystyle \frac{df}{dx} can be more devious.

It can simply be taken as synonymous with f'(x). That is \displaystyle \frac{df}{dx} = f'(x).

When such is the intention it would be indisputable that \displaystyle \frac{df}{du} means f'(u).

But there are other more shady uses as we will see.

Now consider substituting x = u^2 in f(x) = \sin{x} to define a function F defined as f(u) = \sin{u^2}.

The chain rule says

\begin{aligned} \displaystyle \frac{dF}{du} &= \frac{df}{dx} \cdot \frac{dx}{du} \\&= \cos{x} \cdot 2u \\&= 2u\cos{u^2} \end{aligned}

where here \frac{dF}{du} means F'(u).

Note that F is not equal to f, but mathematicians frequently write the chain rule as,

\displaystyle \frac{df}{du} = \frac{df}{dx} \cdot \frac{dx}{du}.

Here \frac{df}{du} does not mean f'(u) which is after all \cos{u}.

To see the chain rule in a more precise and unambiguous form think of x=u^2 as defining a function g given by g(u)=u^2, then F = f \circ g and we see the chain rule as saying

(f \circ g)'(u) = f'(g(u)) \cdot g'(u)

Of course the u here is an entirely dummy symbol.

MA101.18 Partial Differentiation

Consider the curve in which the surface z=f(x,y) meets the plane y=c a constant.

In this plane z=f(x,c), and \displaystyle \frac{dz}{dx} would be a formula for the gradient of the tangent to the curve.

If we differentiate f(x,y) with respect to x, treating the y as if it was a constant (some say holding y constant), the the derivative obtained is called the partial derivative of f(x,y) with respect to x and we write \displaystyle \frac{\partial f}{\partial x} or f_x or D_x f. Similarily we have \displaystyle \frac{\partial f}{\partial y}.

With functions of more than two variables one differentiates with respect to one of the variables by holding all the other variables constant.


\displaystyle \frac{\partial}{\partial x} (x^2 y^2 + \tan{x}) = 2xy^2 + \sec{^2}{x}.

\displaystyle \frac{\partial}{\partial y}( x^2 y^2 + tan x) =2x^2 y

\displaystyle \frac{\partial}{\partial x} (2x+\sin{xy}) = 2 + y \cos{xy}

\displaystyle \frac{\partial}{\partial y} (2x+\sin{xy}) = x \cos{xy}

In the obvious way we can have higher order partial derivatives.

\displaystyle \begin{aligned} f(x,y) &= x^2 y + \sin{x} \\ \frac{\partial ^2 f}{\partial x^2} &= \frac{\partial}{\partial x}\bigg[\frac{\partial}{\partial x}(x^2y + \sin{x})\bigg] \\ &= \frac{\partial}{\partial x}\bigg[2xy+\cos{x}\bigg] \\ &= 2y - \sin{x}\end{aligned}

\displaystyle \begin{aligned} \frac{\partial ^2 f}{\partial y \partial x} &= f_{yx} \\ &= \frac{\partial}{\partial y}\bigg[\frac{\partial}{\partial x}(x^2y + \sin{x})\bigg] \\ &= 2x \end{aligned}

\displaystyle \begin{aligned} \frac{\partial ^2 f}{\partial x \partial y} &= f_{xy} \\ &= \frac{\partial}{\partial x}\bigg[\frac{\partial}{\partial y}(x^2y + \sin{x})\bigg] \\ &= 2x \end{aligned}

\displaystyle \begin{aligned} \frac{\partial ^2 f}{\partial y^2} &= 0 \end{aligned}

Similarly for V = \pi r^2 h we have:

\displaystyle \frac{\partial ^2 V}{\partial r^2} = V_{rr} = 2 \pi h

\displaystyle \frac{\partial ^2 V}{\partial r \partial h} = V_{rh} = 2 \pi r

\displaystyle \frac{\partial ^2 V}{\partial h \partial r} = V_{hr} = 2 \pi r

\displaystyle \frac{\partial ^2 V}{\partial h^2} = V_{hh} = 0

For commonly encountered functions f_{xy} we have that

\displaystyle \frac{\partial ^2 f}{\partial x \partial y} = \frac{\partial ^2 f}{\partial y \partial x}

From here on we may assume that all the mixed derivatives are equal.

Note, the normal rules (sum, product, quotient, function of a function) of differentiation apply to partial differentiation.


MA101.17 Continuity of functions of two variables

We can say that f(x,y)\to L as the (x,y) \to (a,b), or \displaystyle \lim_{(x,y)\to (a,b)} f(x,y) = L.

If, intuitively speaking, by going close enough to (a,b) we can get f(a,b) as close as we like to L.

f(x,y) is said to be continuous at (a,b) if \displaystyle \lim_{(x,y)\to (a,b)} f(x,y) = f(a,b).

Just as with one variable limits from below and above may be different, with two variables we may get various limits coming in from different paths.


\displaystyle f(x,y)=\frac{xy}{x^2+y^2} \qquad (x,y) \neq (0,0)

Consider coming in to (0,0) along the line y=mx (m fixed). Along this line we have:

\displaystyle f(x,y) = f(x,mx) = \frac{xmx}{x^2 + m^2x^2} = \frac{m}{1+m^2}

Thus along y=mx,

\displaystyle \lim_{(x,y)\to (0,0)} f(x,y) = \lim_{x\to 0} \frac{m}{1+m^2} = \frac{m}{1+m^2}

If we come in along the line y=x^2,

\displaystyle \lim_{along y=x^2} f(x,y) = \lim_{x\to 0} \frac{x^3}{x^2+x^4} = \lim_{x\to 0}\frac{1}{\frac{1}{x}+x} = 0.

MA101.16 Functions of two variables or more

For a function of two (real) variables each element of its domain is an ordered pair of real numbers (x,y) only, and each element of its range is a real number, z, say.

x and y are called independent variables, and z the dependent variable.

To specify such functions we must have both a rule, and a domain.


i) \begin{aligned} z=\sqrt{1-x^2-y^2} \qquad&\mbox{domain: }&&\{(x,y) | x^2 +y^2 \leq 1\} \\&\mbox{range: }&&[0,1]\end{aligned}

ii) \begin{aligned} f(x,y)=\sqrt{2-x}+\sqrt{9-y^2} \qquad&\mbox{domain: }&&\{(x,y) | x \leq 2, -3\leq y\leq3\} \\&\mbox{range: }&&[0,\infty)\end{aligned}

If no known domain is specified we again take it to be the largest subset of \mathbb{R}^2 for which the rule makes sense.

A graph of z=f(x,y) may be drawn in the usual way. (ie. (x,y) in the horizontal plane called the xy plane, and z moving along a third, vertical axis.

For obvious reasons informative graphs are often quite difficult to draw. It’s often useful to draw profiles of f, that is the surface curves where their surface meets planes parallel to the coordinate axis.


Sketch x=e^{-xy}

Consider the intersection with x=constant, y=constant and z=constant.

\begin{aligned} x=0 \qquad&z=1 \qquad\qquad&&y=0 &&&&z=1\\x=1 \qquad&z=e^{-y} \qquad\qquad&&y=1 &&&&z=e^{-x}\\x=2 \qquad&z=e^{-2y} \qquad\qquad&&y=2 &&&&z=e^{-2x} \end{aligned}

z=\frac{1}{2} \Rightarrow \frac{1}{2}=e^{-xy} \Rightarrow e^{xy}=2 \Rightarrow xy=\ln{2} \\  z=e^{-1} \Rightarrow e^1 = e^{xy} \Rightarrow xy=1

Another useful diagram of f(x,y) is that provided by drawing the contour lines (or level curves). That is we sketch in the x,y plane the graph of y=f(x,y) for various values of z.

Look at z=e^{-xy} again.

-xy = \ln{z} so xy =-\ln{z}


\begin{aligned}&z=1 \qquad&&xy=0 \\&z=e&&xy=-1 \\&z=e^4&& xy=-4 \\&z=e^{-1}&&xy=1 \\&z=e^{-4}&&xy=4 \end{aligned}


Sketch z=x^2+y^2

For a function w=f(x,y,z) we can not sketch profile, and the best we can do is sketch level surfaces.



For w=2 \Rightarrow 2-x-y=z etc…

MA101.14 The hyperbolic functions

We define:



Note that:

\sinh{-x}=\frac{1}{2}(e^{-x}-e^x)=-\sinh{x} – so an odd function.

\cosh{-x}=\frac{1}{2}(e^{-x}+e^x)=\cosh{x} – so an even function.

Also: \cosh{x}-\sinh{x}=e^{-x}\rightarrow 0 as x \to \infty.

We also define:

\begin{aligned} \tanh{x}&=\frac{\sinh{x}}{\cosh{x}} \\ \coth{x}&=\frac{\cosh{x}}{\sinh{x}}, x\neq 0 \\ \mbox{sech }x &=\frac{1}{\cosh{x}} \\ \mbox{cosech }x &=\frac{1}{\sinh{x}}, x\neq 0 \end{aligned}


\begin{aligned} \tanh{x} &= \frac{e^x-e^{-x}}{e^x+e^{-x}} \\&= \frac{1-e^{-2x}}{1+e^{-2x}} \rightarrow 1 \;\mbox{from below as} x\to\infty \\ &= \frac{e^{2x}-1}{e^{2x}+1} \rightarrow -1 \;\mbox{from above as} x\to -\infty \end{aligned}

Derivative of sinh and cosh



Find derivatives of tanh, coth, sech,cosech.

Inverse functions

sinh is an injection and so we have an inverse function denoted by \sinh^{-1}{x}.

Domain \sinh^{-1}{x} = range of \sinh{x} = \mathbb{R}
Range \sinh^{-1}{x} = domain of \sinh{x} = \mathbb{R}

\sinh^{-1}{x} means the real number whose \sinh{x} is x.

tanh is an injection and so we have an inverse function denoted by \tanh^{-1}{x}.

Domain \tanh^{-1}{x} = range of \tanh{x} = (-1,1)
Range \tanh^{-1}{x} = domain of \tanh{x} = \mathbb{R}

\tanh^{-1}{x} means the real number whose \tanh{x} is x.

Note that \cosh{x} is not an injection, so we need a cut down domain to non-negative x’s (x \geq 0) to make it one.

\cosh^{-1}{x} means the non-negative real number whose \cosh{x} is x.

Derivatives of the inverse hyperbolic functions

\begin{aligned} y &= \cosh^{-1}{x} \\ x &= \cosh{y} \\ \frac{dx}{dy}&=\sinh{y} \\ \frac{dy}{dx} &= \frac{1}{\sinh{y}} \\ &= \frac{1}{\sqrt{x^1-1}} \end{aligned}

\mbox{Note: }\forall x: \cosh{^2}{x}-\sinh{^2}{x}=1.

\begin{aligned} y &= \sinh^{-1}{x} \\ x & = \sinh{y} \\ \frac{dx}{dy} &= \cosh{y} \\ \frac{dy}{dx} &= \frac{1}{\cosh{y}}  \\&= \frac{1}{\sqrt{1+x^2}} \end{aligned}

MA101.13 Trigonometrical functions

Series definitions for the sin and cosine functions are:

\sin{x} = x - \frac{x^3}{3!} + \frac{x^5}{5!} - ...
\cos{x} = 1 - \frac{x^2}{2!} + \frac{x^4}{4!} - ...

These converge \forall x \in \mathbb{R}.

If we differentiate these term by term we can see that:

\frac{d(\sin{x})}{dx} = \cos{x}
\frac{d(\cos{x})}{dx} = -\sin{x}

Many other properties can be deduced from these power series.

The graph of y = \sin{x} is as shown:

We can see that sin is not an injection (domain of \mathbb{R}) and so there is no inverse. However the function f:[-\frac{\pi}{2},\frac{\pi}{2}]\to[-1,1], or f:x\to \sin{x} [called the cut down sine], has the graph:

and this is an injection, and has an inverse function f^{-1} with domain $latex[-1,1]$ and range [-\frac{\pi}{2},\frac{\pi}{2}]. f^{-1} is the unique real number (angle) between [-\frac{\pi}{2},\frac{\pi}{2}] whose sine is x. The f^{-1} is symbolised by \sin^{-1}{x} or \arcsin{x}.

The graph of \arcsin{x} is a reflection of y=\sin{x} (cut down) in the line y=x.

Knowing y=\arcsin{x} is true, you may deduce x=\sin{x} is true.

Example: \frac{\pi}{4}=arcsin(\frac{1}{\sqrt{2}}) \Rightarrow sin)\frac{\pi}{4} = \frac{1}{\sqrt{2}}.

Knowing x = sin(y) is true, you may not deduce y = arcsin(x).

Example: \frac{1}{\sqrt{2}}=sin(\frac{3\pi}{4}) \nRightarrow \frac{3\pi}{4} = arcsin(\frac{1}{\sqrt{2}}).


\frac{d(arcsin(x))}{dx} = \frac{1}{\sqrt{1-x^2}}


\begin{aligned} \mbox{Let: } y &= arcsin(x) \\ \mbox{then } x &= sin(y) \\ \frac{dx}{dy} &= cos(y) \\ \frac{dy}{dx} &= \frac{1}{cos(y)} \\ &= \frac{1}{\sqrt{1-sin^2(y)}} \\ &= \frac{1}{\sqrt{1-x^2}} \end{aligned}

We take the positive square root because for [-\frac{\pi}{2} \leq y \leq \frac{\pi}{2}] the \cos{y} \geq 0

Similarly we define the ‘so called’ cut down cosine – this is cosine but with the domain [0,\pi], and the cut down tangent with domain (-\frac{\pi}{2}, \frac{\pi}{2}). The inverses of these functions are arccos and arctan.