Partial Derivatives and Gradient

# Intro to Partial Derivatives When we want to find the rate of change of a function in multiple variables eg $f(x,y) = x^2 + y^2,$ we are no longer looking for a tangent *line* we are now looking for a tangent *plane*. We start by taking *partial derivatives*, which are found by holding all but one of the variables constant, then taking the derivative of the remaining function. This is the tangent to the _section function_, which is a single-variable function derived from the original by holding all but one of the input variables constant. # Notation ## First partial derivatives Suppose we have some function $f(x,y)$, then the partial derivative of $f$ with respect to $y$ is commonly written: - $\frac{\mathop{\partial f}}{\mathop{\partial x}}$, or - $f_x$ These also have "operator" forms where the function doesn't have a name, so we can say - $\frac{\mathop{\partial }}{\mathop{\partial x}}\bigl(y\cos x\bigr) =-y\sin x$ - $\bigl(y\cos x\bigr)_x =-y\sin x$ - $\mathop{\partial }_x\bigl(y\cos x\bigr) =-y\sin x$ Personally I think some restraint with the second form is probably in order. ## Second partial derivatives If you have a function with $n$ input variables, it will (obviously) have $n$ partial derivatives. If we take the partial derivative of something that is itself already a partial derivative, then the notation looks as follows $ \begin{align*} \frac{\mathop{\partial }}{\mathop{\partial x}}\frac{\mathop{\partial f}}{\mathop{\partial x}} &= \frac{\mathop{\partial^2 f}}{\mathop{\partial x^2}}\\ \left(f_x\right)_x &= f_{xx}\\ \frac{\mathop{\partial }}{\mathop{\partial y}}\frac{\mathop{\partial f}}{\mathop{\partial x}} &= \frac{\mathop{\partial^2 f}}{\mathop{\partial x\partial y}}\\ \left(f_x\right)_y &= f_{xy} \end{align*} $ ...and the operator forms work accordingly. ## Higher partial derivatives ... in a single variable are written $ \frac{\mathop{\partial^n f}}{\mathop{\partial x^n}} $ and (I think - at time of writing I haven't actually seen this in a textbook etc) $ f_{x^n} $ ...and you can infer from that how mixed higher partial derivatives might work. ### Example Take the partial derivatives of $f(x,y) = x^2 + y^2$ with respect to x and y. $ \begin{align*} \frac{\partial f}{\partial x} {}=& \frac{\partial}{\partial x} \left(x^2+y^2\right) \\ {}=& 2x \\[10pt] \frac{\partial f}{\partial y} {}=& \frac{\partial}{\partial y} \left(x^2+y^2\right) \\ {}=& 2y \tag{1} \\ \end{align*} $ ### Example 2 Take the partial derivatives of $f(x,y) = 3x^2y^3$ with respect to x and y. $ \begin{align*} \frac{\partial f}{\partial x} {}=& 3y^3(\frac{\partial}{\partial x} x^2) \\ {}=& 6xy^3 \\[10pt] \frac{\partial f}{\partial y} {}=& 3x^2(\frac{\partial}{\partial y} y^3) \\ {}=& 3x^2(3y^2) \\ {}=& 9x^2y^2 \tag{2}\\ \end{align*} $ Here are [Paul's online math notes about partial derivatives](https://tutorial.math.lamar.edu/Classes/CalcIII/PartialDerivsIntro.aspx) for more problems and details. # Gradient If we take the partial derivative of a function with respect to each of its input variables, we have a vector known as the [[Gradient]]. This is often written $\mathop{\mathrm{\mathbf{grad}}}f = \mathbf{\nabla}f$ which is pronounced "nabla f" or sometimes "del f", and $ \mathbf{\nabla}f(\mathbf{x}) = \begin{pmatrix} f_{x_1}\\ f_{x_2}\\ \vdots\\ f_{x_n}\\ \end{pmatrix} $ ...where $f_{x_n}=\frac{\partial f}{\partial x_n}.$ Note that as $\mathbf{\nabla}f$ is a vector function it should always be typeset in bold (`\mathbf{\nabla}f` in latex) or underlined in handwritten work. # Slope and maximum slope Say we have a function $f(x,y)$. The slope is the component of the gradient of a function at a given point in a certain direction. So the slope of $f$ at some point $(a,b)$ is given by $\mathbf{\nabla}f(a,b)\cdot\widehat{\mathbf{d}}$ where $\widehat{\mathbf{d}}$ is a unit vector in the direction we want to measure. Suppose the angle of $\widehat{\mathbf{d}}$ is $\alpha$. $ \begin{align*} \text{Since}~\mathbf{\nabla}f &= f_x \widehat{\mathbf{i}}+ f_y\widehat{\mathbf{j}},\\ \text{and}~\widehat{\mathbf{d}} &= \cos \alpha \widehat{\mathbf{i}}+ \sin \alpha \widehat{\mathbf{j}},\\ \text{it follows that}~\mathbf{\nabla}f(a,b)\cdot\widehat{\mathbf{d}} &= f_x(a,b) \cos \alpha + f_y(a,b) \sin \alpha \tag{1} \end{align*} $ Additionally, if $\theta$ is the angle between $\mathbf{\nabla}f$ and $\widehat{\mathbf{d}}$, then by the geometric definition of the dot product, $ \begin{align*} \mathbf{\nabla}f(a,b)\cdot\widehat{\mathbf{d}} &= |\mathbf{\nabla}f(a,b)| |\widehat{\mathbf{d}}|\cos \theta\\ &= |\mathbf{\nabla}f(a,b)| \cos \theta \tag{2} \end{align*} $ It follows from (2) that at $(a,b)$ the slope is maximised when $\theta=0$ so that $\cos \theta=1$, the direction of maximum slope is just $\mathbf{\nabla}f(a,b)=f_x(a,b)\widehat{\mathbf{i}} + f_y(a,b)\widehat{\mathbf{j}}$ and that the magnitude of the maximum slope is $|\mathbf{\nabla}f(a,b)|$. Notice also that the direction of maximum slope will be normal to the direction of a contour curve, which will have slope 0. This is perhaps obvious (a contour curve has zero slope by definition, and the maximum slope being perpendicular follows from that by trigonometry) but is worth saying.