Computation graphs are used to represent formulas step-by-step. In [[back propagation]], computation graphs are used to decompose a function of arbitrary complexity into component primitive functions for which derivatives are known (e.g., $x^n$, $e^x$). This allows for the efficient calculation of the derivative of the complex function using the [[chain rule]]. Let's use a simple example to illustrate the basics. Let's say we have some function $ F(a, b, c) = 3 \times (a + b) \times c $ The computation graph can be represented as ```mermaid graph LR A[a] B[b] C[c] A --> U["u=a+b"] B --> U U --> Mul["v=u*c"] C --> Mul Mul --> F["F=3*v"] ``` Note that in some representations, only the operator (e.g., $+$ or $*$) is shown. First, we must forward propagate the graph with some initial values. Let's use $a=2$, $b=4$ , and $c=8$. ```mermaid graph LR A[a=2] B[b=4] C[c=8] A --> U["u=a+b=6"] B --> U U --> Mul["v=u*c=48"] C --> Mul Mul --> F["F=3*v=144"] ``` For back propagation, we ultimately want to calculate the sensitivity of $F$ to changes in input variables $a$, $b$, and $c$, which is to say, using variable $a$ for example, what is $\frac{\partial F}{\partial a}$? From the chain rule, we can break this down as $ \frac{\partial F}{\partial a} = \frac{\partial F}{\partial v} \times \frac{\partial v}{\partial u} \times \frac{\partial u}{\partial a} $ What impact would a $0.001$ increase in $a$ have on $F$? It would be $ F = 3 * (2.001 + 4) * 8 = 144.024 $ which is a $24$ times difference ($0.024 / 0.001 = 24)$ Let's compute the partial derivatives of each step n the calculation graph which gives $\begin{align} \frac{\partial F}{\partial v} = 3 && \frac{\partial v}{\partial u} = c = 8 && \frac{\partial u}{\partial a} = 1 \end{align}$ Therefore $ \frac{\partial F}{\partial a} = \frac{\partial F}{\partial v} \times \frac{\partial v}{\partial u} \times \frac{\partial u}{\partial a} = 3 \times 8 \times 1 = 24 $ Now we can see how the chain rule works for this example. To use the computation graph, simply start from the right side of the graph and work backwards. Filling in the interim value at each step in the computation. For each step, you are calculating the the partial derivative of $F$ with respect to the associated variable at that step in the graph (for completeness, technically, we start with $\frac{\partial F}{\partial F} = 1$). To recap, to use a computation graph, follow these steps. 1. Convert your complex formula into a computation graph 2. Forward feed the initial values 3. Find the partial derivative at each step 4. Starting from the right side of the graph with value 1, plug in the necessary values for each partial derivative to get the partial derivative of $F$ with respect to the associated variable at that step in the graph. > [!Tip]- Additional Resources > - [DeepLearning.ai | Derivatives with Computation Graphs](https://youtu.be/nJyUyKN-XBQ?si=AODv0fRICvFc9SXW) > - [MITOpenCourseware | Differentiation on Computational Graphs](https://youtu.be/r9_5dxtDTOk?si=J5cJ6yyjte6EwNRg) > -