Automatic Differentiation

16 Jun 2026 · Stone Liu

The Chain Rule

Recall that the definition of the derivative of a function at a given input is

Let , and . You can also simply rewrite this as: . I find it very helpful to remember what the derivative is actually measuring. Consider the composite function . Let in otherwords, is the output to . The derivative of is saying that if you move the input to which is by some very small change, call it delta , how much does the output of which is change?

Now let us assign symbols to the rest of the values of our functions:

What is the derivative of ? that would be

The chain rule says that it is a product of two derivatives. The more standardized notation would be

Using this, we find that

Computational Graphs

Let us define a function of two variables .

Let us see how the computation flows from .

Assigning variables to each of our nodes now we have

Forward Autodifferentiation

What we want to do now is to differentiate with respect to each of our input variables for each of our respective nodes .

We do the same with since that is another input variable.

But of course we computed all of these symbolically this is quite inefficient when doing computation so thus we want to be able to reuse previous expressions. If we consider forward autodifferentiation with respect to we get:

If we differentiate with respect to then we just have to swap our seed values of .

  def forward_auto_diff(x, y):
vector = []
n1 = x
n2 = y
n3 = x + 2 * y
n4 = x * y
n5 = (x + 2 * y) ** 2
n6 = math.sin(x * y)
n7 = n5 * n6

# Autodif with respect to n1
dn1 = 1
dn2 = 0
dn3 = 1
dn4 = n1 * dn2 + n2 * dn1
dn5 = 2 * (n3) * dn3
dn6 = math.cos(n4) * dn4
dn7 = n5 * dn6 + n6 * dn5

vector.append(dn7)

# Autodif with respect to n2
dn1 = 0
dn2 = 1
dn3 = 2
dn4 = n1 * dn2 + n2 * dn1
dn5 = 2 * (n3) * dn3
dn6 = math.cos(n4) * dn4
dn7 = n5 * dn6 + n6 * dn5

vector.append(dn7)
# return the output of f aswell as the gradient.
return n7, np.array(vector)

So really forward autodifferentiation is just a way to reuse past computations so that it is more efficient.

TBD (To Be Continued)…