Auto-differentiation: make a julia package

We have talked about auto-differentiation before, but you can’t over-hype this programming technique. Everything is made easy if calculation of a function and its exact derivative can be done simultaneously at low cost. In every modern language there are auto-diff packages that can be imported and used without much fuss, but I contend that writing your own gives you unparalleled  control over the process.

Let’s put together a julia module for dual numbers (see the previous dual numbers post) and overload all of the arithmetic and mathematical operators in the julia Base package. This illustrates how methods are added to functions, how modules import the functions that are being modified, and how to export the modified functions.

We will create a struct called Dn (dual number) that has an infinitismal part as well as a primary value. The coefficient of the infinitismal part will be one, and we will never actually use this part of the number. We will need to import all operators and functions from Base that methods will be added to so that all of these functions can be applied to instances of Dn.

module DualNum
import Base: *, +, -, /, log,exp,sqrt,sin,cos,tan,sinh,cosh,tanh,log,exp,^

struct Dn
val
eps
end

Dn(in::Float64) = Dn(in,1.0)
function Valu(X::Dn)
X.val
end
function Deriv(X::Dn)
X.eps
end

We have an instantiator function Dn, to create instances of a dual number from an ordinary number, and two functions Valu and Deriv that will input a Dn type and produce its principle part and the coefficient of its infinitismal part. Next we add methods to the arithmetic operations

function *(X::Dn, Y::Dn)
Dn(X.val*Y.val, X.val*Y.eps+Y.val*X.eps)
end
function *(X::Dn, Y::Float64)
Dn(X.val*Y,X.eps*Y)
end
function *(X::Float64, Y::Dn)
Dn(Y.val*X, Y.eps*X)
end
function /(X::Dn, Y::Dn)
Dn(X.val/Y.val, (-X.val*Y.eps+Y.val*X.eps)/Y.val^2)
end
function /(X::Dn, Y::Float64)
Dn(X.val/Y, X.eps/Y)
end
function /(X::Float64, Y::Dn)
Dn(X/Y.val, -(X/Y.val^2)*Y.eps)
end
function +(X::Dn, Y::Dn)
Dn(X.val+Y.val, Y.eps+X.eps)
end
function +(X::Dn, Y::Float64)
Dn(X.val+Y, X.eps)
end
function +(X::Float64, Y::Dn)
Dn(Y.val+X,Y.eps)
# and so on and so forth, until we have all possible combinations

Finally we overload (add methods to) the basic functions, so they can input Dn’s and output Dn’s

function sqrt(X::Dn)
Dn(sqrt(X.val), X.eps/(2*sqrt(X.val)))
end
function log(X::Dn)
Dn(log(X.val), X.eps/X.val)
end
function exp(X::Dn)
Dn(exp(X.val), X.eps*exp(X.val))
end
function sin(X::Dn)
Dn(sin(X.val), X.eps*cos(X.val))
end
# and so forth. 

Finally we export all of these things and end the module

export Dn, Valu,Deriv,*, +, -, /, log,exp,sqrt,sin,cos,tan,sinh,cosh,tanh,log,exp,^
end

So if we take function suxh as x^4, the dual number for x is x+\epsilon, and for x^4 it will be x^4+4x^3\epsilon, in other words (x+\epsilon)^4 but we only keep the principle value and the derivative term in the expansion of (x+\epsilon)^4.
We could now expand the module to be able to compute gradients of multi-variate functions, and use this in our DIY neural net programs.

Physics 715 example: Legendre transforms
Suppose that we have a Helmholtz free energy F(t,\eta) for say the Van Der Waals gas near the critical point, in which

    \[t={T_c-T\over T_c}, \quad \eta={V-V_c\over V_c}\]

which in naive VdW theory will be a non-convex quartic function. In 715 we use the double-tangent method to “fix” the free energy and make it convex. This is required because the equilibrium state minimizes G, not F, because of the presence of multiple phases. So what if we want a plot of G versus P?
You can’t solve for V in terms of P,T (P=-\Big({\partial F\over \partial V}\Big)_T) because the equation defining P ans a function of V will be cubic. Good luck inverting that. But we can easily make a vector of V values, compute F and P for each at fixed T, and then get

    \[G=F+PV=F- \Big({\partial F\over \partial V}\Big)_T V\]

and reorder the whole table of V,F,P,G values according to increasing P, and plot whatever we wish. So we use DataFrames, a marvelous gadget from the world of R, now available for julia and python.

include("DualNum.jl")
using .DualNum
V=Dn.([0.01*j for j in -200:200])
t=-0.1
f(x)=-1.0*x*(1+4.0*t)+3.0*t*x^2+(3.0/8.0)*x^4
using DataFrames, Gaston
dat=DataFrame(V=Valu.(V),F=Valu.(f.(V)), P=-Deriv.(f.(V)))
dat.G=dat.F+dat.P .* dat.V
# to plot F versus V
plot(dat.V,dat.F,w=:l,lc=:blue,legend=:F_vs_η,Axes(xlabel=:η, ylabel=:F))
# to plot G versus P
dat2=sort(dat,3) # sort on pressure
plot(dat2.P,dat.G,legend=:G_vs_P,w=:l,lc=:red,Axes(xlabel=:P, ylabel=:G))

That’s it. I urge you to try it. Of course you can wait until it is assigned in 715… You can get DualNum.jl here: abadonna.physics.wisc.edu/download/dualnum/

Leave a Reply

Your email address will not be published. Required fields are marked *

Home 2.0
error: Content is protected !!