We have talked about auto-differentiation before, but you can’t over-hype this programming technique. Everything is made easy if calculation of a function and its exact derivative can be done simultaneously at low cost. In every modern language there are auto-diff packages that can be imported and used without much fuss, but I contend that writing your own gives you unparalleled control over the process.
Let’s put together a julia module for dual numbers (see the previous dual numbers post) and overload all of the arithmetic and mathematical operators in the julia Base package. This illustrates how methods are added to functions, how modules import the functions that are being modified, and how to export the modified functions.
We will create a struct called Dn (dual number) that has an infinitismal part as well as a primary value. The coefficient of the infinitismal part will be one, and we will never actually use this part of the number. We will need to import all operators and functions from Base that methods will be added to so that all of these functions can be applied to instances of Dn.
module DualNum
import Base: *, +, -, /, log,exp,sqrt,sin,cos,tan,sinh,cosh,tanh,log,exp,^
struct Dn
val
eps
end
Dn(in::Float64) = Dn(in,1.0)
function Valu(X::Dn)
X.val
end
function Deriv(X::Dn)
X.eps
end
We have an instantiator function Dn, to create instances of a dual number from an ordinary number, and two functions Valu and Deriv that will input a Dn type and produce its principle part and the coefficient of its infinitismal part. Next we add methods to the arithmetic operations
function *(X::Dn, Y::Dn)
Dn(X.val*Y.val, X.val*Y.eps+Y.val*X.eps)
end
function *(X::Dn, Y::Float64)
Dn(X.val*Y,X.eps*Y)
end
function *(X::Float64, Y::Dn)
Dn(Y.val*X, Y.eps*X)
end
function /(X::Dn, Y::Dn)
Dn(X.val/Y.val, (-X.val*Y.eps+Y.val*X.eps)/Y.val^2)
end
function /(X::Dn, Y::Float64)
Dn(X.val/Y, X.eps/Y)
end
function /(X::Float64, Y::Dn)
Dn(X/Y.val, -(X/Y.val^2)*Y.eps)
end
function +(X::Dn, Y::Dn)
Dn(X.val+Y.val, Y.eps+X.eps)
end
function +(X::Dn, Y::Float64)
Dn(X.val+Y, X.eps)
end
function +(X::Float64, Y::Dn)
Dn(Y.val+X,Y.eps)
# and so on and so forth, until we have all possible combinations
Finally we overload (add methods to) the basic functions, so they can input Dn’s and output Dn’s
function sqrt(X::Dn)
Dn(sqrt(X.val), X.eps/(2*sqrt(X.val)))
end
function log(X::Dn)
Dn(log(X.val), X.eps/X.val)
end
function exp(X::Dn)
Dn(exp(X.val), X.eps*exp(X.val))
end
function sin(X::Dn)
Dn(sin(X.val), X.eps*cos(X.val))
end
# and so forth.
Finally we export all of these things and end the module
export Dn, Valu,Deriv,*, +, -, /, log,exp,sqrt,sin,cos,tan,sinh,cosh,tanh,log,exp,^
end
So if we take function suxh as , the dual number for is , and for it will be , in other words but we only keep the principle value and the derivative term in the expansion of .
We could now expand the module to be able to compute gradients of multi-variate functions, and use this in our DIY neural net programs.
Physics 715 example: Legendre transforms
Suppose that we have a Helmholtz free energy for say the Van Der Waals gas near the critical point, in which
which in naive VdW theory will be a non-convex quartic function. In 715 we use the double-tangent method to “fix” the free energy and make it convex. This is required because the equilibrium state minimizes , not , because of the presence of multiple phases. So what if we want a plot of versus ?
You can’t solve for in terms of () because the equation defining ans a function of will be cubic. Good luck inverting that. But we can easily make a vector of values, compute and for each at fixed , and then get
and reorder the whole table of values according to increasing , and plot whatever we wish. So we use DataFrames, a marvelous gadget from the world of R, now available for julia and python.
include("DualNum.jl")
using .DualNum
V=Dn.([0.01*j for j in -200:200])
t=-0.1
f(x)=-1.0*x*(1+4.0*t)+3.0*t*x^2+(3.0/8.0)*x^4
using DataFrames, Gaston
dat=DataFrame(V=Valu.(V),F=Valu.(f.(V)), P=-Deriv.(f.(V)))
dat.G=dat.F+dat.P .* dat.V
# to plot F versus V
plot(dat.V,dat.F,w=:l,lc=:blue,legend=:F_vs_η,Axes(xlabel=:η, ylabel=:F))
# to plot G versus P
dat2=sort(dat,3) # sort on pressure
plot(dat2.P,dat.G,legend=:G_vs_P,w=:l,lc=:red,Axes(xlabel=:P, ylabel=:G))
That’s it. I urge you to try it. Of course you can wait until it is assigned in 715… You can get DualNum.jl here: abadonna.physics.wisc.edu/download/dualnum/