Recall Newton’s method for finding roots of a function:
Suppose that you think is a root of ie , but it is only close to zero. A better guess may be
Solve for ;
which can be repeated until you get as close as you wish (http://abadonna.physics.wisc.edu/2021/08/05/root-finding/).
The same idea can be used to construct nonlinear least square fits to data.
Suppose the data is , and the fit function .
depends on parameters . Create the functions useful for expanding
where .
The least-squares measure of how far the data deviates from the fit is
Then
or
and
If this is zero for each , we get the values that minimize
Think of
as an matrix,
as the component of a matrix,
is the component of an -vector,
is the component of an -vector.
gives the increment in the parameters that fit better than the original ones.
The process is pretty simple:
Make a plot of versus and come up with an estimate of the parameters , ,
compute ,
compute , and update the original guess to .
Repeat until is as small as you want.
These types of algorithms are sensitive to the initial guess, so do a good job of creating one, and the algorithm converges very nicely.
Let’s implement this in R. We will make a fit for and find the best .
install.packages('matlib')
library(matlib)
# first we make some data, a vector of t values x=R0 values for some A,B,C
t <- seq(1,20,by=1)
x<- 88+1000*exp(-1.23*t)
A<- 96
B<- 1100
C <- 1.5
R0<-sum((x-(A+B*exp(-C*t)))^2)
R0
# Now we code the algorithm, we run 10 iterations
# hand-encoding df/dA=1, df/dB=exp(-C*t), etc
for (j in 1:10){
J<-matrix(c(rep(1,length(t)),exp(-C*t),-B*t*exp(-C*t)),nrow=length(t))
Jt<-t(J)
M <- Jt%*%J
dM <- diag(diag(M),3,3)
iDen=inv(M)
dp <- (iDen%*%Jt)%*%(x-A-B*exp(-C*t))
A<-A+dp[1,1]
B<-B+dp[2,1]
C<-C+dp[3,1]
R<-sum((x-(A+B*exp(-C*t)))^2)
R0<-R
}
Run the code. Note that before the run , and after the run in exact agreement with the input data.
Below is an R library function that fits data to a Lorentzian peak (such as a real spectral line). This is part of my “Physics307” data analysis package.
lorentzian <- deriv(~ w^2/(w^2+(x-b)^2), namevec = c("b", "w"), function.arg = c("x", "b", "w"))
LorentzFit <- function(ch,cntdat,b0,w0,iter){
#' Simple fit to Lorentzian distribution y ~ A w^2/(w^2+(x-b)^2).
#' @param ch A vector of bin/channel labels.
#' @param cntdat A vector of bin occupancies/MCA channel counts.
#' @param b0 A first guess of peak channel location.
#' @param w0 A first guess of peak width in channels.
#' @param iter Number of iterations.
#' @return Returns a list of initial R=R0, final R=R, b,w, A.
#' @examples
#' fit<- LorentFit(x,y,300,60,5)
cnt<-lorentzian(ch,b0,w0)
grad<-attr(cnt,"gradient")
val<- cnt[1:length(ch)]
Y<- cntdat/max(cntdat)
R0<- sum((Y-val)^2)
for (i in 1:iter){
J<-t(grad)
Q<-t(grad)%*%grad
Qi<-inv(Q)
params<-Qi%*%J%*%(Y-val)
b0<-b0+params[1];w0<-w0+params[2]
cnt<-lorentzian(ch,b0,w0)
grad<-attr(cnt,"gradient")
val<- cnt[1:length(ch)]
R<- sum((Y-val)^2)}
c(R0,R,b0,w0,max(cntdat))
}