Algorithms: Nonlinear fits

Recall Newton’s method for finding roots of a function:
Suppose that you think $x$ is a root of $f(x)$ ie $f_(x)=0$ , but it is only close to zero. A better guess may be $y$

$f(y)=f(x)+(y-x){\partial f\over \partial x}+\cdots\approx=f(x)+(y-x)f'(x)+\cdots\approx 0$

Solve for $y$ ;

$y=x-{f(x)\over f'(x)}$

which can be repeated until you get as close as you wish (http://abadonna.physics.wisc.edu/2021/08/05/root-finding/).

The same idea can be used to construct nonlinear least square fits to data.
Suppose the data is $(t_1,t_2,\cdots, t_n)$ , $(x_1,x_2,\cdots, x_n)$ and the fit function $f(p,t)$ .
$f(p,t)$ depends on $M$ parameters $p_\alpha$ . Create the $M$ functions ${\partial f(t)\over \partial p_\alpha}$ useful for expanding

$f(p_1+h_1, p_2+h_2,\cdots, p_M+h_M,t)=f(p_1, p_2,\cdots, p_M,t)+\sum_\alpha {\partial f(t)\over \partial p_\alpha} h_\alpha+\cdots$

$=f+\bm{J}\cdot\bm{h}$

where $h=(h_1, \cdots, h_M)$ .

The least-squares measure of how far the data $x$ deviates from the fit is

$R(p)=\sum_{i=1}^N(x_i-f(p,t_i))^2$

Then

$R(p+h)=\sum_{i=1}^N(x_i-f(p,t_i)-\sum_\alpha {\partial f(t_i)\over \partial p_\alpha} h_\alpha+\cdots)^2$

$R(p+h)=\sum_{i=1}^N(x_i-f(p,t_i)-\sum_\alpha J_\alpha(t_i) h_\alpha+\cdots)^2$

and

${\partial R(p+h)\over \partial h_\beta}=-2\sum_i (x_i-f(p,t_i)-\sum_\alpha J_\alpha(t_i) h_\alpha+\cdots)(\sum_\gamma J_\gamma(t_i){\partial h_\gamma\over \partial h_\beta})\] \[=-2\sum_i (x_i-f(p,t_i)-\sum_\alpha J_\alpha(t_i) h_\alpha+\cdots)(\sum_\gamma J_\gamma(t_i)\delta_{\beta,\gamma})\] \[=-2\sum_i (x_i-f(p,t_i)-\sum_\alpha J_\alpha(t_i) h_\alpha+\cdots)J_\beta(t_i)$

If this is zero for each $\beta$ , we get the $h$ values that minimize $R$

$\sum_i (x_i-f(p,t_i))J_\beta(t_i)=\sum_\alpha \sum_i J_\alpha(t_i)J_\beta(t_i) h_\alpha$

Think of
$\bullet$ $J_\alpha(t_i)$ as an $M\times N$ matrix,
$\bullet$ $\sum_i J_\alpha(t_i)J_\beta(t_i)=Q_{\alpha,\beta}$ as the $\alpha,\beta$ component of a $M\times M$ matrix,
$\bullet$ $v_i=(x_i-f(p,t_i)$ is the $i^{th}$ component of an $N$ -vector,
$\bullet$ $h_\alpha$ is the $\alpha^{th}$ component of an $M$ -vector.

$\bm{J}\cdot\bm{v}=\bm{Q}\cdot\bm{h}, \quad \bm{h}=\bm{Q}^{-1}\cdot \bm{J}\cdot\bm{v}$

gives the increment in the parameters that fit better than the original ones.

The process is pretty simple:
$\bullet$ Make a plot of $x$ versus $t$ and come up with an estimate of the parameters $p_\alpha$ , $\alpha=1,2,\cdots,M$ ,
$\bullet$ compute $R$ ,
$\bullet$ compute $h_\alpha$ , and update the original guess to $p_\alpha\rightarrow p_\alpha+h_\alpha$ .
$\bullet$ Repeat until $R$ is as small as you want.

These types of algorithms are sensitive to the initial guess, so do a good job of creating one, and the algorithm converges very nicely.

Let’s implement this in R. We will make a fit for $x=A+B e^{-Ct}$ and find the best $A,B,C$ .

install.packages('matlib')
library(matlib)
# first we make some data, a vector of t values x=R0 values for some A,B,C
t <- seq(1,20,by=1)
x<- 88+1000*exp(-1.23*t)
A<- 96
B<- 1100
C <- 1.5

R0<-sum((x-(A+B*exp(-C*t)))^2)
R0

# Now we code the algorithm, we run 10 iterations
# hand-encoding df/dA=1, df/dB=exp(-C*t), etc
for (j in 1:10){
J<-matrix(c(rep(1,length(t)),exp(-C*t),-B*t*exp(-C*t)),nrow=length(t))
Jt<-t(J)
M <- Jt%*%J
dM <- diag(diag(M),3,3)
iDen=inv(M)
dp <- (iDen%*%Jt)%*%(x-A-B*exp(-C*t))
A<-A+dp[1,1]
B<-B+dp[2,1]
C<-C+dp[3,1]
R<-sum((x-(A+B*exp(-C*t)))^2)
R0<-R
}

Run the code. Note that before the run $A=96, B=1100, C=1.5$ , and after the run $A=88,B=1000, C=1.23$ in exact agreement with the input data.
Below is an R library function that fits data to a Lorentzian peak (such as a real spectral line). This is part of my “Physics307” data analysis package.

lorentzian <- deriv(~ w^2/(w^2+(x-b)^2), namevec = c("b", "w"), function.arg = c("x", "b", "w"))

LorentzFit <- function(ch,cntdat,b0,w0,iter){
#' Simple fit to Lorentzian  distribution y ~ A w^2/(w^2+(x-b)^2).
#' @param ch A vector of bin/channel labels.
#' @param cntdat A vector of bin occupancies/MCA channel counts.
#' @param b0 A first guess of peak channel location.
#' @param w0 A first guess of peak width in channels.
#' @param iter Number of iterations.
#' @return Returns a list of initial R=R0, final R=R, b,w, A.
#' @examples 
#' fit<- LorentFit(x,y,300,60,5)

cnt<-lorentzian(ch,b0,w0)
grad<-attr(cnt,"gradient")
val<- cnt[1:length(ch)]
Y<- cntdat/max(cntdat)
R0<- sum((Y-val)^2)
for (i in 1:iter){
J<-t(grad)
Q<-t(grad)%*%grad
Qi<-inv(Q)
params<-Qi%*%J%*%(Y-val)
b0<-b0+params[1];w0<-w0+params[2]
cnt<-lorentzian(ch,b0,w0)
grad<-attr(cnt,"gradient")
val<- cnt[1:length(ch)]
R<- sum((Y-val)^2)}
c(R0,R,b0,w0,max(cntdat))
}

Home 2.0

Algorithms: Nonlinear fits

Related Posts

Algorithms: Nonlinear fits

Published by Jeff on January 13, 2024

Related Posts

Algorithms: SVD (singular value decomposition)

Homotopy continuation: solve transcendental equations

Using libc and gsl in julia and lisp (and python too)