# Introduction

The R package OptimModel provides various nonlinear (or linear) curve-fitting functions that use stats::optim() as its base. The packages contains many commonly-used curves and also permits the user to create a new curve function as well. To estimate curve parameters, the user may call upon ordinary least squares (OLS), weighted least squares (WLS), iterative reweighted least squares (IRWLS), or maximum likelihood estimate (MLE). For WLS, the user provides a fixed set of weights. For IRWLS and MLE, the user may choose among a variety of weight functions. Finally, there is also a function for robust outlier detection (ROUT), based on the work of Motulsky and Brown (2006).

# Pre-specified models

In the OptimModel package, data are assumed to stem from the model

$y_i = f(\boldsymbol{\theta}, x_i) + e_i, i = 1, \ldots, n$ where $$x_i$$ is a concentration or dose, $$y_i$$ is the response, $$\boldsymbol{\theta}$$ is a vector of mean-model parameters, and $$e_i \sim N(0, \sigma^2 g^2(\boldsymbol{\theta}, \boldsymbol{\phi}))$$. The mean of $$y_i | x_i$$ is $$f(\boldsymbol{\theta}, x_i)$$ and the variance is $$\sigma^2 g^2(\boldsymbol{\theta}, \boldsymbol{\phi}))$$ with variance-model parameters $$\boldsymbol{\phi}$$. The default is $$g(\boldsymbol{\theta}, \boldsymbol{\phi}) = 1$$ so that $$e_i \sim N(0, \sigma^2)$$. The user, however, may instead provide non-negative valued numerical weights $$\{w_1, \ldots, w_n \}$$ so that $$e_i \sim N(0, \sigma^2 \sqrt{w_i})$$.

A number of pre-defined models for the mean and variance are provided in OptimModel. User-defined mean models and/or variance models may be supplied to the main function, optim_fit().

## Mean-model functions

Mean models are functions of the form $$f(\theta, x)$$, where $$\theta$$ is a parameter vector and $$x$$ is an R object that is processed by the function. Usually $$x$$ is a vector of doses or concentrations; however, $$x$$ may also be a matrix, data.frame, or list. Examples of user-defined mean models are provided at the end of this section.

Although it is not a requirement, each of the mean-model functions included in OptimModel is given several attributes, shown in the table below, where f_model takes the place of the model function.

Attribute Purpose
attr(f_model, ‘backsolve’) Find x such that f_model(theta, x) = y
attr(f_model, ‘start’) Starting values for optimization

If attr(f_model, ‘gradient’) is missing, the gradient of f_model will be approximated. If attr(f_model, ‘start’) is missing, the user must supply a starting value.

### Exponential decay (first order)

The first-order exponential decay model is given by $f(\boldsymbol{\theta}, x) = A + B \times exp(-K x),$ where $$x$$ is the concentration or dose, $$\boldsymbol{\theta} = (A, B, lK)$$, $$A$$ is the minimum value, $$A+B$$ is the maximum value, and $$K=exp(lK)$$ is the shape parameter.

## Exponential decay (first order) with plateau

The first-order exponential decay model with plateau is an extension of the exponential decay model and given as $$f(\boldsymbol{\theta}, x) = y_{max}$$ if $$x \le X_0$$ and $$f(\boldsymbol{\theta}, x) = y_{min} + (y_{max} - y_{min}) \times exp(-K (x-X0))$$ if $$x > X_0$$, where $$\boldsymbol{\theta} = (x_0, y_{min}, y_{max}, lK)$$, $$X_0=exp(x_0)$$ is the inflection point between plateau and exponential decay curve, $$y_{min}$$ is the minimum value, $$y_{max}$$ is the maximum value, and $$K=exp(lK)$$ is the shape parameter.

## Exponential decay (second order)

The second-order exponential decay model is given by $f(\boldsymbol{\theta}, x) = A + B \times \{ p \times \exp(-K_1 x) + (1-p) \times \exp(-K_2 x) \},$ where $$x$$ is the concentration or dose, $$\boldsymbol{\theta} = (A, B, lK_1, lK_2, p)$$, $$A$$ is the minimum value, $$A+B$$ is the maximum value, $$K_1=exp(lK_1)$$ and $$K_2 = exp(lK_2)$$ are shape parameters, and $$p$$ is the proportion of signal from the first exponential term.

### Gompertz model

The four-parameter Gompertz model is a sigmoidal shaped curve, given by

$f(\boldsymbol{\theta}, x) = A + (B - A)\times \exp( -\exp( m*(x- x_0)) ),$ where $$x$$ is the concentration or dose, $$\boldsymbol{\theta} = (A, B, m, x_0)$$, $$A$$ is the minimum value, $$A + (B-A)\times exp(-exp(-m x_0 ))$$ is the maximum value, $$m$$ is the shape parameter, and $$x_0$$ is an offset that shifts the curve on the $$x$$-axis.

### Four-parameter Hill (4PL) model

The Hill equation (Hill, 1910), which is equivalent to the four-parameter logistic curve (Verhulst, 1845), is a sigmoidal-shaped curve with meaningful parameters. The mean model is

$f(\boldsymbol{\theta}, x) = e_{min} + \frac{e_{max} - e_{min}}{ 1 + \exp( m \times( lec50 - log(x) ) )},$ where $$x$$ is the concentration or dose, $$\boldsymbol{\theta}=(e_{min}, e_{max}, lec50, m)$$, $$e_{min}$$ is the minimum asymptote, $$e_{max}$$ is the maximum asymptote, $$lec50$$ is the natural log EC50, and $$m$$ is the shape parameter. Note that $$f(\boldsymbol{\theta}, EC50) = \frac{1}{2}( e_{min} + e_{max})$$ and $$m$$ is often called the slope or Hill parameter. Refer to Hill (1910) or Verhulst (1845) for more information.

### Five-parameter Hill model

The 5-parameter Hill equation is an extension of the 4-parameter model that is asymmetric around the EC50. The model is given by

$f(\boldsymbol{\theta}, x) = e_{min} + \frac{e_{max} - e_{min}}{ \{1 + exp( m \times( lec50 - log(x) ) )\}^s},$ where $$x$$ is the concentration or dose, $$\boldsymbol{\theta}=(e_{min}, e_{max}, lec50, m, ls)$$, $$e_{min}$$ is the minimum asymptote, $$e_{max}$$ is the maximum asymptote, $$lec50$$ is the natural log EC50, $$m$$ is the shape parameter, and $$s=exp(ls)$$ is the symmetry parameter. Note that $$f(\boldsymbol{\theta}, EC50) = \frac{1}{2^s}( e_{min} + e_{max})$$ and $$m$$ is often called the slope or Hill parameter.

The Hill model with a quadratic term is a five-parameter equation for biphasic data. The mean model is given by

$f(\boldsymbol{\theta}, x) = e_{min} + \frac{e_{max} - e_{min}}{ \{1 + exp( -\{a + b z + c z^2 \})},$ where $$z=log(x)$$, $$x$$ is the concentration or dose, $$\boldsymbol{\theta}=(e_{min}, e_{max}, a, b, c)$$, $$e_{min}$$ is the minimum asymptote, $$e_{max}$$ is the maximum asymptote, and $$(a, b, c)$$ are respectively the intercept, linear, and quadratic terms.

Note that if $$c = 0$$, this model is equivalent to the four-parameter Hill model (hill.model). Also, the $$EC50$$ is defined where $$a + b z + c z^2 = 0$$. If the roots of the quadratic equation are real, then the $$EC50$$ is given by $$(-b +/- \sqrt{b^2 - 4 a c})/(2 c)$$. The user decides which of the roots is meaningful.

### Hill switchpoint model

The Hill switchpoint is a biphasic equation that extends the four-parameter Hill model. The mean model is

$f(\boldsymbol{\theta}, x) = e_{min} + \frac{e_{max} - e_{min}}{ 1 + exp( m f(s, x) \times( lec50 - log(x) ) )},$ where $$x$$ is the concentration or dose, $$\boldsymbol{\theta}=(e_{min}, e_{max}, lec50, m, ls)$$, $$e_{min}$$ is the minimum asymptote, $$e_{max}$$ is the maximum asymptote, $$lec50$$ is the natural log EC50, and $$m$$ is the shape parameter. Note that $$f(\boldsymbol{\theta}, EC50) = \frac{1}{2}( e_{min} + e_{max})$$ and $$m$$ is often called the slope or Hill parameter.

### Beta model

The Beta model is a biphasic curve based on the Beta function. Found in MCPMod::betaMod(), the curve is given by

$f(\boldsymbol{\theta}, x) = e_{min} + e_{max} \times exp \left( log\{ \beta(\delta_1, \delta_2) \} + \delta_1 \times log(x) + \delta_2 \times log(sc - x) - (\delta_1 + \delta_2) \times log(sc) \right),$ where $$\beta(\delta_1, \delta_2) = (\delta_1+\delta_2)^{(\delta_1+\delta_2)} / ( \delta_1^{\delta_1} \times \delta_2^{\delta^2} )$$, $$e_{min}$$ is a lower aymptote, $$e_{max}$$ is an upper asymptote, $$sc$$ is a scaling parameter, and $$\delta_1$$ and $$\delta_2$$ are power parameters.

### User defined mean model

An example of a user-defined mean model is from simple linear regression. Code is provided below in which theta = c( intercept, slope ) and $$x$$ is a vector.

slr_model = function(theta, x){  theta[1] + theta[2]*x }

theta = c( 2, -5 )   ## Intercept, slope
x = seq(1, 10, by=0.5)
slr_model(theta, x)


The next example shows $$x$$ as a matrix for multivariate linear regression.

mlr_model = function(theta, x){  theta[1] + theta[2]*x[,1] + theta[3]*x[,2] }

theta = c( 2, -5, 0.1 )   ## Intercept, slope1, slope2
x1 = seq(1, 10, by=0.5)
x2 = seq(10, 1, by=-0.5)
mlr_model(theta, x=cbind(x1, x2))


## Weight/Variance functions

The package OptimModel also provides several pre-defined weight/variance functions. The default function is weights_varIdent(), which sets $$g(\boldsymbol{\theta}, \boldsymbol{\phi}) = 1$$ for all observations (i.e., no weights). The user has no reason to call weights_varIdent().

The following table shows the forms of $$g(\boldsymbol{\theta}, \boldsymbol{\phi})$$ for pre-defined functions, where $$mu = f(\boldsymbol{\theta}, x)$$ and $$resid = y - mu$$.

Function Code
weights_varIdent(phi, mu) rep(1, length(mu))
weights_varExp(phi, mu) exp(phi*mu)
weights_varPower(phi, mu) abs(mu)^(phi)
weights_varConstPower(phi, mu) phi[1]+abs(mu)^(phi[2])
weights_tukey_bw(phi=4.685, resid) see below
weights_huber(phi=1.345, resid) see below

### Tukey bi-weight

The weight function is $$w = (1-(r/\phi)^2)^2$$ if $$r \le \phi$$ and $$w=0$$ otherwise, where $$r = |resid|/sig$$ and sig = mad(resid, center=0). Traditionally, $$phi = 4.685$$.

### Huber weights

The weight function is $$w = min(1, \phi / r)$$, where $$r = |resid|/sig$$ and sig = mad(resid, center=0). Traditionally, $$phi = 1.345$$. This is also the default weight used in MASS::rlm().

### User-defined weights

The user may provide a weight function with arguments phi and mu. For example,

weights_user1 = function(phi, mu){  ifelse( mu < 5, (1/mu), (1/mu)^phi ) }

mu = 1:10
phi = 2.4
weights_user1(phi, mu)

## Calling optim_fit

Usage of the optim_fit() function will be illustrated with a data set generated from the hill_model(). An optim_fit() object is associated with S3-type functionality for print, fitted, and residuals.

### OLS and WLS

Data are generated from hill_model(). For OLS and WLS, the call to optim_fit() is bare boned. The print.optim_fit() function provides parameter estimates, standard errors, and 95% confidence intervals. It also provides the rMSE (sigma), an $$r^2$$ value, and BIC value.


library(OptimModel)

set.seed(456)
x = rep( c(0, 2^(-4:4)), each=4 )
theta = c(0, 100, log(.5), 2)
y = hill_model(theta, x) + rnorm( length(x), sd=2 )

## No need to include a starting value.
fit = optim_fit(theta0=NULL, f.model=hill_model, x=x, y=y)
print(fit)
#>
#> Fit method = 'ols'
#> Variance computation method = 'hessian'
#> Weights = weights_varIdent
#> Convergence : Achieved
#>
#> 95% Wald CI for parameters
#>       par     standard.error lower   upper
#> emin  -0.0458 0.5974         -1.2574   1.1658
#> emax  99.2786 0.6993         97.8603 100.6969
#> lec50 -0.6503 0.0184         -0.6877  -0.6130
#> m      2.1594 0.0774          2.0024   2.3165
#> sigma = 2.0426 on 36 degrees of freedom
#> r-squared = 0.9979
#> BIC =  184.8841   (smaller is better)

## Coefficients
coef(fit)
#>        emin        emax       lec50           m
#> -0.04582586 99.27856807 -0.65033795  2.15943681

## Fitted values
fitted(fit)
#>  [1] 99.27856807 99.27856807 99.27856807 99.27856807 98.27320258 98.27320258
#>  [7] 98.27320258 98.27320258 94.93948372 94.93948372 94.93948372 94.93948372
#> [13] 82.44415089 82.44415089 82.44415089 82.44415089 51.91021938 51.91021938
#> [19] 51.91021938 51.91021938 19.53345242 19.53345242 19.53345242 19.53345242
#> [25]  5.12854738  5.12854738  5.12854738  5.12854738  1.16123128  1.16123128
#> [31]  1.16123128  1.16123128  0.22693902  0.22693902  0.22693902  0.22693902
#> [37]  0.01536124  0.01536124  0.01536124  0.01536124

## Residuals
residuals(fit)
#>  [1] -1.965610882  1.964983041  2.323181264 -2.056352891 -1.240377833
#>  [6] -0.459786220  1.569621876  0.689431687  1.192867878  0.324632706
#> [11] -2.653457701  1.800358234 -0.466698260  0.863706482 -5.325761356
#> [16]  1.450561979  1.563652969 -1.135252733  2.649848560  1.165547240
#> [21] -0.482660037 -2.968070025 -2.387113040  0.883019419  0.682133203
#> [26]  3.022374689 -0.171904379  0.097037498  3.346309206 -1.801525553
#> [31] -0.680358172 -0.810355452 -3.835665264  0.754472293  0.503416542
#> [36]  3.793470677 -1.239006478 -0.198304151 -0.765758483  0.004728182
#> attr(,"label")
#> [1] "Residuals"

The predict() function may be used to obtain mean-model estimates at specific x values. It may also provide standard errors, confidence intervals, and prediction intervals. A graph of the data with 95% confidence bands is provided.

predict(fit, x=2:5)
#>      x     y.hat
#> [1,] 2 5.1285474
#> [2,] 3 2.1775014
#> [3,] 4 1.1612313
#> [4,] 5 0.7031703

predict(fit, x=2:5, se.fit=TRUE)
#>      x     y.hat    se.fit
#> [1,] 2 5.1285474 0.5130136
#> [2,] 3 2.1775014 0.4853637
#> [3,] 4 1.1612313 0.5104829
#> [4,] 5 0.7031703 0.5326739

## 95% confidence interval
predict(fit, x=2:5, interval="confidence")
#>      x     y.hat    se.fit      lower    upper
#> [1,] 2 5.1285474 0.5130136  4.0881076 6.168987
#> [2,] 3 2.1775014 0.4853637  1.1931383 3.161864
#> [3,] 4 1.1612313 0.5104829  0.1259241 2.196538
#> [4,] 5 0.7031703 0.5326739 -0.3771424 1.783483

## 90% confidence interval
predict(fit, x=2:5, interval="confidence", level=0.9)
#>      x     y.hat    se.fit      lower    upper
#> [1,] 2 5.1285474 0.5130136  4.2624277 5.994667
#> [2,] 3 2.1775014 0.4853637  1.3580630 2.996940
#> [3,] 4 1.1612313 0.5104829  0.2993843 2.023078
#> [4,] 5 0.7031703 0.5326739 -0.1961418 1.602482

## 95% prediciton interval for next observation (K=1)
predict(fit, x=2:5, interval="prediction", K=1)
#>      x     y.hat    se.fit      lower    upper
#> [1,] 2 5.1285474 0.5130136  0.8572438 9.399851
#> [2,] 3 2.1775014 0.4853637 -2.0804899 6.435493
#> [3,] 4 1.1612313 0.5104829 -3.1088250 5.431288
#> [4,] 5 0.7031703 0.5326739 -3.5780205 4.984361

d = data.frame(x=x, y=y)
pred = as.data.frame(predict(fit, x=2^seq(-6, 4, length=25), interval="confidence"))

library(ggplot2)
ggplot( d, aes(x=x, y=y) ) + geom_point() +
geom_line( data=pred, aes(x, y.hat)) +
geom_ribbon( data=pred, aes(x=x, y=y.hat, ymin=lower, ymax=upper), alpha=0.3) +
scale_x_log10() + ggtitle("OLS")
#> Warning: Transformation introduced infinite values in continuous x-axis


## Fixed weights.  In this example, weights are increasing step-wise with x
w = ifelse(x < 0.1, 0.5, ifelse(x < 2, 1, 2))
## No need to include a starting value.
fit = optim_fit(theta0=NULL, f.model=hill_model, x=x, y=y, wts=w)
print(fit)
#>
#> Fit method = 'ols'
#> Variance computation method = 'hessian'
#> Weights = user.defined
#> Convergence : Achieved
#>
#> 95% Wald CI for parameters
#>       par     standard.error lower   upper
#> emin  -0.0430 0.5042         -1.0655   0.9796
#> emax  99.3303 1.0216         97.2584 101.4022
#> lec50 -0.6496 0.0220         -0.6942  -0.6049
#> m      2.1396 0.0862          1.9648   2.3145
#> sigma = 2.333 on 36 degrees of freedom
#> r-squared = 0.9978
#> BIC =  189.9727   (smaller is better)

pred = as.data.frame(predict(fit, x=2^seq(-6, 4, length=25), interval="confidence"))

ggplot( d, aes(x=x, y=y) ) + geom_point() +
geom_line( data=pred, aes(x, y.hat)) +
geom_ribbon( data=pred, aes(x=x, y=y.hat, ymin=lower, ymax=upper), alpha=0.3) +
scale_x_log10() + ggtitle("WLS")
#> Warning: Transformation introduced infinite values in continuous x-axis

### MLE

The same curve may be fitted to the data via maximum likelihood. In the following code output, note that the starting value needs to include log of the standard deviation (‘lsigma’) as the last parameter.


fit = optim_fit(theta0=NULL, f.model=hill_model, x=x, y=y, fit.method="mle")
print(fit)
#>
#> Fit method = 'mle'
#> Variance computation method = 'hessian'
#> Weights = weights_varIdent
#> Convergence : Achieved
#>
#> 95% Wald CI for parameters
#>        par     standard.error lower   upper
#> emin   -0.0458 0.5668         -1.1566   1.0651
#> emax   99.2788 0.6634         97.9785 100.5791
#> lec50  -0.6503 0.0175         -0.6846  -0.6161
#> m       2.1594 0.0735          2.0155   2.3034
#> lsigma  0.6616 0.1118          0.4424   0.8807
#> sigma = 1.9378 on Inf degrees of freedom
#> r-squared = 0.9979
#> BIC =  188.573   (smaller is better)
#> var/weight param(s):
#> [1] none

Maximum likelihood can also be used with variance model functions. The following code illustrates the method with a power-of-the-mean variance model (weights_varPower) with $$\sigma=0.7$$ and $$\phi = 0.5$$.


set.seed(876)

## Standard deviation = sigma*( Mean )^0.5
y2 = hill_model(theta, x) + rnorm( length(x), sd=0.7*sqrt(hill_model(theta, x)) )

fit = optim_fit(theta0=NULL, f.model=hill_model, x=x, y=y2, wts=weights_varPower,
fit.method="mle", phi.fixed=FALSE)
print(fit)
#>
#> Fit method = 'mle'
#> Variance computation method = 'hessian'
#> Weights = weights_varPower
#> Convergence : Achieved
#>
#> 95% Wald CI for parameters
#>        par      standard.error lower   upper
#> emin    -0.3626 0.1237         -0.6050  -0.1202
#> emax   102.9510 1.7656         99.4906 106.4114
#> lec50   -0.7299 0.0383         -0.8049  -0.6549
#> m        1.8806 0.0775          1.7286   2.0326
#> v1       0.4190 0.0426          0.3355   0.5024
#> lsigma  -0.2895 0.1543         -0.5919   0.0128
#> sigma = 0.7486 on Inf degrees of freedom
#> r-squared = 0.9997
#> BIC =  196.694   (smaller is better)
#> var/weight param(s):
#>   phi
#> 0.419

### IRWLS

The same variance model may be fitted to the data using iterative reweighted least squares.


fit = optim_fit(theta0=NULL, f.model=hill_model, x=x, y=y2, wts=weights_varPower,
fit.method="irwls", phi.fixed=FALSE)
#> Warning in nlm(.trkfunc, phi.start, hessian = FALSE, typsize = rep(1,
#> length(phi)), : NA/Inf replaced by maximum positive value
print(fit)
#>
#> Fit method = 'irwls'
#> Variance computation method = 'hessian'
#> Weights = weights_varPower
#> Convergence : Achieved
#>
#> 95% Wald CI for parameters
#>       par      standard.error lower   upper
#> emin    0.0502 0.1314         -0.2162   0.3166
#> emax  102.4983 1.8791         98.6872 106.3094
#> lec50  -0.7251 0.0401         -0.8065  -0.6437
#> m       1.9330 0.0773          1.7763   2.0897
#> sigma = 0.6392 on 36 degrees of freedom
#> r-squared = 0.9998
#> BIC =  186.3115   (smaller is better)
#> var/weight param(s):
#>    phi
#> 0.4714

d = data.frame(x=x, y=y2)
pred = as.data.frame(predict(fit, x=2^seq(-6, 4, length=25), interval="confidence"))

ggplot( d, aes(x=x, y=y) ) + geom_point() +
geom_line( data=pred, aes(x, y.hat)) +
geom_ribbon( data=pred, aes(x=x, y=y.hat, ymin=lower, ymax=upper), alpha=0.3) +
scale_x_log10()
#> Warning: Transformation introduced infinite values in continuous x-axis

### Robust model fit

Iterative reweighted least squares may be implemented with a robust weighting scheme, such as Huber weighting, to down-weight outliers. With OLS, the curve gets pulled slightly downward by the outliers, but with the Huber weighting, the outliers are virtually ignored.


set.seed(456)
x = rep( c(0, 2^(-4:4)), each=4 )
theta = c(0, 100, log(.5), 2)
y = hill_model(theta, x) + rnorm( length(x), sd=2 )

## Create outliers
y[c(8, 11)] = y[c(8, 11)] + -30

## Ordinary least squares
fit = optim_fit(theta0=NULL, f.model=hill_model, x=x, y=y)

d = data.frame(x=x, y=y)
pred = as.data.frame(predict(fit, x=2^seq(-6, 4, length=25), interval="confidence"))

ggplot( d, aes(x=x, y=y) ) + geom_point() +
geom_line( data=pred, aes(x, y.hat)) +
geom_ribbon( data=pred, aes(x=x, y=y.hat, ymin=lower, ymax=upper), alpha=0.3) +
scale_x_log10() + ggtitle("OLS fit")
#> Warning: Transformation introduced infinite values in continuous x-axis


## IRWLS with Huber weights
fit = optim_fit(theta0=NULL, f.model=hill_model, x=x, y=y, wts=weights_huber,
fit.method="irwls")

d = data.frame(x=x, y=y)
pred = as.data.frame(predict(fit, x=2^seq(-6, 4, length=25), interval="confidence"))

ggplot( d, aes(x=x, y=y) ) + geom_point() +
geom_line( data=pred, aes(x, y.hat)) +
geom_ribbon( data=pred, aes(x=x, y=y.hat, ymin=lower, ymax=upper), alpha=0.3) +
scale_x_log10() + ggtitle("IRWLS + Huber weights fit")
#> Warning: Transformation introduced infinite values in continuous x-axis

### Multiple starting values

As a convenience, optim_fit() can be called upon to try starting values in a neighborhood of the original starting value (either user-supplied theta0 or the attr(f_model, ‘start’) value). The following code illustrates its usage. The model is initially fitted with OLS and a singular starting value. Although the fitted object claims that convergence is achieved, the $$r^2=0.50$$ value is contradictory. When a grid of neighboring starting value is tried, one of them achieves a much lower BIC. A graph of the model fit is shown with a single starting value (red) and the fixed (green) and random (blue) neighborhoods of starting values. Note that the green and blue curves overlap.


set.seed(123)
theta = c(0, 100, log(0.25), -3, -2)
y = hill_switchpoint_model(theta, x) + rnorm( length(x), mean=0, sd=5 )

## Try OLS, but original starting value does not provide convergence
fit1 = optim_fit(NULL, hill_switchpoint_model, x=x, y=y)
print(fit1)
#>
#> Fit method = 'ols'
#> Variance computation method = 'hessian'
#> Weights = weights_varIdent
#> Convergence : Achieved
#>
#> 95% Wald CI for parameters
#>      par     standard.error lower      upper
#> emin  0.6472    3.3471         -6.1478    7.4421
#> emax 30.2027    3.7421         22.6058   37.7995
#> ec50 12.1289 1125.6173      -2272.9957 2297.2536
#> m     3.1558  270.7225       -546.4402  552.7517
#> lsp  -0.6482    1.4261         -3.5434    2.2469
#> sigma = 14.9684 on 35 degrees of freedom
#> r-squared = 0.5047
#> BIC =  346.7826   (smaller is better)

## Fixed grid of equally-spaced starting values
fit2 = optim_fit(NULL, hill_switchpoint_model, x=x, y=y, ntry=50,
start.method="fixed", until.converge=FALSE)
#> Warning in sqrt(diag(fit$varBeta)): NaNs produced #> Warning in sqrt(diag(fit$varBeta)): NaNs produced

#> Warning in sqrt(diag(fit$varBeta)): NaNs produced #> Error in solve.default(0.5 * fit$hessian) :
#>   system is computationally singular: reciprocal condition number = 1.00216e-116
#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): Invalid
#> variance-covariance matrix.

#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced

#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced

#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced

#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced
#> Error in solve.default(0.5 * fit$hessian) : #> system is computationally singular: reciprocal condition number = 7.17357e-17 #> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): Invalid #> variance-covariance matrix. #> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced #> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced #> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced #> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced #> Error in solve.default(0.5 * fit$hessian) :
#>   system is computationally singular: reciprocal condition number = 1.86509e-36
#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): Invalid
#> variance-covariance matrix.
print(fit2)
#>
#> Fit method = 'ols'
#> Variance computation method = 'hessian'
#> Weights = weights_varIdent
#> Convergence : Achieved
#>
#> 95% Wald CI for parameters
#>      par      standard.error lower     upper
#> emin   0.7125   0.9154         -1.1458   2.5709
#> emax  50.3381   1.5857         47.1191  53.5572
#> ec50  -0.7070   0.0928         -0.8954  -0.5187
#> m    -28.8584 191.3471       -417.3136 359.5968
#> lsp   -2.7548   0.1192         -2.9967  -2.5128
#> sigma = 4.4844 on 35 degrees of freedom
#> r-squared = 0.9555
#> BIC =  250.3562   (smaller is better)

## Random grid of starting values
fit3 = optim_fit(NULL, hill_switchpoint_model, x=x, y=y, ntry=50,
start.method="fixed", until.converge=FALSE)
#> Warning in sqrt(diag(fit$varBeta)): NaNs produced #> Warning in sqrt(diag(fit$varBeta)): NaNs produced

#> Warning in sqrt(diag(fit$varBeta)): NaNs produced #> Error in solve.default(0.5 * fit$hessian) :
#>   system is computationally singular: reciprocal condition number = 1.00216e-116
#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): Invalid
#> variance-covariance matrix.

#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced

#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced

#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced

#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced
#> Error in solve.default(0.5 * fit$hessian) : #> system is computationally singular: reciprocal condition number = 7.17357e-17 #> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): Invalid #> variance-covariance matrix. #> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced #> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced #> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced #> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): NaNs produced #> Error in solve.default(0.5 * fit$hessian) :
#>   system is computationally singular: reciprocal condition number = 1.86509e-36
#> Warning in .get.fit.stats(fit[[i]], x, y, conf.level): Invalid
#> variance-covariance matrix.
print(fit3)
#>
#> Fit method = 'ols'
#> Variance computation method = 'hessian'
#> Weights = weights_varIdent
#> Convergence : Achieved
#>
#> 95% Wald CI for parameters
#>      par      standard.error lower     upper
#> emin   0.7125   0.9154         -1.1458   2.5709
#> emax  50.3381   1.5857         47.1191  53.5572
#> ec50  -0.7070   0.0928         -0.8954  -0.5187
#> m    -28.8584 191.3471       -417.3136 359.5968
#> lsp   -2.7548   0.1192         -2.9967  -2.5128
#> sigma = 4.4844 on 35 degrees of freedom
#> r-squared = 0.9555
#> BIC =  250.3562   (smaller is better)

d = data.frame(x=x, y=y)
x.seq = 2^seq(-6, 4, length=25)
pred = data.frame(x=rep(x.seq, 3),
Fit=rep(c("1 starting value", "Fixed grid", "Random grid"), each=25),
mu=c(predict(fit1, x=x.seq)[,"y.hat"],
predict(fit2, x=x.seq)[,"y.hat"],
predict(fit3, x=x.seq)[,"y.hat"])
)

ggplot( d, aes(x=x, y=y) ) + geom_point() +
geom_line( data=pred, aes(x, mu, col=Fit)) +
scale_x_log10() + theme(legend.position="top") +
guides(col=guide_legend(ncol=2))
#> Warning: Transformation introduced infinite values in continuous x-axis

### User-defined model

To illustrate a simple user-defined model, a three-parameter Hill model is constructed with $$e_{max}=100$$.


set.seed(456)
x = rep( c(0, 2^(-4:4)), each=4 )
theta = c(0, log(.5), 2)

hill3_model = function(theta, x){
## Parameters are theta = (emin, lec50, m).  Assumes emax = 100.
mu = theta[1] + (100 - theta[1])/(1+exp(theta[3]*log(x) - theta[3]*theta[2]))
return(mu)
}

y = hill3_model(theta, x) + rnorm( length(x), sd=2 )

fit = optim_fit(theta0=c(emin=-1, lec50=log(0.5), m=1), f.model=hill3_model, x=x, y=y)
print(fit)
#>
#> Fit method = 'ols'
#> Variance computation method = 'hessian'
#> Weights = weights_varIdent
#> Convergence : Achieved
#>
#> 95% Wald CI for parameters
#>       par     standard.error lower   upper
#> emin  -0.1492 0.5925         -1.3496  1.0513
#> lec50 -0.6590 0.0165         -0.6924 -0.6257
#> m      2.1188 0.0646          1.9878  2.2498
#> sigma = 2.0441 on 37 degrees of freedom
#> r-squared = 0.9978
#> BIC =  182.3478   (smaller is better)

### Functions of parameters

As a convenience function, the standard error (using the delta method) and confidence interval for a function of model parameters may be obtained with get_se_func() as demonstrated below. The hill_model parameters are $$(e_{min}, e_{max}, lec50, m)$$ and the function of interest is $$lec50 + (1/m) log( \frac{0.5 e_{max}}{ 0.5 e_{max} - e_{min}} )$$.


set.seed(456)
x = rep( c(0, 2^(-4:4)), each=4 )
theta = c(0, 100, log(.5), 2)
y = hill_model(theta, x) + rnorm( length(x), sd=2 )

fit = optim_fit(theta0=NULL, f.model=hill_model, x=x, y=y)

myfunc = function(theta){ theta[3] + (1/theta[4])*log( 0.5*theta[2]/(0.5*theta[2] - theta[1])) }
myfunc( coef(fit) )
#>      lec50
#> -0.6507653
get_se_func(object=fit, Func=myfunc)
#>              est         SE     lower      upper
#> lec50 -0.6507653 0.01722887 -0.685707 -0.6158235

## Detecting outliers with ROUT

Instead of down-weighting outliers, sometimes it is more desirable to detect and eliminate them. The Robust Outlier (ROUT) detection method of Motulsky and Brown (2008) has been implemented in the OptimModel package. The Huber weights data example is repeated, but with the rout_fitter().


set.seed(456)
x = rep( c(0, 2^(-4:4)), each=4 )
theta = c(0, 100, log(.5), 2)
y = hill_model(theta, x) + rnorm( length(x), sd=2 )

## Create outliers
y[c(8, 11)] = y[c(8, 11)] + -30

## ROUT
fit_r = rout_fitter(theta0=NULL, f.model=hill_model, x=x, y=y)
print(fit_r)
#>
#>
#> ROUT Fitted Model Statistics
#>
#> Convergence : Achieved
#>
#> Parameter estimates
#>        emin        emax       lec50           m lsig.68.27%
#>     -0.2608     99.3133     -0.6281      2.2006      0.0731
#>
#> RSDR = 2.1934 from 40 obsevations and 4 parameter(s)
#>
#> r-sq (without FDR correction) = 0.9983
#> r-sq (with FDR correction) = 0.9977
#>
#> Q = 0.01
#> # of outliers detected (without FDR correction) = 3
#> # of outliers detected (with FDR correction) = 2

## Get the outliers and remove them
print(which(fit_r$outlier.adj)) #> [1] 8 11 index = !fit_r$outlier.adj
x.clean = x[index]
y.clean = y[index]

## Refit model with OLS, but outliers removed
fit = optim_fit(NULL, hill_model, x=x.clean, y=y.clean)

d = data.frame(x=x, y=y)
pred = as.data.frame(predict(fit, x=2^seq(-6, 4, length=25), interval="confidence"))

ggplot( d, aes(x=x, y=y) ) + geom_point() +
geom_line( data=pred, aes(x, y.hat)) +
geom_ribbon( data=pred, aes(x=x, y=y.hat, ymin=lower, ymax=upper), alpha=0.3) +
scale_x_log10() + ggtitle("OLS fit after ROUT")
#> Warning: Transformation introduced infinite values in continuous x-axis

## Summary

The OptimModel package provides a wide variety of curve-fitting options. From an optim_fit() object, the user may obtain a printed summary of the model fit, residuals, predicted values with standard error, confidence intervals, and prediction intervals. The standard error and confidence intervals may also be obtained for functions of the model parameters. The user may opt to fit the model with OLS, MLE, or IRWLS. The ROUT outlier detection method is also available.

## References

1. Motulsky, H.J. and Brown, R.E. (2006) Detecting Outliers When Fitting Data with Nonlinear Regression: A New Method Based on Robust Nonlinear Regression and the False Discovery Rate. BMC Bioinformatics, 7, 123.

2. Hill AV. (1910) The possible effects of the aggregation of the molecules of haemoglobin on its dissociation curves. J Physiol (Lond.) 40:iv–vii.

3 Verhulst, Pierre-François (1845). “Recherches mathématiques sur la loi d’accroissement de la population” [Mathematical Researches into the Law of Population Growth Increase]. Nouveaux Mémoires de l’Académie Royale des Sciences et Belles-Lettres de Bruxelles. 18: 8. Retrieved 18 February 2013. Nous donnerons le nom de logistique à la courbe [We will give the name logistic to the curve]