Because of the length of time needed to run the vignettes, only static vignettes have been included with this package.
The original of the vignettes and the code can be obtained from the GitHub site at https://github.com/cschwarz-stat-sfu-ca/BTSPAS
The two-sample capture-recapture experiment is one of the simplest possible studies with a long history. The standard Lincoln-Petersen estimator used to estimate abundance is \[ \widehat{N} = \frac{n_1 n_2}{m_2}\] where \(n_1\) is the number of animals captured, marked and released at the first capture event; \(n_2\) is the number of animals captured at the second capture event; and \(m_2\) is the number of animals from \(n_1\) that were recaptured at the second event (i.e. the number of recaptured marked animals).
A key assumption of the Lincoln-Petersen estimator is that there is no correlation in capture-probabilities at the two events. The most common way in which this can occur is if the probability of capture is equal for all animals at either the first or second event.
If capture probabilities are heterogeneous, then estimates can be biased. One way to account for heterogeneous capture-probabilities is through stratification. For example, if males and females have different catchabilities, then separate Lincoln-Petersen estimators can be computed for each sex, and the estimated abundance for the entire population is found by summing the two estimated abundances (one for each sex).
Stratification can be based on animal attributes (e.g. sex), geographic location (e.g. parts of a study area), or temporal (e.g. when captured). The \(BTSPAS\) package deals with temporal stratification.
Consider an experiment to estimate the number of outgoing smolts on a small river. The run of smolts extends over several weeks. As smolts migrate, they are captured and marked with individually numbered tags and released at the first capture location using, for example, a fishwheel. The migration continues, and a second fishwheel takes a second sample several kilometers down stream. At the second fishwheel, the captures consist of a mixture of marked (from the first fishwheel) and unmarked fish.
The efficiency of the fishwheels varies over time in response to stream flow, run size passing the wheel and other uncontrollable events. So it is unlikely that the capture probabilities are equal over time at either location, i.e. are heterogeneous over time.
We suppose that we can temporally stratify the data into, for example, weeks, where the capture-probabilities are (mostly) homogeneous at each wheel in each week. Furthermore, suppose that fish captured and marked in each week tend to migrate together so that they are captured in a single subsequent stratum. For example, suppose that in each julian week \(j\), \(n1[j]\) fish are marked and released above the rotary screw trap. Of these, \(m2[j]\) are recaptured. All recaptures take place in the week of release, i.e. the matrix of releases and recoveries is diagonal. The \(n1[j]\) and \(m2[j]\) establish the capture efficiency of the second trap in julian week \(j\).
At the same time, \(u2[j]\) unmarked fish are captured at the screw trap.
This implies that the data can be structured as a diagonal array similar to:
Recovery Stratum
tagged rs1 rs2 rs3 ... rsk
Marking ms1 n1[1] m2[1] 0 0 ... 0
Stratum ms2 n1[2] 0 m2[2] 0 ... 0
ms3 n1[3] 0 0 m2[3] ... 0
...
msk n1[k] 0 0 0 ... m2[k]
Newly
Untagged u2[1] u2[2] u2[3] ... u2[k]
captured
Here the tagging and recapture events have been stratified into \(k\) temporal strata. Marked fish from one stratum tend to move at similar rates and so are recaptured together with unmarked fish. Recaptures of marked fish take place along the “diagonal.”
Because the matrix is diagonal, and because the \(u2\) vector is the same length as the \(n1\) and \(m2\) vectors, the data can be entered as several columns. Here is an example of some raw data:
<- textConnection(
demo.data.csv 'jweek, n1, m2, u2
9 ,0, 0, 4135
10,1465, 51, 10452
11,1106,121, 2199
12, 229, 25, 655
13, 20, 0, 308
14, 177, 17, 719
15, 702, 74, 973
16, 633, 94, 972
17,1370, 62, 2386
18, 283, 10, 469
19, 647, 32, 897
20, 276, 11, 426
21, 277, 13, 407
22, 333, 15, 526
23,3981,242, 39969
24,3988, 55, 17580
25,2889,115, 7928
26,3119,198, 6918
27,2478, 80, 3578
28,1292, 71, 1713
29,2326,153, 4212
30,2528,156, 5037
31,2338,275, 3315
32,1012,101, 1300
33, 729, 66, 989
34, 333, 44, 444
35, 269, 33, 339
36, 77, 7, 107
37, 62, 9, 79
38, 26, 3, 41
39, 20, 1, 23
40,4757,188, 35118
41,2876, 8, 34534
42,3989, 81, 14960
43,1755, 27, 3643
44,1527, 30, 1811
45, 485, 14, 679
46, 115, 4, 154')
<- read.csv(demo.data.csv, header=TRUE, as.is=TRUE, strip.white=TRUE)
demo.data
print(demo.data)
#> jweek n1 m2 u2
#> 1 9 0 0 4135
#> 2 10 1465 51 10452
#> 3 11 1106 121 2199
#> 4 12 229 25 655
#> 5 13 20 0 308
#> 6 14 177 17 719
#> 7 15 702 74 973
#> 8 16 633 94 972
#> 9 17 1370 62 2386
#> 10 18 283 10 469
#> 11 19 647 32 897
#> 12 20 276 11 426
#> 13 21 277 13 407
#> 14 22 333 15 526
#> 15 23 3981 242 39969
#> 16 24 3988 55 17580
#> 17 25 2889 115 7928
#> 18 26 3119 198 6918
#> 19 27 2478 80 3578
#> 20 28 1292 71 1713
#> 21 29 2326 153 4212
#> 22 30 2528 156 5037
#> 23 31 2338 275 3315
#> 24 32 1012 101 1300
#> 25 33 729 66 989
#> 26 34 333 44 444
#> 27 35 269 33 339
#> 28 36 77 7 107
#> 29 37 62 9 79
#> 30 38 26 3 41
#> 31 39 20 1 23
#> 32 40 4757 188 35118
#> 33 41 2876 8 34534
#> 34 42 3989 81 14960
#> 35 43 1755 27 3643
#> 36 44 1527 30 1811
#> 37 45 485 14 679
#> 38 46 115 4 154
The first stratum was defined as julian week 9. At this time, 0 fish were captured, tagged, and released, but 4135 unmarked fish were recovered in the first recovery stratum. In the next week, 1465 fish were captured, tagged, and released, with 51 fish recaptured, and 10452 unmarked fish recovered.
A pooled-Petersen estimator would add all of the marked, recaptured and unmarked fish to give an estimate of 10,602,698,039 which seems unbelievable!
What went wrong? Let us first examine a plot of the estimated capture efficiency at the second trap for each (recovery) julian week.
There are several unusual features
Similarly, let us look at the pattern of unmarked fish captured at the second trap:
There are two julian weeks where the number of unmarked fish captured suddenly jumps by several orders of magnitude (remember the above plot is on the log() scale). These jumps correspond to releases of hatchery fish into the system.
Finally, let us look at the individual estimates for each stratum found by computing a Petersen estimator for the total number of unmarked fish for each individual stratum:
The \(BTSPAS\) package attempts to strike a balance between the completely pooled Petersen estimator and the completely stratified Petersen estimator. In the former, capture probabilities are assumed to equal for all fish in all strata, while in the latter, capture probabilities are allowed to vary among strata in no structured way. Furthermore, fish populations often have a general structure to the run, rather than arbitrarily jumping around from stratum to stratum.
Bonner and Schwarz (2011) developed a suite of models that add structure. In the basic model,
The model also allows the user to use covariates to explain some of the variation in the capture probabilities. Bonner and Schwarz (2011) also developed models where fish are recaptured in more than stratum.
The \(BTSPAS\) package also has additional features and options:
I find it easiest if bad recaptures (a value of \(m2\)) result in zeroing out both \(n1\) and \(m2\) for that stratum.
The \(BTSPAS\) function also allows you specify
We already read in the data above. Here we set the rest of the parameters. Don’t forget to set the working directory as appropriate
library("BTSPAS")
# After which weeks is the spline allowed to jump?
<- c(22,39) # julian weeks after which jump occurs
demo.jump.after
# Which julian weeks have "bad" recapture values. These will be set to missing and estimated.
<- c(41) # list julian weeks with bad m2 values. This is used in the Trinity Example
demo.bad.m2 <- c(11) # list julian weeks with bad u2 values. [This was arbitrary to demostrate the feature.]
demo.bad.u2 <- c(38) # list julian weeks with bad n1 values. [This was arbitrary to demonstrate the feature.]
demo.bad.n1
# The prefix for the output files:
<- "demo-JC-2003-CH-TSPDE"
demo.prefix
# Title for the analysis
<- "Junction City 2003 Chinook "
demo.title
cat("*** Starting ",demo.title, "\n\n")
#> *** Starting Junction City 2003 Chinook
# Make the call to fit the model and generate the output files
<- TimeStratPetersenDiagError_fit(
demo.fit title=demo.title,
prefix=demo.prefix,
time=demo.data$jweek,
n1=demo.data$n1,
m2=demo.data$m2,
u2=demo.data$u2,
jump.after=demo.jump.after,
bad.n1=demo.bad.n1,
bad.m2=demo.bad.m2,
bad.u2=demo.bad.u2,
InitialSeed=890110,
debug=TRUE, # this generates only 10,000 iterations of the MCMC chain for checking.
save.output.to.files=FALSE)
#>
#>
#> *** Start of call to JAGS
#> Working directory: /Users/cschwarz/Dropbox/SPAS-Bayesian/BTSPAS/vignettes
#> Initial seed for JAGS set to: 890110
#> Random number seed for chain 127579
#> Random number seed for chain 693284
#> Random number seed for chain 289944
#> Compiling model graph
#> Resolving undeclared variables
#> Allocating nodes
#> Graph information:
#> Observed stochastic nodes: 72
#> Unobserved stochastic nodes: 104
#> Total graph size: 1665
#>
#> Initializing model
#>
#>
|
| | 0%
|
|++ | 4%
|
|++++ | 8%
|
|++++++ | 12%
|
|++++++++ | 16%
|
|++++++++++ | 20%
|
|++++++++++++ | 24%
|
|++++++++++++++ | 28%
|
|++++++++++++++++ | 32%
|
|++++++++++++++++++ | 36%
|
|++++++++++++++++++++ | 40%
|
|++++++++++++++++++++++ | 44%
|
|++++++++++++++++++++++++ | 48%
|
|++++++++++++++++++++++++++ | 52%
|
|++++++++++++++++++++++++++++ | 56%
|
|++++++++++++++++++++++++++++++ | 60%
|
|++++++++++++++++++++++++++++++++ | 64%
|
|++++++++++++++++++++++++++++++++++ | 68%
|
|++++++++++++++++++++++++++++++++++++ | 72%
|
|++++++++++++++++++++++++++++++++++++++ | 76%
|
|++++++++++++++++++++++++++++++++++++++++ | 80%
|
|++++++++++++++++++++++++++++++++++++++++++ | 84%
|
|++++++++++++++++++++++++++++++++++++++++++++ | 88%
|
|++++++++++++++++++++++++++++++++++++++++++++++ | 92%
|
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96%
|
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100%
#>
|
| | 0%
|
|** | 4%
|
|**** | 8%
|
|****** | 12%
|
|******** | 16%
|
|********** | 20%
|
|************ | 24%
|
|************** | 28%
|
|**************** | 32%
|
|****************** | 36%
|
|******************** | 40%
|
|********************** | 44%
|
|************************ | 48%
|
|************************** | 52%
|
|**************************** | 56%
|
|****************************** | 60%
|
|******************************** | 64%
|
|********************************** | 68%
|
|************************************ | 72%
|
|************************************** | 76%
|
|**************************************** | 80%
|
|****************************************** | 84%
|
|******************************************** | 88%
|
|********************************************** | 92%
|
|************************************************ | 96%
|
|**************************************************| 100%
#>
#>
#> *** Finished JAGS ***
The final parameter (save.output.to.files) can be set to automatically to save plots and reports in files with the appropriate prefix in the working directory.
The output object contains all of the results and can be saved for later interrogations. This is useful if the run takes considerable time (e.g. overnight) and you want to save the results for later processing. Notice that I didn’t save the results below as part of this vignette.
# save the results in a data dump that can be read in later using the load() command.
#save(list=c("demo.fit"), file="demo-fit-saved.Rdata") # save the results from this run
The final object has many components
names(demo.fit)
#> [1] "n.chains" "n.iter" "n.burnin"
#> [4] "n.thin" "n.keep" "n.sims"
#> [7] "sims.array" "sims.list" "sims.matrix"
#> [10] "summary" "mean" "sd"
#> [13] "median" "root.short" "long.short"
#> [16] "dimension.short" "indexes.short" "last.values"
#> [19] "program" "model.file" "isDIC"
#> [22] "DICbyR" "pD" "DIC"
#> [25] "model" "parameters.to.save" "plots"
#> [28] "runTime" "report" "data"
The plots sub-object contains many plots:
names(demo.fit$plots)
#> [1] "init.plot" "fit.plot" "logitP.plot"
#> [4] "acf.Utot.plot" "post.UNtot.plot" "gof.plot"
#> [7] "trace.logitP.plot" "trace.logU.plot"
In particular, it contains plots of the initial spline fit (init.plot), the final fitted spline (fit.plot), the estimated capture probabilities (on the logit scale) (logitP.plot), plots of the distribution of the posterior sample for the total unmarked and marked fish (post.UNtot.plot) and model diagnostic plots (goodness of fit (gof.plot), trace (trace…plot), and autocorrelation plots (act.Utot.plot).
These plots are all created using the \(ggplot2\) packages, so the user can modify the plot (e.g. change titles etc).
The \(BTSPAS\) program also creates a report, which includes information about the data used in the fitting, the pooled- and stratified-Petersen estimates, a test for pooling, and summaries of the posterior. Only the first few lines are shown below:
head(demo.fit$report)
#> [1] "Time Stratified Petersen with Diagonal recaptures and error in smoothed U - Sun Oct 24 15:37:06 2021"
#> [2] "Version: 2021-11-02 "
#> [3] ""
#> [4] ""
#> [5] ""
#> [6] " Junction City 2003 Chinook Results "
Here is the fitted spline curve to the number of unmarked fish available in each recovery sample
$plots$fit.plot demo.fit
The jump in the spline when hatchery fish are released is evident. The actual number of unmarked fish is allowed to vary around the spline as shown below.
The distribution of the posterior sample for the total number unmarked and total abundance is available:
$plots$post.UNtot.plot demo.fit
A plot of the \(logit(P)\) is
$plots$logitP.plot demo.fit
In cases where there is no information, \(BTSPAS\) has interpolated based on the distribution of catchability in the other strata and so the credible interval is very wide (e.g. julian weeks 7, 13, and 41).
A summary of the posterior for each parameter is also available. In particular, here are the summary statistics on the posterior sample for the total number unmarked and total abundance:
$summary[ row.names(demo.fit$summary) %in% c("Ntot","Utot"),]
demo.fit#> mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
#> Ntot 5834690 520650.8 5091090 5498342 5765732 6066297 6949601 1.004049 500
#> Utot 5784225 520650.8 5040625 5447877 5715267 6015832 6899136 1.004053 500
This also includes the Rubin-Brooks-Gelman statistic (\(Rhat\)) on mixing of the chains and the effective sample size of the posterior (after accounting for autocorrelation).
The estimated total abundance from \(BTSPAS\) is 5,834,690 (SD 520,651 ) fish.
Samples from the posterior are also included in the sims.matrix, sims.array and sims.list elements of the results object.
It is always important to do model assessment before accepting the results from the model fit. Please contact me for details on how to interpret the goodness of fit, trace, and autocorrelation plots.
In some cases, the second trap is not running and so there are no recaptures of tagged fish and no captures of untagged fish.
We need to set the p’s in these strata to 0 rather than letting BTSPAS impute a value.
Here is an example of some raw data that is read in:
<- textConnection(
demo2.data.csv 'jweek, n1, m2, u2
9 ,0, 0, 4135
10,1465, 51, 10452
11,1106,121, 2199
12, 229, 25, 655
13, 20, 0, 308
14, 177, 17, 719
15, 702, 74, 973
16, 633, 94, 972
17,1370, 62, 2386
18, 283, 10, 469
19, 647, 32, 897
20, 276, 11, 426
21, 277, 13, 407
22, 333, 15, 526
23,3981,242, 39969
24,3988, 55, 17580
25,2889,115, 7928
26,3119,198, 6918
27,2478, 80, 3578
28,1292, 71, 1713
29,2326,153, 4212
30,2528,156, 5037
31,2338,275, 3315
32,1012,101, 1300
33, 729, 66, 989
34, 333, 44, 444
35, 269, 33, 339
36, 77, 7, 107
37, 62, 0, 0
38, 26, 0, 0
39, 20, 0, 0
40,4757,188, 35118
41,2876, 8, 34534
42,3989, 81, 14960
43,1755, 27, 3643
44,1527, 30, 1811
45, 485, 14, 679
46, 115, 0, 0')
<- read.csv(demo2.data.csv, header=TRUE, as.is=TRUE, strip.white=TRUE)
demo2.data
print(demo2.data)
#> jweek n1 m2 u2
#> 1 9 0 0 4135
#> 2 10 1465 51 10452
#> 3 11 1106 121 2199
#> 4 12 229 25 655
#> 5 13 20 0 308
#> 6 14 177 17 719
#> 7 15 702 74 973
#> 8 16 633 94 972
#> 9 17 1370 62 2386
#> 10 18 283 10 469
#> 11 19 647 32 897
#> 12 20 276 11 426
#> 13 21 277 13 407
#> 14 22 333 15 526
#> 15 23 3981 242 39969
#> 16 24 3988 55 17580
#> 17 25 2889 115 7928
#> 18 26 3119 198 6918
#> 19 27 2478 80 3578
#> 20 28 1292 71 1713
#> 21 29 2326 153 4212
#> 22 30 2528 156 5037
#> 23 31 2338 275 3315
#> 24 32 1012 101 1300
#> 25 33 729 66 989
#> 26 34 333 44 444
#> 27 35 269 33 339
#> 28 36 77 7 107
#> 29 37 62 0 0
#> 30 38 26 0 0
#> 31 39 20 0 0
#> 32 40 4757 188 35118
#> 33 41 2876 8 34534
#> 34 42 3989 81 14960
#> 35 43 1755 27 3643
#> 36 44 1527 30 1811
#> 37 45 485 14 679
#> 38 46 115 0 0
Notice that there are no recaptures of marked fish and no recaptures of unmarked fish in julian weeks 37, 38, 39, and 46 when the trap was not operating. Notice that this differs from the case where a small number of tagged fish released with no recaptures but captures of tagged fish occur such as in julian week 13.
Two additional arguments to the BTSPAS allow you specify the julian weeks in which the capture probability is fixed to a known (typically zero) value.
<- c(37,38,39, 46)
demo2.logitP.fixed <- rep(-10, length(demo2.logitP.fixed)) demo2.logitP.fixed.values
The strata where the value of p is to be fixed is specified along with the value (on the logit scale) at which the capture probabilities are fixed. Technically, the \(logit(0.0)\) is negative infinity, but \(-10\) is ``close enough’’.
The rest of the call is basically the same – don’t forget to specify the additional arguments in the call
library("BTSPAS")
# After which weeks is the spline allowed to jump?
<- c(22,39) # julian weeks after which jump occurs
demo2.jump.after
# Which julian weeks have "bad" recapture values. These will be set to missing and estimated.
<- c(41) # list julian weeks with bad m2 values. This is used in the Trinity Example
demo2.bad.m2 <- c(11) # list julian weeks with bad u2 values. [This was arbitrary to demostrate the feature.]
demo2.bad.u2 <- c(38) # list julian weeks with bad n1 values. [This was arbitrary to demonstrate the feature.]
demo2.bad.n1
# The prefix for the output files:
<- "demo2-JC-2003-CH-TSPDE"
demo2.prefix
# Title for the analysis
<- "Junction City 2003 Chinook with p fixed "
demo2.title
cat("*** Starting ",demo2.title, "\n\n")
#> *** Starting Junction City 2003 Chinook with p fixed
# Make the call to fit the model and generate the output files
<- TimeStratPetersenDiagError_fit(
demo2.fit title=demo2.title,
prefix=demo2.prefix,
time=demo2.data$jweek,
n1=demo2.data$n1,
m2=demo2.data$m2,
u2=demo2.data$u2,
jump.after=demo2.jump.after,
logitP.fixed=demo2.logitP.fixed, # ***** NEW ****8
logitP.fixed.values=demo2.logitP.fixed.values, # ***** NEW *****
bad.n1=demo2.bad.n1,
bad.m2=demo2.bad.m2,
bad.u2=demo2.bad.u2,
InitialSeed=890110,
debug=TRUE, # this generates only 10,000 iterations of the MCMC chain for checking.
save.output.to.files=FALSE)
#>
#>
#> *** Start of call to JAGS
#> Working directory: /Users/cschwarz/Dropbox/SPAS-Bayesian/BTSPAS/vignettes
#> Initial seed for JAGS set to: 890110
#> Random number seed for chain 127579
#> Random number seed for chain 693284
#> Random number seed for chain 289944
#> Compiling model graph
#> Resolving undeclared variables
#> Allocating nodes
#> Graph information:
#> Observed stochastic nodes: 72
#> Unobserved stochastic nodes: 100
#> Total graph size: 1630
#>
#> Initializing model
#>
#>
|
| | 0%
|
|++ | 4%
|
|++++ | 8%
|
|++++++ | 12%
|
|++++++++ | 16%
|
|++++++++++ | 20%
|
|++++++++++++ | 24%
|
|++++++++++++++ | 28%
|
|++++++++++++++++ | 32%
|
|++++++++++++++++++ | 36%
|
|++++++++++++++++++++ | 40%
|
|++++++++++++++++++++++ | 44%
|
|++++++++++++++++++++++++ | 48%
|
|++++++++++++++++++++++++++ | 52%
|
|++++++++++++++++++++++++++++ | 56%
|
|++++++++++++++++++++++++++++++ | 60%
|
|++++++++++++++++++++++++++++++++ | 64%
|
|++++++++++++++++++++++++++++++++++ | 68%
|
|++++++++++++++++++++++++++++++++++++ | 72%
|
|++++++++++++++++++++++++++++++++++++++ | 76%
|
|++++++++++++++++++++++++++++++++++++++++ | 80%
|
|++++++++++++++++++++++++++++++++++++++++++ | 84%
|
|++++++++++++++++++++++++++++++++++++++++++++ | 88%
|
|++++++++++++++++++++++++++++++++++++++++++++++ | 92%
|
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96%
|
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100%
#>
|
| | 0%
|
|** | 4%
|
|**** | 8%
|
|****** | 12%
|
|******** | 16%
|
|********** | 20%
|
|************ | 24%
|
|************** | 28%
|
|**************** | 32%
|
|****************** | 36%
|
|******************** | 40%
|
|********************** | 44%
|
|************************ | 48%
|
|************************** | 52%
|
|**************************** | 56%
|
|****************************** | 60%
|
|******************************** | 64%
|
|********************************** | 68%
|
|************************************ | 72%
|
|************************************** | 76%
|
|**************************************** | 80%
|
|****************************************** | 84%
|
|******************************************** | 88%
|
|********************************************** | 92%
|
|************************************************ | 96%
|
|**************************************************| 100%
#>
#>
#> *** Finished JAGS ***
Here is the fitted spline curve to the number of unmarked fish available in each recovery sample. Note how the spline interpolates across the julian weeks when no unmarked fish were captured in julian weeks 37, 38, 39, and 46 but the uncertainty is much larger.
$plots$fit.plot demo2.fit
The jump in the spline when hatchery fish are released is evident.
The distribution of the posterior sample for the total number unmarked and total abundance is available as before:
$plots$post.UNtot.plot demo2.fit
A plot of the \(logit(P)\) is
$plots$logitP.plot demo2.fit
Notice how the fixed values at \(-10\) (on the logit scale) occur.
A summary of the posterior for each parameter is also available. In particular, here are the summary statistics on the posterior sample for the total number unmarked and total abundance:
$summary[ row.names(demo2.fit$summary) %in% c("Ntot","Utot"),]
demo2.fit#> mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
#> Ntot 5851224 506714.6 5140894 5516261 5769217 6062302 7073362 1.007011 300
#> Utot 5800759 506714.6 5090429 5465796 5718752 6011837 7022897 1.007020 300
The estimated total abundance from \(BTSPAS\) is 5,851,224 (SD 506,715 ) fish.
BTSPAS also allows you to model the p’s with additional covariates, such a temperature, stream flow, etc. It is not possible to use covariates to model the total number of unmarked fish.
Here is an example of some raw data that includes the covariate \(log(flow)\):
<- textConnection(
demo3.data.csv 'jweek, n1, m2, u2, logflow
9, 0, 0, 4135, 6.617212
10, 1465, 51, 10452, 6.51217
11, 1106, 121, 2199, 7.193686
12, 229, 25, 655, 6.960754
13, 20, 0, 308, 7.008376
14, 177, 17, 719, 6.761573
15, 702, 74, 973, 6.905753
16, 633, 94, 972, 7.062314
17, 1370, 62, 2386, 7.600188
18, 283, 10, 469, 8.246509
19, 647, 32, 897, 8.110298
20, 276, 11, 426, 8.035001
21, 277, 13, 407, 7.859965
22, 333, 15, 526, 7.774255
23, 3981, 242, 39969, 7.709116
24, 3988, 55, 17580, 7.653766
25, 2889, 115, 7928, 7.622105
26, 3119, 198, 6918, 7.593734
27, 2478, 80, 3578, 7.585063
28, 1292, 71, 1713, 7.291072
29, 2326, 153, 4212, 6.55556
30, 2528, 156, 5037, 6.227665
31, 2338, 275, 3315, 6.278789
32, 1012, 101, 1300, 6.273685
33, 729, 66, 989, 6.241111
34, 333, 44, 444, 6.687999
35, 269, 33, 339, 7.222566
36, 77, 7, 107, 7.097194
37, 62, 9, 79, 6.949993
38, 26, 3, 41, 6.168714
39, 20, 1, 23, 6.113682
40, 4757, 188, 35118, 6.126557
41, 2876, 8, 34534, 6.167217
42, 3989, 81, 14960, 5.862413
43, 1755, 27, 3643, 5.696614
44, 1527, 30, 1811, 5.763847
45, 485, 14, 679, 5.987528
46, 115, 4, 154, 5.912344')
<- read.csv(demo3.data.csv, header=TRUE, as.is=TRUE, strip.white=TRUE)
demo3.data
print(demo3.data)
#> jweek n1 m2 u2 logflow
#> 1 9 0 0 4135 6.617212
#> 2 10 1465 51 10452 6.512170
#> 3 11 1106 121 2199 7.193686
#> 4 12 229 25 655 6.960754
#> 5 13 20 0 308 7.008376
#> 6 14 177 17 719 6.761573
#> 7 15 702 74 973 6.905753
#> 8 16 633 94 972 7.062314
#> 9 17 1370 62 2386 7.600188
#> 10 18 283 10 469 8.246509
#> 11 19 647 32 897 8.110298
#> 12 20 276 11 426 8.035001
#> 13 21 277 13 407 7.859965
#> 14 22 333 15 526 7.774255
#> 15 23 3981 242 39969 7.709116
#> 16 24 3988 55 17580 7.653766
#> 17 25 2889 115 7928 7.622105
#> 18 26 3119 198 6918 7.593734
#> 19 27 2478 80 3578 7.585063
#> 20 28 1292 71 1713 7.291072
#> 21 29 2326 153 4212 6.555560
#> 22 30 2528 156 5037 6.227665
#> 23 31 2338 275 3315 6.278789
#> 24 32 1012 101 1300 6.273685
#> 25 33 729 66 989 6.241111
#> 26 34 333 44 444 6.687999
#> 27 35 269 33 339 7.222566
#> 28 36 77 7 107 7.097194
#> 29 37 62 9 79 6.949993
#> 30 38 26 3 41 6.168714
#> 31 39 20 1 23 6.113682
#> 32 40 4757 188 35118 6.126557
#> 33 41 2876 8 34534 6.167217
#> 34 42 3989 81 14960 5.862413
#> 35 43 1755 27 3643 5.696614
#> 36 44 1527 30 1811 5.763847
#> 37 45 485 14 679 5.987528
#> 38 46 115 4 154 5.912344
A preliminary plot of the empirical logit (excluding those weeks when the trap was not running) shows an approximate quadratic fit to \(log(flow)\), but the uncertainty in each week is enormous!
We need to create a matrix with the covariate values. We will need three columns - one for the intercept, the value of log(flow) and the square of its values. In practice, it is often advisable to standardize covariates to prevent numerical difficulties, but in this case, the values are small enough that standardization is not really needed.
<- cbind(1, demo3.data$logflow, demo3.data$logflow^2)
demo3.logitP.cov head(demo3.logitP.cov)
#> [,1] [,2] [,3]
#> [1,] 1 6.617212 43.78749
#> [2,] 1 6.512170 42.40836
#> [3,] 1 7.193686 51.74912
#> [4,] 1 6.960754 48.45210
#> [5,] 1 7.008376 49.11733
#> [6,] 1 6.761573 45.71887
The rest of the call is basically the same – don’t forget to specify the additional arguments in the call
library("BTSPAS")
# After which weeks is the spline allowed to jump?
<- c(22,39) # julian weeks after which jump occurs
demo3.jump.after
# Which julian weeks have "bad" recapture values. These will be set to missing and estimated.
<- c(41) # list julian weeks with bad m2 values. This is used in the Trinity Example
demo3.bad.m2 <- c(11) # list julian weeks with bad u2 values. [This was arbitrary to demostrate the feature.]
demo3.bad.u2 <- c(38) # list julian weeks with bad n1 values. [This was arbitrary to demonstrate the feature.]
demo3.bad.n1
# The prefix for the output files:
<- "demo3-JC-2003-CH-TSPDE"
demo3.prefix
# Title for the analysis
<- "Junction City 2003 Chinook with covariates for p "
demo3.title
cat("*** Starting ",demo3.title, "\n\n")
#> *** Starting Junction City 2003 Chinook with covariates for p
# Make the call to fit the model and generate the output files
<- TimeStratPetersenDiagError_fit(
demo3.fit title=demo3.title,
prefix=demo3.prefix,
time=demo3.data$jweek,
n1=demo3.data$n1,
m2=demo3.data$m2,
u2=demo3.data$u2,
jump.after=demo3.jump.after,
logitP.cov = demo3.logitP.cov, # ***** NEW *****
bad.n1=demo3.bad.n1,
bad.m2=demo3.bad.m2,
bad.u2=demo3.bad.u2,
InitialSeed=890110,
debug=TRUE, # this generates only 10,000 iterations of the MCMC chain for checking.
save.output.to.files=FALSE)
#>
#>
#> *** Start of call to JAGS
#> Working directory: /Users/cschwarz/Dropbox/SPAS-Bayesian/BTSPAS/vignettes
#> Initial seed for JAGS set to: 890110
#> Random number seed for chain 127579
#> Random number seed for chain 693284
#> Random number seed for chain 289944
#> Compiling model graph
#> Resolving undeclared variables
#> Allocating nodes
#> Graph information:
#> Observed stochastic nodes: 72
#> Unobserved stochastic nodes: 106
#> Total graph size: 1825
#>
#> Initializing model
#>
#>
|
| | 0%
|
|++ | 4%
|
|++++ | 8%
|
|++++++ | 12%
|
|++++++++ | 16%
|
|++++++++++ | 20%
|
|++++++++++++ | 24%
|
|++++++++++++++ | 28%
|
|++++++++++++++++ | 32%
|
|++++++++++++++++++ | 36%
|
|++++++++++++++++++++ | 40%
|
|++++++++++++++++++++++ | 44%
|
|++++++++++++++++++++++++ | 48%
|
|++++++++++++++++++++++++++ | 52%
|
|++++++++++++++++++++++++++++ | 56%
|
|++++++++++++++++++++++++++++++ | 60%
|
|++++++++++++++++++++++++++++++++ | 64%
|
|++++++++++++++++++++++++++++++++++ | 68%
|
|++++++++++++++++++++++++++++++++++++ | 72%
|
|++++++++++++++++++++++++++++++++++++++ | 76%
|
|++++++++++++++++++++++++++++++++++++++++ | 80%
|
|++++++++++++++++++++++++++++++++++++++++++ | 84%
|
|++++++++++++++++++++++++++++++++++++++++++++ | 88%
|
|++++++++++++++++++++++++++++++++++++++++++++++ | 92%
|
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96%
|
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100%
#>
|
| | 0%
|
|** | 4%
|
|**** | 8%
|
|****** | 12%
|
|******** | 16%
|
|********** | 20%
|
|************ | 24%
|
|************** | 28%
|
|**************** | 32%
|
|****************** | 36%
|
|******************** | 40%
|
|********************** | 44%
|
|************************ | 48%
|
|************************** | 52%
|
|**************************** | 56%
|
|****************************** | 60%
|
|******************************** | 64%
|
|********************************** | 68%
|
|************************************ | 72%
|
|************************************** | 76%
|
|**************************************** | 80%
|
|****************************************** | 84%
|
|******************************************** | 88%
|
|********************************************** | 92%
|
|************************************************ | 96%
|
|**************************************************| 100%
#>
#>
#> *** Finished JAGS ***
Here is the fitted spline curve to the number of unmarked fish available in each recovery sample.
$plots$fit.plot demo3.fit
The jump in the spline when hatchery fish are released is evident.
The distribution of the posterior sample for the total number unmarked and total abundance is available as before:
$plots$post.UNtot.plot demo3.fit
A plot of the \(logit(P)\) is
$plots$logitP.plot demo3.fit