Parameter estimation for wildlife population models!

We have now gone through the basic mechanics of population modeling. But we have barely discussed where the parameter estimates come from!

This is the focus of the current lecture!

Data requirements

Scalar models (no age structure)

  • Initial abundance (\(N_0\))
  • Mean population growth rate (\(r\) or \(r_{max}\))
  • Variation in population growth rate (standard deviation for a normal distribution representing environmental stochasticity)
  • Carrying capacity of the population (\(K\); equilibrium abundance of logistic model)

Stage-structured (life history) models

  • Initial abundance (\(\mathbf{N_0}\)) (vector of initial abundances for all stages)
  • Stage-specific vital rates (fecundity, survival; fill in a transition matrix)
  • Temporal variation in stage-specific vital rates (environmental stochasticity)
  • Dependence of vital rates on density (e.g., how fecundity decreases as population gets more crowded)
    • Which vital rates are density-dependent?

Metapopulation (spatial) models

  • Spatial distributions of suitable habitat patches (define patches)
  • Spatial variation in vital rates (e.g., variation in habitat quality among different patches)
  • Correlation in environmental stochasticity among patches
  • Dispersal rates among habitat patches
  • Habitat requirements of different life stages
  • For “classical” metapopulation models: rates of colonization and extinction in habitat patches

Data sources!

Life tables

As you recall, classical life tables represent a single cohort that has been followed over time until the last one has died.

NOTE: in practice, it is nearly impossible to follow a single cohort over time in the wild (although life tables are commonly available for captive populations). Therefore, in practice, most published life tables for wild populations use the static life table, which compares population size from multiple different cohorts (across the entire range of ages), at a single point in time. Static life tables assume the population has a stable age structure —- that is, the proportion of individuals in each age class does not change from generation to generation. This assumption is often violated in natural populations, but in the absence of better data, this method can be better than nothing!

x S(x) b(x) l(x) g(x)
0 500 0 1.0 0.80
1 400 2 0.8 0.50
2 200 3 0.4 0.25
3 50 1 0.1 0.00
4 0 0 0.0 NA

PROS - Summary of survival and fecundity schedules, often available in the literature. Be careful when using data from captive populations!

CONS - Makes unrealistic assumptions (especially the static life table method). - Can not be used to estimate environmental stochasticity - Can not be used to estimate density dependence - Ignores imperfect detection and the fact that some age classes are less detectable than others
- Remember, it’s not always perfectly straightforward to translate into a matrix-based (age-structured) population model- but it can be done!

Census data!

Census data is ultimately what we all wish we had! It means we follow a population over time, and we know exactly how many individuals of each stage are in the population each year. We have perfect detection!!!!

In this case, computing survival and fecundity (and variation in these parameters, and density-dependence) is relatively straightforward!

PROS - Can estimate survival and fecundity, env. stochasticity, density dependence- pretty much everything we want for a population model!

CONS - Very rarely (if ever) available!! - Ignores imperfect detection (assumes perfect detection), which is almost always not realistic.

Capture-mark-recapture (CMR)

PROS - Can estimate survival, variation in survival, lambda, recruitment, dispersal rates. - Probably the most widely used source of data for population models!

CONS - Few downsides, although the analytical techniques can be difficult to master - Mounting a proper CMR study can be very costly, and require many years of data. - Estimating landscape-scale movements can require an even more costly and time-consuming study. - Emigration and survival can be difficult to tease apart in many cases.

See below for more! Also this is the main focus of the Population Dynamics course (NRES 488, the next course in the Wildlife Ecology and Conservation sequence!)

Data on spatial structure/habitat

  • See links on Final projects page

What if there’s “no data”?

Remember the “Stone Soup” analogy! - even when you think there’s no data, there probably is more than there initially appears!

  • Use algebra to construct a full age-structured transition matrix from available information
    • E.g., we are missing information on hatchling survival. We only know:
      • Juvenile and adult survival rates
      • Nesting success
      • Population growth (lambda) is 1.09
      • Now we can solve for hatchling survival!
      • For an example of this, see Congdon et al 1993
  • Simplify! (Models are always simplified representations of reality)
    • Ignore age structure? (i.e., use scalar model)
    • Ignore density-dependence?
    • Ignore trophic interactions (we usually make this simplification anyway!)
    • Ignore abundance entirely (e.g., use classical metapopulation model)
  • Conservative strategies (use worst case scenario)!
    • Density-independent model is conservative, so if you don’t have data on D-D, then maybe just ignore it!
    • With parameter uncertainty, use the worst case scenario
    • Under decline trends, use the worst case scenario
  • Use data from similar species!
    • E.g., tamarin species have similar life histories, so use data on golden lion tamarins to model golden-headed lion tamarins.
  • Expert Opinion
    • See below…
  • National Databases
    • See links on Final projects page
  • Allometries
    • e.g., Fenchel’s allometry
    • This type of thinking (looking for broad patterns across species) is often called “Macroecology”

Aside- is expert opinion okay to use???

  • Not ideal, because it is hard or impossible to validate, and hard to document, but …
  • That is what will be done in any case!
  • And it is better to use it than to do nothing
  • And it is better to document that expert opinion was used than to proceed with conservation planning in the absence of stating sources and assumptions
  • It is a starting point (and sometimes a reasonable one)

Capture-mark-recapture (CMR) analysis

PVA Parameters estimable from CMR data

  • Survival rate (possibly age or size structured)
  • Fecundity (entry of new individuals in the population)
  • Recruitment (entry of new individuals into the adult population)
  • Abundance
  • Lambda (finite rate of growth)
  • Environmental influences on survival rates and fecundity
  • Temporal process variance (env. stochasticity)
  • Dispersal rates

The data needed for CMR analysis: capture histories

Consider a project designed to monitor a population of alligators. These alligators were monitored for four years, from 1976 to 1979.

Each row in the following table represents a unique possible history of captures:

A “1” indicates that an animal was successfully captured in a given year, and subsequently released.

A “0” indicates that an animal was not successfully captured in a given year.

A “2” indicates that an animal was successfully captured in a given year but was not released back into the population (probably died due to handling or capture).

Two main types of CMR analyses

Closed population models

We assume that the population is closed (no BIDE processes!). That is, abundance does not change! We attempt to estimate abundance!

  • No mortality
  • No births
  • No immigration
  • No emigration
  • All individuals are observable (but not necessarily observed…)

Parameters estimated:

  • Abundance

M = the number of individuals marked in the first sample
C = total number of individuals captured in 2nd sample
R = number of individuals in 2nd sample that are marked

We can use the following formula to estimate abundance (the lincoln-peterson estimator):

\(N = \frac{M \times C}{R}\)

Open population models

We assume that the population is open to one or more of the BIDE processes. That is, abundance CAN change! We attempt to estimate the processes driving abundance change (often, survival rates)

  • Populations open to birth, death, and possibly even migration (abundance can change during the study).
  • Enables estimation of the drivers of population dynamics over extended time periods
  • Often of great interest to ecologists and managers.

Maximum likelihood: a framework for statistical inference!

FOR CMR ANALYSIS: - What value of survival maximizes the probability of generating the observed capture histories?

EXAMPLE:

Consider the following capture history for a single individual:

1 0 1 1

This individual was marked and released upon initial capture. It was not captured in the next survey, but then was captured in each of the next two subsequent surveys.

What is the probability of observing this capture history?

[(Probability of surviving from time 1 to 2) X (Probability of not being seen at time 2)] X [(Probability of surviving from time 2 to 3) X (Probability of being seen at time 3)] X [(Probability of surviving from time 3 to 4) X (Probability of being seen at time 4)]

This can be written:

\(L_1 = \phi_1(1-p_2) \cdot \phi_2p_3 \cdot \phi_3p_4\)

How about the following capture history for a single individual:

1 0 1 0

What is the probability of observing this capture history?

[(Probability of surviving from time 1 to 2) X (Probability of not being seen at time 2)] X [(Probability of surviving from time 2 to 3) X (Probability of being seen at time 3)] X

– either –

[(Probability of surviving from time 3 to 4) X (Probability of not being seen at time 4)]

– or –

[(Probability of NOT surviving from time 3 to 4)

This can be written:

\(L_1 = \phi_1(1-p_2) \cdot \phi_2p_3 \cdot \left \{(1-\phi_3)+\phi_3(1-p_4) \right \}\)

Q: if survival were 100% and capture probability were 100%, what is the probability of observing the above capture histories?

Q: what about if survival was 100% and capture probability was 75%?

Maximum likelihood estimation is the process of finding those values of the parameters \(\phi\) and \(p\) that would be most likely to generate the observed capture histories!

This model is known as the Cormack-Jolly-Seber model (CJS), and is the most common analysis performed by Program MARK.

Q: Why is \(\phi\) also known as “apparent” survival? Why is it not “true” survival???

Key assumptions of the CJS model

  • All individuals in population are equally detectible at each sampling occasion (each capture session represents a totally random sample from the population)
  • Marks are not lost or missed by surveyors
  • All emigration is permanent (equivalent to a mortality)

Program MARK

MARK is a numerical maximum-likelihood engine designed for mark-recapture analysis. You input a capture history dataset and MARK will output results such as survival rate and capture probability!!

Example of an open-population mark-recapture analysis!

Let’s run through a CJS analysis in R! To follow along, please save this script and load it up in Rstudio.

NOTE: this analysis includes both males and females (unlike the example in Lab 7), so the results will look somewhat different!

The European dipper data is THE classic example of a CMR dataset. Let’s look at it!

###########
# Cormack-Jolly-Seber (CJS) model in R
###########

library(marked)      # install the 'marked' package if you haven't already done this!
## Warning: package 'marked' was built under R version 3.6.3
## Loading required package: lme4
## Loading required package: Matrix
## Loading required package: parallel
## This is marked 1.2.6
data("dipper")
head(dipper,10)
##         ch    sex
## 1  0000001 Female
## 2  0000001 Female
## 3  0000001 Female
## 4  0000001 Female
## 5  0000001 Female
## 6  0000001 Female
## 7  0000001 Female
## 8  0000001 Female
## 9  0000001 Female
## 10 0000001 Female

Here we use the “marked” package in R (instead of MARK) to do the ML parameter estimation!

##########
# load data!

data(dipper)

#############
# Process data

dipper.proc=process.data(dipper,model="cjs",begin.time=1)  # Helper function- process the data for CJS model
## 255 capture histories collapsed into 53
dipper.ddl=make.design.data(dipper.proc)    # another helper function- process data!

##########
# Fit models

# fit time-varying cjs model


capture.output(suppressMessages(   # note: this is just to suppress messages to avoid cluttering the website...
      mod.Phit.pt <-  crm(dipper.proc,dipper.ddl,model.parameters=list(Phi=list(formula=~time),p=list(formula=~time)),method="Nelder-Mead",hessian = T)
),file="temp.txt") 

mod.Phit.pt   # print out model
## 
## crm Model Summary
## 
## Npar :  12
## -2lnL:  657.0644
## AIC  :  681.0644
## 
## Beta
##                    Estimate        se        lcl       ucl
## Phi.(Intercept)  0.75646636 0.6570806 -0.5314116 2.0443444
## Phi.time2       -0.99255899 0.7541098 -2.4706142 0.4854962
## Phi.time3       -0.84151202 0.6991672 -2.2118797 0.5288557
## Phi.time4       -0.22522075 0.7051952 -1.6074034 1.1569619
## Phi.time5       -0.36893542 0.6974964 -1.7360284 0.9981576
## Phi.time6        0.10115137 1.5063624 -2.8513189 3.0536216
## p.(Intercept)    1.00571207 0.8006508 -0.5635635 2.5749876
## p.time3          1.49463757 1.3119503 -1.0767850 4.0660601
## p.time4          1.37853596 1.0927336 -0.7632220 3.5202939
## p.time5          1.12053506 0.9914942 -0.8227937 3.0638638
## p.time6          1.57229762 1.0715540 -0.5279483 3.6725435
## p.time7          0.07823848 1.7821210 -3.4147187 3.5711956
mod.Phit.pt$results$AIC       # extract AIC
## [1] 681.0644
########
# fit time-invariant cjs model

capture.output(suppressMessages(
  mod.Phidot.pdot <- crm(dipper.proc,dipper.ddl,model.parameters = list(Phi=list(formula=~1),p=list(formula=~1)),method="Nelder-Mead",hessian = TRUE)
),file="temp.txt")

mod.Phidot.pdot
## 
## crm Model Summary
## 
## Npar :  2
## -2lnL:  666.8377
## AIC  :  670.8377
## 
## Beta
##                  Estimate        se        lcl       ucl
## Phi.(Intercept) 0.2422903 0.1020150 0.04234102 0.4422397
## p.(Intercept)   2.2261889 0.3251153 1.58896300 2.8634148
mod.Phidot.pdot$results$AIC
## [1] 670.8377
########
# fit sex-dependent cjs model

capture.output(suppressMessages(
  mod.Phisex.psex <- crm(dipper.proc,dipper.ddl,model.parameters = list(Phi=list(formula=~sex),p=list(formula=~sex)),method="Nelder-Mead",hessian = TRUE)
),file="temp.txt")

mod.Phisex.psex
## 
## crm Model Summary
## 
## Npar :  4
## -2lnL:  666.1518
## AIC  :  674.1518
## 
## Beta
##                   Estimate        se         lcl       ucl
## Phi.(Intercept) 0.22338784 0.1440981 -0.05904451 0.5058202
## Phi.sexMale     0.04138178 0.2041878 -0.35882631 0.4415899
## p.(Intercept)   2.01072100 0.4210671  1.18542943 2.8360126
## p.sexMale       0.47587463 0.6630165 -0.82363774 1.7753870
mod.Phisex.psex$results$AIC
## [1] 674.1518
###########
# compare all models with AIC
###########

######
# Set up models to run (must have either "Phi." or "p." in the name)
Phi.dot <- list(formula=~1)       
Phi.time <- list(formula=~time)
Phi.sex <- list(formula=~sex)
Phi.timesex <- list(formula=~sex+time)
p.dot <- list(formula=~1)
p.time <- list(formula=~time)
p.sex <- list(formula=~sex)
p.timesex <- list(formula=~sex+time)

cml=create.model.list(c("Phi","p"))    # create list of all models to run

######
# Run all models

capture.output(suppressMessages(suppressWarnings(
  allmodels <- crm.wrapper(cml,data=dipper.proc, ddl=dipper.ddl,external=FALSE,accumulate=FALSE,method="Nelder-Mead",hessian=TRUE)
)),file="temp.txt")

######
# AIC model selection

allmodels
##                             model npar      AIC  DeltaAIC       weight
## 1                    Phi(~1)p(~1)    2 670.8377  0.000000 0.3792707172
## 2                  Phi(~1)p(~sex)    3 672.1934  1.355756 0.1925531715
## 5                  Phi(~sex)p(~1)    3 672.6762  1.838541 0.1512568696
## 9                 Phi(~time)p(~1)    7 673.7301  2.892463 0.0893015564
## 6                Phi(~sex)p(~sex)    4 674.1518  3.314151 0.0723253541
## 10              Phi(~time)p(~sex)    8 675.1913  4.353613 0.0430104833
## 13          Phi(~sex + time)p(~1)    8 675.6617  4.824056 0.0339952989
## 14        Phi(~sex + time)p(~sex)    9 677.1681  6.330419 0.0160072363
## 3                 Phi(~1)p(~time)    7 678.4804  7.642778 0.0083050299
## 4           Phi(~1)p(~sex + time)    8 679.9497  9.112045 0.0039837661
## 7               Phi(~sex)p(~time)    8 680.4921  9.654404 0.0030375415
## 11             Phi(~time)p(~time)   12 681.0644 10.226756 0.0022815888
## 12       Phi(~time)p(~sex + time)   13 681.7032 10.865580 0.0016577483
## 8         Phi(~sex)p(~sex + time)    9 681.8896 11.051978 0.0015102292
## 15       Phi(~sex + time)p(~time)   13 682.9670 12.129303 0.0008812611
## 16 Phi(~sex + time)p(~sex + time)   14 683.6633 12.825655 0.0006221479
##     neg2lnl convergence
## 1  666.8377           0
## 2  666.1934           0
## 5  666.6762           0
## 9  659.7301           0
## 6  666.1518           0
## 10 659.1913           0
## 13 659.6617           0
## 14 659.1681           0
## 3  664.4804           0
## 4  663.9497           0
## 7  664.4921           0
## 11 657.0644           0
## 12 655.7032           0
## 8  663.8896           0
## 15 656.9670           0
## 16 655.6633           1
#######
# get parameter estimates and confidence intervals for best model

allmodels[[1]]
## 
## crm Model Summary
## 
## Npar :  2
## -2lnL:  666.8377
## AIC  :  670.8377
## 
## Beta
##                  Estimate        se        lcl       ucl
## Phi.(Intercept) 0.2422903 0.1020150 0.04234102 0.4422397
## p.(Intercept)   2.2261889 0.3251153 1.58896300 2.8634148
allmodels[[11]]
## 
## crm Model Summary
## 
## Npar :  12
## -2lnL:  657.0644
## AIC  :  681.0644
## 
## Beta
##                    Estimate        se        lcl       ucl
## Phi.(Intercept)  0.75646636 0.6570806 -0.5314116 2.0443444
## Phi.time2       -0.99255899 0.7541098 -2.4706142 0.4854962
## Phi.time3       -0.84151202 0.6991672 -2.2118797 0.5288557
## Phi.time4       -0.22522075 0.7051952 -1.6074034 1.1569619
## Phi.time5       -0.36893542 0.6974964 -1.7360284 0.9981576
## Phi.time6        0.10115137 1.5063624 -2.8513189 3.0536216
## p.(Intercept)    1.00571207 0.8006508 -0.5635635 2.5749876
## p.time3          1.49463757 1.3119503 -1.0767850 4.0660601
## p.time4          1.37853596 1.0927336 -0.7632220 3.5202939
## p.time5          1.12053506 0.9914942 -0.8227937 3.0638638
## p.time6          1.57229762 1.0715540 -0.5279483 3.6725435
## p.time7          0.07823848 1.7821210 -3.4147187 3.5711956
#######
# make predictions and plot them. 

predict(allmodels[[1]])$Phi
##   occ estimate         se       lcl       ucl
## 1   6 0.560278 0.02513307 0.5105837 0.6087926
Phi_by_year <- predict(allmodels[[11]])$Phi    # predict Phi for all years (based on the best Phi(t) model)

suppressWarnings( suppressMessages( library(Hmisc,quietly = T) ))    #load Hmisc package- has a nice error bar function
plot(1:nrow(Phi_by_year),Phi_by_year$estimate,xlab="Year",ylab="Survival",ylim=c(0,1),main="Variability in Survival, dipper demo")
errbar(1:nrow(Phi_by_year),Phi_by_year$estimate,Phi_by_year$ucl,Phi_by_year$lcl,add=T)

What is our estimate for mean survival?

#######
# Compute the mean survival rate for the population

mean(Phi_by_year$estimate)
## [1] 0.5880352

What is the environmental stochasticity?

#######
# Compute the environmental variablility in annual survival rates. 

sd(Phi_by_year$estimate)
## [1] 0.1066587

Program MARK!

MARK is free software, and can be downloaded from here

The demo from Lab 7 can be found here

–go to next lecture–