Run reporting rate models to assess the change in species occurrence over time.
Usage
reportingRateModel(
taxa,
site,
time_period,
list_length = FALSE,
site_effect = FALSE,
species_to_include = unique(taxa),
overdispersion = FALSE,
verbose = FALSE,
family = "Binomial",
print_progress = FALSE
)
Arguments
- taxa
A character vector of taxon names, as long as the number of observations.
- site
A character vector of site names, as long as the number of observations.
- time_period
A numeric vector of user defined time periods, or a date vector, as long as the number of observations.
- list_length
Logical, if
TRUE
then list length is added to the models as a fixed effect. Note that since list_length is a property of each visit the model will run as a binomial model rather that as a bernoulli model.- site_effect
Logical, if
TRUE
then site is added to the models as a random effect.- species_to_include
A character vector giving the name of species to model. By default all species will be modelled
- overdispersion
This option allows modelling overdispersion (
TRUE
) in models. Default isFALSE
.- verbose
This option, if
TRUE
, sets models to verbose, allowing the interations of each model to be viewed.- family
The type of model to be use. Can be
"Binomial"
or"Bernoulli"
. Note the if list_length isTRUE
family defaults to Bernoulli.- print_progress
Logical, if
TRUE
progress is printed to console when running models. Default isTRUE
Value
A dataframe of results are returned to R. Each row gives the results for a
single species, with the species name given in the first column, species_name
.
For each of the following columns the prefix (before ".") gives the covariate and the
sufix (after the ".") gives the parameter of that covariate.
number_observations
gives the number of visits where the species of interest
was observed. If any of the models encountered an error this will be given in the
column error_message
. If model do encounter errors the the values for most
columns will be NA
The data.frame has a number of attributes:
intercept_year
- The year used for the intercept (i.e. the year whose value is set to 0). Setting the intercept to the median year helps to increase model stabilitymin_year
andmax_year
- The earliest and latest year in the dataset (after years have been centered onintercept_year
nVisits
- The total number of visits that were in the datasetmodel_formula
- The model used, this will vary depending on the combination of arguments used
References
Roy, H.E., Adriaens, T., Isaac, N.J.B. et al. (2012) Invasive alien predator causes rapid declines of native European ladybirds. Diversity & Distributions, 18: 717-725.
Isaac, N.J.B. et al. (2014) Extracting robust trends in species' distributions from unstructured opportunistic data: a comparison of methods. bioRXiv 006999, https://doi.org/10.1101/006999.
Examples
if (FALSE) {
# Create data
n <- 3000 #size of dataset
nyr <- 10 # number of years in data
nSamples <- 30 # set number of dates
nSites <- 15 # set number of sites
# Create somes dates
first <- as.POSIXct(strptime("2010/01/01", "%Y/%m/%d"))
last <- as.POSIXct(strptime(paste(2010+(nyr-1),"/12/31", sep=''), "%Y/%m/%d"))
dt <- last-first
rDates <- first + (runif(nSamples)*dt)
# taxa are set as random letters
taxa <- sample(letters, size = n, TRUE)
# three sites are visited randomly
site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE)
# the date of visit is selected at random from those created earlier
time_period <- sample(rDates, size = n, TRUE)
# combine this to a dataframe (adding a final row of 'bad' data)
df <- data.frame(taxa = c(taxa,'bad'),
site = c(site,'A1'),
time_period = c(time_period, as.POSIXct(strptime("1200/01/01", "%Y/%m/%d"))))
# Run the model
RR_out <- reportingRateModel(df$taxa, df$site, df$time_period, print_progress = TRUE)
head(RR_out)
}