Skip to contents

Run reporting rate models to assess the change in species occurrence over time.

Usage

reportingRateModel(
  taxa,
  site,
  time_period,
  list_length = FALSE,
  site_effect = FALSE,
  species_to_include = unique(taxa),
  overdispersion = FALSE,
  verbose = FALSE,
  family = "Binomial",
  print_progress = FALSE
)

Arguments

taxa

A character vector of taxon names, as long as the number of observations.

site

A character vector of site names, as long as the number of observations.

time_period

A numeric vector of user defined time periods, or a date vector, as long as the number of observations.

list_length

Logical, if TRUE then list length is added to the models as a fixed effect. Note that since list_length is a property of each visit the model will run as a binomial model rather that as a bernoulli model.

site_effect

Logical, if TRUE then site is added to the models as a random effect.

species_to_include

A character vector giving the name of species to model. By default all species will be modelled

overdispersion

This option allows modelling overdispersion (TRUE) in models. Default is FALSE.

verbose

This option, if TRUE, sets models to verbose, allowing the interations of each model to be viewed.

family

The type of model to be use. Can be "Binomial" or "Bernoulli". Note the if list_length is TRUE family defaults to Bernoulli.

print_progress

Logical, if TRUE progress is printed to console when running models. Default is TRUE

Value

A dataframe of results are returned to R. Each row gives the results for a single species, with the species name given in the first column, species_name. For each of the following columns the prefix (before ".") gives the covariate and the sufix (after the ".") gives the parameter of that covariate.

number_observations gives the number of visits where the species of interest was observed. If any of the models encountered an error this will be given in the column error_message. If model do encounter errors the the values for most columns will be NA

The data.frame has a number of attributes:

  • intercept_year - The year used for the intercept (i.e. the year whose value is set to 0). Setting the intercept to the median year helps to increase model stability

  • min_year and max_year - The earliest and latest year in the dataset (after years have been centered on intercept_year

  • nVisits - The total number of visits that were in the dataset

  • model_formula - The model used, this will vary depending on the combination of arguments used

References

Roy, H.E., Adriaens, T., Isaac, N.J.B. et al. (2012) Invasive alien predator causes rapid declines of native European ladybirds. Diversity & Distributions, 18: 717-725.

Isaac, N.J.B. et al. (2014) Extracting robust trends in species' distributions from unstructured opportunistic data: a comparison of methods. bioRXiv 006999, https://doi.org/10.1101/006999.

Examples

if (FALSE) {

# Create data
n <- 3000 #size of dataset
nyr <- 10 # number of years in data
nSamples <- 30 # set number of dates
nSites <- 15 # set number of sites

# Create somes dates
first <- as.POSIXct(strptime("2010/01/01", "%Y/%m/%d")) 
last <- as.POSIXct(strptime(paste(2010+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) 
dt <- last-first 
rDates <- first + (runif(nSamples)*dt)

# taxa are set as random letters
taxa <- sample(letters, size = n, TRUE)

# three sites are visited randomly
site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE)

# the date of visit is selected at random from those created earlier
time_period <- sample(rDates, size = n, TRUE)

# combine this to a dataframe (adding a final row of 'bad' data)
df <- data.frame(taxa = c(taxa,'bad'),
                 site = c(site,'A1'),
                 time_period = c(time_period, as.POSIXct(strptime("1200/01/01", "%Y/%m/%d"))))

# Run the model
RR_out <- reportingRateModel(df$taxa, df$site, df$time_period, print_progress = TRUE)
head(RR_out)

}