This takes occurrene data in the form of a vector of taxa names, locations and survey (usually a date) and converts them into the form needed for occupancy models (see value section)
Usage
formatOccData(
  taxa,
  site,
  survey,
  replicate = NULL,
  closure_period = NULL,
  includeJDay = FALSE
)Arguments
- taxa
- A character vector of taxon names, as long as the number of observations. 
- site
- A character vector of site names, as long as the number of observations. 
- survey
- A vector as long as the number of observations. This must be a Date if either closure_period is not supplied or if includeJDay = - TRUE
- replicate
- An optional vector to identify replicate samples (visits) per survey. Need not be globally unique (e.g can be 1, 2, .. n within surveys) 
- closure_period
- An optional vector of integers specifying the closure period. If - FALSEthen closure_period will be extracted as the year from the survey.
- includeJDay
- Logical. If - TRUEa Julian day column is returned in the occDetData object.
Value
A list of length 2 the first element 'spp_vis' is a data.frame with visit
 (unique combination of site and time period) in the first column and taxa for all
 the following columns. Values in taxa columns are either TRUE or
FALSE depending on whether they were observed on that visit. The second
 element ('occDetData') is a dataframe giving the site, list length (the number of
 species observed on a visit) and year (or time period) for each visit. Optionally this also includes
 a Julian Day column, centered on 1 July.
References
Isaac, N.J.B., van Strien, A.J., August, T.A., de Zeeuw, M.P. and Roy, D.B. (2014). Statistics for citizen science: extracting signals of change from noisy ecological data. Methods in Ecology and Evolution, 5 (10), 1052-1060.
van Strien, A.J., Termaat, T., Groenendijk, D., Mensing, V. & Kéry, M. (2010). Site-occupancy models may offer new opportunities for dragonfly monitoring based on daily species lists. Basic and Applied Ecology, 11, 495-503.
Examples
if (FALSE) {
# Create data
n <- 15000 #size of dataset
nyear <- 20 # number of years in data
nSurveys <- 100 # set number of dates
nSites <- 50 # set number of sites
# Create somes dates
first <- as.Date(strptime("2010/01/01", format="%Y/%m/%d")) 
last <- as.Date(strptime(paste(2010+(nyear-1),"/12/31", sep=''), format="%Y/%m/%d")) 
dt <- last-first 
rDates <- first + (runif(nSurveys)*dt)
# taxa are set as random letters
taxa <- sample(letters, size = n, TRUE)
# three sites are visited randomly
site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE)
# the date of visit is selected at random from those created earlier
survey <- sample(rDates, size = n, TRUE)
# run the model with these data for one species
formatted_data <- formatOccData(taxa = taxa,
                                site = site,
                                survey = survey,
                                includeJDay = TRUE)
}
if (FALSE) {
# Create data with coarser survey information
n <- 1500 #number of species observation in dataset
np <- 10 # number of closure periods in data
nSurveys <- 100 # set number of surveys
nSites <- 20 # set number of sites
# taxa are set as random letters
taxa <- sample(letters, size = n, TRUE)
# three sites are visited randomly
site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE)
# the date of visit is selected at random from those created earlier
survey <- sample(nSurveys, size = n, TRUE)
# allocate the surveys randomly to closure periods 
cp <- sample(1:np, nSurveys, TRUE)
closure_period <- cp[survey]
# run the model with these data for one species
formatted_data <- formatOccData(taxa = taxa,
                                site = site,
                                survey = survey,
                                closure_period = closure_period)
 
# OR format the unicorns data
formatted_data <- formatOccData(taxa = unicorns$species,
                               survey = unicorns$start_date,
                               site = unicorns$site)
}
