This function uses the method outlined in Roy et al (2012) and Isaac et al (2014) for selecting well-sampled sites from a dataset using list length and number of years as selection criteria.

siteSelection(taxa, site, time_period, minL, minTP, LFirst = TRUE)

Arguments

taxa

A character vector of taxon names, as long as the number of observations.

site

A character vector of site names, as long as the number of observations.

time_period

A numeric vector of user defined time periods, or a date vector, as long as the number of observations.

minL

numeric, The minimum number of taxa recorded at a site at a given time period (list-length) for the visit to be considered well sampled.

minTP

numeric, The minimum number of time periods, or if time_period is a date the minimum number of years, a site must be sampled in for it be be considered well sampled.

LFirst

Logical, if TRUE data is first filtered by list-length then time periods, else time period then list-length

Value

A data.frame of data that forefills the selection criteria

References

needed

Examples

# Create data n <- 150 #size of dataset nyr <- 8 # number of years in data nSamples <- 20 # set number of dates # Create somes dates first <- as.POSIXct(strptime("2003/01/01", "%Y/%m/%d")) last <- as.POSIXct(strptime(paste(2003+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) dt <- last-first rDates <- first + (runif(nSamples)*dt) # taxa are set as random letters taxa <- sample(letters, size = n, TRUE) # three sites are visited randomly site <- sample(c('one', 'two', 'three'), size = n, TRUE) # the date of visit is selected at random from those created earlier time_period <- sample(rDates, size = n, TRUE) # combine this to a dataframe df <- data.frame(taxa, site, time_period) head(df)
#> taxa site time_period #> 1 f one 2004-10-14 08:25:06 #> 2 d three 2004-10-16 02:35:43 #> 3 p two 2008-01-15 15:39:28 #> 4 p one 2003-05-10 08:34:09 #> 5 d three 2007-06-20 08:15:56 #> 6 j one 2004-11-28 21:52:20
# Use the site selection function on this simulated data dfSEL <- siteSelection(df$taxa, df$site, df$time_period, minL = 4, minTP = 3)
#> Warning: 9 out of 150 observations will be removed as duplicates