This function takes in data for a recorder and calculates the list length metrics. These metrics are based around the idea of a 'list', defined as the species recorded at a single location (often a 1km square) on a single day by an individual recorder.

listLength(recorder_name, data, threshold = 10, plot = FALSE,
  sp_col = "preferred_taxon", date_col = "date_start",
  recorder_col = "recorders", location_col = "kmsq")



the name of the recorder for whom you want to calculate the metrics


the data.frame of recording information


how many lists do there need to be before we calculate the metrics? If this is not met NA is reported for all metrics except n_lists


should a plot of a histogram of list lengths be created


the name of the column that contains the species names


the name of the column that contains the date. This must be formatted as a date


the name of the column that contains the recorder names


the name of the column that contains the location. This is a character, such as a grid reference and should be representative of the scale at which recording is done over a single day, typically 1km-square is used.


A data.frame with seven columns

  • recorder - The name of the recorder, as given in the recorder_name argument

  • mean_LL - The mean number of species recorded across all lists

  • median_LL - The median number of species recorded across all lists

  • variance - The variance in the number of species recorded across all lists

  • p1 - The proportion of visits that had a single species recorded

  • p4 - The proportion of visits that had four or more species recorded

  • n_lists - The number of lists this recorder recorded


# load example data

# Location might be a site name column in your data or a unique combination of lat and long
# Our data is missing a location column so we will use lat and long
# It might be more sensible to convert lat long to a grid reference and 
# use a 1 km square grid reference to represent a site 
cit_sci_data$location <- paste(round(cit_sci_data$lat, 4), round(cit_sci_data$long, 4))

# run for one recorder
LL <- listLength(data = cit_sci_data,
                 recorder_name = 3007,
                 threshold = 10,
                 plot = FALSE,
                 sp_col = 'species',
                 date_col = 'date',
                 recorder_col = 'recorder',
                 location_col = 'location')

# Run the metric for all recorders
LL_all <- lapply(unique(cit_sci_data$recorder),
                 FUN = listLength,
                 data = cit_sci_data,
                 threshold = 10,
                 plot = FALSE,
                 sp_col = 'species',
                 date_col = 'date',
                 recorder_col = 'recorder',
                 location_col = 'location')

# summarise as one table
LL_all_sum <-, LL_all)

hist(LL_all_sum$n_lists, breaks = 80)
# }