Recorder metrics can be biased if there are annual breaks in data collection. In these cases it is better to ensure that only data in the recording period (typically summer), is included. This function is an objective way to identify the recording period.

summerData(input_data, probs = c(0.025, 0.975),
  date_col = "date_start")

Arguments

input_data

the data.frame of recording information

probs

A vector of two proportions giving the positions of the start and end of summer. Default value of 0.025 and 0.975 mean that the central 95 percent of day in any year is classed as the recording period.

date_col

the name of the column that contains the date. This must be formatted as a date

Value

Only data identified as from the the primary recording period (e.g. summer) is returned. Three additional columns are returned.

  • Jday - The day of the year as a numeric value, the first day of the year being 1, the second 2 and so on

  • year - The year of the record in the format YYYY

  • summer - Logical, does this record fall in the summer period (i.e. the annual period of heightened recording)

The returned object has an attribute cutoffs which details the days (Jday) used as the first and last days of summer in each year.

Examples

# NOT RUN {
# load example data
head(cit_sci_data)

# Subset this data to summer periods only
SD <- summerData(input_data = cit_sci_data,
                 probs = c(0.025, 0.975),
                 date_col = 'date')

head(SD)

# Data not in the summer period is removed
nrow(cit_sci_data)
nrow(SD)

# The cutoffs used to define summer are also returned
attr(SD, which = 'cutoffs')
# }