summerData.Rd
Recorder metrics can be biased if there are annual breaks in data collection. In these cases it is better to ensure that only data in the recording period (typically summer), is included. This function is an objective way to identify the recording period.
summerData(input_data, probs = c(0.025, 0.975), date_col = "date_start")
input_data | the data.frame of recording information |
---|---|
probs | A vector of two proportions giving the positions of the start and end of summer. Default value of 0.025 and 0.975 mean that the central 95 percent of day in any year is classed as the recording period. |
date_col | the name of the column that contains the date. This must be formatted as a date |
Only data identified as from the the primary recording period (e.g. summer) is returned. Three additional columns are returned.
Jday
- The day of the year as a numeric value, the first day of the year being 1, the second 2 and so on
year
- The year of the record in the format YYYY
summer
- Logical, does this record fall in the summer period (i.e. the annual period of heightened recording)
The returned object has an attribute cutoffs
which details the days (Jday
) used as the first and last days of summer in each year.
# NOT RUN { # load example data head(cit_sci_data) # Subset this data to summer periods only SD <- summerData(input_data = cit_sci_data, probs = c(0.025, 0.975), date_col = 'date') head(SD) # Data not in the summer period is removed nrow(cit_sci_data) nrow(SD) # The cutoffs used to define summer are also returned attr(SD, which = 'cutoffs') # }