This function replicates the analyses presented in August et al XXXX. This allows you to use your own data to extract the values for the 4-axes (recording intensity, spatial extent, recording potential, and rarity recording). This function applies the same centring and scaling values as used in August et al and performs the same pre-analysis log transformations. Note that the axis values, while comparable to those used in August et al, may not be optimal for your data, and you should also extract the raw metrics and apply your own PCA to see if the same axes are important for explaining the variation observed in your data.

predictAxes(data, recorders = NULL, verbose = TRUE,
  recorder_col = "recorder", date_col = "date", y_col = "lat",
  x_col = "long", square_km_col = "km_sq", active_days_limit = 10,
  crs = "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs",
  new_crs = "+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +datum=OSGB36 +units=m +no_defs",
  sp_col = "species")

Arguments

data

The data.frame of recording information. See `head(cit_sci_data)`, for an example format. This should be the data for all observations made to your citizen science project.

recorders

Optional. A vector of recorders (as in `recorder_col`), for which you want to calculate values.

verbose

Should progress be reported?

recorder_col

The name of the column that contains the recorder names

date_col

The name of the column that contains the date. This must be formatted as a date

y_col

The name of the column that contains the y coordinate (e.g. latitude) of the observation. This should be a numeric.

x_col

The name of the column that contains the x coordinate (e.g. longitude) of the observation. This should be a numeric.

square_km_col

To calculate list lengths the location of recorders is defined by the 1km-square in which they are recorded. To make results comparable to August et al provide the 1km-square of each record here (i.e. a grid reference).

active_days_limit

If there are less than this number of active days NA values will be returned for the metrics. August et al use 10, and changing this value will result in metrics that are not comparable to August et al

crs

The proj4 string that describes the projection your data are using. For GPS lat long this is "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs". You can find more at http://spatialreference.org/

new_crs

The proj4 string that the describes the coordinate system your data should be reprojected to. THIS IS IMPORTANT. Your data must be on a projection that has units in meters so that results are comparable to other studies. An appropriate system in the UK is the UK national grid "+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +datum=OSGB36 +units=m +no_defs". If your original crs (given in the argument crs), already has units in meters then set new_crs = NULL. WARNING: if you set this to NULL but your coordinate system is not in units of meters you will likely have errors.

sp_col

The name of the column that contains the species names

Examples

# NOT RUN {
# load example data
head(cit_sci_data)

# Run for 10 recorders
metrics_axes <- predictAxes(data = cit_sci_data,
                            recorders = unique(cit_sci_data$recorder)[1:10])

# The returned object is a list of the metrics...
metrics_axes$recorder_metrics

# ...and the axes values
metrics_axes$axes

# Run the metric all recorders. NOTE: this takes a long time
metrics_axes <- predictAxes(data = cit_sci_data)

# }