A function for using Frescalo (Hill, 2011), a tool for analysing occurrence data when recording effort is not known. This function returns the output from Frescalo to the R session and saves it to the path specified by sinkdir. By setting plot_fres to TRUE maps of the results will also be saved. Plotting the returned object gives a useful summary.

frescalo(Data, frespath, time_periods, site_col, sp_col, year_col = NULL,
  start_col = NULL, end_col = NULL, species_to_include = NULL,
  sinkdir = NULL, plot_fres = FALSE, Fres_weights = "LCGB",
  non_benchmark_sp = NULL, fres_site_filter = NULL, phi = 0.74,
  alpha = 0.27, trend_option = "arithmetic", NYears = 10,
  ignore.ireland = F, ignore.channelislands = F)

Arguments

Data

A dataframe object. This should consist of rows of observations and columns indicating the species and location as well as either the year of the observation or columns specifying the start and end dates of the observation. It is important that date columns are in a date format.

frespath

the path to the frescalo .exe file. This can be downloaded from http://www.brc.ac.uk/biblio/frescalo-computer-program-analyse-your-biological-records. It is currently only available for Windows. The directory where the .exe is saved should be writeable.

time_periods

A dataframe object with two columns. The first column contains the start year of each time period and the second column contains the end year of each time period. Time periods should not overlap.

site_col

The name of the site column in Data

sp_col

The name of the species column in Data

year_col

The name of the year column in Data

start_col

The name of the start date column in Data

end_col

The name of the end date column in Data

species_to_include

Optionally a character vector listing the names of species to be used. Species not in your list are ignored. This is useful if you are only interested in a subset of species.

sinkdir

String giving the output directory for results

plot_fres

Logical, if TRUE maps are produced by Frescalo. Default is FALSE. CURRENTLY ONLY WORKS FOR UK GRID-REFERENCE DATA

Fres_weights

'LC*' specifies a weights files based on landcover data. The suffix specifies the extend ('LCUK', 'LCNI' or 'LCGB'). 'VP' uses a weights file based on vascular plant data for the UK , both are included in the package. Alternativly a custom weights file can be given as a data.frame. This must have three columns: target cell, neighbour cell, weight. Default is 'LCGB'

non_benchmark_sp

a character vector, giving the names of species not to be used as benchmarks in Frescalo. Default is NULL and all species are used. See Hill, 2011 for reasons why some species may not be suitable benchmarks.

fres_site_filter

Optionally a character vector giving the names of sites to be used in the trend analysis. Sites not include in this list are not used for estimating TFactors. Default is NULL and all sites are used.

phi

Target frequency of frequency-weighted mean frequency. Default is 0.74 as in Hill (2011). If set to NULL, phi will start at 0.74 and will be increased if the value is smaller than the 98.5 percentile of input phi, limited to to a maximum of 0.95.

alpha

the proportion of the expected number of species in a cell to be treated as benchmarks. Default is 0.27 as in Hill (2011). This is limited to 0.08 to 0.50.

trend_option

Set the method by which you wish to calculate percentage change. This can currently be set to either 'arithmetic' (default) or 'geometric'. Arimthmetic calculates percentage change in a linear fashion such that a decline of 50% over 50 years is equal to 10% in 10 years. Using the same example a Geometric trend would be 8.44% every 10 years as this work on a compound rate.

NYears

The number of years over which you want the percentage change to be calculated (i.e. 10 gives a decadal change). Default = 10

ignore.ireland

Logical, if TRUE Irish hectads are removed. Default is FALSE

ignore.channelislands

Logical, if TRUE channel island hectads are removed. Default is FALSE

Value

Results are saved to file and most are returned in a list to R.

The list object returned is comprised of the following:

$paths

This list of file paths provides the locations of the raw data files for $log, $stat, $freq and $trend, in that order

$trend

This dataframe provides the list of time factors for each species

rll - Species Name of species
- Time Time period, specified as a class (e.g. 1970); times need not be numeric and are indexed as character strings
- TFactor Time factor, the estimated relative frequency of species at the time
- St_Dev Standard deviation of the time factor, given that spt (defined below) is a weighted sum of binomial variates
- X Number of occurrences of species at the time period
- Xspt Number of occurrences, given reduced weight of locations having very low sampling effort
- Xest Estimated number of occurrences; this should be equal to spt if the algorithm has converged
- N>0.00 Number of locations with non-zero probability of the species occurring
- N>0.98 Number of locations for which the probability of occurrence was estimated as greater than 0.98
$stat

Location report

rll - Location Name of location; in this case locations are hectads of the GB National Grid
- Loc_no Numbering (added) of locations in alphanumeric order
- No_spp Number of species at that location; the actual number which may be zero
- Phi_in Initial value of phi, the frequency-weighted mean frequency
- Alpha Sampling effort multiplier (to achieve standard value of phi)
- Wgt_n2 effective number N2 for the neighbourhood weights; this is small if there are few floristically similar hectads close to the target hectad. It is (sum weights)^2 / (sum weights^2)
- Phi_out Value of phi after rescaling; constant, if the algorithm has converged
- Spnum_in Sum of neighbourhood frequencies before rescaling
- Spnum_out Estimated species richness, i.e. sum of neighbourhood frequencies after rescaling
- Iter Number of iterations for algorithm to converge
$freq

Listing of rescaled species frequencies

rll - Location Name of location
- Species Name of species
- Pres Record of species in location (1 = recorded, 0 = not recorded)
- Freq Frequency of species in neighbourhood of location
- Freq_1 Estimated probabilty of occurrence, i.e. frequency of species after rescaling
- SD_Frq1 Standard error of Freq_1, calculated on the assumption that Freq is a binomial variate with standard error sqrt(Freq*(1-Freq)/ Wgt_n2), where Wgt_n2 is as defined for samples.txt in section (b)
- Rank Rank of frequency in neighbourhood of location
- Rank_1 Rescaled rank, defined as Rank/Estimated species richness
$log

This records all the output sent to the console when running frescalo

$lm_stats

The results of linear modelling of TFactors

rll - SPECIES Name of species used internally by frescalo
- NAME Name of species as appears in raw data
- b The slope of the model
- a The intercept
- b_std_err Standard error of the slope
- b_tval t-value for a test of significance of the slope
- b_pval p-value for a test of significance of the slope
- a_std_err Standard error of the intercept
- a_tval t-value for a test of significance of the intercept
- a_pval p-value for a test of significance of the intercept
- adj_r2 Rescaled rank, defined as Rank/Estimated species richness
- r2 t-value for a test of significance of the intercept
- F_val F-value of the model
- F_num_df Degrees of freedom of the model
- F_den_df Denominator degrees of freedom from the F-statistic
- Ymin The earliest year in the dataset
- Ymax The latest year in the dataset
- change_... The percentage change dependent on the values given to trend_option and NYears.
The following columns are only produced when there are only two time periods rll - Z_VAL Z-value for the significance test of the trend
- SIG_95 A logical statement indicating if the trend is significant (TRUE) or non-significant (FALSE)

References

Hill, Mark. Local frequency as a key to interpreting species occurrence data when recording effort is not known. 2011. Methods in Ecology and Evolution, 3 (1), 195-205.

Examples

# NOT RUN {
# Load data
data(unicorns)

# Run frescalo (data is save to the working directory as sinkdir is not given)
fres_out <- frescalo(Data = unicorns,
                     time_periods = data.frame(start=c(1980,1990),end=c(1989,1999)),
                     site_col = 'hectad',
                     sp_col = 'CONCEPT',
                     start_col = 'TO_STARTDATE',
                     end_col = 'Date')
# }