Frescalo trend analysis — frescalo • sparta

A function for using Frescalo (Hill, 2011), a tool for analysing occurrence data when recording effort is not known. This function returns the output from Frescalo to the R session and saves it to the path specified by sinkdir. By setting plot_fres to TRUE maps of the results will also be saved. Plotting the returned object gives a useful summary.

Usage

frescalo(
  Data,
  frespath,
  time_periods,
  site_col,
  sp_col,
  year_col = NULL,
  start_col = NULL,
  end_col = NULL,
  species_to_include = NULL,
  sinkdir = NULL,
  plot_fres = FALSE,
  Fres_weights = "LCGB",
  non_benchmark_sp = NULL,
  fres_site_filter = NULL,
  phi = 0.74,
  alpha = 0.27,
  trend_option = "arithmetic",
  NYears = 10,
  ignore.ireland = F,
  ignore.channelislands = F
)

Arguments

Data: A dataframe object. This should consist of rows of observations and columns indicating the species and location as well as either the year of the observation or columns specifying the start and end dates of the observation. It is important that date columns are in a date format.
frespath: the path to the frescalo .exe file. This can be downloaded from http://www.brc.ac.uk/biblio/frescalo-computer-program-analyse-your-biological-records. It is currently only available for Windows. The directory where the .exe is saved should be writeable.
time_periods: A dataframe object with two columns. The first column contains the start year of each time period and the second column contains the end year of each time period. Time periods should not overlap.
site_col: The name of the site column in Data
sp_col: The name of the species column in Data
year_col: The name of the year column in Data
start_col: The name of the start date column in Data
end_col: The name of the end date column in Data
species_to_include: Optionally a character vector listing the names of species to be used. Species not in your list are ignored. This is useful if you are only interested in a subset of species.
sinkdir: String giving the output directory for results
plot_fres: Logical, if TRUE maps are produced by Frescalo. Default is FALSE. CURRENTLY ONLY WORKS FOR UK GRID-REFERENCE DATA
Fres_weights: 'LC*' specifies a weights files based on landcover data. The suffix specifies the extend ('LCUK', 'LCNI' or 'LCGB'). 'VP' uses a weights file based on vascular plant data for the UK , both are included in the package. Alternativly a custom weights file can be given as a data.frame. This must have three columns: target cell, neighbour cell, weight. Default is 'LCGB'
non_benchmark_sp: a character vector, giving the names of species not to be used as benchmarks in Frescalo. Default is NULL and all species are used. See Hill, 2011 for reasons why some species may not be suitable benchmarks.
fres_site_filter: Optionally a character vector giving the names of sites to be used in the trend analysis. Sites not include in this list are not used for estimating TFactors. Default is NULL and all sites are used.
phi: Target frequency of frequency-weighted mean frequency. Default is 0.74 as in Hill (2011). If set to NULL, phi will start at 0.74 and will be increased if the value is smaller than the 98.5 percentile of input phi, limited to to a maximum of 0.95.
alpha: the proportion of the expected number of species in a cell to be treated as benchmarks. Default is 0.27 as in Hill (2011). This is limited to 0.08 to 0.50.
trend_option: Set the method by which you wish to calculate percentage change. This can currently be set to either 'arithmetic' (default) or 'geometric'. Arimthmetic calculates percentage change in a linear fashion such that a decline of 50% over 50 years is equal to 10% in 10 years. Using the same example a Geometric trend would be 8.44% every 10 years as this work on a compound rate.
NYears: The number of years over which you want the percentage change to be calculated (i.e. 10 gives a decadal change). Default = 10
ignore.ireland: Logical, if TRUE Irish hectads are removed. Default is FALSE
ignore.channelislands: Logical, if TRUE channel island hectads are removed. Default is FALSE

Value

Results are saved to file and most are returned in a list to R.

The list object returned is comprised of the following:

$paths: This list of file paths provides the locations of the raw data files for $log, $stat, $freq and $trend, in that order
$trend: This dataframe provides the list of time factors for each species

-	`Species`	Name of species
-	`Time`	Time period, specified as a class (e.g. 1970); times need not be numeric and are indexed as character strings
-	`TFactor`	Time factor, the estimated relative frequency of species at the time
-	`St_Dev`	Standard deviation of the time factor, given that spt (defined below) is a weighted sum of binomial variates
-	`X`	Number of occurrences of species at the time period
-	`Xspt`	Number of occurrences, given reduced weight of locations having very low sampling effort
-	`Xest`	Estimated number of occurrences; this should be equal to spt if the algorithm has converged
-	`N>0.00`	Number of locations with non-zero probability of the species occurring
-	`N>0.98`	Number of locations for which the probability of occurrence was estimated as greater than 0.98

$stat: Location report

-	`Location`	Name of location; in this case locations are hectads of the GB National Grid
-	`Loc_no`	Numbering (added) of locations in alphanumeric order
-	`No_spp`	Number of species at that location; the actual number which may be zero
-	`Phi_in`	Initial value of phi, the frequency-weighted mean frequency
-	`Alpha`	Sampling effort multiplier (to achieve standard value of phi)
-	`Wgt_n2`	effective number N2 for the neighbourhood weights; this is small if there are few floristically similar hectads close to the target hectad. It is (sum weights)^2 / (sum weights^2)
-	`Phi_out`	Value of phi after rescaling; constant, if the algorithm has converged
-	`Spnum_in`	Sum of neighbourhood frequencies before rescaling
-	`Spnum_out`	Estimated species richness, i.e. sum of neighbourhood frequencies after rescaling
-	`Iter`	Number of iterations for algorithm to converge

$freq: Listing of rescaled species frequencies

-	`Location`	Name of location
-	`Species`	Name of species
-	`Pres`	Record of species in location (1 = recorded, 0 = not recorded)
-	`Freq`	Frequency of species in neighbourhood of location
-	`Freq_1`	Estimated probabilty of occurrence, i.e. frequency of species after rescaling
-	`SD_Frq1`	Standard error of Freq_1, calculated on the assumption that Freq is a binomial variate with standard error sqrt(Freq*(1-Freq)/ Wgt_n2), where Wgt_n2 is as defined for samples.txt in section (b)
-	`Rank`	Rank of frequency in neighbourhood of location
-	`Rank_1`	Rescaled rank, defined as Rank/Estimated species richness

$log: This records all the output sent to the console when running frescalo
$lm_stats: The results of linear modelling of TFactors

-	`SPECIES`	Name of species used internally by frescalo
-	`NAME`	Name of species as appears in raw data
-	`b`	The slope of the model
-	`a`	The intercept
-	`b_std_err`	Standard error of the slope
-	`b_tval`	t-value for a test of significance of the slope
-	`b_pval`	p-value for a test of significance of the slope
-	`a_std_err`	Standard error of the intercept
-	`a_tval`	t-value for a test of significance of the intercept
-	`a_pval`	p-value for a test of significance of the intercept
-	`adj_r2`	Rescaled rank, defined as Rank/Estimated species richness
-	`r2`	t-value for a test of significance of the intercept
-	`F_val`	F-value of the model
-	`F_num_df`	Degrees of freedom of the model
-	`F_den_df`	Denominator degrees of freedom from the F-statistic
-	`Ymin`	The earliest year in the dataset
-	`Ymax`	The latest year in the dataset
-	`change_...`	The percentage change dependent on the values given to `trend_option` and `NYears`.

The following columns are only produced when there are only two time periods

-	`Z_VAL`	Z-value for the significance test of the trend
-	`SIG_95`	A logical statement indicating if the trend is significant (TRUE) or non-significant (FALSE)

References

Hill, Mark. Local frequency as a key to interpreting species occurrence data when recording effort is not known. 2011. Methods in Ecology and Evolution, 3 (1), 195-205.

Examples

if (FALSE) {
# Load data
data(unicorns)

fres_out <- frescalo(Data = unicorns,
                     frespath = file.path(getwd(), "frescalo.exe"),
                     time_periods = data.frame(start=c(1980,1990),end=c(1989,1999)),
                     site_col = 'site',
                     sp_col = 'species',
                     start_col = 'start_date',
                     end_col = 'end_date')
}