A function for using Frescalo (Hill, 2011), a tool for analysing occurrence data when
recording effort is not known. This function returns the output from Frescalo to the
R session and saves it to the path specified by sinkdir
. By setting
plot_fres
to TRUE
maps of the results will also be saved. Plotting the
returned object gives a useful summary.
Usage
frescalo(
Data,
frespath,
time_periods,
site_col,
sp_col,
year_col = NULL,
start_col = NULL,
end_col = NULL,
species_to_include = NULL,
sinkdir = NULL,
plot_fres = FALSE,
Fres_weights = "LCGB",
non_benchmark_sp = NULL,
fres_site_filter = NULL,
phi = 0.74,
alpha = 0.27,
trend_option = "arithmetic",
NYears = 10,
ignore.ireland = F,
ignore.channelislands = F
)
Arguments
- Data
A dataframe object. This should consist of rows of observations and columns indicating the species and location as well as either the year of the observation or columns specifying the start and end dates of the observation. It is important that date columns are in a date format.
- frespath
the path to the frescalo .exe file. This can be downloaded from http://www.brc.ac.uk/biblio/frescalo-computer-program-analyse-your-biological-records. It is currently only available for Windows. The directory where the .exe is saved should be writeable.
- time_periods
A dataframe object with two columns. The first column contains the start year of each time period and the second column contains the end year of each time period. Time periods should not overlap.
- site_col
The name of the site column in
Data
- sp_col
The name of the species column in
Data
- year_col
The name of the year column in
Data
- start_col
The name of the start date column in
Data
- end_col
The name of the end date column in
Data
- species_to_include
Optionally a character vector listing the names of species to be used. Species not in your list are ignored. This is useful if you are only interested in a subset of species.
- sinkdir
String giving the output directory for results
- plot_fres
Logical, if
TRUE
maps are produced by Frescalo. Default isFALSE
. CURRENTLY ONLY WORKS FOR UK GRID-REFERENCE DATA- Fres_weights
'LC*' specifies a weights files based on landcover data. The suffix specifies the extend ('LCUK', 'LCNI' or 'LCGB'). 'VP' uses a weights file based on vascular plant data for the UK , both are included in the package. Alternativly a custom weights file can be given as a data.frame. This must have three columns: target cell, neighbour cell, weight. Default is 'LCGB'
- non_benchmark_sp
a character vector, giving the names of species not to be used as benchmarks in Frescalo. Default is
NULL
and all species are used. See Hill, 2011 for reasons why some species may not be suitable benchmarks.- fres_site_filter
Optionally a character vector giving the names of sites to be used in the trend analysis. Sites not include in this list are not used for estimating TFactors. Default is
NULL
and all sites are used.- phi
Target frequency of frequency-weighted mean frequency. Default is 0.74 as in Hill (2011). If set to
NULL
, phi will start at 0.74 and will be increased if the value is smaller than the 98.5 percentile of input phi, limited to to a maximum of 0.95.- alpha
the proportion of the expected number of species in a cell to be treated as benchmarks. Default is 0.27 as in Hill (2011). This is limited to 0.08 to 0.50.
- trend_option
Set the method by which you wish to calculate percentage change. This can currently be set to either
'arithmetic'
(default) or'geometric'
. Arimthmetic calculates percentage change in a linear fashion such that a decline of 50% over 50 years is equal to 10% in 10 years. Using the same example a Geometric trend would be 8.44% every 10 years as this work on a compound rate.- NYears
The number of years over which you want the percentage change to be calculated (i.e. 10 gives a decadal change). Default = 10
- ignore.ireland
Logical, if
TRUE
Irish hectads are removed. Default isFALSE
- ignore.channelislands
Logical, if
TRUE
channel island hectads are removed. Default isFALSE
Value
Results are saved to file and most are returned in a list to R.
The list object returned is comprised of the following:
- $paths
This list of file paths provides the locations of the raw data files for $log, $stat, $freq and $trend, in that order
- $trend
This dataframe provides the list of time factors for each species
- | Species | Name of species |
- | Time | Time period, specified as a class (e.g. 1970); times need not be numeric and are indexed as character strings |
- | TFactor | Time factor, the estimated relative frequency of species at the time |
- | St_Dev | Standard deviation of the time factor, given that spt (defined below) is a weighted sum of binomial variates |
- | X | Number of occurrences of species at the time period |
- | Xspt | Number of occurrences, given reduced weight of locations having very low sampling effort |
- | Xest | Estimated number of occurrences; this should be equal to spt if the algorithm has converged |
- | N>0.00 | Number of locations with non-zero probability of the species occurring |
- | N>0.98 | Number of locations for which the probability of occurrence was estimated as greater than 0.98 |
- $stat
Location report
- | Location | Name of location; in this case locations are hectads of the GB National Grid |
- | Loc_no | Numbering (added) of locations in alphanumeric order |
- | No_spp | Number of species at that location; the actual number which may be zero |
- | Phi_in | Initial value of phi, the frequency-weighted mean frequency |
- | Alpha | Sampling effort multiplier (to achieve standard value of phi) |
- | Wgt_n2 | effective number N2 for the neighbourhood weights; this is small if there are few floristically similar hectads close to the target hectad. It is (sum weights)^2 / (sum weights^2) |
- | Phi_out | Value of phi after rescaling; constant, if the algorithm has converged |
- | Spnum_in | Sum of neighbourhood frequencies before rescaling |
- | Spnum_out | Estimated species richness, i.e. sum of neighbourhood frequencies after rescaling |
- | Iter | Number of iterations for algorithm to converge |
- $freq
Listing of rescaled species frequencies
- | Location | Name of location |
- | Species | Name of species |
- | Pres | Record of species in location (1 = recorded, 0 = not recorded) |
- | Freq | Frequency of species in neighbourhood of location |
- | Freq_1 | Estimated probabilty of occurrence, i.e. frequency of species after rescaling |
- | SD_Frq1 | Standard error of Freq_1, calculated on the assumption that Freq is a binomial variate with standard error sqrt(Freq*(1-Freq)/ Wgt_n2), where Wgt_n2 is as defined for samples.txt in section (b) |
- | Rank | Rank of frequency in neighbourhood of location |
- | Rank_1 | Rescaled rank, defined as Rank/Estimated species richness |
- $log
This records all the output sent to the console when running frescalo
- $lm_stats
The results of linear modelling of TFactors
- | SPECIES | Name of species used internally by frescalo |
- | NAME | Name of species as appears in raw data |
- | b | The slope of the model |
- | a | The intercept |
- | b_std_err | Standard error of the slope |
- | b_tval | t-value for a test of significance of the slope |
- | b_pval | p-value for a test of significance of the slope |
- | a_std_err | Standard error of the intercept |
- | a_tval | t-value for a test of significance of the intercept |
- | a_pval | p-value for a test of significance of the intercept |
- | adj_r2 | Rescaled rank, defined as Rank/Estimated species richness |
- | r2 | t-value for a test of significance of the intercept |
- | F_val | F-value of the model |
- | F_num_df | Degrees of freedom of the model |
- | F_den_df | Denominator degrees of freedom from the F-statistic |
- | Ymin | The earliest year in the dataset |
- | Ymax | The latest year in the dataset |
- | change_... | The percentage change dependent on the values given to trend_option and NYears . |
The following columns are only produced when there are only two time periods
- | Z_VAL | Z-value for the significance test of the trend |
- | SIG_95 | A logical statement indicating if the trend is significant (TRUE) or non-significant (FALSE) |