frescalo.Rd
A function for using Frescalo (Hill, 2011), a tool for analysing occurrence data when
recording effort is not known. This function returns the output from Frescalo to the
R session and saves it to the path specified by sinkdir
. By setting
plot_fres
to TRUE
maps of the results will also be saved. Plotting the
returned object gives a useful summary.
frescalo(Data, frespath, time_periods, site_col, sp_col, year_col = NULL, start_col = NULL, end_col = NULL, species_to_include = NULL, sinkdir = NULL, plot_fres = FALSE, Fres_weights = "LCGB", non_benchmark_sp = NULL, fres_site_filter = NULL, phi = 0.74, alpha = 0.27, trend_option = "arithmetic", NYears = 10, ignore.ireland = F, ignore.channelislands = F)
Data | A dataframe object. This should consist of rows of observations and columns indicating the species and location as well as either the year of the observation or columns specifying the start and end dates of the observation. It is important that date columns are in a date format. |
---|---|
frespath | the path to the frescalo .exe file. This can be downloaded from http://www.brc.ac.uk/biblio/frescalo-computer-program-analyse-your-biological-records. It is currently only available for Windows. The directory where the .exe is saved should be writeable. |
time_periods | A dataframe object with two columns. The first column contains the start year of each time period and the second column contains the end year of each time period. Time periods should not overlap. |
site_col | The name of the site column in |
sp_col | The name of the species column in |
year_col | The name of the year column in |
start_col | The name of the start date column in |
end_col | The name of the end date column in |
species_to_include | Optionally a character vector listing the names of species to be used. Species not in your list are ignored. This is useful if you are only interested in a subset of species. |
sinkdir | String giving the output directory for results |
plot_fres | Logical, if |
Fres_weights | 'LC*' specifies a weights files based on landcover data. The suffix specifies the extend ('LCUK', 'LCNI' or 'LCGB'). 'VP' uses a weights file based on vascular plant data for the UK , both are included in the package. Alternativly a custom weights file can be given as a data.frame. This must have three columns: target cell, neighbour cell, weight. Default is 'LCGB' |
non_benchmark_sp | a character vector, giving the names of species not to be
used as benchmarks in Frescalo. Default is |
fres_site_filter | Optionally a character vector giving
the names of sites to be used in the trend analysis. Sites not include in this
list are not used for estimating TFactors. Default is |
phi | Target frequency of frequency-weighted mean frequency. Default is 0.74 as in
Hill (2011). If set to |
alpha | the proportion of the expected number of species in a cell to be treated as benchmarks. Default is 0.27 as in Hill (2011). This is limited to 0.08 to 0.50. |
trend_option | Set the method by which you wish to calculate percentage change. This can currently
be set to either |
NYears | The number of years over which you want the percentage change to be calculated (i.e. 10 gives a decadal change). Default = 10 |
ignore.ireland | Logical, if |
ignore.channelislands | Logical, if |
Results are saved to file and most are returned in a list to R.
The list object returned is comprised of the following:
This list of file paths provides the locations of the raw data files for $log, $stat, $freq and $trend, in that order
This dataframe provides the list of time factors for each species
Species
Name of speciesTime
Time period, specified as a class (e.g. 1970); times need not be numeric and are indexed as character stringsTFactor
Time factor, the estimated relative frequency of species at the timeSt_Dev
Standard deviation of the time factor, given that spt (defined below) is a weighted sum of binomial variatesX
Number of occurrences of species at the time periodXspt
Number of occurrences, given reduced weight of locations having very low sampling effortXest
Estimated number of occurrences; this should be equal to spt if the algorithm has convergedN>0.00
Number of locations with non-zero probability of the species occurringN>0.98
Number of locations for which the probability of occurrence was estimated as greater than 0.98Location report
Location
Name of location; in this case locations are hectads of the GB National Grid Loc_no
Numbering (added) of locations in alphanumeric order No_spp
Number of species at that location; the actual number which may be zero Phi_in
Initial value of phi, the frequency-weighted mean frequency Alpha
Sampling effort multiplier (to achieve standard value of phi) Wgt_n2
effective number N2 for the neighbourhood weights; this is small if there are few floristically similar hectads close to the target hectad. It is (sum weights)^2 / (sum weights^2) Phi_out
Value of phi after rescaling; constant, if the algorithm has convergedSpnum_in
Sum of neighbourhood frequencies before rescalingSpnum_out
Estimated species richness, i.e. sum of neighbourhood frequencies after rescalingIter
Number of iterations for algorithm to convergeListing of rescaled species frequencies
Location
Name of locationSpecies
Name of speciesPres
Record of species in location (1 = recorded, 0 = not recorded)Freq
Frequency of species in neighbourhood of locationFreq_1
Estimated probabilty of occurrence, i.e. frequency of species after rescalingSD_Frq1
Standard error of Freq_1, calculated on the assumption that Freq is a binomial variate with standard error sqrt(Freq*(1-Freq)/ Wgt_n2), where Wgt_n2 is as defined for samples.txt in section (b)Rank
Rank of frequency in neighbourhood of locationRank_1
Rescaled rank, defined as Rank/Estimated species richnessThis records all the output sent to the console when running frescalo
The results of linear modelling of TFactors
SPECIES
Name of species used internally by frescaloNAME
Name of species as appears in raw datab
The slope of the modela
The interceptb_std_err
Standard error of the slopeb_tval
t-value for a test of significance of the slopeb_pval
p-value for a test of significance of the slopea_std_err
Standard error of the intercepta_tval
t-value for a test of significance of the intercepta_pval
p-value for a test of significance of the interceptadj_r2
Rescaled rank, defined as Rank/Estimated species richnessr2
t-value for a test of significance of the interceptF_val
F-value of the modelF_num_df
Degrees of freedom of the modelF_den_df
Denominator degrees of freedom from the F-statisticYmin
The earliest year in the datasetYmax
The latest year in the datasetchange_...
The percentage change dependent on the values given to trend_option
and NYears
.Z_VAL
Z-value for the significance test of the trendSIG_95
A logical statement indicating if the trend is significant (TRUE) or non-significant (FALSE)Hill, Mark. Local frequency as a key to interpreting species occurrence data when recording effort is not known. 2011. Methods in Ecology and Evolution, 3 (1), 195-205.
# NOT RUN { # Load data data(unicorns) # Run frescalo (data is save to the working directory as sinkdir is not given) fres_out <- frescalo(Data = unicorns, time_periods = data.frame(start=c(1980,1990),end=c(1989,1999)), site_col = 'hectad', sp_col = 'CONCEPT', start_col = 'TO_STARTDATE', end_col = 'Date') # }