| Title: | An Extreme 'vegan' Package of Experimental Code |
|---|---|
| Description: | Random code that is too experimental or too weird to be included in the vegan package. |
| Authors: | Jari Oksanen [cre, aut] |
| Maintainer: | Jari Oksanen <[email protected]> |
| License: | MIT + file LICENCE |
| Version: | 0.3 |
| Built: | 2026-06-03 06:50:53 UTC |
| Source: | https://github.com/jarioksa/natto |
Function casts a Delaunay triangulation from
deldir (deldir package) to a
dist object, with numeric values for distances
between neighbours and NA for other distances.
## S3 method for class 'deldir' as.dist(m, diag = FALSE, upper = FALSE)## S3 method for class 'deldir' as.dist(m, diag = FALSE, upper = FALSE)
m |
A |
diag |
logical value indicating whether the diagonal of the distance
matrix should be printed by |
upper |
logical value indicating whether the upper triangle of the
distance matrix should be printed by |
A "dist" object with numeric value for distances
between neighbours and NA for other distances.
Jari Oksanen
Function casts a result object of unconstrained
rda to a prcomp result object.
as.prcomp(x) ## S3 method for class 'rda' as.prcomp(x)as.prcomp(x) ## S3 method for class 'rda' as.prcomp(x)
x |
An unconstrained |
A prcomp result object.
Jari Oksanen
Jaccard dissimilarity for presence/absence data can be seen as the mode of Beta distributed variate. The function calculates the dissimilarity as a random sample of Beta distribution, or alternatively as its expected value or mode.
bayesjaccard(x, method = c("rbeta", "expected", "mode"))bayesjaccard(x, method = c("rbeta", "expected", "mode"))
x |
Community data, will be treated as binary. |
method |
Dissimilarity as a random sample from Beta distribution or as its expected value or mode. |
In often-used 2x2 contingency table notation, is the number
of species shared by two compared communities, and and
are the numbers of species occurring only in one of the
compared communities. Assuming uniform prior in (0, 1), the species
will “sample” the dissimilarity of two communities with
posterior Beta(, ). This will have mode
and expected value . The
mode is directly the Jaccard dissimilarity for binary
(presence/absence) data, and the expected value adds 1 in numerator
and 2 in denominator from the prior. The importance of prior will
diminish when the number of species grows, but it will
protect from claiming complete identity or complete difference when
we only have a few species.
Function bayesjaccard estimates Jaccard dissimilarity as
Beta-distributed random variate, and will return random sample from
its posterior distribution. It can also return the expected value
or the mode which are constant in function calls.
The function is intended to be used to assess the effect of random
sampling variation in community analysis. The natto package
provided three examples of its application: clsupport
finds the “support” of branches in hierarchic clustering by
function hclust, bjNMDS the variation
of ordination scores in non-metric multidimensional scaling by
functions metaMDS and
monoMDS, and bjdbrda the
variation of ordination scores of constrained component of
distance-based RDA by function dbrda. All
these functions find the basic result from the expected value of
the dissimilarity, and produce a set of random samples from the
Beta distribution to assess the variation in the result.
Dissimilarity object inheriting from classes "dist"
and "designdist".
Jari Oksanen
Function is a wrapper to designdist.
data(spurn) ## the effect of prior plot(bayesjaccard(spurn, "mode"), bayesjaccard(spurn, "expected")) abline(0, 1, col=2) ## one random sample of dissimilarities plot(bayesjaccard(spurn, "expected"), bayesjaccard(spurn), asp=1) abline(0, 1, col=2)data(spurn) ## the effect of prior plot(bayesjaccard(spurn, "mode"), bayesjaccard(spurn, "expected")) abline(0, 1, col=2) ## one random sample of dissimilarities plot(bayesjaccard(spurn, "expected"), bayesjaccard(spurn), asp=1) abline(0, 1, col=2)
Habitat types of Barro Colorado Island Forest Dynamics Plots classified in 25 grid cells in each 1-ha plot (Harms et al. 2001).
data("BCI.env2")data("BCI.env2")
A data frame with 50 observations on the area (in ha) of following habitat types:
YoungYoung forests of ca. 100 years.
StreamStreamside vegetation.
SwampSwamp.
OldHighOld forests on flat surface above 152 m altitude.
OldLowOld forests on flat surface at or below 152 m altitude.
OldSlopeOld forests on steep slopes (> 7 degrees).
MixedMixed habitat.
The data set provides habitat type classification for the Barro
Colorado Island data BCI in the vegan
package. The vegan package provides alternative environmental
data BCI.env which has variables derived from
this data set: Dominant habitat type excluding Stream,
presence of Stream grid cells in the plot, and habitat type
diversity of each plot (see Examples).
The data were extracted from Harms et al. (2001: Fig. 1) who give a map of 50 forest plots divided into 25 grid cells, and the areas are expressed in ha (at 0.04 ha resolution).
Harms et al. (2001), Fig. 1.
Harms K.E., Condit R., Hubbell S.P. & Foster R.B. (2001) Habitat associations of trees and shrubs in a 50-ha neotropical forest plot. J. Ecol. 89, 947–959.
## Derive variables in BCI.env data in vegan data(BCI.env2) ## 1. The dominant habitat type in each plot k <- apply(BCI.env2[,-2], 1, which.max) Habitat <- (names(BCI.env2)[-2])[k] ## 2. Presence of streamside habitats in the plot Stream <- with(BCI.env2, ifelse(Stream > 0, "Yes", "No")) ## 3. Habitat type diversity (needs vegan) if (require(vegan)) { EnvHet <- diversity(BCI.env2, "simpson") } ## The combined variables in BCI.env and the original area variables in ## BCI.env2 give nearly equal constraints. if (require(vegan)) { data(BCI) ## habitat areas per 1-ha plot ord <- cca(BCI ~ ., BCI.env2) ## dominant habitat type of each 1-ha plot ord0 <- cca(BCI ~ Habitat) plot(procrustes(ord, ord0)) }## Derive variables in BCI.env data in vegan data(BCI.env2) ## 1. The dominant habitat type in each plot k <- apply(BCI.env2[,-2], 1, which.max) Habitat <- (names(BCI.env2)[-2])[k] ## 2. Presence of streamside habitats in the plot Stream <- with(BCI.env2, ifelse(Stream > 0, "Yes", "No")) ## 3. Habitat type diversity (needs vegan) if (require(vegan)) { EnvHet <- diversity(BCI.env2, "simpson") } ## The combined variables in BCI.env and the original area variables in ## BCI.env2 give nearly equal constraints. if (require(vegan)) { data(BCI) ## habitat areas per 1-ha plot ord <- cca(BCI ~ ., BCI.env2) ## dominant habitat type of each 1-ha plot ord0 <- cca(BCI ~ Habitat) plot(procrustes(ord, ord0)) }
Classification of BCI species based on
APG IV (2016) and Chase & Reveal (2009).
data("BCI.taxon")data("BCI.taxon")
A data frame with 225 observations on the following 7 variables.
genusGenus.
familyAPG IV family.
orderAPG IV order.
superorderSuperorder (Chase & Reveal, 2009).
unrankEither eudicotyledons,
Magnoliidae or monocotyledons which correspond to
traditional major clades, but have no formal status in the
classification of the subclass of Angiosperms.
APG IV [Angiosperm Phylogeny Group] (2016) An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IV. Bot. J. Linnean Soc. 181: 1–20.
Chase, M.W. & Reveal, J. L. (2009) A phylogenetic classification of the land plants to accompany APG III. Bot. J. Linnean Soc. 161: 122–127.
data(BCI.taxon)data(BCI.taxon)
Function performs distance-based RDA on expected Beta Jaccard dissimilarities, and then reruns the analysis on given number of random Beta Jaccard dissimilarities.
bjdbrda(formula, data, n = 100, ...) ## S3 method for class 'bjdbrda' scores( x, choices = 1:2, display = c("wa", "lc", "bp", "cn"), scaling = "species", const, expected = TRUE, ... ) ## S3 method for class 'bjdbrda' plot( x, choices = 1:2, wa = "p", lc = "n", cn = "hull", bp = "wedge", wa.par = list(), lc.par = list(), cn.par = list(), bp.par = list(), scaling = "species", type = "t", ... ) ## S3 method for class 'bjdbrda' boxplot( x, kind = c("eigen", "correlation"), points = "red", pch = 16, xlab = "Axis", ylab, ... )bjdbrda(formula, data, n = 100, ...) ## S3 method for class 'bjdbrda' scores( x, choices = 1:2, display = c("wa", "lc", "bp", "cn"), scaling = "species", const, expected = TRUE, ... ) ## S3 method for class 'bjdbrda' plot( x, choices = 1:2, wa = "p", lc = "n", cn = "hull", bp = "wedge", wa.par = list(), lc.par = list(), cn.par = list(), bp.par = list(), scaling = "species", type = "t", ... ) ## S3 method for class 'bjdbrda' boxplot( x, kind = c("eigen", "correlation"), points = "red", pch = 16, xlab = "Axis", ylab, ... )
formula, data
|
Model definition of type |
n |
Number of Jaccard dissimilarity matrices sampled from Beta distribution. |
... |
Other parameters passed to functions; passed
|
x |
|
choices |
Selected ordination axes. |
display, scaling, const
|
Kind of scores, the scaling of scores
(axes), and scaling constant with similar definitions as in
|
expected |
Return scores of the expected ordination instead of ordination based on random samples from Jaccard dissimilarity. |
wa, lc, cn, bp
|
Display of corresponding scores. |
wa.par, lc.par, cn.par, bp.par
|
List of arguments to modify the plotting parameters of the corresponding shape. |
type |
Add |
kind |
Draw boxplots of eigenvalues or pairwise axis correlations for each random sample. |
points, pch
|
Add points of given color and shape to the eigenvalue boxplot. |
xlab, ylab
|
Change labelling of axes in boxplot. |
Function works only with constrained and partial constrained
ordination, and only saves the randomized results of constrained
ordination. To maintain the eigenvalues, function does not rotate
the random results to the reference ordination, but it will reflect
axes with reversed signs. The eigenvalues and magnitudes of
axiswise correlations to the reference ordination can be inspected
with boxplot.
The display type in plot can be specified independently to
each type of score (or the score can be skipped). The default
display can be modified using a similarly named list of graphical
argument values that will replace the defaults. For instance, to
display linear combination scores as convex hull, use lc =
"hull", and modify its parameters with list lc.par. For
available graphical parameters, see bjpolygon for
filled shapes, bjstars for stars,
points for points, and ordilabel
for text. Alternatives "p" and "t" will only show the
scores of the reference solution.
Function returns dbrda result object amended
with item BayesJaccard that is a list of randomized
results with following items for each random sample:
tot.chi total inertia
eig eigenvalues of constrained axes
r absolute value of correlation coefficient of
randomized axis and the reference axis
u, wa, biplot, centroids ordination scores for linear
combination scores, weighted averages scores, biplot scores and centroid
of factor constraints; these are unscaled and are usually accessed with
scores.bjdbrda for appropriate scaling
Jari Oksanen
if (require(vegan)) { data(dune, dune.env) m <- bjdbrda(dune ~ A1 + Management + Moisture, dune.env) ## eigenvalues and correlations showing axis stability boxplot(m) boxplot(m, kind = "correlation") ## Default plot plot(m) ## modify plot plot(m, cn = "star", wa = "ellipse", cn.par = list(col=gl(2,4)), wa.par=list(col = dune.env$Management, keep = 0.607)) }if (require(vegan)) { data(dune, dune.env) m <- bjdbrda(dune ~ A1 + Management + Moisture, dune.env) ## eigenvalues and correlations showing axis stability boxplot(m) boxplot(m, kind = "correlation") ## Default plot plot(m) ## modify plot plot(m, cn = "star", wa = "ellipse", cn.par = list(col=gl(2,4)), wa.par=list(col = dune.env$Management, keep = 0.607)) }
Function performs metaMDS on expected Beta
Jaccard dissimilarity with multiple starts and then a number of
monoMDS runs on random Beta Jaccard
dissimilarities starting from the expected configuration, and
Procrustes rotates the random solutions to the expected one.
bjNMDS( x, n = 100, trymax = 500, maxit = 1000, smin = 1e-04, sfgrmin = 1e-07, sratmax = 0.999999, parallel = 2, trace = FALSE, ... ) ## S3 method for class 'bjnmds' plot( x, choices = 1:2, kind = c("hull", "ellipse", "wedge", "star"), keep = 0.9, type = "t", ... )bjNMDS( x, n = 100, trymax = 500, maxit = 1000, smin = 1e-04, sfgrmin = 1e-07, sratmax = 0.999999, parallel = 2, trace = FALSE, ... ) ## S3 method for class 'bjnmds' plot( x, choices = 1:2, kind = c("hull", "ellipse", "wedge", "star"), keep = 0.9, type = "t", ... )
x |
Community data to be analysed and treated as binary
( |
n |
Number of random samples of Beta Distribution. |
trymax |
Maximum number of random starts in
|
maxit, smin, sfgrmin, sratmax
|
Convergence parameters in
|
parallel |
Number of parallel tries in |
trace |
Trace iterations in |
... |
Other parameters passed to functions. |
choices |
Axes to be plotted. |
kind |
Shape to be plotted to show the scatter of coordinates
of sampled distances; see |
keep |
Proportion of points to be enclosed by shape; see
|
type |
Type of the plot: |
Current function is designed for visual inspection of random
variation of ordination, and no numerical analysis functions are
(yet) available. The plot can show the scatter of points as
convex hulls or ellipsoid hulls containing a given proportion of
random points, or as stars that connect the expected point to
randomized ones. With option "wedge" the convex hull is
extended to include the observed point if necessary. See
bjpolygon and bjstars for technical
details.
Missing functionality includes adding species scores, adding fitted environmental vectors and factors and numeric summaries of the randomization results.
A metaMDS result object amended with item
rscores that is a three-dimensional array of coordinates
from random bayesjaccard ordinations.
Jari Oksanen
data(spurn) m <- bjNMDS(spurn) plot(m, keep = 0.607, col = "skyblue")data(spurn) m <- bjNMDS(spurn) plot(m, keep = 0.607, col = "skyblue")
These are low-level functions that are used by plot
functions to draw shapes enclosing a specified proportion of
randomized points, or connecting certain proportion of them to the
reference points. The functions are not usually called directly by
users, but the plots can be modified with described arguments.
bjpolygon( xarr, x0, keep = 0.9, kind = c("hull", "ellipse"), criterion = "area", linetopoint = TRUE, col = "gray", alpha = 0.3, observed = TRUE, type = c("t", "p", "n"), ... ) bjstars(xarr, x0, keep = 0.9, col = "gray", type = c("t", "p", "n"), ...)bjpolygon( xarr, x0, keep = 0.9, kind = c("hull", "ellipse"), criterion = "area", linetopoint = TRUE, col = "gray", alpha = 0.3, observed = TRUE, type = c("t", "p", "n"), ... ) bjstars(xarr, x0, keep = 0.9, col = "gray", type = c("t", "p", "n"), ...)
xarr |
3-D array of coordinates of sampling units times two axes by random samples. |
x0 |
2-D array of sampling units times two axes treated as
constant for all samples in |
keep |
Proportion of points enclosed in the shape; passed to
|
kind |
Shape is either a convex hull ( |
criterion |
Criterion to remove extreme points from the hull
in |
linetopoint |
Draw line from the centre of the shape to the
coordinates in |
col |
Colour of the shape; can be a vector of colours. |
alpha |
Transparency of shapes; 0 is completely transparent, and 1 is non-transparent. |
observed |
After finding the convex hull, extend hull to
enclose fixed point |
type |
Mark coordinate of |
... |
Other parameters passed to to marker of |
Functions require output of three-dimensional array xarr of
randomized scores, where the last dimension is for the random
samples, and a two-dimensional matrix of references scores
x0.
Function bjpolygon uses peelhull or
peelellipse to find a convex hull
(chull) or an ellipsoid hull
(ellipsoidhull) containing keep
proportion of points. The reference coordinates can also be added
to the plot, either as point or a text label
(ordiellipse) and optionally connected to the shape
centre (not to the centre of points). With argument observed
the convex hull can be extended to include the reference point when
that is outside the hull; this is used to draw wedges for arrows
using the origin as the reference point.
Function bjstars connects the reference point to keep
proportion of closest points. If the reference point is not
specified, the centre of points prior to selection will be used.
Functions return invisibly NULL.
Jari Oksanen
peelhull, peelellipse,
polygon for basic functions and
plot.bjnmds and plot.bjnmds and
plot.bjdbrda for programmatic interface.
Function is a storehouse for dissimilarity indices that can be
called by their vernacular names. The function is based on
designdist (vegan package).
canneddist(x, method, help = FALSE)canneddist(x, method, help = FALSE)
x |
Input data. |
method |
Vernacular name for a dissimilarity index. |
help |
List available indices and their definitions instead of calculating dissimilarities. |
Function wraps popular dissimilarity indices for
designdist allowing these to be called by
their popular names. It can have synonymous names for one
dissimilarity index. Use argument help=TRUE to see the
current selection of indices and their definitions.
The function uses the main notation of
designdist where terms are based on sums and
paired minima or sum of squares and crossproducts of pairs of
sampling units. For vectors x and y the
"quadratic" terms are J = sum(x*y), A =
sum(x^2), B = sum(y^2), and "minimum" terms are
J = sum(pmin(x,y)), A = sum(x) and B = sum(y),
and "binary" terms are either of these after transforming
data into binary form (number of shared species J, and
number of species for each row, A, B.). Number of columns
(species) is notated as P, and the number of sampling units
is N.
There is a huge number of indices, and the current selection is far
from complete (but it can easily expanded). See References for
sources. Many sources use different notation, but they were changed
to the notation described above. For instance, in popular (but
strange) 2x2 contingency table notation for binary data a =
J, b = A-J, c = B-J, d = P-A-B+J. Some of
formulae may be surprising, but they are mathematically equivalent
to traditional ones. I challenge you to inspect Euclidean distance,
and once you see how it is derived, try Chord distance.
Function returns a dist object of dissimilarities.
Jari Oksanen
Barkman, J.J. (1958). Phytosociology and ecology of cryptogamic epiphytes, including taxonomic survey and description of vegetation units in Europe. Van Gorcum.
Clarke, K.R., Somerfield, P.J. & Chapman, M.G. (2006).
On resemblance measures for ecological studies,
including taxonomic dissimilarities and a zero-adjusted
Bray-Curtis coefficient for denuded assemblages.
J. Exp. Marine Biol. & Ecol. 330, 55-80. (bray0).
Goodall, D.W. (1973). Sample similarity and species correlation. Handbook of Vegetation Science 5, 105-156.
Hubálek, Z. (1982). Coefficients of association and similarity, based on binary (presence-absence) data: an evaluation. Biological Review 57, 669–689.
Legendre, P. & Legendre, L. (2012). Numerical Ecology. 3rd English Ed., Elsevier.
Yue, J.C. & Clayton, M.K. (2005). A similarity measure based on species proportions. Communications in Statistics Theory and Methods 34, 2123–2131. doi:10.1080/STA-200066418.
Function is a wrapper to
designdist. vegan function
betadiver is a similar collection of canned
indices for beta diversity, and many of these are well-known
dissimilarity indices.
data(spurn) canneddist(help = TRUE) ## Ochiai dissimilarity d <- canneddist(spurn, "ochiai")data(spurn) canneddist(help = TRUE) ## Ochiai dissimilarity d <- canneddist(spurn, "ochiai")
Function performs hierarchic clustering (hclust) with
the expected value of Jaccard dissimilarity as Beta random variate,
and compares the stability of each cluster branch in random samples
from Beta distribution.
clsupport(x, n = 1000, method = "average", softmatch = FALSE, plot = TRUE, ...) clsets(hclus)clsupport(x, n = 1000, method = "average", softmatch = FALSE, plot = TRUE, ...) clsets(hclus)
x |
Community data, will be treated as binary. |
n |
Number of random samples from Beta distribution. |
method |
Clustering method, passed to |
softmatch |
Estimate cluster similarity as Jaccard similarity or as a hard exact match between two clusters. |
plot |
Plot the |
... |
Other parameters; passed to |
hclus |
|
Function is basically a graphical tool that plots an
hclust dendrogram and adds the count of similar
clusters in randomized cluster dendrograms, but it can also be
called only for the numeric result without plot.
The count of similar clusters in trees is based on randomized form Jaccard Beta dissimilarities, and is called here “support”. The support is calculated alternatively as the number of exactly similar branches in randomized trees or as a sum of maximum Jaccard similarities in observed and randomized trees (“softmatch”). The exact matches are very sensitive to single wandering sampling units, and often give very low “support” for large classes, whereas Jaccard-based “support” may give too high values for the smallest classes. For exact matches the “support” is given as count, and for soft maches (Jaccard-based) as a rounded integer per 1000.
Usually called to draw a plot, but will return the
“support”; the values are returned in the order of
merges in the agglomerative hierarchic clustering. Function
clsets returns a list with indices of items (sampling
units) of each branch in their merge order.
Jari Oksanen
data(spurn) clsupport(spurn) # exact match clsupport(spurn, softmatch = TRUE)data(spurn) clsupport(spurn) # exact match clsupport(spurn, softmatch = TRUE)
Constraints are estimated as non-metric MDS of constrained
distances (distconstrain). These are similar as the
constraints (LC) scores of distance-based RDA
(dbrda), but with non-metric mapping. Site
ordination is rank-order similar to the constraints and community
points are added to the constraints using vegan function
MDSaddpoints with original unconstrained
community dissimilarities.
cNMDS(formula, data, k = 2, add = FALSE) ## S3 method for class 'cNMDS' scores(x, display = "sites", ...) ## S3 method for class 'cNMDS' plot(x, display = "sites", type = "p", ...)cNMDS(formula, data, k = 2, add = FALSE) ## S3 method for class 'cNMDS' scores(x, display = "sites", ...) ## S3 method for class 'cNMDS' plot(x, display = "sites", type = "p", ...)
formula |
Model formula where the left-hand side must be dissimilarities, and right-hand side the constraints. |
data |
Data frame to find the terms on the right-hand side of the formula. |
k |
Number of dimensions in NMDS. |
add |
Type of additive constant used to avoid negative squared
distances in |
x |
|
display |
Kind of scores to display. In |
... |
Other arguments passed to graphics functions. |
type |
Either |
The steps of the algorithm are:
Find constrained dissimilarities for given constraints
with distconstrain.
Find NMDS of constrained dissimilarities with
metaMDS.
Add points to constrained ordination using unconstrained
community dissimilarities with MDSaddpoints.
The algorithms creates analogous result to constrained eigenvector
ordination distance-based RDA (db-RDA, dbrda).
The constrained component is similar to linear combination (LC)
scores and the site NMDS analogous to to weighted average (WA) or
site scores of db-RDA. The cNMDS site scores are found by
adding the sites with their community dissimilarities as new points
to the ordination of constraints. Instead of raw dissimilarities,
both the constrained ordination and the site ordination
("new points") are based on ranks of dissimilarities (with
ties). The ranks give the same result as raw dissimilarities within
one set of dissimilarities (constrained, community), but with them
constrained and community components are rank-order similar.
Although the idea of the algorithm is simple and obvious, the results are often far from satisfactory. Quite often the community dissimilarities and constrained dissimilarities differ from each other so strongly that the site points (community) are expelled from the constrained configuration and are located at the outskirts of the ordination instead of being mixed with the constrained points. The community (site) points and constraints mix well only when the constraints describe the community dissimilarities sufficiently well.
Function returns an object of class "cNMDS" that
inherits from vegan function
monoMDS. but it has rudimentary
scores and plot methods. The plot only
shows one set of scores, and it is best to build graphics with
pipes (see Examples). The result object also contains biplot
arrows for continuous constraints and centroids for factor
constraints. These are based on community ordination (sites)
instead of constraints, and they are added with
envfit. Species scores are not available.
The function is provided as an exhibit of the algorithm. The results are often poor, and the function should not be used for any other purposes than inspecting the method and as an inspiration for alternative implementations.
Jari Oksanen
distconstrain, dbrda,
monoMDS, MDSaddpoints.
if (require(vegan)) { # vegan needed for data data(mite, mite.env) dis <- canneddist(mite, "chord") mod <- cNMDS(dis ~ WatrCont + SubsDens + Topo + Shrub, mite.env) print(mod) plot(mod, type = "p") |> points("constraints", pch = 16, col = 2) |> text("biplot", col = 4) |> text("centroids", col = 4) legend("topleft", c("LC Scores", "NMDS Scores", "Constraints"), col = c(2,1,NA), pch = c(16,1,NA), text.col = c(1,1,4)) }if (require(vegan)) { # vegan needed for data data(mite, mite.env) dis <- canneddist(mite, "chord") mod <- cNMDS(dis ~ WatrCont + SubsDens + Topo + Shrub, mite.env) print(mod) plot(mod, type = "p") |> points("constraints", pch = 16, col = 2) |> text("biplot", col = 4) |> text("centroids", col = 4) legend("topleft", c("LC Scores", "NMDS Scores", "Constraints"), col = c(2,1,NA), pch = c(16,1,NA), text.col = c(1,1,4)) }
Function constrains dissimilarities by external variables, or
alternatively removes effects of constraining variables and returns
residual dissimilarities. The analysis is based on McArdle &
Anderson (2001), and the metric multidimensional scaling of
constrained dissimilarities is equal to distance-based Redundancy
Analysis (dbrda).
distconstrain(formula, data, add = FALSE, residuals = FALSE, squared = FALSE)distconstrain(formula, data, add = FALSE, residuals = FALSE, squared = FALSE)
formula |
The left-hand-side must be dissimilarities and the right-hand-side should list the constraining variables. |
data |
Data frame containing the constrainging variables in
the |
add |
an additive constant is added to the non-diagonal
dissimilarities such that all |
residuals |
Return residuals after constraints. |
squared |
Return squared dissimilarities instead of
dissimilarities. This allows handling negative squared distances by
the user instead of setting them |
Function uses the method of McArdle & Anderson (2001) to
constrain dissimilarities by external variables, or alternatively,
to find residual dissimilarities after constraints. With Euclidean
distances, the method is equal to performing linear regressions on
each column in the raw data and then calculationg the distances,
but works directly on distances. With other methods, there is no
similar direct connection to the raw data, but it is possible to
work with non-Euclidean metrics. The same basic method is used
within db-RDA (dbrda in vegan), but
this function exposes the internal calculations to users.
Non-Euclidean indices can produce negative eigenvalues in
db-RDA. Would negative eigenvalues be produced, this function can
return negative squared distances resulting in NaN when
taking the square root. Db-RDA works with the internal presentation
of the dissimilarities, and its analysis does not suffer from the
imaginary distances, but these can ruin the analysis of
dissimilarities returned from this function.
A "dist" object of constrained or residual distances.
Jari Oksanen
McArdle, B.H. & Anderson, M.J. (2001). Fitting multivariate models to community data: a comment on distance-based redundancy analysis. Ecology 82, 290–297.
if(require(vegan, quietly=TRUE)) { data(dune, dune.env, package = "vegan") d <- canneddist(dune, "chord") dcon <- distconstrain(d ~ Management + Moisture, dune.env) mod <- pco(dcon) # constrained component of dbrda mod0 <- dbrda(d ~ Management + Moisture, dune.env) } ## PCoA with constrained dissimilarities mod ## vegan::dbrda - constrained component similar as with distconstrain mod0if(require(vegan, quietly=TRUE)) { data(dune, dune.env, package = "vegan") d <- canneddist(dune, "chord") dcon <- distconstrain(d ~ Management + Moisture, dune.env) mod <- pco(dcon) # constrained component of dbrda mod0 <- dbrda(d ~ Management + Moisture, dune.env) } ## PCoA with constrained dissimilarities mod ## vegan::dbrda - constrained component similar as with distconstrain mod0
Mean of distances is defined as the distance of each point to the mean or expected value of coordinates generating the distances.
distMeans(...) ## Default S3 method: distMeans(d, addcentre = FALSE, label = "centroid", ...) ## S3 method for class 'formula' distMeans(formula, data, ...)distMeans(...) ## Default S3 method: distMeans(d, addcentre = FALSE, label = "centroid", ...) ## S3 method for class 'formula' distMeans(formula, data, ...)
... |
Other parameters (ignored). |
d |
Distances as a |
addcentre |
Add distances to the centroid as the first item in
the distance matrix. If |
label |
Label for the centroid when |
formula, data
|
Formula where the left-hand-side is the
dissimilarity structure, and right-hand-side defines the mean
from which the dissimilarities are calculated. The terms in the
right-hand-side can be given in |
Function is analagous to colMeans or
rowMeans and returns values that are at the mean
of distances of each row or column of a symmetric distance
matrix. Alternatively, the use of formula calculates mean
distances to the fitted values.
Means of distances cannot be directly found as marginal means
of distance matrix, but they must be found after Gower double
centring (Gower 1966). After double centring, the means are
zero, and when backtransformed to original distances, these
give the mean distances. When added to the original distances,
the metric properties are preserved. For instance, adding
centres to distances will not influence results of metric
scaling, or the rank of spatial Euclidean distances. The method
is based on Euclidean geometry, but also works for
non-Euclidean dissimilarities. However, the means of very
strongly non-Euclidean indices may be imaginary, and given as
NaN.
Average mean distances can be regarded as a measure of beta
diversity, and formula interface allows analysis of beta
diversity within factor levels or with covariates. Such
analysis is preferable to conventional averaging of
dissimilarities or regression analysis of dissimilarities.
Analysis of mean distances is consistent with directly grouping
observed rectangular data, and overall beta diversity can be
decomposed into components defined by the formula, and handles
inflating observations to
dissimilarities in analysis.
Distances to all other points from a point that is in the centroid or fitted value (with formula interface) of the coordinates generating the distances. Default method allows returning the input dissimilarity matrix where the mean distances are added as the first observation.
Jari Oksanen.
Gower, J.C. (1966) Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53, 325-328.
## Euclidean distances to the mean of coordinates ... xy <- matrix(runif(5*2), 5, 2) dist(rbind(xy, "mean" = colMeans(xy))) ## ... are equal to distMeans ... distMeans(dist(xy)) ## ... but different from mean of distances colMeans(as.matrix(dist(xy))) ## adding mean distance does not influence PCoA of non-Euclidean ## distances (or other metric properties) data(spurn) d <- canneddist(spurn, "bray") m0 <- cmdscale(d, eig = TRUE) mcent <- cmdscale(distMeans(d, addcentre=TRUE), eig = TRUE) ## same non-zero eigenvalues zapsmall(m0$eig) zapsmall(mcent$eig) ## distMeans are at the origin of ordination head(mcent$points)## Euclidean distances to the mean of coordinates ... xy <- matrix(runif(5*2), 5, 2) dist(rbind(xy, "mean" = colMeans(xy))) ## ... are equal to distMeans ... distMeans(dist(xy)) ## ... but different from mean of distances colMeans(as.matrix(dist(xy))) ## adding mean distance does not influence PCoA of non-Euclidean ## distances (or other metric properties) data(spurn) d <- canneddist(spurn, "bray") m0 <- cmdscale(d, eig = TRUE) mcent <- cmdscale(distMeans(d, addcentre=TRUE), eig = TRUE) ## same non-zero eigenvalues zapsmall(m0$eig) zapsmall(mcent$eig) ## distMeans are at the origin of ordination head(mcent$points)
diverclust forms hierarchic clustering so that each fusion
minimizes beta diversity or gamma diversity.
The distance between two sampling units is the increase in
deversity when these are merged together, or the beta diversity
between fused units. diverdist finds such distances for all
pairs of sampling units using any Rényi diversity.
diverclust( x, renyi = 1, equalize = c("renyi", "total", "no"), beta = TRUE, hill = FALSE, trace = FALSE ) diverdist( x, renyi = 1, equalize = c("renyi", "total", "no"), beta = TRUE, hill = FALSE )diverclust( x, renyi = 1, equalize = c("renyi", "total", "no"), beta = TRUE, hill = FALSE, trace = FALSE ) diverdist( x, renyi = 1, equalize = c("renyi", "total", "no"), beta = TRUE, hill = FALSE )
x |
Community data. |
renyi |
The scale of Renyi diversity index as defined in
|
equalize |
Equalize data rows so that they can be meaningfully
pooled and averaged. If this is |
beta |
Use beta diversities: the average alpha diversity of
cluster members is subtracted from the pooled diversity. If this is
|
hill |
Use Hill numbers instead of Renyi diversity (see
|
trace |
Trace calculations. Either logical or an integer:
|
diverclust forms clusters so that pooled diversity or
its change to the baseline alpha diversity is minimized at each
level of clustering. The change of pooled diversity with respect to
the baseline alpha diversity is called beta diversity, and the
overall diversity is called gamma diversity. For beta diversity,
the clustering implies an additive partitioning.
Function diverdist finds beta diversities (or optionally
gamma diversities) for pairs of sampling units. These are used as
starting values in diverclust, but after merging sampling
units into clusters, beta diversities are calculated for clusters
and cannot be found directly from these dissimilarities.
Both diverdist and diverclust work by pooling
together sampling units and averaging species abundances. The units
may be equalized before pooling together to remove the effects of
variable totals in sampling units. Rényi diversities are based on
species proportions and most popular equalization is to use species
proportions by sampling units instead of raw data (argument
equalize="total"). With alternative "renyi"
equalization depends on therenyi scale. With renyi=2
(Simpson type) sums of squared proportions or norms are made equal,
and with renyi=Inf, species maxima are made equal in
compared units, and with renyi=1 (Shannon type),
"total". Equalization has no effect with scale=0
(Species richness type).
For more detailed description, see vignette "diverclust".
diverclust returns an hclust object.
diverdist returns a dissimilarity object inheriting
from dist.
Jari Oksanen.
hclust for cluster analysis and its support
methods, renyi for estimating Renyi
diversities and Hill numbers, and adipart
for related additive partitioning of beta diversity.
data(spurn) (cl <- diverclust(spurn)) plot(cl) # beta diversity based on Shannon diversity (cl <- diverclust(spurn, renyi=0, hill=TRUE)) if (require(vegan, quietly=TRUE)) tabasco(spurn, cl) plot(cl) # increase in species richness when pooling sites ## increase in Shannon diversity when pooling two sites d <- diverdist(spurn, renyi=1, equalize="no") # raw data de <- diverdist(spurn, renyi=1, equalize="total") plot(d, de, xlab = "Shannon distance, no equalization", ylab = "Shannon distance, equalized")data(spurn) (cl <- diverclust(spurn)) plot(cl) # beta diversity based on Shannon diversity (cl <- diverclust(spurn, renyi=0, hill=TRUE)) if (require(vegan, quietly=TRUE)) tabasco(spurn, cl) plot(cl) # increase in species richness when pooling sites ## increase in Shannon diversity when pooling two sites d <- diverdist(spurn, renyi=1, equalize="no") # raw data de <- diverdist(spurn, renyi=1, equalize="total") plot(d, de, xlab = "Shannon distance, no equalization", ylab = "Shannon distance, equalized")
Function is similar to standard hclust, but the
leaves are drawn as fans with base proportional to weights (or
sizes) of leaves.
ginkgogram( x, labels = NULL, check = TRUE, axes = TRUE, frame.plot = FALSE, ann = TRUE, main = "Cluster Ginkgogram", sub = NULL, xlab = NULL, ylab = "Height", w, col, leaf = Inf, ... )ginkgogram( x, labels = NULL, check = TRUE, axes = TRUE, frame.plot = FALSE, ann = TRUE, main = "Cluster Ginkgogram", sub = NULL, xlab = NULL, ylab = "Height", w, col, leaf = Inf, ... )
x |
An object of the type produced by |
labels |
A character vector of labels for te leaves of the tree. |
check |
Logical indicating if the |
axes, frame.plot, ann
|
Logical flags as in
|
main, sub, xlab, ylab
|
Character strings to replace default annotation. |
w |
Weights of leaves. |
col |
Colour of the fill of leaves. |
leaf |
Maximum height of leaf triangle. Leaf height is
smaller of this value and the height of the branch, and the
default value ( |
... |
Further graphical arguments |
Function ginkgogram has two new arguments to
hclust: w for weights that give the base
width of leaves, and col that gives the colour of the
fill of leaves. The weights w must be numeric and
non-negative, and zero weight guarantees that the leaf is drawn
as a simple line similarly as in hclust.
Function is used to modify a plot of hclust tree.
Jari Oksanen
## need vegan data sets if (require(vegan)) { data(dune, dune.phylodis, dune.taxon) cl <- hclust(dune.phylodis) ginkgogram(cl, w = colMeans(dune), col = factor(dune.taxon$Superorder)) ## limit heights of the polygons to the Cretaceous ginkgogram(cl, w = colMeans(dune), col = factor(dune.taxon$Superorder), leaf = 65) }## need vegan data sets if (require(vegan)) { data(dune, dune.phylodis, dune.taxon) cl <- hclust(dune.phylodis) ginkgogram(cl, w = colMeans(dune), col = factor(dune.taxon$Superorder)) ## limit heights of the polygons to the Cretaceous ginkgogram(cl, w = colMeans(dune), col = factor(dune.taxon$Superorder), leaf = 65) }
Non-linear regression (nls) selfStart
model generalized to use exponential family error
distributions.
gnlmod(formula, data, family = gaussian, wts, ...) ## S3 method for class 'gnlmod' predict(object, newdata, type = "response", ...) ## S3 method for class 'gnlmod' summary(object, dispersion = NULL, correlation = FALSE, ...)gnlmod(formula, data, family = gaussian, wts, ...) ## S3 method for class 'gnlmod' predict(object, newdata, type = "response", ...) ## S3 method for class 'gnlmod' summary(object, dispersion = NULL, correlation = FALSE, ...)
formula |
|
data |
Data for formula. |
family |
Error distribution of the exponential
|
wts |
Prior weights |
... |
Other parameters passed to |
object |
|
newdata |
New independent data for prediction. |
type |
Type of predicted values; only |
dispersion |
The dispersion parameter for the family
used. Either a single numerical value of |
correlation |
logical; if |
Function combines three R functions for fitting non-linear models with exponential family errors:
Non-linear regression is defined by
selfStart like in nls. In addition to
defining the nonlinear function, these also provide initial values and
derivatives of model parameters.
family functions are used to define the error
distribution allowing use of other than least squares models. The
family functions provide the log-likelihood function for fitting,
and allows finding the partial derivatives of model parameters for the
log-likelihood function (see McCullagh & Nelder 1989, p. 41).
The log-likelihood function (with partial derivatives) is optimized
with nlm.
The result is mostly similar to glm object, and can be
handled with many glm method functions, except those that assume
linear fit.
Function returns the nlm result object, but
it is amended with glm object related to
family.
Jari Oksanen
McCullagh, P & Nelder, J.A. (1989) Generalized Linear Models, 2nd ed. Chapman & Hall.
## Compare to vegan species-area models require(vegan) || stop("needs vegan") data(sipoo, sipoo.map) S <- specnumber(sipoo) ## Arrhenius model in vegan with least squares nls(S ~ SSarrhenius(area, k, z), sipoo.map) ## Assume Poisson error (mod <- gnlmod(S ~ SSarrhenius(area, k, z), sipoo.map, family=poisson)) ## Arrhenius should be identical to Poisson glm with log-link glm(S ~ log(area), sipoo.map, family=poisson) # (Intercept) = log(k) ## some method functions fitted(mod) predict(mod) predict(mod, newdata = list(area = seq(1, 250, len=31))) summary(mod, correlation = TRUE)## Compare to vegan species-area models require(vegan) || stop("needs vegan") data(sipoo, sipoo.map) S <- specnumber(sipoo) ## Arrhenius model in vegan with least squares nls(S ~ SSarrhenius(area, k, z), sipoo.map) ## Assume Poisson error (mod <- gnlmod(S ~ SSarrhenius(area, k, z), sipoo.map, family=poisson)) ## Arrhenius should be identical to Poisson glm with log-link glm(S ~ log(area), sipoo.map, family=poisson) # (Intercept) = log(k) ## some method functions fitted(mod) predict(mod) predict(mod, newdata = list(area = seq(1, 250, len=31))) summary(mod, correlation = TRUE)
Rescale Ecological Gradient to Constant Community Heterogeneity
gradrescale(grad, comm)gradrescale(grad, comm)
grad |
Observed numeric gradient. |
comm |
Community data matrix. |
Observed environmental gradient is rescaled in similar way as
species axis in rdecorana and
decorana. The goal is that weighted averages
of species have equal dispersion along the rescaled gradient, but
retain the order of sites along environmental gradient, unlike in
[r]decorana.
Axis rescaling in [r]decorana is often poorly
understood. Although rescaling is based on site (sample) scores, it
is primarily performed on species scores, and site (sample) scores
are finally found as weighted averages of rescaled species
scores. With this the rank order of site (sample) scores can
change. [r]decorana usually performs several sweeps (default
is four), and when new species scores are based on rescaled site
(sample) scores, their rank order can also change from the
original. Changing the order is not acceptable in rescaling observed
gradients, and therefore input gradient is handled like species
scores in [r]decorana and only one rescaling sweep is made.
Rescaled gradient scaled in units of heterogeneity.
Jari Oksanen
if(require(vegan, quietly=TRUE)) { data(mite, mite.env) WatrCont.rescaled <- with(mite.env, gradrescale(WatrCont, mite)) plot(WatrCont.rescaled ~ WatrCont, data=mite.env) abline(lm(WatrCont.rescaled ~ WatrCont, data=mite.env), col=4) legend("topleft", c("Nonlinear rescaling", "Linear scaling"), lty=c(NA,1), pch=c(1,NA), col=c(1,4)) }if(require(vegan, quietly=TRUE)) { data(mite, mite.env) WatrCont.rescaled <- with(mite.env, gradrescale(WatrCont, mite)) plot(WatrCont.rescaled ~ WatrCont, data=mite.env) abline(lm(WatrCont.rescaled ~ WatrCont, data=mite.env), col=4) legend("topleft", c("Nonlinear rescaling", "Linear scaling"), lty=c(NA,1), pch=c(1,NA), col=c(1,4)) }
Function humpfit fits a no-interaction model for species
richness vs. biomass data (Oksanen 1996). This is a null model that
produces a hump-backed response as an artifact of plant size and
density.
humpfit(mass, spno, family = poisson, start) ## S3 method for class 'humpfit' lines(x, segments = 101, ...) ## S3 method for class 'humpfit' plot( x, xlab = "Biomass", ylab = "Species Richness", lwd = 2, l.col = "blue", p.col = 1, type = "b", ... ) ## S3 method for class 'humpfit' points(x, ...) ## S3 method for class 'humpfit' predict(object, newdata = NULL, ...) ## S3 method for class 'humpfit' profile(fitted, parm = 1:3, alpha = 0.01, maxsteps = 20, del = zmax/5, ...) ## S3 method for class 'humpfit' summary(object, ...) ## S3 method for class 'humpfit' lines(x, segments = 101, ...) ## S3 method for class 'humpfit' plot( x, xlab = "Biomass", ylab = "Species Richness", lwd = 2, l.col = "blue", p.col = 1, type = "b", ... ) ## S3 method for class 'humpfit' points(x, ...) ## S3 method for class 'humpfit' predict(object, newdata = NULL, ...) ## S3 method for class 'humpfit' profile(fitted, parm = 1:3, alpha = 0.01, maxsteps = 20, del = zmax/5, ...) ## S3 method for class 'humpfit' summary(object, ...)humpfit(mass, spno, family = poisson, start) ## S3 method for class 'humpfit' lines(x, segments = 101, ...) ## S3 method for class 'humpfit' plot( x, xlab = "Biomass", ylab = "Species Richness", lwd = 2, l.col = "blue", p.col = 1, type = "b", ... ) ## S3 method for class 'humpfit' points(x, ...) ## S3 method for class 'humpfit' predict(object, newdata = NULL, ...) ## S3 method for class 'humpfit' profile(fitted, parm = 1:3, alpha = 0.01, maxsteps = 20, del = zmax/5, ...) ## S3 method for class 'humpfit' summary(object, ...) ## S3 method for class 'humpfit' lines(x, segments = 101, ...) ## S3 method for class 'humpfit' plot( x, xlab = "Biomass", ylab = "Species Richness", lwd = 2, l.col = "blue", p.col = 1, type = "b", ... ) ## S3 method for class 'humpfit' points(x, ...) ## S3 method for class 'humpfit' predict(object, newdata = NULL, ...) ## S3 method for class 'humpfit' profile(fitted, parm = 1:3, alpha = 0.01, maxsteps = 20, del = zmax/5, ...) ## S3 method for class 'humpfit' summary(object, ...)
mass |
Biomass. |
spno |
Species richness. |
family |
Family of error distribution. Any
|
start |
Vector of starting values for all three parameters. |
x |
Fitted result object. |
segments |
Number of segments used for lines. |
... |
Other parameters to functions. |
xlab, ylab
|
Axis labels. |
lwd |
Line width. |
l.col, p.col
|
Line and point colours. |
type |
Type of ype of |
object |
Fitted result object. |
newdata |
Values of |
fitted |
Fitted result object. |
parm |
Profiled parameters. |
alpha, maxsteps, del
|
Parameters for profiling range and density (see Details). |
The no-interaction model assumes that the humped species richness
pattern along biomass gradient is an artifact of plant size and
density (Oksanen 1996). For low-biomass sites, it assumes that
plants have a fixed size, and biomass increases with increasing
number of plants. When the sites becomes crowded, the number of
plants and species richness reaches the maximum. Higher biomass is
reached by increasing the plant size, and then the number of plants
and species richness will decrease. At biomasses below the hump,
plant number and biomass are linearly related, and above the hump,
plant number is proportional to inverse squared biomass. The number
of plants is related to the number of species by the relationship
(link function) from Fisher's log-series (Fisher et
al. 1943).
The parameters of the model are:
hump: the location of the hump on the biomass gradient.
scale: an arbitrary multiplier to translate the biomass
into virtual number of plants.
alpha: Fisher's to translate the
virtual number of plants into number of species.
The parameters scale and alpha are intermingled and
this function should not be used for estimating Fisher's
. Probably the only meaningful and interesting
parameter is the location of the hump.
Function may be very difficult to fit and easily gets trapped into
local solutions, or fails with non-Poisson families, and function
profile should be used to inspect the fitted models. You can
use plot.profile, pairs.profile for graphical
inspection of the profiles, and confint for the profile
based confidence intervals.
The original model intended to show that there is no need to speculate about “competition” and “stress” (Al-Mufti et al. 1977), but humped response can be produced as an artifact of using fixed plot size for varying plant sizes and densities.
The function returns an object of class "humpfit"
inheriting from class "glm". The result object has specific
summary, predict, plot, points and
lines methods. In addition, it can be accessed by the
following methods for glm objects: AIC,
extractAIC, deviance,
coef, residuals.glm (except type
= "partial"), fitted, and perhaps some others. In
addition, function ellipse.glm (package
ellipse) can be used to draw approximate confidence ellipses
for pairs of parameters, if the normal assumptions look
appropriate.
The function is a replacement for the original GLIM4
function at the archive of Journal of Ecology. There the function
was represented as a mixed glm with one non-linear
parameter (hump) and a special one-parameter link function
from Fisher's log-series. The current function directly applies
non-linear maximum likelihood fitting using function
nlm. Some expected problems with the current
approach are:
The function is discontinuous at hump and may be
difficult to optimize in some cases (the lines will always join, but
the derivative jumps).
The function does not try very hard to find sensible starting
values and can fail. The user may supply starting values in
argument start if fitting fails.
The estimation is unconstrained, but both scale and
alpha should always be positive. Perhaps they should be
fitted as logarithmic. Fitting Gamma
family models might become easier, too.
Jari Oksanen
Al-Mufti, M.M., Sykes, C.L, Furness, S.B., Grime, J.P & Band, S.R. (1977) A quantitative analysis of shoot phenology and dominance in herbaceous vegetation. Journal of Ecology 65,759–791.
Fisher, R.A., Corbet, A.S. & Williams, C.B. (1943) The relation between the number of species and the number of individuals in a random sample of of an animal population. Journal of Animal Ecology 12, 42–58.
Oksanen, J. (1996) Is the humped relationship between species richness and biomass an artefact due to plot size? Journal of Ecology 84, 293–295.
fisherfit, profile.glm,
confint.glm.
## Data of Al-Mufti et al. (1977) mass <- c(140,230,310,310,400,510,610,670,860,900,1050,1160,1900,2480) spno <- c(1, 4, 3, 9, 18, 30, 20, 14, 3, 2, 3, 2, 5, 2) (sol <- humpfit(mass, spno)) summary(sol) # Almost infinite alpha... plot(sol) ## Parameter 3 (alpha) is too extreme for profile and confint, and we ## must use only "hump" and "scale". plot(profile(sol, parm=1:2)) confint(sol, parm=c(1,2))## Data of Al-Mufti et al. (1977) mass <- c(140,230,310,310,400,510,610,670,860,900,1050,1160,1900,2480) spno <- c(1, 4, 3, 9, 18, 30, 20, 14, 3, 2, 3, 2, 5, 2) (sol <- humpfit(mass, spno)) summary(sol) # Almost infinite alpha... plot(sol) ## Parameter 3 (alpha) is too extreme for profile and confint, and we ## must use only "hump" and "scale". plot(profile(sol, parm=1:2)) confint(sol, parm=c(1,2))
Function finds the number of companion species or the average
species richness of other species for each species in a
community. This is the quality index of Index of
Atmospheric Purity (IAP) in lichen bioindication (LeBlanc & De
Sloover 1970). The non-randomness of low or high values is
found by randomization.
iapq(comm, freq.min = 5, permutations = 999) iap(comm, iapq) ## S3 method for class 'iapq' plot(x, type = c("p", "t"), optimize = FALSE, bg, labels, cex = 0.7, ...) ## S3 method for class 'iapq' summary(object, plimit = 0.1, ...)iapq(comm, freq.min = 5, permutations = 999) iap(comm, iapq) ## S3 method for class 'iapq' plot(x, type = c("p", "t"), optimize = FALSE, bg, labels, cex = 0.7, ...) ## S3 method for class 'iapq' summary(object, plimit = 0.1, ...)
comm |
The community data frame. |
freq.min |
Minimum number of occurrences for analysed species. |
permutations |
Number of permutations to assess the randomized number of companion species. |
iapq |
Result of |
x |
|
type |
Plot |
optimize |
Optimize location of text labels to minimize overplotting. |
bg |
Background colour of labels for text. |
labels |
Text labels to replace original species names. |
cex |
Character expansion for text and points. |
... |
Other arguments to the function. |
object |
|
plimit |
Show species at or below |
Index of Atmospheric Purity (IAP) is used in bioindication
with epiphytic lichens and bryophytes (LeBlanc & De Sloover
1970). It derives species indicator scores () as the number
of other species in sampling units where each focal species is
present, and then finds the IAP values for each sampling unit as
scaled weighted sum of species indicator values.
Function iapq finds the values for all species in a
community data set. Function iap applies these values for a
community data set to evalute the IAP values for each site. The
species are matched by names. LeBlanc & De Sloover (1970) used
scaled abundance values and divided the weighted sum by 10, but
this is not done in the current function, but this is left to the
user.
Function iapq is a general measure of indicator value for
species richness and can well be used outside lichen
bioindication. The value is the average species richness in
sampling units where the species is present, excluding the species
itself from the richness. For rare species, is based on
small sample size, and is therefore more variable than for common
species. The iapq function assesses the non-randomness
(‘significance’) of by taking random samples of the
same size as the frequency (number of occurrences) of the focal
species and finding the average richness (without the focal
species) in these samples. Because species are more likely to be
present in species-rich sampling units than in species-poor, the
random sampling uses the observed species richness (with the
focal species) as weights in random sampling. Testing is two-sided
and the number of greater or less random values is multiplied with
two. The observed value of is included in the random sample
of species richness values both in assessing the -value and
in estimating the quantiles.
The summary for the results of iapq shows only the
most “significant” species, and plot shows species
values against species frequencies with lines for expected
and its 2.5% and 97.5% quantiles as estimated from
randomization.
iapq returns a matrix with columns of species
frequency, estimated quality and based on randomization
its standardized effect size (SES), expected value
(E(Q)), 2.5% and 97.5% quantiles and
“significance”. summary methods returns the same
iap object but ordered by and only for selected
taxa. iap returns a vector of estimated IAP values for
each site.
Jari Oksanen
LeBlanc, S.C. & De Sloover, J. (1970) Relation between industrialization and the distribution and growth of epiphytic lichens and mosses in Montreal. Can. J. Bot. 48, 1485–1496.
data(spurn) iq <- iapq(spurn) summary(iq) plot(iq, type="t", optimize = TRUE, bg = "white") iap(spurn, iq)data(spurn) iq <- iapq(spurn) summary(iq) plot(iq, type="t", optimize = TRUE, bg = "white") iap(spurn, iq)
Function performs hierarchical clustering of binary ecological communities based on information analysis as defined by Williams et al. (1966) and Lance & Williams (1966).
infoclust(x, delta = TRUE)infoclust(x, delta = TRUE)
x |
Community data |
delta |
Use increase in information ( |
Function performs information analysis of binary ecological communities (Williams et al. 1966, Lance & Williams 1966). The current implementation is based on Legendre & Legendre (2012).
The information of a collection of sampling units
with species is defined as
where is the frequency count of each species in the
collection. The method works by merging either the units that give
the lowest increase ( when delta = TRUE), or
the units that are most homogeneous (lowest when
delta = FALSE). After merging sampling units or clusters,
the community data matrix is updated by actually merging the data
units and re-evaluating their information distance to all other
units. The information content of all non-merged clusters is , and for clusters of several sampling units the constant
species (completely absent or always present) do not contribute to
the information. The largest increase in information is made by
species with 0.5 relative frequency, so that the analysis tries to
build clusters where species is either always present or always
absent. This often gives easily interpretable clusters.
Function returns an object of class "infoclust" that
inherits from hclust. It uses all "hclust"
methods, but some may fail or work in unexpected ways because the
analysis is not based on dissimilarities but on binary data matrix.
Jari Oksanen
Williams, W.T., Lambert, J.M. & Lance, G.N. (1966). Multivariate methods in plant ecology. V. Similarity analyses and information-analysis. J. Ecol. 54, 427–445.
Lance, G.N. & Williams, W.T. (1966). Computer programs for hierarchical polythetic classification (“similarity analyses”). Comp. J. 9, 60–64.
Legendre, P. & Legendre, L. (2012). Numerical Ecology. 3rd English Ed., Elsevier.
## example used to demonstrate the calculation of ## information analysis by Legendre & Legendre (2012, p. 372). data(pond) cl <- infoclust(pond) plot(cl, hang = -1) ## Lance & Williams suggest a limit below which clustering is ## insignificant and should not be interpreted abline(h=qchisq(0.95, ncol(pond)), col=2) ## Spurn Point Scrub data data(spurn) cl <- infoclust(spurn) plot(cl, hang = -1) if (require(vegan)) { tabasco(spurn, cl) ## apply information clustering on species tabasco(spurn, cl, infoclust(t(spurn))) }## example used to demonstrate the calculation of ## information analysis by Legendre & Legendre (2012, p. 372). data(pond) cl <- infoclust(pond) plot(cl, hang = -1) ## Lance & Williams suggest a limit below which clustering is ## insignificant and should not be interpreted abline(h=qchisq(0.95, ncol(pond)), col=2) ## Spurn Point Scrub data data(spurn) cl <- infoclust(spurn) plot(cl, hang = -1) if (require(vegan)) { tabasco(spurn, cl) ## apply information clustering on species tabasco(spurn, cl, infoclust(t(spurn))) }
Function calculates the mean rank shift between decreasing rank orders of species (columns). The averaging is done by the number of species occurring in both compared sites, and the results are returned as dissimilarities.
mrankdist(x, missing = TRUE)mrankdist(x, missing = TRUE)
x |
The input data. |
missing |
If |
Function returns a "dist" object of mean rank shifts.
Jari Oksanen
Collins, S.L, Suding, K.N., Cleland, E.E., Batty, M., Pennings, S.C., Gross, K.L., Grace, J.B., Gough, L., Fargione, J.E. & Clark, C.M. (2008). Rank clocks and plant community dynamics. Ecology 89, 3534–3541.
Occurrences of five plant species on 167 sample plots on Mt Field,
Tasmania, and the Altitude above sea level (in meters) of the sample
plots. The data are the well drained subset of the complete of
Minchin (1989). These data were also analysed by Oksanen & Minchin
(2002). The species are Epacris serpyllifolia (abbreviated as
Epacserp), Cyathodes petiolaris (Cyatpeti),
Nothofagus cunninghamiana (Nothcunn), Poa
gunnii (Poagunn), and Bauera rubioides
(Bauerubi).
data("mtfield5")data("mtfield5")
Minchin, P.R. (1989). Montane vegetation of the Mt. Field massif, Tasmania: a test of some hypotheses about properties of community patterns. Vegetatio 83, 97–110.
Oksanen, J. & Minchin, P.R. (2002). Continuum theory revisited: what shape are species responses along ecological gradients? Ecological Modelling 157, 119–129.
Functions find a small convex hull or ellipse enclosing a given proportion of points. The functions work by removing points from the hull or ellipse one by one. The points are selected either so that the area of the remaining hull is reduced as much as possible at every step (de Smith, Goodrich & Longley 2007), or removing the point that has the maximal total distance to all remaining points or longest Mahalanobis distance to the group.
peelhull(pts, keep = 0.9, criterion = c("area", "distance", "mahalanobis")) peelellipse(pts, keep = 0.9, criterion = c("area", "mahalanobis"))peelhull(pts, keep = 0.9, criterion = c("area", "distance", "mahalanobis")) peelellipse(pts, keep = 0.9, criterion = c("area", "mahalanobis"))
pts |
Coordinates of points, a two-column matrix |
keep |
Proportion of points kept |
criterion |
Criterion to remove a point on the
hull. |
Reduction of area is the only criterion that really is based the area. The other methods are based on the distances to all points within and on the hull or on the Mahalanobis distance. The functions only work in 2D.
The algorithms are naïve, and stepwise removal of single points
does not guarantee smallest possible final hull. Two outlier points
close to each other can protect each other from removal, but
removing both together would give a large reduction of the
hull. The "distance" criterion produces circular hulls, but
"mahalanobis" will better preserve the original elongation
of the configuration, although it rarely gives smaller areas than
"area" criterion.
A two-column matrix of coordinates of peeled hull or
ellipse defining that can be plotted with
polygon, with attributes
"centre" and "area".
Jari Oksanen
de Smith, M.J., Goodchild, M.F. & Longley, P.A. (2007). Geospatial analysis: A comprehensive guide to principles, techniques and software tools. Matador.
x <- matrix(c(rnorm(100), rnorm(100, sd=2)), nrow=100, ncol=2) op <- par(mar=c(0,0,1,0), mfrow=c(2,2)) ## show first ten dropped points and hull keeping 50% of points for (crit in c("area", "distance", "mahalanobis")) { plot(x, asp = 1, main = crit, xlab="", ylab="", axes=FALSE) for (p in c(100:90, 50)/100) polygon(peelhull(x, keep=p, crit=crit), col = adjustcolor("blue", alpha.f=0.04)) } plot(x, asp = 1, main = "ellipse", xlab="", ylab="", axes=FALSE) for (p in c(100:90, 50)/100) polygon(peelellipse(x, keep=p), col=adjustcolor("blue", alpha.f=0.04)) par(op) # original graphical parametersx <- matrix(c(rnorm(100), rnorm(100, sd=2)), nrow=100, ncol=2) op <- par(mar=c(0,0,1,0), mfrow=c(2,2)) ## show first ten dropped points and hull keeping 50% of points for (crit in c("area", "distance", "mahalanobis")) { plot(x, asp = 1, main = crit, xlab="", ylab="", axes=FALSE) for (p in c(100:90, 50)/100) polygon(peelhull(x, keep=p, crit=crit), col = adjustcolor("blue", alpha.f=0.04)) } plot(x, asp = 1, main = "ellipse", xlab="", ylab="", axes=FALSE) for (p in c(100:90, 50)/100) polygon(peelellipse(x, keep=p), col=adjustcolor("blue", alpha.f=0.04)) par(op) # original graphical parameters
Polar or Bray-Curtis ordination is a historic ordination method that could be performed without computers with simple hand calculations or even with ruler and compass (Bray & Curtis 1957). Ordination axis is found by selecting two extreme points and projecting all points between these end points. The current function follows Beals (1984) in selecting the endpoints, projection of points on axis, and defining the residual distances for later axes.
polarord(d, k = 2, endpoints, metric = c("euclidean", "manhattan")) ## S3 method for class 'polarord' plot(x, choices = c(1, 2), type = "t", display, ...) ## S3 replacement method for class 'polarord' sppscores(object) <- value ## S3 method for class 'polarord' eigenvals(x, ...)polarord(d, k = 2, endpoints, metric = c("euclidean", "manhattan")) ## S3 method for class 'polarord' plot(x, choices = c(1, 2), type = "t", display, ...) ## S3 replacement method for class 'polarord' sppscores(object) <- value ## S3 method for class 'polarord' eigenvals(x, ...)
d |
Dissimilarities or distances: a |
k |
Number of dimensions. |
endpoints |
Indices (not names) of endpoints for axes. This can be a single integer for the first endpoint, and the second endpoint and later axes are found automatically. If this is a vector, its non-zero values are taken as indices of endpoints of sequential axes. |
metric |
Use either |
x |
|
choices |
Axes shown. |
type |
Type of graph which may be |
display |
Items displayed: |
... |
Other arguments to the function (passed to
|
object |
|
value |
Community data to find the species scores. |
The implementation follows McCune & Grace (2002, Chapter 17). The
endpoints are found using the variance-regression method of Beals
(1984). The first endpoint has the highest variance of distances to
other points. This guarantees that the point is at the margin of
the multivariate cloud but is not an outlier, since outliers have
long distances to all points and hence low variance of
distances. The second endpoint has the lowest (most negative)
regression coefficient between distances from the first and second
point to all other points. This selects a point at the margin of
the main cloud of points, opposite to the first endpoint. All
points are projected on the axis between the endpoints, and this
gives the scores on a polar ordination axis. Then the effect of the
axis are removed by calculating residual distances. The projection
of points and calculation of residuals is based either on Euclidean
or Manhattan geometry, depending on the argument metric.
Ecological indices are usually semimetric, and negative residual
distances can emerge, but these are taken as zero in the current
function.
The eigenvalues are estimated from the dispersion of points, and
they are not necessarily in descending order. Usually only some
first axes are reliable, and too high numbers of dimensions should
be avoided. The inertia and eigenvalues give the average
dispersion, and they are consistent with dbrda
and capscale.
Polar ordination is a historical method that is little used today,
but several authors claim that it is a powerful method (see McCune
& Grace 2002). Although the basic operations can be easily
performed by hand or graphically, the later developments of
endpoint selection require more extensive calculations. With modern
numerical utilities, the polar ordination is not faster than metric
multidimensional scaling (cmdscale,
wcmdscale).
It is possible to use predefined endpoints instead of automatic selection. This can be useful for confirmatory analysis (McCune & Grace 2002). If only one endpoint is given, the others are selected automatically.
McCune & Grace (2002) suggest that Polar Ordination can be
alternatively performed with Manhattan or City-Block methods for
projecting points onto axes and calculating residual ordination,
although they say that this is "yet largerly untested". This can be
implemented by setting metric = "manhattan". However,
two-dimensional configuration cannot be recovered from its
Manhattan distances with this method, but it can be exactly
recovered from its Euclidean distances and Euclidean metric (see
Examples). Ordination is similar to the original two-dimensional
configuration only if the endpoints were chosen so that the
ordination axes are parallel to the dimensions in the original
configuration. Further, ordinations with Manhattan metric are not
invariant when rotated: Manhattan distances are calculated with
respect to the axes, and they change if axes are rotated. This also
means that fitted vectors cannot be used for environmental
variables, and the interpretation of the ordination graph needs
special and rare intuition. I warn against use of
metric = "manhattan".
Function sppscores can be used to add species scores
to the ordination result.
The function returns an object of class "polarord" with the
following elements:
points: The ordination scores.
inertia: Total inertia of the input dissimilarities.
eig: Eigenvalues of axes. These do not usually add up to
total inertia and may not be in strictly descending order.
endpoints: The indices (not the names) of the endpoints for
each axis.
call: Function call.
Jari Oksanen
Beals, E. W. (1984) Bray-Curtis ordination: an effective strategy for analysis of ecological multivariate data. Advances in Ecological Research 14, 1–55.
Bray, J. R. & Curtis, J. T. (1957) An ordination of the upland forest communities in southern Wisconsin. Ecological Monographs 27, 325–349.
McCune, B. & Grace, J. B. (2002) Analysis of Ecological Communities. MjM Software Design.
data(spurn) dis <- dist(spurn, method = "binary") ## Jaccard index ord <- polarord(dis) ord summary(eigenvals(ord)) ## add species scores sppscores(ord) <- spurn plot(ord, optimize = TRUE) ## Focus on plot X14: select it as a one endpoint ord14 <- polarord(dis, endpoints = 14) ord14 plot(ord14) ## Two-dimensional configuration recovered with Euclidean metric if (require(vegan)) { ## Procrustes analysis set.seed(1009) x <- matrix(runif(50), 25, 2) # 2-dim matrix plot(procrustes(x, polarord(dist(x)))) # Euclidean: exact plot(procrustes(x, polarord(dist(x, "man"), metric="man"))) # diverges rm(.Random.seed) # reset }data(spurn) dis <- dist(spurn, method = "binary") ## Jaccard index ord <- polarord(dis) ord summary(eigenvals(ord)) ## add species scores sppscores(ord) <- spurn plot(ord, optimize = TRUE) ## Focus on plot X14: select it as a one endpoint ord14 <- polarord(dis, endpoints = 14) ord14 plot(ord14) ## Two-dimensional configuration recovered with Euclidean metric if (require(vegan)) { ## Procrustes analysis set.seed(1009) x <- matrix(runif(50), 25, 2) # 2-dim matrix plot(procrustes(x, polarord(dist(x)))) # Euclidean: exact plot(procrustes(x, polarord(dist(x, "man"), metric="man"))) # diverges rm(.Random.seed) # reset }
Relative abundances in scale 0 to 5 for un-named zooplankton species in ponds.
data("pond")data("pond")
A data frame with 5 ponds (observations) on 8 species (variables).
This small data set was used to demonstrate dissimilarity and classification methods by Legendre & Legendre (2012).
Legendre, P. & Legendre, L. (2012). Numerical Ecology. 3rd English Ed. Elsevier, page 290.
data(pond)data(pond)
Position Vector Ordination (PVO) finds ordination axes that go through sample points in species space and maximize the projections of other points onto them. These are similar to Principal Components. Species Vector Ordination (SVO) finds a set of species that maximize the projections of other species on species axes. This also can be similar to Principal Components and can be used to select a subset of species that cover most of the variance in the data.
posvectord(x, scale = FALSE) spvectord(x, scale = FALSE) ## S3 method for class 'posvectord' plot(x, choices = c(1, 2), type = "t", display = c("sites", "species"), ...) ## S3 replacement method for class 'posvectord' sppscores(object) <- value ## S3 method for class 'posvectord' scores(x, display = c("sites", "species"), choices = 1:2, ...)posvectord(x, scale = FALSE) spvectord(x, scale = FALSE) ## S3 method for class 'posvectord' plot(x, choices = c(1, 2), type = "t", display = c("sites", "species"), ...) ## S3 replacement method for class 'posvectord' sppscores(object) <- value ## S3 method for class 'posvectord' scores(x, display = c("sites", "species"), choices = 1:2, ...)
x |
Input data. |
scale |
Scale variables to unit variance. |
choices |
Axes shown. |
type |
Type of plot: |
display |
Scores shown. |
... |
Other arguments (passed to |
object |
Ordination object to which species scores should be added. |
value |
Community for which species scores should be added. |
Position Vector Ordination (PVO, function posvectord) is a
simple educational method that resembles Principal Component
Analysis, PCA (Orlóci 1966, 1973a). Function posvectord
positions vectors from the data centroid (origin) to sample points
in species space and evaluates the projections of other sample
vectors on these positioned axes, and selects the one with highest
total projection as the ordination axis. Then the effects of that
selected axis are removed from the covariance matrix, zeroing the
row and column of selected sampling unit, and the process is
repeated. Principal Components maximize this projection, but PVO
axes can be close to the Principal Component, in particular in the
large data sets with many observations. The method was proposed as
a computationally light approximation to PCA suitable for the
computers of 1960s (Orlóci 1966). Now it can only be regarded as
historically interesting, and also as an educational tool in
introducing PCA.
Species Vector Ordination (SVO; function spvectord) is
similar, but it picks the species vector that maximizes the species
projected onto that vector (Orlóci 1973b). The method picks the
species or variable that explains largest proportion of variance
and uses the centred values of this variable as the axis. Then it
orthogonalizes all species or variables to that selected axis and
repeats the selection process. PCA finds a linear combination of
species or variables that minimizes the residual variance, but in
SVO only one species or variable is used. The axes are named by the
species or variables, and the axis scores are the centred
(residual) observed values. The resulting plot has orthogonalized
set of species as axes. SVO can be similar to PCA, in particular
when some few species contribute most of the the total
variance. The method only has educational use in explaining PCA in
species space. Orlóci (1973b) suggesed SVO as a method of selecting
a subset of species or variables that contributes most of the
variance in the data. The current spvectord function can
also be used for this purpose, although it is mainly geared for
ordination. dave package has function orank
specifically written for variable or species selection using the
same algorithm.
Function sppscores can add species scores to the ordination
results. The species scores should be regarded as arrow heads of
vectors. Although the axes of spvectord are the species, the
corresponding arrow is not usually parallel to the axis after axis
1. the axis shows the residual abundance of the species after
previous axes whereas the species score shows the correlated
response with other species.
posvectord returns an object of class
"posvectord", and spvectord returns an object of
class "spvectord" that inherits from
"posvectord". Result objects have the following
elements:
points: The ordination scores. In SVO, these are
named by the species (variable) the axes is based on, the
numerical scores are the centred (residual) values of
observed data. In PVO, they are called as PVO, and they are
scaled similarly as in PCA and can reconstitute
(a low-rank approximation of) covariances.
totvar: The total variance in the input data.
eig: Eigenvalues of axes.
posits: Indices of points used to position axes in
posvectord.
Jari Oksanen
Orlóci, L. (1966) Geometric models in ecology. I. The theory and application of some ordination methods. J. Ecol. 54: 193–215.
Orlóci, L. (1973a) Ordination by resemblance matrices. In: R. H. Whittaker (ed.) Ordination and Classification of Communities. Handbook of Vegetation Science 5: 249–286.
Orlóci, L. (1973b) Ranking characters by dispersion criterion. Nature 244: 371–373.
polarord (Polar Ordination) is a similar
educational tool to approximate Principal Coordinates Analysis
(PCoA).
data(spurn) m <- spvectord(spurn) sppscores(m) <- spurn plot(m) m <- posvectord(spurn) m plot(m) if (require(vegan, quietly = TRUE)) { ## similar to PCA plot(procrustes(pca(spurn), m, choices=1:2)) }data(spurn) m <- spvectord(spurn) sppscores(m) <- spurn plot(m) m <- posvectord(spurn) m plot(m) if (require(vegan, quietly = TRUE)) { ## similar to PCA plot(procrustes(pca(spurn), m, choices=1:2)) }
Rao's quadratic entropy (Rao 1982) is a generalization of Simpson diversity index to a situation where species are non-independent. The species dependences are expressed as pairwise dissimilarities, where 1 means independent species pair, and 0 a completely aliased species pair. Rao's distance (1982) is a similar generalization for distances between sampling units with non-independent species. Typically species distances are based on phylogenetics, taxonomy or species traits, and these measures are called phylogenetic or functional diversities and distances.
Function raostand modified data so that it is compatible to
other functions, and Rao's quadratic entropy and Rao distances can
be directly found from the standarized data.
qrao(x, d, na.rm = FALSE, dmax) distrao( x, d, method = c("jensen", "euclidean", "standardized"), propx = TRUE, dmax ) raostand(x, d, propx = TRUE, dmax)qrao(x, d, na.rm = FALSE, dmax) distrao( x, d, method = c("jensen", "euclidean", "standardized"), propx = TRUE, dmax ) raostand(x, d, propx = TRUE, dmax)
x |
Community data. |
d |
Phylogenetic distances or other dissimilarities among species. |
na.rm |
Should missing values be removed? |
dmax |
Scale dissimilarities by |
method |
Distance measure to be used (see Details). |
propx |
Standardize row of |
Rao's quadratic entropy is ,
where is the index of the sampling unit, and
are indices of species, is the proportion of
species in the sampling unit, and are distances among
species. The distances should be scaled to range , and
they are divided by the observed maximum if this exceeds
1. Alternatively, the distances are divided by argument dmax
instead of data maximum. Distances that are longer than
dmax are truncated to the maximum value. The square roots of
distances should be Euclidean, but this is not verified. They are
Euclidean if there are no negative eigenvalues in the principal
coordinates analysis, and ade4 package has function
is.euclid for a canned test. If all distances
are 1, species are independent and qrao will return Simpson
diversity.
Function distrao finds distances based on quadratic entropy
for sampling units. Rao (1982) suggested distance ,
where and are Rao entropies for
sites and and is the similar
entropy evaluated so that the species proportions are from
different sampling units. Rao (1982) called this as Jensen
distance, and it is half of the squared Euclidean distance. The
Euclidean distance can also be requested. In addition, Rao (1982)
suggested a standardized distance that is based on logarithms of
elements .
Function raostand standardizes data similarly as is
implicitely done in qrao and raodist. Function
qrao always standardizes rows to unit sum, but this is
optional in raodist and raostand. For standardized
data Z, quadratic entropy is found as 1 -
rowSums(Z^2), and Rao distances can be found via Euclidean
distances of Z. The standardized data allows calculating any
generic community dissimilarity, and using Z in
rda allows performing phylogenitically
constrained RDA. The standardization does not preserve absences,
but zero abundances will be boosted to positive values when the
sampling unit has related species. More details can be found in
vignette "raostand".
qrao returns a vector of Rao's quadratic entropy
values and distrao distances of class "dist".
raostand returns data standardized so that it is
equivalent to the internal form used in qrao or
distrao, but can be used in any analysis.
Jari Oksanen
Rao, C.R. (1982) Diversity and dissimilarity coefficients: a unified approach. Theoretical Population Biology 21, 24–43.
There are other implementations of this function in
R. Most notably functions divc and
disc in ade4. However, these may
square input dissimilarities and divide results by 2 depending
on options. Function taxondist provides Clarke's
taxonomic dissimilarity.
if (require(vegan)) { data(dune, dune.phylodis) qrao(dune, dune.phylodis) tabasco(dune, hclust(distrao(dune, dune.phylodis)), hclust(dune.phylodis)) ## regard lineages completely distinct beyond K/T (K/Pg) limit qrao(dune, dune.phylodis, dmax = 65.2) } ## Rao standardization ## Phylogenetic diversity data(dune, dune.phylodis, dune.env) Z <- raostand(dune, dune.phylodis) all.equal(1 - rowSums(Z^2), qrao(dune, dune.phylodis), check.attributes = FALSE) ## Phylogenetic distance all.equal(dist(Z), distrao(dune, dune.phylodis, method="euclidean"), check.attributes = FALSE) ## plot with standardized values tabasco(Z, hclust(dist(Z)), hclust(dune.phylodis)) ## phylogenetic polar ordination pol <- polarord(vegdist(Z)) sppscores(pol) <- Z plot(pol, optimize = TRUE) ## Phylogenetically constrainted RDA mod <- rda(raostand(dune, dune.phylodis, propx = FALSE, dmax = 65.2) ~ Management + Moisture, dune.env) anova(mod, by = "margin") plot(mod, scaling = "sites", spe.par = list(optimize = TRUE, xpd = TRUE))if (require(vegan)) { data(dune, dune.phylodis) qrao(dune, dune.phylodis) tabasco(dune, hclust(distrao(dune, dune.phylodis)), hclust(dune.phylodis)) ## regard lineages completely distinct beyond K/T (K/Pg) limit qrao(dune, dune.phylodis, dmax = 65.2) } ## Rao standardization ## Phylogenetic diversity data(dune, dune.phylodis, dune.env) Z <- raostand(dune, dune.phylodis) all.equal(1 - rowSums(Z^2), qrao(dune, dune.phylodis), check.attributes = FALSE) ## Phylogenetic distance all.equal(dist(Z), distrao(dune, dune.phylodis, method="euclidean"), check.attributes = FALSE) ## plot with standardized values tabasco(Z, hclust(dist(Z)), hclust(dune.phylodis)) ## phylogenetic polar ordination pol <- polarord(vegdist(Z)) sppscores(pol) <- Z plot(pol, optimize = TRUE) ## Phylogenetically constrainted RDA mod <- rda(raostand(dune, dune.phylodis, propx = FALSE, dmax = 65.2) ~ Management + Moisture, dune.env) anova(mod, by = "margin") plot(mod, scaling = "sites", spe.par = list(optimize = TRUE, xpd = TRUE))
Similar to Fortran implementation in vegan, but all in R.
rdecorana( x, iweigh = 0, iresc = 4, ira = 0, mk = 26, short = 0, before = NULL, after = NULL ) ## S3 method for class 'rdecorana' scores(x, display = "sites", choices = 1:4, origin = FALSE, ...)rdecorana( x, iweigh = 0, iresc = 4, ira = 0, mk = 26, short = 0, before = NULL, after = NULL ) ## S3 method for class 'rdecorana' scores(x, display = "sites", choices = 1:4, origin = FALSE, ...)
x |
|
iweigh |
Downweighting of rare species (0: no). |
iresc |
Number of rescaling cycles (0: no rescaling). |
ira |
Type of analysis with numeric values |
mk |
Number of segments in classic detrending. |
short |
Shortest gradient to be rescaled. |
before, after
|
Definition of Hill's piecewise transformation. |
display |
Scores for |
choices |
Axes to be returned. |
origin |
Return centred scores. |
... |
Other arguments passed to functions. |
This function mimics vegan function
decorana, but is written in R. It is
slower, and does not have all features and support of the
vegan function, and there usually is no need to use this
function for data analysis. The function serves two
purposes. First, as an R function it is easier to inspect the
algorithm than from C or Fortran code. Second, it is more hackable,
and it is easier to develop new features, change code or replace
functionality than in the compiled code. For instance, it would be
trivial to add Detrended Constrained Correspondence Analysis, but
this would be impossible without extensive changes in Fortran in
the vegan function.
The main function rdecorana performs janitorial tasks, and
for detrended CA it has basically only two loops. The first loop
finds the next correspondence analysis axis for species and sites,
and the second loop performs optional (if iresc > 0 and
ira != 1) nonlinear rescaling of that axis.
Two non-exported functions are used for finding the correspondence
analysis axis: transvu runs one step of reciprocal averaging
and the step is repeated so many times that the criterion value
converges. Function transvu calls detrend that
performs detrending. The default is to use Hill's non-linear
rescaling on mk segments, but this can be substituted with
orthogonalization for classic correspondence analysis,
a.k.a. reciprocal averaging. It is also easy to add new alternative
detrending methods that are not a part of
decorana (see function natto:::detrend
for examples). Detrending against previous axis is done one axis
by time starting from first, and then going again down to the
first. For axis 4 the order of detrendings is 1, 2, 3, 2, 1 (in
orthogonal correspondence analysis with ira=1 the going down
is skipped). The criterion value is the one that vegan
decorana calls ‘Decorana values’: for
orthogonal CA it is the eigenvalue, but for detrended CA it is a
combination of eigenvalues and strength of detrending.
Non-linear rescaling is performed by function stretch that
calls two functions: segment to estimate the dispersion of
species, and smooth121 to smooth those estimates on segments
(that number of segments is selected internally and mk has
no influence). The criterion values for community data with
species and site scores are based on
summarized over sites
. Site scores are weighted averages of species
scores , and therefore species scores should be dispersed
symmetrically around corresponding site scores and the statistic
describes their weighted dispersion. The purpose of rescaling is to
make this dispersion 1 all over the axis. The procedure is
complicated, and it is best inspected from the code (which is
commented).
Currently returns a list of elements evals of
Decorana values, with rproj and cproj of scaled
row and column scores.
Row and column scores and the convergence criterion which for orthogonal CA is the eigenvalue.
Jari Oksanen
For serious purposes, use decorana in
vegan. Function gradrescale rescales
observed gradients similarly as decorana rescales axes.
data(spurn) (mod <- rdecorana(spurn)) ## current detrending methods natto:::detrend if (require(vegan)) { ## compare to decorana decorana(spurn) ## Comparison of detrending choices op <- par(mfrow=c(2,2), mar=c(4,4,1,1)) ordiplot(rdecorana(spurn)) # default: DCA ordiplot(rdecorana(spurn, ira=2)) # quadratic DCA ordiplot(rdecorana(spurn, ira=3)) # smooth DCA ordiplot(rdecorana(spurn, ira=4)) # smoothing spline with CV par(op) }data(spurn) (mod <- rdecorana(spurn)) ## current detrending methods natto:::detrend if (require(vegan)) { ## compare to decorana decorana(spurn) ## Comparison of detrending choices op <- par(mfrow=c(2,2), mar=c(4,4,1,1)) ordiplot(rdecorana(spurn)) # default: DCA ordiplot(rdecorana(spurn, ira=2)) # quadratic DCA ordiplot(rdecorana(spurn, ira=3)) # smooth DCA ordiplot(rdecorana(spurn, ira=4)) # smoothing spline with CV par(op) }
Community is divided into given number of pieces, and randomly
rarefied community is derived for each of these pieces using
rrarefy. The final community is a sum of
rarefied pieces. The purpose is to mimick sampling variability of
the observed community. However, the process loses rare species,
and generated communities have lower species richness and diversity
than the original one.
recomm(x, pieces = 4)recomm(x, pieces = 4)
x |
communities to be rebuilt |
pieces |
number of pieces used to rebuild the community |
Randomized community data.
Jari Oksanen
Function finds residual species association after fitting the
constraints (and conditions) in constrained ordination
(cca, rda).
resassoc.cca(x, rank)resassoc.cca(x, rank)
x |
|
rank |
Number of unconstrained ordination axes that are used
to compute the species associations. If missing, the rank is
chosen by brokenstick distribution
( |
The residual unconstrained axes can be used to assess if there is
any important unexplained variation in constrained ordination. This
approach was proposed already in the first papers on constrained
ordination (ter Braak 1986). However, this has been rarely done,
mainly because there are no easily available tools to assess the
unconstrained axes. Bayesian analysis of species communities
(Tikhonov et al. 2020) is conceptually similar to constrained
ordination. There the random effects are analysed via
correlated species responses that are interpreted to be caused by
unknown environmental variables or residual inter-species
association. The random effects are found as Bayesian latent
factors which are conceptually similar to residual axes in
constrained ordination (although these are maximum likelihood
principal components). It is customary represent these residual
correlations as ordered correlation matrix (Tikhonov et al
2017). This function provides similar tools for constrained
correpsondence analysis (CCA) and redundancy analysis (RDA) as
perfomed in vegan functions cca and
rda.
In Bayesian framework, we only try to extract a low number of
latent factors needed to describe correlated random variation. In
constrained ordination, all residual variation is accounted for by
residual axes. For meaningful analysis, we should only look at some
first axes which may reflect systematic unexplained variation. With
all axes, the residual correlations are usually nearly zero and
systematic variation is masked by non-systematic noise. Therefore
we should only have a look at some first axes. Users can either
specify the number of axes used, and the default is to use
brokenstick distribution (vegan function
(bstick) to assess the number of axes that
have surprisingly high eigenvalues and may reflect unaccounted
systematic variation. The function returns inter-species
associations as correlation-like numbers, where values nearly zero
mean non-associated species. These values can be plotted in
correlation plot or analysed directly. The correlations are
directly found from the residual ordination axes.
It is an attractive idea to interpret inter-species association to describe ecological interactions among species (Tikhonov et el. 2017). This indeed is possible, but there are other alternative explanations, such as missing environmental variables or modelling errors (model formula, scaling of species, scaling of environmental variables, assumptions on the shape of species response etc.). You should be cautious in interpreting the results (and the same applies also to the analogous Bayesian model).
Correlation-like association matrix between species with
added argument "rank" of the unconstrained data used to
compute associations.
Jari Oksanen
ter Braak, C.J.F. (1986) Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology 67, 1167–1179.
Tikhonov, G., Abrego, N., Dunson, D. and Ovaskainen, O. (2017) Using joint species distribution models for evaluating how species-to-species associations depend on the environmental context. Methods in Ecology and Evolution 8, 443- 452.
Tikhonov et al. (2020) Joint species distribution modelling with the R-package Hmsc. Methods in Ecology and Evolution 11, 442–447.
Hmsc has analogous function
computeAssociations. However, in Hmsc the
associations are based on Bayesian latent factors which are an
essential natural component of the analysis whereas here we
apply post-analysis tricks to extract something similar.
library(vegan) data(dune, dune.env) mod <- cca(dune ~ A1 + Moisture, dune.env) resassoc.cca(mod)library(vegan) data(dune, dune.env) mod <- cca(dune ~ A1 + Moisture, dune.env) resassoc.cca(mod)
Species distribution models (SDM) often use methods that give continuous fitted responses for the probability of species occurrence. Some users want to find a threshold that splits these continuous responses to binary outcomes of presence and absence (Allouche, Tsoar & Cadmon 2006). This function finds a threshold cutoff that maximizes the explained deviance and gives as true approximation of the original continuous response as possible.
responsecut(object)responsecut(object)
object |
Fitted |
The function evaluates the deviance of the original continuous
model by setting a cutoff at each fitted value and evaluating
the deviance explained with this threshold. The threshold is
inclusive: values at or above the threshold are regarded as being
hits. The deviance profile is usually jagged, and the method is
sensitive to single data points, and can give disjunct regions of
nearly equally good response cutoffs. Function plot will
display the deviance profile, and summary lists all
thresholds that are nearly as good as the best cutoff using
Chi-square distribution with 1 degree of freedom at as
the criterion.
The threshold model has two values or average responses below and at or above the cutoff, but these fitted threshold responses are not usually 0 and 1. However, they are the values that maximize the explained deviance of a two-value model.
The summary also gives a Deviance table showing the original
Null deviance, the original model deviance, and between these the
residual deviance with the binary threshold, and the differences of
these deviances in the second model. The result object can also be
accessed with coef that returns the cutoff value of
fitted response, deviance that returns the residual
deviance, and fitted that returns the fitted two
values for each original observation.
The function returns an object of class "responsecut" with
following elements:
coefficients: optimal splitting threshold of fitted values.
expl.deviance: explained deviance
deviance: residual deviance at threshold.
cutoff, devprofile: sorted possible thresholds (i.e.,
estimated fitted values) and associated explained deviances.
formula, null.deviance, orig.deviance:
formula, null deviance and deviance of the input
model.
values: fitted values below and above threshold.
fitted: fitted values for each observation.
Jari Oksanen
Allouche, O., Tsoar, A. & Kadmon, R. (2006) Assessing the accurraccy of species distribution models: prevelance, kappa and the true skill statistic (TSS). Journal of Applied Ecology 43, 1223–1232.
## Gaussian response of Bauera rubioides on Mt Field data(mtfield5) mod <- glm(Bauerubi ~ poly(Altitude, 2), data=mtfield5, family=binomial) thr <- responsecut(mod) thr summary(thr) plot(Bauerubi ~ Altitude, mtfield5) lines(fitted(mod) ~ Altitude, mtfield5, lwd=2) # data ordered by Altitude lines(fitted(thr) ~ Altitude, mtfield5, type="s", col=2) abline(h = coef(thr), lty=2, col=2) legend("topright", c("Gaussian model", "Threshold model", "Cutoff level"), lty=c(1,1,2), col=c(1,2,2), lwd=c(2,1,1)) plot(thr)## Gaussian response of Bauera rubioides on Mt Field data(mtfield5) mod <- glm(Bauerubi ~ poly(Altitude, 2), data=mtfield5, family=binomial) thr <- responsecut(mod) thr summary(thr) plot(Bauerubi ~ Altitude, mtfield5) lines(fitted(mod) ~ Altitude, mtfield5, lwd=2) # data ordered by Altitude lines(fitted(thr) ~ Altitude, mtfield5, type="s", col=2) abline(h = coef(thr), lty=2, col=2) legend("topright", c("Gaussian model", "Threshold model", "Cutoff level"), lty=c(1,1,2), col=c(1,2,2), lwd=c(2,1,1)) plot(thr)
Function sdca is similar to decorana
and rdecorana, but instead of detrending by segments
it uses loess for smooth non-linear detrending.
sdca(Y, iweigh = FALSE, pairwise = FALSE, monitor = FALSE, ...)sdca(Y, iweigh = FALSE, pairwise = FALSE, monitor = FALSE, ...)
Y |
Input data. |
iweigh |
Downweight rare species. |
pairwise |
Detrend axis |
monitor |
Turn on graphical monitoring of detrending for each axis. |
... |
Other arguments (passed to |
Detrended correspondence analysis (DCA) as implemented in
decorana and rdecorana tries to
remove all systematic biases between ordination axes by detrending
later axes against previous ones (Hill & Gauch 1980). DCA uses an
ingenious method of detrending by axis segments to allow removing
non-linear dependencies. Detrending means taking residuals against
smoothed segment means as the new ordination scores for the current
axis. It has been suggested that abrupt changes at segment borders
can cause some problems in DCA. The current function replaces
segmented detrending with detrending against smooth
loess functions. However, in many cases this changes
the results little from the original detrending by segments.
The detrending for axes 3 and 4 is perfomed either using all
previous axes simultaneously in loess (default) or if
pairwise=TRUE performing sequential separate detrending
against each previous axis separately so that both the first and
last detrending are performed against axis 1 similarly as in the
original decorana. For axis 3 the sequence is against axes
1, 2, 1 and for axis 4 against axes 1, 2, 3, 2, 1.
The decorana software made several other
innovations than detrending (it may be easier to inspect the code
in rdecorana written in R). Rescaling of axes is
often more influential in application than the actual detrending,
like you can see by using decorana without
rescaling.
Function returns a subset of decorana
result object and can use many decorana methods (such as
plot).
Jari Oksanen
Hill, M.O. and Gauch, H.G. (1980). Detrended correspondence analysis: an improved ordination technique. Vegetatio 42, 47–58.
data(spurn) mod <- sdca(spurn) mod plot(mod, display = "sites") plot(mod, display = "species", optimize = TRUE) ## compare against rdecorana without rescaling mod0 <- rdecorana(spurn, iresc = 0) mod0 if (require(vegan)) { plot(procrustes(mod0, mod, choices=1:2)) }data(spurn) mod <- sdca(spurn) mod plot(mod, display = "sites") plot(mod, display = "species", optimize = TRUE) ## compare against rdecorana without rescaling mod0 <- rdecorana(spurn, iresc = 0) mod0 if (require(vegan)) { plot(procrustes(mod0, mod, choices=1:2)) }
Functions fit linear regression on distances within blocks and across blocks and compere their slopes using permutation tests.
slopediff(ydist, xdist, block, nperm = 999) slopediff2(ydist, xdist, block, nperm = 999) ## S3 method for class 'slopediff' plot(x, col = 1, obsline = TRUE, ...) ## S3 method for class 'slopediff' summary(object, ...)slopediff(ydist, xdist, block, nperm = 999) slopediff2(ydist, xdist, block, nperm = 999) ## S3 method for class 'slopediff' plot(x, col = 1, obsline = TRUE, ...) ## S3 method for class 'slopediff' summary(object, ...)
ydist, xdist
|
Dependent and indpendent dissimilarities. |
block |
Blocking of sites. |
nperm |
Number of simple permutations. |
x |
Result output from |
col |
Plotting colour. |
obsline |
Draw line on observed statistic in the
|
... |
Other parameters passed to |
object |
Result object from |
Functions are intended to analyse the differences of linear Mantel
regression within and across blocks. Typically the dependent
dissimilarities (ydist) are based on community data,
independent dissimilarities (xdist) are based on
environmental data, and block defines subsets of data, such
as geographical subareas.
Functions slopediff and slopediff2 differ in their
permutation models. Function slopediff permutes the
classification vector defining blocks, and keeps the
dependent (ydist) and independent (xdist) pairs of
dissimilarities fixed. Function slopediff2 permutes
dependent data, and keeps blocks and dependent distances
fixed.
Functions return the differences of regression coefficients (slopes).y
Jari Oksanen
Shimwell (1971) used the Spurn Point Dune Scrub data as an example well suited for Braun-Blanquet tabular rearrangement method.
data("spurn")data("spurn")
A data frame with 20 sample plots on the following 40 species.
ElaerhamElaeagnus rhamnoides (as Hippophae)
JacovulgJacobaea vulgaris (as Senecio jacobaea)
SoladulcSolanum dulcamara
RubufrutRubus fruticosus s.lat.
UrtidioiAUrtica dioica
RumecrisRumex crispus
ClayperfBClaytonia perfoliata (as Montia)
StelmediBStellaria media
FestrubrCFestuca rubra
ElymrepeCElymus repens (as Agropyron)
AmmoarenCAmmophila arenaria
SoncarveSonchus arvensis
OnonspinOnonis spinosa (as O. repens)
GaliveruGalium verum
CalysoldCalystegia soldanella
PoapratCPoa pratensis
AgrostolAgrostis stolonifera
RanubulbCRanunculus bulbosus
PlanlancCPlantago lanceolata
VerochamVeronica chamaedrys
EpilanguEpilobium angustifolium (as Chamaenerion)
CerafontBCerastium fontanum (as C. vulgatum)
SambnigrSambucus nigra
CirsvulgBCirsium vulgare
HerasphoHeracleum sphondylium
InulconyInula conyza
CardhirsBCardamine hirsuta
HyporadiHypochaeris radicata
ArrhelatArrhenatherum elatius
SoncaspeSonchus asper
EurhpraeAEurhynchium praelongum
HypncuprHypnum cupressiforme
BracrutaBrachythecium rutabulum
GeasfornBGeastrum fornicatum
BracalbiCBrachythecium albicans
BryuinclBryum inclinatum
SyntruraSyntrichia ruralis (as Tortula ruraliformis)
CladranfCladonia rangiformis
BovinigrBovista nigrescens
LophheteLophocolea heterophylla
The species abundances were visually assessed with Braun-Blanquet scale values +, 1 ... 5 which are presented in numeric scale 1 ... 6 in the data set.
Shimwell (1971) used the data set to demonstrate the Braun-Blanquet tabular rearrangement method. He divided the data into three classification units marked as A, B and C, and removed one data set marked as X. The sample plots are named with consecutive numbers after a letter corresponding to these classes. The diagnostic taxa have a corresponding upper case letter after their name.
Shimwell, D. W. (1971) Description and Classification of Vegetation (Sidgwick & Jackson), Table 44.
data(spurn)data(spurn)
Extract Subdiagonal from Dissimilarities or from a Square Matrix
subdiag(x)subdiag(x)
x |
Square matrix or a |
vector of subdiagonal values.
Jari Oksanen
Poisson random numbers lose species as they introduce new zeros for low expected values. They are also unable to compensate this by generating new "unseen" species. To compensate this, we first use Swan transformation with one pass to generate above-zero expected values for unseen species. After generating the Swan probabilities for missing species we standardize all species to their original totals; this is a similar idea as the species coverage in the Good-Turing model (Good 1953). Finally, a community is generated as a Poisson random variate of adjusted observed abundances and Swan probabilities. The process maintains average species richness and diversity, but Poisson distribution probably underestimates the real sampling variance.
swanrandom(x)swanrandom(x)
x |
community data of integer counts |
Randomized community matrix.
Jari Oksanen
Good, I.J. (1953) The population frequencies of species and the estimation of population parameters. Biometrika 40, 237–264.
Taxonomic dissimilarity takes into account related species when there is no exact species match in compared communities. This function returns generalizations of Bray-Curtis and Kulczynski indices (Clarke _et al._ 2006).
taxondist(x, d, method = c("gamma", "theta"), dmax)taxondist(x, d, method = c("gamma", "theta"), dmax)
x |
Community data; will be treated as binary presence/absence matrix. |
d |
Taxonomic, phylogenetic or other dissimilarities among
species (columns of |
method |
Type of returned dissimilarity index. Gamma is generalized Bray-Curtis, and Theta generalized Kulczynski index. |
dmax |
Scale dissimilarities by |
When a species occurs only in one of two compared communities, the species will increase dissimilarity of two communities with one species. However, if the other communities contains related species, the increase will be less depending on the relatedness of species (Clarke _et al._ 2006). The degree of relatedness is defined by taxonomic or other taxon dissimilarities.
The current function generalizes Bray-Curtis and Kulczynski
indices (see canneddist) for binary
(presence-absence) data by taking the dissimilarity to most
closely related species as the increase of dissimilarity,
instead of 1 of basic index. If species dissimilarities have
any values over 1, they will be scaled by observed
maximum. Alternatively, user can specify maximum species
dissimilarity (argument 'dmax') for scaling, and if it is
lower than data maximum, higher values will be truncated to
scaled value 1. This allows imposing stricter concept of
relatedness so that, say, taxa in different orders will be
regarded as completely unrelated.
Although the function was proposed and is named as taxonomic dissimilarity, the function can be used with phylogenetic, functional or trait dissimilarities.
Function distrao provides an alternative index
that can handle quantitative data and produce indices related
to Euclidean distance.
Clarke's taxonomic dissimilarity index as defined in
method.
Jari Oksanen
Clarke, K.R., Somerfield, P.J. & Chapman, M.G. (2006). On resemblance measures for ecological studies, including taxonomic dissimilarities and a zero-adjusted Bray-Curtis coefficient for denuded assemblages. _J. Exp. Marine Biol. & Ecol._ 330, 55-80.
distrao is an alternative taxonomic distance
measure. The index is closely related to Clarke's taxonomic
diversity indices which are available in vegan function
taxondive. Vegan function
taxa2dist can be used to find taxonomic
dissimilarities from classification table.
if (require(vegan)) { data(dune, dune.phylodis) ## phylogenetic data, but regard lineaages completely distict ## beyond K/T (K/Pg) age limit d <- taxondist(dune, dune.phylodis, dmax = 65.2) ord <- polarord(d) sppscores(ord) <- dune plot(ord) cl <- hclust(d) tabasco(dune, cl, hclust(dune.phylodis)) }if (require(vegan)) { data(dune, dune.phylodis) ## phylogenetic data, but regard lineaages completely distict ## beyond K/T (K/Pg) age limit d <- taxondist(dune, dune.phylodis, dmax = 65.2) ord <- polarord(d) sppscores(ord) <- dune plot(ord) cl <- hclust(d) tabasco(dune, cl, hclust(dune.phylodis)) }