Clustering in non-parametric multivariate analyses.

Clarke, KR; Somerfield, PJ; Gorley, RN. 2016 Clustering in non-parametric multivariate analyses.. Journal of Experimental Marine Biology and Ecology, 483. 147-155. 10.1016/j.jembe.2016.07.010

[img]
Preview
Text
JEMBE-S-16-00318.pdf - Submitted Version

Download (1MB) | Preview

Abstract/Summary

Non-parametric multivariate analyses of complex ecological datasets are widely used. Following appropriate pre-treatment of the data inter-sample resemblances are calculated using appropriate measures. Ordination and clustering derived from these resemblances are used to visualise relationships among samples (or variables). Hierarchical agglomerative clustering with group-average (UPGMA) linkage is often the clustering method chosen. Using an example dataset of zooplankton densities from the Bristol Channel and Severn Estuary, UK, a range of existing and new clustering methods are applied and the results compared. Although the examples focus on analysis of samples, the methods may also be applied to species analysis. Dendrograms derived by hierarchical clustering are compared using cophenetic correlations, which are also used to determine optimum  in flexible beta clustering. A plot of cophenetic correlation against original dissimilarities reveals that a tree may be a poor representation of the full multivariate information. UNCTREE is an unconstrained binary divisive clustering algorithm in which values of the ANOSIM R statistic are used to determine (binary) splits in the data, to form a dendrogram. A form of flat clustering, k-R clustering, uses a combination of ANOSIM R and Similarity Profiles (SIMPROF) analyses to determine the optimum value of k, the number of groups into which samples should be clustered, and the sample membership of the groups. Robust outcomes from the application of such a range of differing techniques to the same resemblance matrix, as here, result in greater confidence in the validity of a clustering approach.

Item Type: Publication - Article
Additional Keywords: Non-parametric multivariate; divisive clustering; flat clustering; SIMPROF; cophenetic correlation; cophenetic distance
Subjects: Data and Information
Ecology and Environment
Marine Sciences
Divisions: Plymouth Marine Laboratory > Science Areas > Marine Life Support Systems
Depositing User: Dr Paul J Somerfield
Date made live: 19 Sep 2016 12:53
Last Modified: 06 Jun 2017 16:16
URI: http://plymsea.ac.uk/id/eprint/7133

Actions (login required)

View Item View Item