Consistency of clustering analysis of complex 3D ocean datasets

Millington, R, Partridge, D, Powley, HR, Lessin, G, Moffat, D and Blackford, JC 2025 Consistency of clustering analysis of complex 3D ocean datasets. Ecological Informatics, 93. 103586. 10.1016/j.ecoinf.2025.103586

[thumbnail of 1-s2.0-S1574954125005953-main (1).pdf]
Preview
Text
1-s2.0-S1574954125005953-main (1).pdf - Published Version
Available under License Creative Commons Attribution.

Download (5MB) | Preview
Official URL: https://doi.org/10.1016/j.ecoinf.2025.103586

Abstract/Summary

Rapid advancement of machine learning and artificial intelligence is enabling new analysis techniques to be applied across all fields of scientific research. To aid analysis of the physical or biogeochemical characteristics of the ocean, marine systems have been subdivided into spatial regions where properties exhibit similar distributions or behaviour, such as the Longhurst provinces. Machine learning techniques enable the identification of spatial regions in a robust and transferable way. In this paper we drive clustering algorithms with a variety of input datasets to assess the consistency of resulting clusters. We compare the results of clustering analyses applied separately to physical, biogeochemical and ecological variables at different depths, using model output from a 3D hydrodynamical-biogeochemical model (NEMO-ERSEM) on the Northwest European shelf. Clustering outcomes depended on both the variables and depths input into the algorithm, although some similarities still existed in spatial patterns between each clustering analysis, e.g. clusters were smaller near the coast and relatively extensive in the open ocean. Clusters based on physical properties showed latitudinal distribution, while biogeochemical and ecological inputs resulted in a higher concentration of clusters near the coast. Results from depth-averaged and near-bottom inputs were similar and followed the limits of the shelf-edge, unlike clusters based on surface inputs. Overall, clustering algorithms offer a useful method to define spatial regions with similar characteristics, however, our results emphasise that input data choices should be carefully considered. Our results provide a knowledge foundation which can help future researchers make informed decisions when applying clustering to complex datasets.

Item Type: Publication - Article
Additional Keywords: Machine learning Clustering Biogeochemical model Ocean model Northwest European shelf Marine ecosystem
Divisions: Plymouth Marine Laboratory > National Capability categories > National Capability Modelling
Plymouth Marine Laboratory > Science Areas > Environmental Intelligence
Depositing User: S Hawkins
Date made live: 12 Mar 2026 13:36
Last Modified: 12 Mar 2026 15:02
URI: https://plymsea.ac.uk/id/eprint/10586

Actions (login required)

View Item View Item