Parallel k-Means Clustering for Quantitative Ecoregion Delineation Using Large Data Sets

  • Authors: Kumar, Jitendra; Mills, Richard T.; Hoffman, Forrest M; Hargrove, William W
  • Publication Year: 2011
  • Publication Series: Scientific Journal (JRNL)
  • Source: Procedia Computer Science 4:1602-1611

Abstract

Identification of geographic ecoregions has long been of interest to environmental scientists and ecologists for identifying regions of similar ecological and environmental conditions. Such classifications are important for predicting suitable species ranges, for stratification of ecological samples, and to help prioritize habitat preservation and remediation efforts. Hargrove and Hoffman [1, 2] have developed geographical spatio-temporal clustering algorithms and codes and have successfully applied them to a variety of environmental science domains, including ecological regionalization; environmental monitoring network design; analysis of satellite-, airborne-, and ground-based remote sensing, and climate model-model and model-measurement intercomparison. With the advances in state-of-the-art satellite remote sensing and climate models, observations and model outputs are available at increasingly high spatial and temporal resolutions. Long time series of these high resolution datasets are extremely large in size and growing. Analysis and knowledge extraction from these large datasets are not just algorithmic and ecological problems, but also pose a complex computational problem. This paper focuses on the development of a massively parallel multivariate geographical spatio-temporal clustering code for analysis of very large datasets using tens of thousands processors on one of the fastest supercomputers in the world.

  • Citation: Kumar, Jitendra; Mills, Richard T.; Hoffman, Forrest M.; Hargrove, William W 2011. Parallel k-Means Clustering for Quantitative Ecoregion Delineation Using Large Data Sets. Procedia Computer Science 4:1602-1611.
  • Keywords: ecoregionalization, k-means clustering, data mining, high performance computing
  • Posted Date: August 4, 2011
  • Modified Date: August 8, 2011
  • Print Publications Are No Longer Available

    In an ongoing effort to be fiscally responsible, the Southern Research Station (SRS) will no longer produce and distribute hard copies of our publications. Many SRS publications are available at cost via the Government Printing Office (GPO). Electronic versions of publications may be downloaded, printed, and distributed.

    Publication Notes

    • This article was written and prepared by U.S. Government employees on official time, and is therefore in the public domain.
    • Our online publications are scanned and captured using Adobe Acrobat. During the capture process some typographical errors may occur. Please contact the SRS webmaster if you notice any errors which make this publication unusable.
    • To view this article, download the latest version of Adobe Acrobat Reader.