Clustering

Topics

  • [AM-02-009] Classification and Clustering

    Classification and clustering are often confused with each other, or used interchangeably. Clustering and classification are distinguished by whether the number and type of classes are known beforehand (classification), or if they are learned from the data (clustering). The overarching goal of classification and clustering is to place observations into groups that share similar characteristics while maximizing the separation of the groups that are dissimilar to each other. Clusters are found in environmental and social applications, and classification is a common way of organizing information. Both are used in many areas of GIS including spatial cluster detection, remote sensing classification, cartography, and spatial analysis. Cartographic classification methods present a simplified way to examine some classification and clustering methods, and these will be explored in more depth with example applications.

  • [AM-03-058] Hot Spots and Getis-Ord Gi* Analysis

    A common goal in spatial analysis is the identification of regions containing unusually high or low values. These areas may be called hot spots if the values are high and cold spots if the values are low. These hot/cold spots indicate where the effects of spatial heterogeneity are greatest. Point density, heat, and choropleth maps all highlight these areas in one way or another. However, due to the limitations of subjective symbolization, statistical methods of hot spot detection are common. Some, like Moran’s I, simply identify the pattern for the entire study area. Local methods display the location and magnitude of individual high and low clusters. Getis-Ord Gi* analysis is the local method most associated with the term hot spots and it is the focus of the second half of the article. Getis-Ord Gi* combines the logic of a probability map with moving windows, kernels and/or adjacency weights. The result is an output surface showing neighborhoods with means significantly above or below the global mean. A primary concern is the correct parameterization, especially the correct conceptualization of spatial relationships. Spatiotemporal variants, limitations, and future directions of hot spot analysis are briefly discussed.