Local multivariate statistical models are increasingly encountered in geographical research to estimate spatially varying relationships between a response variable and its associated predictor variables. In geography and many other disciplines, such models have been largely embedded within the framework of regression and can reveal significantly more information about the determinants of observed spatial distribution of the dependent variable than their global regression model counterparts. This section introduces one type of local statistical modeling framework: Geographically Weighted Regression (GWR). Models within this framework estimate location-specific parameter estimates for each covariate, local diagnostic statistics, and bandwidth parameters as indicators of the spatial scales at which the modeled processes operate. These models provide an effective means to estimate how the same factors may evoke different responses across locations and by so doing, bring to the fore the role of geographical context on human preferences and behavior.
Sachdeva, M., and Fotheringham, A. S. (2020). The Geographically Weighted Regression Framework. The Geographic Information Science & Technology Body of Knowledge (4th Quarter 2020 Edition), John P. Wilson (Ed.). DOI: 10.22224/gistbok/2020.4.7.
This Topic is also available in the following editions:
DiBiase, D., DeMers, M., Johnson, A., Kemp, K., Luck, A. T., Plewe, B., and Wentz, E. (2006). Spatial Expansion and Geographically Weighted Regression. The Geographic Information Science & Technology Body of Knowledge. Washington, DC: Association of American Geographers.
1. Local Statistical Modeling Frameworks
A fundamental aspect of geographical research is to understand processes operating between people, objects and events through an examination of their observed spatial patterns (Harvey, 1968). The fact that an attribute value varies across space is taken for granted within most studies - it is extremely rare to uncover some measurement of our physical and social environments that is constant across space. Various methods have been developed to investigate aspects of spatial variation of an observed variable such as fractal analysis, variograms to analyze spatial structure, local models of spatial autocorrelation and wavelets (Lloyd, 2006). While it is interesting to examine spatial variation in data, it is often of more value to ask why such variation exists. Exploring this question leads to a need to understand the underlying processes causing the observed spatial pattern. Traditional global statistical models assume that these processes are constant over space so that variations in y are assumed to be caused solely by variations in one or more of the covariates which affect y. Local models relax this assumption and allow the possibility that these processes vary over space. Such models hence quantify not only how each covariate can affect y but also if this effect varies spatially. In this sense, as shown in Figure 1, traditional regression models such as the Ordinary Least Squares (OLS) regression, which estimate a single, average coefficient for each relationship are special cases of the more general local model formulation and only apply when relationships are stationary over space.
Figure 1. Local trends revealed using local statistical models (right) which are otherwise masked by ‘averaged’ global trends (left). Source: authors.
This section of the GIS&T Book of Knowledge focuses on the Geographically Weighted Regression (GWR) framework - the GWR model and its recent multiscale extension, Multiscale Geographically Weighted Regression (MGWR) (Brunsdon et al., 1996; Fotheringham et al., 1996; Fotheringham et al., 2003; Fotheringham et al., 2017; LeSage, 2004). Other local modeling frameworks such as eigenvector spatial filter-based local regression (SFLR) and Bayesian spatially-varying coefficient (SVC) models (Banerjee et al., 2014; Gelfand et al., 2003; Murakami et al., 2017) are not discussed but comparisons between GWR and these other frameworks can be found in (Oshan & Fotheringham, 2018; Wolf et al., 2018).
2. Spatially Varying Processes and the Concept of Data-borrowing
In the context of the GWR framework, we define processes as the conditional relationships between the dependent variable (y) and independent variables (x). Where global regression models, such as OLS and some spatial regression models, assume these relationships to be constant over space, this assumption is relaxed in the GWR framework. For example, a traditional global regression model linking variations in crime rates across an urban area to variations in income levels would produce a single parameter estimate to establish the effect of income on crime rates. It would be assumed that the effect of income on crime rates is constant across the entire study area. However, many social processes, particularly those involving human decisions and behaviors, might vary over space so that in such cases the global model will be misspecified. It might be, for example, that in some parts of the urban area, income variations have a stronger effect on crime rates than in other parts. Models within the GWR framework hence estimate location-dependent relationships between the independent and dependent variables, in effect allowing the processes being modeled to vary across space. For this reason, GWR and some other techniques, such as Anselin’s local indices of spatial associations (Anselin, 1995), have been classified as place-based analytic techniques which respond to the reality of complex sciences (Goodchild, 2009). The insights gained from such techniques can highlight the important role place plays in affecting people’s beliefs, behaviours and decision-making. For example, many studies suggest that preference for type of housing, political affinity and choice of mode of transport depend significantly on location (Chandola et al., 2005; Enos, 2017; Escobar, 2001; Panter et al., 2016; Walker & Li, 2007). Models within the GWR framework account for and quantify such spatial variations in processes.
In order to calibrate a separate model for each location when only one observation of y and each covariate is typically recorded, GWR models borrow data from neighboring observations and weight these data according to a smooth decay function based on either a physical distance or the number of nearest neighbors. Data from nearby locations are weighted (1-0) more highly than are data from more distant locations with the rate of decrease in the weighting determined by a bandwidth parameter which controls the distance, or the nth nearest neighbor, at which weights fall to zero (or approximately zero). Small bandwidths denote more local processes; large bandwidths indicate regional or global processes. As long as an optimal bandwidth is determined in the calibration of the GWR model and some continuous smooth function of distance is used, the specific kernel function chosen is not critical. Several kernel functions have been employed in GWR with the most common being bisquare, Gaussian and exponential functions.
Figure 2: Weighting scheme in GWR with different distance-decay kernel functions (top) and a generic example of the weights at each point (bottom). Source: authors.
Spatial weighting kernels can be defined as ‘fixed’ or ‘adaptive’ (Figure 3 left and right, respectively). Fixed kernels have the same rate of distance-decay for all locations whereas adaptive kernels have different rates of distance-decay depending on the density of data points in the vicinity of the regression point. Depending on the choice of the kernel, the bandwidth is typically defined as either the distance at which weights fall below a certain value (fixed) or the number of nearest neighbors from the regression point which receive a non-zero weight in the local regressions (adaptive). A large bandwidth allows data from locations further from the regression point to be included in the local regressions whereas a small bandwidth restricts data in the local model calibrations to those recorded at locations in close proximity to the regression point.
Figure 3. Conceptual diagram explaining fixed (left) and adaptive weighting (right) schemes. Source: authors.
Conceptually, the optimal bandwidth selection is a trade-off between bias and variance. If the relationships being modeled are spatially varying, greater bias will be introduced into the local parameter estimates as the bandwidth increases because the data borrowed from locations farther from the local regression point will have been produced by increasingly different processes. However, the smaller the bandwidth, the greater will be the uncertainty regarding the local parameter estimates because the local models will have been calibrated with fewer data points. The selection of the bandwidth parameter is based on a statistical optimization criterion that includes a trade-off between model fit and model complexity, such as a corrected Akaike Information Criteria (AICc) (Fotheringham et al., 2003; Yu et al., 2020(a)). Most GWR software packages include several options for the goodness-of-fit criterion used in bandwidth selection but as Yu et al. (2020(b)) demonstrate, minimizing AICc is a very good approximation to finding the optimal trade-off between bias and variance.
The bandwidth parameter is a useful output in GWR calibration because it is an indicator of the spatial scale over which processes operate. A small optimal bandwidth indicates local processes that vary significantly across the study region while a large optimal bandwidth suggests a stable relationship that does not vary much over space.
3. Geographically Weighted Regression Model Specification
A GWR model is calibrated as an ensemble of ordinary least squares regressions, individually estimated at each location where data are observed using the neighboring weighted data. By allowing coefficient estimates to be derived for each location, GWR explicitly incorporates spatial context. The model can be written as:
where yi is the dependent variable, Xji is the jth independent variable, βj (ui,vi) is the jth coefficient at location (ui,vi), and εi is the random error term. Unlike OLS, the parameters are allowed to vary by location (ui,vi).
Since the GWR technique estimates a surface of local parameters for each modeled relationship using overlapping subsets of data, a classic t-test to examine the significance of each local estimate would lead to an excess of false positives being recorded at a given significance level due to multiple hypothesis testing (Williams et al., 1999). The severity of the multiple hypothesis testing issue for GWR is dependent on the optimized bandwidth obtained in the model calibration as shown by da Silva and Fotheringham (2016). To obtain an α-value that accounts for this issue, a correction is applied such that:
where ξ is the expected type I error rate before correction, ENP is the effective number of parameters in GWR which is a function of the optimal bandwidth parameter and p is the number of independent variables in the model (da Silva & Fotheringham 2016). It is important to use this corrected α value to assess the significance of local parameter estimates to avoid the proportion of false positives exceeding α.
A Monte Carlo test can be employed for parameter estimates from MGWR (see below) and GWR to distinguish situations where the observed spatial variation in the parameter surfaces may only be due to noise. This test checks the variation in the estimated parameter surface against the variations in parameter surfaces constructed from randomly arranged data points to examine whether the spatial variation in the estimated surface is significantly higher than that in the randomly constructed surfaces. Such a test is implemented in the software described below.
4. Multiscale Geographically Weighted Regression
Since different relationships might operate at different spatial scales, an emerging trend in local models is to allow the bandwidth parameter to be covariate-specific rather than having a single bandwidth represent the spatial scale at which every relationship being modeled varies. For instance, the recently developed multi-scale extension to GWR, Multiscale GWR (MGWR), allows for a unique bandwidth parameter to be derived for each relationship within a model:
where the subscript bwj indicates the bandwidth used for the calibration of the jth conditional relationship (Fotheringham et al., 2017; Yang, 2014). The MGWR model calibration employs a more complex backfitting algorithm and is therefore more time-consuming to converge but recent advances have led to the model being calibrated on even very large data sets in a manageable time (Li et al., 2019; Li & Fotheringham, 2020). For example, in a model explaining house prices using multiple independent variables such as income, age of a structure, living area and access to public amenities, each affecting the response variable at a unique spatial scale, MGWR can estimate a unique scale for each covariate-specific association (Figure 4).
Figure 4. Example of multiple estimated bandwidths interpreted as the unique scale at which different processes operate. In this example, the data are for house prices in King County, Washington, USA (377 total Census tracts). Source: authors.
Similar to the inference in GWR, the location specific parameter estimates produced in the MGWR model also require a correction to account for multiple dependent hypothesis tests. In this case, the correction in equation (2) is still applied but it is covariate-specific because the value of ENP will vary across covariates as it is controlled by the optimized bandwidth. The software described below outputs a covariate-specific value of ENP as well as a corrected t value for each covariate.
The selection of optimal covariate-specific bandwidths in MGWR is based on a model selection score such as the corrected Akaike Information Criterion (AICc), similar to the selection criterion of a single bandwidth in GWR. As discussed, the bandwidth parameters are either given in number of nearest neighbors or a distance metric and can be compared to assess the relative operational scale of processes across covariates. The MGWR calibration process also produces confidence intervals for covariate specific bandwidths using Akaike weights (Li et al. 2020). MGWR marks a significant advancement over GWR and is described further in Fotheringham et al. (2017), Wolf et al. (2018), and Yu et al. (2020(a)).
MGWR 2.2 software provides a user-friendly, graphical interface for calibrating the GWR and MGWR models and can be downloaded from https://sgsup.asu.edu/sparc/mgwr for both Windows and Mac OS platforms (Figure 5). The types of models that can be fit using the software consist of the Gaussian, Poisson or Logit GWR models and a Gaussian MGWR model. Standard file formats such as comma separated, dbase IV and excel can be used to input data into the software. The software reads the first row of any dataset as variable names by default and populates the variable list. The location variables can be either projected or spherical and are required inputs from the user in the software. The spatial kernel types can be chosen from adaptive bisquare or fixed Gaussian and the optimization criterion has AIC, AICc, Bayesian Information Criterion (BIC) and Cross Validation (CV) as available options. After choosing the dependent and independent variables from the variable list, the user can select the folder for storing the resulting output files. The MGWR 2.2 software produces two output files once a model is successfully run: a summary text file and a .csv file containing georeferenced parameter estimates, their associated t-values, standard errors and local goodness of fit measures. The .csv file can be subsequently joined to the associated shapefile and the results can be mapped using any software such as ArcMap or QGIS. Some advanced options are also available in the software including Monte Carlo tests for spatial variation, tests for local collinearity, confidence intervals for estimated bandwidths and options to change the threshold for convergence of the MGWR backfitting algorithm.
An open-source Python package is also available for the software (https://github.com/pysal/mgwr) that has some added functionality and provides the option to directly map and analyze the results from the model in Python or R. A comprehensive route map to appropriately choosing (M)GWR in any research is provided by Comber et al. (2020) and an in-depth guide to using the open-source package is available by Oshan et al. (2019).
Figure 5. MGWR 2.2 main interface with variables selected for model run. Source: authors.
Models within the GWR framework provide an effective method of capturing spatially varying processes and estimating the spatial scale at which those processes operate. By estimating location-specific parameters, such models are able to provide more information about the processes that lead to an observed pattern of data than traditional global regression models. Readers interested in applying the GWR framework of models are referred to Comber et al., 2020; Cupido et al., 2020; Fotheringham et al., 2019; Oshan et al., 2020; Fotheringham et al., 2020; Gu et al., 2020; Li, 2020; and Wu et al., 2019 for best practices and examples of recent applications.