Residual Maps and Plots

23.4 Residual Maps and Plots

When the predicted values and residuals are saved to the data table as ad- ditional variables, they become available to all the exploratory functionality of GeoDa. This is particularly useful for the construction of diagnostic maps and plots. In the following examples, we will use the residual and predicted value of the quadratic trend surface regression, OLS PQUAD and OLS RQUAD. It is straightforward to replicate these examples for the residuals (OLS RLIN) and predicted values (OLS PLIN) of the linear trend model as well.

Figure 23.16: Residual map, quadratice trend surface.

23.4.1 Residual Maps The most useful residual map is probably a standard deviational map, since

it clearly illustrates patterns of over- or under-prediction, as well as the magnitude of the residuals, especially those greater than two standard de- viational units.

Select Map > St.Dev and choose OLS RQUAD as the variable. The result- ing map should be as in Figure 23.16. Note the broad patterns in over- prediction (negative residuals, or blue tones) and underprediction (positive residuals, or brown tones). This “visual inspection” would suggest the pres- ence of spatial autocorrelation, but this requires a formal test before it can

be stated more conclusively. Also note several very large residuals (the very dark brown and blue). This is not surprising, since the “model” only contains location as a variable and no other distinguishing characteristics of the houses were considered. The outliers suggest the existence of transactions where location alone was not sufficient to explain the price. Selecting these locations and linking with other graphs or maps (e.g., some of the multivariate EDA tools) might shed light on which variables should be included in an improved regression specification.

Figure 23.17: Quadratic trend surface residual plot.

23.4.2 Model Checking Plots

A simple plot of the model residuals is often revealing in that it should not suggest any type of patterning. While GeoDa currently does not have simple plot functions, it is possible to use a scatter plot to achieve the same goal. For example, plot the residuals of the quadratic trend surface model against a simple list of observation numbers, such as contained in the variable STATION.

Start with Explore > Scatter Plot and select OLS RQUAD as the first variable (y-axis) and STATION as the second variable (x-axis). The resulting plot should be as in Figure 23.17.

The plot confirms the existence of several very large residuals. Selecting these on the graph and linking with a map or with other statistical graphs (describing other variables) may suggest systematic relationships with “ig- nored” variables and improve upon the model.

A different focus is taken with a plot of the residuals against the predicted values. Here, the interest lies in detecting patterns of heteroskedasticity, or

a change in the variance of the residuals with another variable. As before,

Figure 23.18: Quadratic trend surface residual/fitted value plot. select Explore > Scatter Plot and choose OLS RQUAD as the first variable

(y-axis) and OLS PQUAD as the second variable (x-axis). The resulting plot should be as in Figure 23.18.

In this graph, one tries to find evidence of funnel-like patterns, suggest- ing a relation between the spread of the residuals and the predicted value. There is some slight evidence of this in Figure 23.18, but insufficient to state a strong conclusion. Formal testing for heteroskedasticity will need to supplement the visual inspection.

Instead of the predicted values, other variables may be selected for the x- axis as well, especially when there is strong suspicion that they may “cause” heteroskedasticity. Often, such variables are related to size, such as area or total population.

23.4.3 Moran Scatter Plot for Residuals Spatial patterns in the residuals can be analyzed more formally by means

of a Moran scatter plot. In the usual fashion, select Space > Univariate Moran from the menu, choose OLS RQUAD as the variable, and baltrook.GAL as the spatial weights file.

Figure 23.19: Moran scatter plot for quadratic trend surface residuals.

The resulting graph should be as in Figure 23.19, indicating a Moran’s

I for the residuals of 0.2009. Note that this measure is purely descriptive, and while it allows for linking and brushing, it is not appropriate to use the permutation approach to assess significance. 1 For the same reason, it is also not appropriate to construct LISA maps for the residuals.