GEOG 414/515:  Advanced Geographic Data Analysis
GLMs, GAMs, and CARTs

There are a number of extensions or alternative approaches for building regression-like models for circumstances when the (fairly restrictive) set of assumptions that underlie the use of ordinary least squares for fitting the regression model may be violated (e.g. binary response variables, nonlinearizable relationships, etc.)  These alternative approaches are not simply workarounds, but are useful in their own right for exploring or describing relationships between a response variable and several predictors.

Generalized Linear Models (GLMs)

Generalized linear models, as the name implies, are an extension of the standardized linear model to those situations where the response variable may be binary, or counts or proportions (among others).  The extension involves fitting a linear predictor (that looks like an ordinary regression equation) that together with a "link fucntion" describes the distribution of the response variable (which may be one of several from the family of exponential distributions).  When the distribution of the dependent variable is gaussian (i.e. normal), and the link function is just the identity function, the model is equivalent to the standard regression model.

Generalized Additive Models (GAMs)

Generalized additive models implement an overall strategy for relaxing the assumption that the relationships between the response variable and predictor variables are linear, by allowing the forms of the relationship to be determined by the data, in a manner analogous to using loess to describe the relationship. 

Classification and Regression Trees

Classification and regression trees (CARTs) implement a much different approach for describing the relationship between a continuous or categorical response variable and several predictor variables that makes few assumptions about the nature of the relationship.  It does so by constructing a "tree" that describes the role of the predictor variables in creating groups of observations that, as in cluster analysis, are relatively homogeneous.  When applied to categorical variables, a "classification tree" results, and when applied to continuous variables, a "regreession tree" is produced.

Readings:

Kuhnert & Venebles (An Introduction...):  p. 141-168, 169-178, 259-300;  Maindonald (Using R...):  ch. 9

[Geog. 414/514] [syllabus] [lectures & exercises] | [GeogR] [topics] [data sets] [documentation]