Geog. 4/517 --  Geographic Data Analysis
Spring 2011 -- Due Friday, June 10th

Answer the following questions in clear, complete, and grammatically-correct sentences. You may, however, illustrate any specific answer by using a table or figure, with accompanying text. Be brief, but informative. Make sure you answer all parts of a question.  The questions below probably are answerable within a single page, but do not exceed two double-spaced pages (in a 10 or 12-point font, with normal margins) for each question (figures may be attached as additional pages). 

Because it is likely that the opportunity to discuss the questions with others will arise, you may do so, but work out and write down the answers yourself.

  1. Many data-analytical procedures share one version or another of the same underlying conceptual model: 

data = predictable component + unpredictable component; or
data = signal + noise; or
data = common variation + unique variation

For regression analysis, analysis of variance, and principal components analysis, describe the particular version of that common conceptual model that applies, and why that conceptual model makes sense given the goals of the analysis.

2.      Describe the general context in which regression analysis is applicable. (What is it used for? Are there any assumptions that underlie its use? How is it implemented in practice?)

3.      Describe how "nonparametric" regression works, as typified by a loess/lowess curve added to a bivariate scatter plot. How is the curve constructed? Are there some particular quantities that are optimized (like in standard regression analysis, where the sum of squares of residuals are minimized)? What controls the smoothness of the fitted curve? How does one tell whether a loess curve does a good job of representing the relationship between variables?

4.      Suppose you have a set of observations that are described by multiple variables, and that also can be classified into one or more groups of observations. What are some strategies for figuring out how those groups of observations differ from one another?