|
GEOG 414/515: Advanced Geographic Data Analysis Significance evaluation Descriptive statistics (like the mean or standard deviation) are useful for characterizing a set of observations, but the significance of a particular value of a statistic (i.e. is it large or small? commonplace or unusual?) will in general not be immediately apparent, and must be judged using some external (to the data set) standard. The significance of a particular value of a statistic (e.g. the 30-yr mean annual temperature of a place) can be evaluated by determining how rare or how commonplace that particular value is. In ordinary practice, those values of a statistic with absolute values that are equaled or exceeded in no more than 5% of the realizations of that statistic are regarded as significant. The reference distribution The significance of a particular value of a statistic can be judged by comparing that value to the distribution of "all" values of the statistic. That distribution is known as the reference distribution, and in practice, can be either an empirical distribution (as represented by a histogram or cumulative frequency curve for an actual data set), or a theoretical distribution (as represented by an equation or function). Empirical reference distributions Empirical reference distributions are contructed using observed (i.e. real) data sets, which if sufficently large, allow statements to be made about the chances of observing particular values of a variable--possibly new obervations of the variable, or of observing combinations of values of a variable, such as the mean for some subset of observations. A theoretical reference distribution -- the normal (or standard normal) distribution Much of the time, a data set large enough to make probabilistic-type statements about the values of a statistic does not exist, or the statistic may be difficult to observe. In such cases, theoretical reference distributions allow the significance of particular values of statistics to be evaluated. One distribution that is applicable to many cases in practice is the normal distribution Other reference distributions Although the normal distribution is applicable in a number of situations (owing to the implications of the Central Limit Theorem), other situations arise, in which other referernce distributions may apply. Readings: Owen (The R Guide): Ch. 6; O'Sullivan and Unwin (Geographic Information Analysis): section A.4; Rogerson (Statistical Methods): sections 2.1-2.5. [Geog. 414/514] [syllabus] [lectures & exercises] | [GeogR] [topics] [data sets] [documentation]
|