GEOG 414/515:  Advanced Geographic Data Analysis
Visualizing high-resolution and high-dimension data sets

There has been a rapid increase in the availability of data sets that may contain thousands of observations (high-resolution) and tens or hundreds of variables or attributes (high-dimension).  Such data sets arise from

  • satellite remote sensing and other remote-sensing methods (e.g. hyperspectral methods)

  • comprehensive censuses and other polling activities

  • "reanalysis" sets of climatological variables that may consist of multiple observations per day over spands of multiple decades

  • grid-point and other forms of data exported from GIS-managed data sets.

Although computing resources now permit the easy and rapid numerical analysis of such data sets, there is still a need for visualizing and understanding the results.  Two approaches are evolving:

  • methods for visualizing many hundreds or thousands (or more) data points using "conventional" approaches, with about a million points now (2008) considered to represent a "large" data set.
  • methods for examining "linked" displays (usually interactively) that allow inividual observations to be tracked across multiple views of a data set

In R, some of these approaches are implemented in the Rggobi and iplots packages, and by appropriate modification of commonplace displays like scatterplots and parallel coordinate plots.

Examples of high-resolution and high-dimension data sets

[Geog. 414/514] [syllabus] [lectures & exercises] | [GeogR] [topics] [data sets] [documentation]