GEOG 414/514:  Advanced Geographic Data Analysis
Multivariate displays (Part 1)

 

Multivariate descriptive displays or plots are designed to reveal the relationship among several variables simultaneously..  As was the case when examining relationships among pairs of variables, there are several basic characteristics of the relationship among sets of variables that are of interest.  These include:

 

  • the form of the relationships
  • the strength of the relationships, and 
  • the dependence of the relationships on external (usually to the pairs of variables being examined) circumstances.

 

1. Enhanced 2-D Scatter plots

 

The scatter diagram or scatter plot is the workhorse bivariate plot, and can be enhanced to illustrate relationships among three (or four) variables. 

 

  • The color plot

 

A basic “color plot” displays the values of three variables at a time using colored symbols, where the value of one variable determines the relative position of the symbol along the X-axis and the value of a second variable determines the relative position of the symbol along the Y-axis, and the value of the third variable is used to determine the color of the symbol.

 

The Specmap data set illustrated the variations over time of oxygen-isotope data (that records global ice volume) which should theoretically depend on insolation.  However, a simple plot of Insolation and O18 (and correlation) suggests otherwise:

 

attach(specmap)
plot(O18 ~ Insol)



The correlation coefficient, r, is also rather low.

 

cor(O18, Insol)

[1] -0.2415094

 

Plotting O18 as a function of Age, and color coding the symbols by Insol levels, reveals the nature of the control of ice volume by insolation:

 

Now it’s possible to see that warm (and warming) intervals (points near the top of the plot) tend to have high (orange) solar radiation values, while cooling and cold intervals follow periods of declining solar radiation (blue)

 

Information from four variables at a time can also be displayed.  In this example for the Summit Cr. data, the plotting character is determined by Reach and its color by HU.  Although these are factors, numerical variables could also be plotted.

 

attach(sumcr)
plot(WidthWS ~ CumLen, pch=as.integer(Reach), col=as.integer(HU))

 

  • The bubble plot

 

The bubble plot displays the values of three variables at a time using graduated symbols (usually circles), where the value of one variable determines the relative position of the symbol along the X-axis and the value of a second variable determines the relative position of the symbol along the Y-axis, and the value of the third variable is used to determine the size of the symbol.  Here’s a simple map of  the elevations of the Oregon climate stations.

 

attach(orstationc)
plot(lon, lat, type="n")
symbols(lon, lat, circles=elev, inches=0.1, add=T)

2.  3-D Scatter plots 

3-D scatter plots (as distinct from scatter plot matrices involving three variables), illustrate the relationship among three variables by plotting them in a three-dimensional "workbox".  There are a number of basic enhancements of the basic 3-D scatter plot, such as the addition of drop lines, lines connecting points, symbol modification and so on.

 

  • 3-D point cloud plot

 

Displays the values of three variables at a time by plotting them in a 3-D workbox, where the value of one variable determines the relative position of the symbol along the X-axis and the value of a second variable determines the relative position of the symbol along the Y-axis, and the value of the third variable is used to determine the relative position along the Z-axis.  This plot makes use of the lattice package.

library(lattice)
attach(orstationc)
cloud(elev ~ lon*lat)

Notice that you can still see the outline of the state, because elevation is a fairly well behaved variable.

  • 3-D Scatter plots (using the scatterplot3d package)

The scatterplot3d package (by Ligges and Mächler) provides a way of constructing a 3-point cloud display with some nice embellishments.  The first part of the code, like in making maps, does some setup like determining the number of colors to plot and getting their definitions.  The second block produces the plot

library(scatterplot3d)
library(RColorBrewer)
attach(orstationc)

# get colors for labeling the points
plotvar <- pann # pick a variable to plot
nclr <- 8 # number of colors
plotclr <- brewer.pal(nclr,"PuBu") # get the colors
colornum <- cut(rank(plotvar), nclr, labels=FALSE)
colcode <- plotclr[colornum] # assign color

# scatter plot
plot.angle <- 45
scatterplot3d(lon, lat, plotvar, type="h", angle=plot.angle, color=colcode, pch=20, cex.symbols=2, col.axis="gray", col.grid="gray")

The “z-variable,” in this case, annual precipitation, is plotted as a dot, and for interpretability a drop line is plotted below the dot.  This simple addition facilitates finding the location of each point (where it hits the x-y, or latitude-longitude plane), as well as the value of annual precipitation.

Maps can be added to the 3-D scatter plot to improve interpretability:

library(scatterplot3d)
library(maps)
library(RColorBrewer)
attach(orstationc)

  • OpenGL surface and point plots

The rgl package (by D. Alder) can be used to plot points (and surfaces and lines) in a 3-D space.  The main feature that distinguishes this approach is the ability to rotate the cloud of points "on the fly."  Here’s what the code looks like, and when the image appears, it can be rotated and spun by dragging the mouse within the window.  Holding down the left button while dragging rotates the balls, while holding down the right changes the perspective.

library(rgl)
example(rgl.surface)

rgl.clear()
example(rgl.spheres)

Here’s a second example, for a gridded data set of Oregon climate data (elevation in this example) [orgrid.csv]:

library(rgl)
library(RColorBrewer)
attach(orgrid)

plotvar <- elev # pick a variable to plot
nclr <- 8 # number of colors
plotclr <- brewer.pal(nclr,"PuOr") # get the colors
colornum <- cut(rank(plotvar), nclr, labels=FALSE)
colcode <- plotclr[colornum] # assign color

x <- (lon-min(lon))/(max(lon)-min(lon))
y <- (lat-min(lon))/(max(lat)-min(lat))
z <- (plotvar-min(plotvar))/(max(plotvar)-min(plotvar))

rgl.clear()
rgl.points(x, y, z/2, color=colcode, size=4)
rgl.bbox(color="#333377", emission="#333377", specular="#3333FF",
     shininess=5, alpha=0.8 )

Plot the SPECMAP data:

attach(specmap)
x <- (Age-min(Age))/(max(Age)-min(Age))
y <- (O18-min(O18))/(max(O18)-min(O18))
z <- (Insol-min(Insol))/(max(Insol)-min(Insol))

rgl.clear()
rgl.bbox(color="#333377", emission="#333377", specular="#3333FF",
     shininess=5, alpha=0.8 )
rgl.points(x, y, z, size=4)
rgl.texts(.9, 0, 1, "-Age", col="red")
rgl.texts(.1, 0, 1, "Age+", col="red")
rgl.texts(0, 0, .9, "+Insol", col="red")
rgl.texts(0, 0, .1, "-Insol", col="red")
rgl.texts(0, .9, 0, "+O18", col="red")
rgl.texts(0, .1, 0, "-O18", col="red")

 

3. Contour, levels, and surface plots

Contour plots are the multivariate plot type that is likely the most familiar to geographers.  In R, there are two kinds of contour plots, "2-D contour plots" in which contours are drawn on a standard set of scatter diagram axes, and levels plots that illustrate similar information by coloring or shading a grid of points on the 2-D space, and contouring is achieved visually.  All contour plots are constructed by selecting three variables.  There are additional ways of creating contour plots which will be discussed later.

  • 2-D contour plot

The following script creates a contour plot for annual precipitation data at Oregon climate stations (the lines), overlays this on a levels or color plot, and also adds the station locations as dots.  The interp() function from the akima package does the work of interpolating the scattered values of precipitation at the stations onto a regular grid (interp.out) that can be inferred from the squares in the levels plot.  Note that areas within the plotting region that are “unsupported” by stations are treated as missing.  This is usually preferable, but sometimes for aesthetic reasons one might want to smear the data into to empty regions.  The image() function plots the color, while the contour() function plots the contour lines.

library(akima)
library(RColorBrewer)
attach(orstationc)
x.ctrl <- lon
y.ctrl <- lat
z.ctrl <- pann
interp.out <- interp(x.ctrl, y.ctrl, z.ctrl,
    xo= seq(-124.5000, -116.8333, .1667), yo= seq(42.0000, 46.1667, .0833))

image(interp.out, col=brewer.pal(8,"PuOr"))
contour(interp.out, xlab="", ylab="", add=T)
points(x.ctrl, y.ctrl)