|
GEOG 414/514:
Advanced Geographic Data Analysis 6. More Lattice Plots "Trellis" plots are the R version of Lattice plots that were originally implemented in the S language at Bell Labs. The aim of these plots is to extend the usual kind of univariate and bivariate plots, like histograms or scatter plots, to situations where some external variables, possibly categorical or "factor" variables, may influence the distribution of the data or form of a relationship. They do this by generating a trellis or lattice of plots that consist of an array of simple plots, arranged according to the values of some "conditioning" variables. A
Trellis-type plot A multipanel plot, in which the individual panels are "conditioned" by the value of a third variable, here longitude, can be illustrated for the Oregon climate station data using the following script: library(lattice) The idea here is to chop longitude into
eight bands from west to east using
the equal.count() function.
(The third argument here, 0.5, indcates that the bands should overlap
by 50 percent.) Then the lattice plot is made using the xyplot() function, which makes a separate scatter
plot for each longitude band, showing the relationship between annual
precipitatoin and elevation. A
"shingles" plot is added at the bottom to indicate the range of
longitudes that go into each plot. Here's the resulting plot:
Notice that in each panel, a stright
regression line (more about regression later) and a smooth lowess curve have
been added to help summarize ethe relationships. The panels are arranged in longitudinal
order from low (west) to high (east, remember that in the western hemisphere,
longitudes are negative). The plots
are certainly interesting. The general
idea is that precipitation should increase with increasing elevation, but at
least for the western part of the state the reverse seems to be true! What is going on here is that proximity to
the Pacific is a much more important control than elevation, and low
elevation coastal and inland stations are qute wet. In the eastern part of the state (top row
of panels), the expected relationship holds, but it's kind of hard to see
because the wet western part of the state stretches out the scale., Some more plots The following plots explore the seasonality of precipitation in the Yellowstone region. This first plot uses glyphs to show the values of twelve monthly precipitation variables as "spokes" of a wheel, where each variable is plotted relative to its overall range. The first block of code below sets things up, and the stars() function does the plotting. trellis.device(color=TRUE, theme = "col.whitebg")
Here the
stars wind up looking more like fans. The
legend indicates that stations with fans that open out to the right are
stations with winter precipitation maxima (like in the southwestern portion
of the region) while those that open toward the left have summer
precipitation maxima (like in the southeastern portion of the region). The next
examples show a couple of conditioning plots (coplots), that illustrate the
relationship bewteen January and July precipitation, as varies (is
conditioned on) with elevation. The
first block of code does some set up. # create some conditioning variables # January vs July Precipitation Ratios by Elevation
The plot
shows that the relationship between January and July precipition indeed
varies with elevation. At low
elevations, there is proportionally lower January precipitation for the same
July values (lower two panels on the lattice plot), but at higher elevations,
there is proportionally more (top two panels). This relationship points to some orographic
(i.e. related to the elevation of the mountains) amplification of the winter precipitation. The next
plot shows the variation of the relationship between January and July
precipitation as it varies spatially. # January vs July Precipitation Ratios by Latitude and
Longitude
Notice
that the steepest curve lies in the panel representing the southwestern part
of the region (low latitude and low longitude, i.e. the bottom left panel),
which suggests that winter (January) precipitation is relatively more import
there, which is also apparent on the stars plot above. Next, the
general idea that seems to be emerging, that there variations within the
region of the relative importance of summer and winter precipitation can be
explored by a parallel-coordinate plot, that allow different precipitation
"regimes" to be detected by the appearance of distict
"bundles" of curves. # Parallel plot of precipitation ratios
Notice
that at low elevations, most of the stations are behaving similarly, and
showing a distinct summer precipitation maximum (and only one station seems
to show a winter maximum). At high
elevations, there is more variability but a general tendency for winter
precipitation to dominate. Lattice
plots can extend many of the basic univariate and bivariate plots. For
example, a set of scatter plot matrices can be generated, for the high/low
latitude and longitude slices. # Lattice plot of scatter plot matrices
These
plots provide a different prepective on the variations of precipitation
across the region, but they're consistent with what the other plots show.
The spplot() function in the sp package is a Lattice-plot type method, and can be thought of as either extending the capabilities of Lattice plots to maps, or extending the ability of R to produce multi-panel maps. The following example uses a data set of locations and elevations Oregon cirque basins (upland basins eroded by glaciers), and whether or not they are currently (early 21st century) glaciated. Whether a cirque is occupied by a glacier or not is basically determined by the trade-off between snow accumulation (and hence winter precipitation) and summer ablation (or melting, and hence summer temperature. Cirque basins not currently occupied by glaciers were, of couse, occupied in the past, while those occupied today indicate where "glacier-safe" climate prevails (at least for now). Here are the links to the shapefile components that contain the locations of the cirques, and information on whether they are currently glaciated (G) or unglaciated (U): [.dbf] [.shp] [.shx] As usual, the first blocks of code load the necessary libraries and data sets: library(gpclib) cirques.shp <- readShapePoints(file.choose(), In the code below, the two as.factor() functions are used to turn the single variable cirques.shp$Glacier, which has the values "G" and "U", into two "binary" (0 or 1) variables. The variable cirques.shp$Glaciated will contain 1's for glaciated cirques, and 0 otherwise (i.e. unglaciated cirques), while the variable cirques.shp$Unglaciated will contain 1's for unglaciated cirques, and 0 otherwise. The two variables are obviously redundant (the elements would sum to 1 for each observation), but it makes the illustration of the method more transparent. # multi-panel lattice plot
The top panel shows unglaciated cirques in pink and glaciated ones in turqoise, while the bottom panel shows the reverse, glaciated cirques in pink, unglaciated in turquiose. Note the aspect argument—this scales the horizontal and vertical axes of the plot in a way that makes the map look projected. It's pretty easy to see where the glaciated cirques occur. This way of mapping the cirques could also have been done by plotting a simple shape file, and then putting points on top, e.g. plot(orotl.shp)
Finally, here are som multi- and single-panel plots of climate-station data, the interpretation of which is straightforward. # lattice-type plots of climate-station data
|