Coplots (conditioning scatter plots)

Conditioning scatter plots involves creating a multipanel display, where each panel contains a subset of the data.  This subset can be either a) those observations that fall in a particular group, or b) they may represent a the values that fall within a particular range of the values of a variable.  The idea is that the individual panels should illustrate the relationship between a pair of variables, over part of the range of the two marginal "conditioning" variables (i.e. the relationship "conditional on one marginal variable lying in one particular interval, and the other lying in a different interval.")

Coplot, conditioning by one factor variable

Plot scatter diagrams for Yes as a function of the log(10) of Population, conditioned by country:

attach(scanvote)
coplot(Yes ~ log10(Pop) | Country, columns=3,
     panel=function(x,y,...) {
          panel.smooth(x,y,span=.8,iter=5,...)
          abline(lm(y ~ x), col="blue")
     } )

Note the use of the "panel" function here.  Basically, what's going on is that the coplot() function is determining which subset of observations should appear in each panel, while the two function calls within the panel function (panel.smooth and abline) perform their tasks on that subset of observations.  In other words, coplot() selects the observations of Yes and log(Pop) for a particular panel (i.e. country), sends these to the panel function, which passes them on (relabeled as x and y), and plots the points, and then panel.smooth() and albline() draw a lowess curve and least-squares line for those observations on each panel.

Coplot, conditioning by one continuous numeric variable

Most of the time, the conditioning variables are continuous numeric variables.  Here's a coplot for WidthWS as a function of DepthWS, conditioned by CumLen:

attach(sumcr)
coplot(WidthWS ~ DepthWS | CumLen, pch=14+as.integer(Reach), cex=1.5,
     number=3, columns=3,
     panel=function(x,y,...) {
          panel.smooth(x,y,span=.8,iter=5,...)
          abline(lm(y ~ x), col="blue")
     } )

We know the arrangement of the reaches, and so the resulting plot should be no surprise.  The plotting characters are determined by Reach, to reveal the extent of overlap in the conditioning "shingles."  The plot could be regenerated using Reach as the conditioning variable, which would result in no overlap between the individual panels.