Fitting a bivariate regression model
Attach the data set and get a quick listing of the data for the first
example
> attach(regrex1)
> x
[1] 9.8492 11.1450 5.1209
8.1085 3.5599 1.9883 15.5322
[8] 7.5202 14.0202 7.5226 14.3272 3.8950
1.8707 0.8347
[15] 18.3413 1.6645 17.0579 9.2375 21.0000 3.5286 25.0000
[22] 9.9008 17.5990 13.8790 10.8690
12.6238 0.9719 14.4814
[29] 2.1792 18.4052
> y
[1] 6.8102
9.8437 4.9767 5.0006
3.5047 3.6419 7.8725
[8] 5.1442
8.3883 6.8204 9.0185
2.6148 3.7367 2.8651
[15] 10.7469 2.3710
9.6653 7.1926 11.7833 3.9537 15.0000
[22] 7.9152
9.3368 8.7587 6.7813
7.9924 2.2806 10.4050
[29] 3.7891 10.9067
Plot the data
plot(y ~ x)

Fit the regression model:
> # fit the model
> ex1.lm <- lm(y ~ x)
Type "attributes(ex1.lm)" to see the contents of ex1.lm
> attributes(ex1.lm)
$names
[1] "coefficients" "residuals" "effects"
[4] "rank" "fitted.values"
"assign"
[7] "qr" "df.residual" "xlevels"
[10] "call"
"terms"
"model"
$class
[1] "lm"
Type "ex1.lm" to see a brief summary of model:
> ex1.lm
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
2.2481 0.4691
Type "summary(ex1.htm)" to see more details
> summary(ex1.lm)
Call:
lm(formula = y ~ x)
Residuals:
Min
1Q Median 3Q
Max
-1.66121 -0.53286 -0.02869 0.50436 2.36786
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.24814
0.29365 7.656 2.44e-08 ***
x 0.46906 0.02444
19.194 < 2e-16 ***
---
Signif. codes:
0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Residual standard error:
0.8779 on 28 degrees of freedom
Multiple R-Squared:
0.9294, Adjusted R-squared: 0.9268
F-statistic: 368.4 on 1 and
28 DF, p-value:
< 2.2e-16
Plot the regression line, and the deviations that were minimized while fitting the regression equation
> plot(y
~ x)
> abline(ex1.lm,
col="red")
> segments(x,
fitted(ex1.lm), x, y, col="red")

A second example:
> attach(regrex2)
> regrex2
e x
y1 y2 y3
y4 y5 y6
1 -0.0578 9.8492
6.8102 6.8390 6.8535
6.8622 6.7523 6.6367
2 2.3679 11.1450 9.8437
8.6597 8.0677 7.7125 12.2115 16.9473
3 0.3266 5.1209
4.9767 4.8134 4.7317
4.6828 5.3033 5.9564
4 -1.0509 8.1085
5.0006 5.5260 5.7887
5.9464 3.9496 1.8478
5 -0.4132 3.5599
3.5047 3.7113 3.8146
3.8766 3.0915 2.2650
6 0.4611 1.9883
3.6419 3.4113 3.2960
3.2268 4.1030 5.0252
7 -1.6612
15.5322 7.8725 8.7030
9.1183 9.3675 6.2112
2.8888
8 -0.6313 7.5202
5.1442 5.4598 5.6177
5.7124 4.5128 3.2502
9 -0.4362
14.0202 8.3883 8.6063
8.7154 8.7808 7.9520
7.0795
10 1.0437 7.5226
6.8204 6.2985 6.0376
5.8810 7.8640 9.9513
11 0.0500 14.3272 9.0185
8.9934 8.9809 8.9734
9.0684 9.1684
12 -1.4603 3.8950
2.6148 3.3449 3.7100
3.9290 1.1544 -1.7662
13 0.6111 1.8707
3.7367 3.4311 3.2783
3.1867 4.3478 5.5700
14 0.2254 0.8347
2.8651 2.7523 2.6960
2.6622 3.0905 3.5413
15 -0.1045 18.3413 10.7469
10.7990 10.8252 10.8408 10.6423 10.4333
16 -0.6579 1.6645
2.3710 2.6999 2.8644
2.9630 1.7130 0.3971
17 -0.5841 17.0579 9.6653 9.9572 10.1033 10.1909 9.0811
7.9129
18 0.6114 9.2375
7.1926 6.8868 6.7339
6.6422 7.8039 9.0268
19 -0.3151 21.0000 11.7833
11.9408 12.0196 12.0669 11.4682 10.8380
20 0.0504 3.5286
3.9537 3.9284 3.9158
3.9083 4.0040 4.1048
21 1.0253 25.0000
15.0000 14.4873 14.2309 14.0771 16.0252 18.0758
22 1.0230 9.9008
7.9152 7.4036 7.1479
6.9945 8.9381 10.9840
23 -1.1664 17.5990 9.3368 9.9199 10.2115 10.3865 8.1703
5.8375
24 0.0005 13.8790 8.7587
8.7584 8.7583 8.7582
8.7591 8.7600
25 -0.5650 10.8690 6.7813 7.0638
7.2050 7.2898 6.2162
5.0861
26 -0.1771 12.6238 7.9924 8.0809
8.1251 8.1517 7.8153
7.4611
27 -0.4234 0.9719
2.2806 2.4923 2.5981
2.6616 1.8572 1.0104
28 1.3642 14.4814
10.4050 9.7228 9.3818
9.1772 11.7691 14.4974
29 0.5187 2.1792
3.7891 3.5297 3.4000
3.3222 4.3077 5.3452
30 0.0254 18.4052
10.9067 10.8939 10.8876 10.8838 10.9320 10.9827
> summary(regrex2)
e x y1
Min. :-1.66120
Min. : 0.8347 Min.
: 2.281
1st Qu.:-0.53280 1st Qu.: 3.6437 1st Qu.: 3.830
Median :-0.02865 Median : 9.8750 Median : 7.007
Mean :
0.00001 Mean :10.0678 Mean
: 6.971
3rd Qu.: 0.50430 3rd Qu.:14.4429 3rd Qu.: 9.257
Max. :
2.36790 Max. :25.0000 Max.
:15.000
y2 y3 y4
Min. : 2.492 Min.
: 2.598 Min. : 2.662
1st Qu.: 3.766 1st Qu.: 3.840 1st Qu.: 3.913
Median : 6.975 Median : 7.001 Median : 6.928
Mean :
6.970 Mean : 6.970
Mean : 6.971
3rd Qu.: 8.935 3rd Qu.: 9.084 3rd Qu.: 9.126
Max. :14.487 Max. :14.231 Max.
:14.077
y5 y6
Min. : 1.154 Min.
:-1.766
1st Qu.: 4.154 1st Qu.: 3.682
Median : 7.278 Median : 6.297
Mean :
6.971 Mean : 6.970
3rd Qu.: 9.036 3rd Qu.: 9.756
Max. :16.025 Max. :18.076
The variables x and y1 are the same as in the previous example, so as a second example
look at the relationship between y2 and x:
> plot(y2
~ x)

Note the stronger relationship. The second regression example results are:
> ex2.lm <- lm(y2 ~ x)
> summary(ex2.lm)
Call:
lm(formula = y2 ~ x)
Residuals:
Min
1Q Median 3Q
Max
-0.83062 -0.26641 -0.01437 0.25222 1.18393
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.24810
0.14682 15.31 3.9e-15 ***
x 0.46906 0.01222
38.39 < 2e-16 ***
---
Signif. codes:
0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Residual standard error:
0.439 on 28 degrees of freedom
Multiple R-Squared:
0.9814, Adjusted R-squared: 0.9807
F-statistic: 1474 on 1 and 28 DF, p-value: < 2.2e-16
Compare these with the results of the first regression:
> summary(ex1.lm)
Call:
lm(formula = y ~ x)
Residuals:
Min
1Q Median 3Q
Max
-1.66121 -0.53286 -0.02869 0.50436 2.36786
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.24814
0.29365 7.656 2.44e-08 ***
x 0.46906 0.02444
19.194 < 2e-16 ***
---
Signif. codes:
0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Residual standard error:
0.8779 on 28 degrees of freedom
Multiple R-Squared:
0.9294, Adjusted R-squared: 0.9268
F-statistic: 368.4 on 1 and
28 DF, p-value:
< 2.2e-16
Both regressions are significant, but all of the diagnostic statistics
for the second regression are higher.
Compare the two regressions and the size of the deviations that were minimized
> plot(y2
~ x)
> abline(ex2.lm,
col="blue")
> segments(x,
fitted(ex2.lm), x, y2, col="blue")


Back to the first example--plot the regression line on top of the scatter plot

Get prediction intervals and confidence intervals, and plot them
> pred.data <-
data.frame(x=1:25)
> pred.int <- predict(ex1.lm, int="p", newdata=pred.data)
> conf.int <- predict(ex1.lm, int="c", newdata=pred.data)
>
> plot(x,
y, ylim=range(y, pred.int, na.rm=T))
> pred.ex1 <-
pred.data$x
> matlines(pred.ex1,
pred.int, lty=c(1,2,2), col="black")
> matlines(pred.ex1,
conf.int, lty=c(1,2,2), col="red")

Examine some regression diagnostic plots
> # standard regression
diagnostics (4-up)
> oldpar
<- par(mfrow = c(2, 2))
> plot(ex1.lm)
> par(oldpar)

R-F spread plot
> # r-f spread plot
> library(lattice)
... [output
deleted]
> trellis.device(color=TRUE,
theme = "col.whitebg")
> rfs(ex1.lm)
