Page 18-52
From which it follows that the standard deviations of x and y, and the
covariance of x,y are given, respectively, by
, , and
Also, the sample correlation coefficient is
In terms of ⎯x, ⎯y, S
xx
, S
yy
, and S
xy
, the solution to the normal equations is:
,
Prediction error
The regression curve of Y on x is defined as Y = Α + Β⋅x + ε. If we have a set
of n data points (x
i
, y
i
), then we can write Y
i
= Α + Β⋅x
i
+ ε
I
, (i = 1,2,…,n),
where Y
i
= independent, normally distributed random variables with mean (Α +
Β⋅x
i
) and the common variance σ
2
; ε
i
= independent, normally distributed
random variables with mean zero and the common variance σ
2
.
Let y
i
= actual data value,
^
y
i
= a + b⋅x
i
= least-square prediction of the data.
Then, the prediction error is: e
i
= y
i
-
^
y
i
= y
i
- (a + b⋅x
i
).
An estimate of σ
2
is the, so-called, standard error of the estimate,
Confidence intervals and hypothesis testing in linear regression
Here are some concepts and equations related to statistical inference for linear
regression:
1−
=
n
S
s
xx
x
1−
=
n
S
s
yy
y
1−
=
n
S
s
yx
xy
.
yyxx
xy
xy
SS
S
r
⋅
=
xbya −=
2
x
xy
xx
xy
s
s
S
S
b ==
)1(
2
1
2
/)(
)]([
2
1
22
2
2
1
2
xyy
xxxyyy
i
n
i
ie
rs
n
n
n
SSS
bxay
n
s
−⋅⋅
−
−
=
−
−
=+−
−
=
∑
=