8 Transformations
Y transformations are denoted by y-prime, written \(Y'\), and consist of raising \(Y\) to some power called \(\lambda\).
\[ Y' = Y^\lambda \quad \text{(Y Transformation)} \]
Value of \(\lambda\) | Transformation to Use | R Code |
---|---|---|
-2 | \(Y' = Y^{-2} = 1/Y^2\) | lm(Y^-2 ~ X) |
-1 | \(Y' = Y^{-1} = 1/Y\) | lm(Y^-1 ~ X) |
0 | \(Y' = \log(Y)\) | lm(log(Y) ~ X) |
0.5 | \(Y' = \sqrt(Y)\) | lm(sqrt(Y) ~ X) |
1 | \(Y' = Y\) | lm(Y ~ X) |
2 | \(Y' = Y^2\) | lm(Y^2 ~ X) |
Using “maximum-likelihood” estimation, the Box-Cox procedure can actually automatically detect the “optimal” value of \(\lambda\) to consider for a Y-transformation. Keep in mind however, that simply accepting a suggested Y-transformation without considering the scatterplot and diagnostic plots first, is unwise.
8.1 Y-Transformations
8.1.1 Scatterplot Recognition
The following panel of scatterplots can give you a good feel for when to try different values of \(\lambda\).
8.1.2 Box-Cox Suggestion
The boxCox(...)
function in library(car)
can also be helpful on finding values of \(\lambda\) to try.
8.1.3 An Example
Suppose we were running a simple linear regression on the cars
dataset.
This would be done with the code
cars.lm <- lm(dist ~ speed, data=cars)
summary(cars.lm)
Notice the line doesn’t quite fit the data as well as we would hope. Instead, the data looks a little curved.
cars.lm <-lm(dist ~ speed,data=cars)
plot(dist ~ speed, data=cars, pch=20, col="firebrick", cex=1.2, las=1,
xlab="Speed of the Vehicle (mph) \n the Moment the Brakes were Applied", ylab="Distance (ft) it took the Vehicle to Stop",
main="Don't Step in front of a Moving 1920's Vehicle...")
mtext(side=3, text="...they take a few feet to stop.", cex=0.7, line=.5)
legend("topleft", legend="Stopping Distance Experiment", bty="n")
abline(cars.lm, col="gray")
Using the boxCox(...)
function from library(car)
we would compute the following to determine which Y-transformation would be most meaningful.
library(car)
boxCox(cars.lm)
The output from the boxCox(...)
function looks as follows.
This plot tells use to use the \(\lambda = 0.5\) transformation, so that \(Y' = Y^0.5 = \sqrt{Y}\). (To see this yourself, see above in the “Box-Cox Suggestion” section, as well as on the “Scatterplot Recognition” section.)
Now, a transformation regression is performed using sqrt(Y)
in place of Y
as follows:
cars.lm.t <- lm(sqrt(dist) ~ speed, data=cars)
summary(cars.lm.t)
Estimate | Std. Error | t value | Pr(> | |
---|---|---|---|---|
(Intercept) | 1.277 | 0.4844 | 2.636 | 0.01126 |
speed | 0.3224 | 0.02978 | 10.83 | 1.773e-14 |
Then,
\[ \widehat{Y}_i' = 1.277 + 0.3224 X_i \]
And replacing \(\hat{Y}_i' = \sqrt{\hat{Y}_i}\) we have
\[ \sqrt{\widehat{Y}_i} = 1.277 + 0.3224 X_i \]
Solving for \(\hat{Y}_i\) gives
\[ \widehat{Y}_i = (1.277 + 0.3224 X_i)^2 \]
Which, using curve((1.277 + 0.3224*x)^2, add=TRUE)
(see code for details) looks like this:
plot(dist ~ speed, data=cars, pch=20, col="firebrick", cex=1.2, las=1,
xlab="Speed of the Vehicle (mph) \n the Moment the Brakes were Applied", ylab="Distance (ft) it took the Vehicle to Stop",
main="Don't Step in front of a Moving 1920's Vehicle...")
mtext(side=3, text="...they take a few feet to stop.", cex=0.7, line=.5)
legend("topleft", legend="Stopping Distance Experiment", bty="n")
curve( (1.277 + 0.3224*x)^2, add=TRUE, col="firebrick")
8.2 X-Transformations
X-transformations are more difficult to recognize than y-transformations. This is partially because there is no Box-Cox method to automatically search for them.
The best indicator that you should consider an x-transformation is when the variance of the residuals is constant across all fitted-values, but linearity is clearly violated.
The following panel of scatterplots can give you a good feel for when to try different values of an x-transformation.