8 Transformations

\(Y'\), \(X'\), and returning to the original space…

Y transformations are denoted by y-prime, written \(Y'\), and consist of raising \(Y\) to some power called \(\lambda\).

\[ Y' = Y^\lambda \quad \text{(Y Transformation)} \]

Value of \(\lambda\) Transformation to Use R Code
-2 \(Y' = Y^{-2} = 1/Y^2\) lm(Y^-2 ~ X)
-1 \(Y' = Y^{-1} = 1/Y\) lm(Y^-1 ~ X)
0 \(Y' = \log(Y)\) lm(log(Y) ~ X)
0.5 \(Y' = \sqrt(Y)\) lm(sqrt(Y) ~ X)
1 \(Y' = Y\) lm(Y ~ X)
2 \(Y' = Y^2\) lm(Y^2 ~ X)

Using “maximum-likelihood” estimation, the Box-Cox procedure can actually automatically detect the “optimal” value of \(\lambda\) to consider for a Y-transformation. Keep in mind however, that simply accepting a suggested Y-transformation without considering the scatterplot and diagnostic plots first, is unwise.

8.1 Y-Transformations

8.1.1 Scatterplot Recognition

The following panel of scatterplots can give you a good feel for when to try different values of \(\lambda\).

8.1.2 Box-Cox Suggestion

The boxCox(...) function in library(car) can also be helpful on finding values of \(\lambda\) to try.

8.1.3 An Example

Suppose we were running a simple linear regression on the cars dataset.

This would be done with the code

cars.lm <- lm(dist ~ speed, data=cars)

summary(cars.lm)

Notice the line doesn’t quite fit the data as well as we would hope. Instead, the data looks a little curved.

cars.lm <-lm(dist ~ speed,data=cars)
plot(dist ~ speed, data=cars, pch=20, col="firebrick", cex=1.2, las=1,
     xlab="Speed of the Vehicle (mph) \n the Moment the Brakes were Applied", ylab="Distance (ft) it took the Vehicle to Stop",
     main="Don't Step in front of a Moving 1920's Vehicle...")
mtext(side=3, text="...they take a few feet to stop.", cex=0.7, line=.5)
legend("topleft", legend="Stopping Distance Experiment", bty="n")

abline(cars.lm, col="gray")

Using the boxCox(...) function from library(car) we would compute the following to determine which Y-transformation would be most meaningful.

library(car)

boxCox(cars.lm)

The output from the boxCox(...) function looks as follows.

cars.lm <-lm(dist ~ speed,data=cars)
boxCox(cars.lm)

This plot tells use to use the \(\lambda = 0.5\) transformation, so that \(Y' = Y^0.5 = \sqrt{Y}\). (To see this yourself, see above in the “Box-Cox Suggestion” section, as well as on the “Scatterplot Recognition” section.)

Now, a transformation regression is performed using sqrt(Y) in place of Y as follows:

cars.lm.t <- lm(sqrt(dist) ~ speed, data=cars)

summary(cars.lm.t)

  Estimate Std. Error t value Pr(>
(Intercept) 1.277 0.4844 2.636 0.01126
speed 0.3224 0.02978 10.83 1.773e-14

Then,

\[ \widehat{Y}_i' = 1.277 + 0.3224 X_i \]

And replacing \(\hat{Y}_i' = \sqrt{\hat{Y}_i}\) we have

\[ \sqrt{\widehat{Y}_i} = 1.277 + 0.3224 X_i \]

Solving for \(\hat{Y}_i\) gives

\[ \widehat{Y}_i = (1.277 + 0.3224 X_i)^2 \]

Which, using curve((1.277 + 0.3224*x)^2, add=TRUE) (see code for details) looks like this:

plot(dist ~ speed, data=cars, pch=20, col="firebrick", cex=1.2, las=1,
     xlab="Speed of the Vehicle (mph) \n the Moment the Brakes were Applied", ylab="Distance (ft) it took the Vehicle to Stop",
     main="Don't Step in front of a Moving 1920's Vehicle...")
mtext(side=3, text="...they take a few feet to stop.", cex=0.7, line=.5)
legend("topleft", legend="Stopping Distance Experiment", bty="n")

curve( (1.277 + 0.3224*x)^2, add=TRUE, col="firebrick")


8.2 X-Transformations

X-transformations are more difficult to recognize than y-transformations. This is partially because there is no Box-Cox method to automatically search for them.

The best indicator that you should consider an x-transformation is when the variance of the residuals is constant across all fitted-values, but linearity is clearly violated.

The following panel of scatterplots can give you a good feel for when to try different values of an x-transformation.