I doubt seriously whether centering or standardizing the original data could really mitigate the multicollinearity problem when squared terms or other interaction terms are included in regression, as some of you, gung in particular, have recommend above.
To illustrate my point, let's consider a simple example.
Suppose the true specification takes the following form such that
yi=b0+b1xi+b2x2i+ui
Thus the corresponding OLS equation is given by
yi=yi^+ui^=b0^+b1^xi+b2^x2i+ui^
where yi^ is the fitted value of yi, ui is the residual, b0^-b2^ denote the OLS estimates for b0-b2 – the parameters that we are ultimately interested in. For simplicity, let zi=x2i thereafter.
Usually, we know x and x2 are likely to be highly correlated and this would cause the multicollinearity problem. To mitigate this, a popular suggestion would be centering the original data by subtracting mean of yi from yi before adding squared terms.
It is fairly easy to show that the mean of yi is given as follows:
y¯=b0^+b1^x¯+b2^z¯
where
y¯,
x¯,
z¯ denote means of
yi,
xi and
zi, respectively.
Hence, subtracting y¯ from yi gives
yi−y¯=b1^(xi−x¯)+b2^(zi−z¯)+ui^
where yi−y¯, xi−x¯, and zi−z¯ are centered variables. b1^ and b2^ – the parameters to be estimated, remain the same as those in the original OLS regression.
However, it is clear that in my example, centered RHS-variables x and x2 have exactly the same covariance/correlation as the uncentered x and x2, i.e. corr(x,z)=corr(x−x¯,z−z¯).
In summary, if my understanding on centering is correct, then I do not think centering data would do any help to mitigate the MC-problem caused by including squared terms or other higher order terms into regression.
I'd be happy to hear your opinions!