Hacker News new | past | comments | ask | show | jobs | submit login

I would disagree and here is why:

> When I try to get this point across about techniques like the PCA, I like to show that the measurement units strongly affect the inference.

In such a case the problem is not with PCA but with application. PCA is just a rotation of the original coordinate system that projects the data on new axes which are aligned with the directions of highest variability. It is not the job of PCA to parse out the origin of that variability (is it because of different units, or different effects).

> Really, if your conclusions change depending on whether you measure in inches or centimeters, there’s something wrong with the analysis!

To get a statistical distance one should: subtract the mean if the measurements differ in origin; divide by standard deviation if the measurements differ in scale; rotate (or equivalently compute Mahalanobis distance) if the measurements are dependant (co-vary). The PCA itself is closely related to Mahalanobis distance: Euclidian distance on PCA-transformed data should be equivalent to Mahalanobis distance on the original data. So, saying that something is wrong with PCA because it doesn't take units of measurement into account is close to saying that something is wrong with dividing by standard deviation because it doesn't subtract the mean.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: