He also didn't explain how he got the "married" data, and he didn't explain how he differentiated between people who were unable to login and people who were unwilling to log in and write code in an unfamiliar environment for a marginal job.
There's nothing scientific about this data, despite the fact that it's been dressed up with words like "correlate". It's seems borderline irresponsible for him to have even written this.
I'm amazed you think the word "correlate" counts as "dressing up". I use it regularly, without any intention of claiming scientific credentials.
I'm amazed you think that someone making true observations they think are interesting is being borderline irresponsible. Quite specifically he says the sample is too small, quite specifically he says that "not having these attributes was no guarantee of success." Quite specifically he says that they saw no positive predictors.
What should he have done? Observed these things, noted that he was surprised by some, then not told anyone? That seems to me to be borderline irresponsible.
I see this post as an invitation to make a larger experiment, one with controls, and where the sample is large and chosen appropriately.
"Among 30 random people who applied for my marginal job opening [in 2005], no married people, women, or university graduates were observed logging in to complete the unusual programming test we set up for candidates".
When you write it out the way it actually happened, it doesn't sound so interesting. That this post dates back to 2005 is not the author's fault. And yet here we are in 2009 talking about gender politics and married people and whatever on Hacker News, based on what? Nothing.
<shrug> As you wish. As someone who currently manages 8 programmers, and am always being approached to see if we have an opening, and am always concerned about the hiring process, I am always interested in such things. I thought people here on HN might also be.
I'm interested to hear your views, much as I might disagree with them. I haven't down-mud you, and I wish others wouldn't. I think your thoughts add value. They make me think again, even if I conclude that in my opinion you're wrong.
It looks to me like you're complaining that the author got too close to something you can't say. He stated the sample size and some interesting observations, and specifically pointed out that he's not trying to claim causation.
Your first sentence makes no sense to me. Can you explain what led you to this conclusion? It looks to me like I could use similar reasoning to say, "it looks to me like you really think married people suck at coding, and that's why you're jumping on my critique of the old, irrelevant blog post's data collection methodology".
The fact that we're heading towards this rathole of a conversation is a good reason to flag this post. It isn't newsworthy, and I think I have a strong argument that it isn't sound either.
You're claiming it's "borderline irresponsible" to say what he said, but not that it's false. You'd certainly be justified in calling conclusions people might draw from it (e.g. that women should not be hired as Linux programmers) false. Simply publishing some observations from a single attempt to hire a programmer is fairly meaningless in a statistical sense, but still somewhat interesting to me.
He's saying it's borderline irresponsible because Neil is publishing correlations--which he actually calls predictors--that in no way have any statistical validity, and further causes baseless damage by reinforcing negative stereotypes that a group of people struggles against daily.
In the one area he addresses sample size, all he says is that the limited sample size restricts the ability to find positive predictors. That's stupid. Limited sample size also severely restricts ability to claim significant correlation (which is implied in claiming predictive power).
Nobody is claiming that statistically rigorous data, which show that the women self-selecting for applications in this job post can't pass the qualifications, should not be made public; the claim is that claiming predictive power in a sample of two that hurts a negatively stereotyped group is irresponsible. And it is.
If I were to go out on the street, test 30 random people, two of whom happened to be black, and who randomly had lower tested IQ than the rest of the sample, would it be responsible for me to claim that being black is a predictor of lower IQ? It's the same deal. Do a proper study or don't claim predictive power. Particularly when being statistically foolish could hurt people.