This guy's blog is pretty interesting. Interesting enough that it sucked my attention away for half an hour, damn it. He invented an algorithm for keeping multiple copies of a document in sync and when Google found out about it they immediately hired him:
In Silicon Valley a Craigslist ad often gets several hundred replies. Unless I get some kind of personalized reply, I dont bother with programming challenges that involve a significant amount of effort. However, I will reply to interesting design questions.
If your screening process is tilted towards screening out the untalented, it may also screen out senior qualified developers who done want to put up with it.
This is definitely interesting. One thing to reemphasize (which the blog post says, but it cannot be overeemphasized) is that the sample size is so small as to make any results highly suspect.
The other thing I found interesting was this one:
>MCSE or other Microsoft certification
This makes sense. Getting an MCSE is a major investment in time and money in Microsoft technologies. Those people rarely also invest a lot of time learning nix as well (definitely some exceptions). I would not be surprised if many of those candidates simply moved on to other opportunities after finding out how tied to nix technoligies this particular job was going to be. Even if they were that person that had *nix skills in addition to their MCSE they probably found out they could make more money somewhere else that put more value on the certification itself.
It's a great first step to hiring. But I'm surprised that he didn't have any university graduates pass--the problem itself is a freshman-level bozo test used to apply the concept of a stack. The rest (log in to ssh and install a CGI script) is a set of useful skills that any competent programmer could (hopefully) teach themselves within the timespan of the problem.
If you only have 3 out of 30 candidates pass the bozo test, that limits your choices when it comes to testing them on having any real talent.
As you and the original author both know, you don't need to use a stack data structure if you're just matching a single type of parens. (Your algorithm is the server-side solution employed by the author.) Though if you used {([])} with rules against them side-straddling like ([)], the stack would be necessary to distinguish what type of matching character you're using.
But look at what you have: an integer (n) that counts the number of unmatched open parens, that is not allowed to drop below 0, and that can only be incremented and decremented. Your solution there is essentially a pushdown automaton, with n representing your stack of open parentheses (and the value of n representing the size of the stack). If you approached the problem thinking of it as a stack problem, you'd probably end up optimizing to your solution when you realized that actually stacking parentheses could be efficiently simulated with an integer. But the restrictions you've placed on that integer effectively simulate the restrictions of pushing and popping parens off a stack. You're even going through the string once without backtracking.
I'm not sure if it's useful or just comp-sci wankery to think of it this way, but there's an equivalence between that solution and the freshman stack solution after all.
Which is interesting, as a proper regular expression is incapable of testing for a parens match. The set of strings with properly matched parens is a context-free language, not a regular language. But modern "regex" engines are capable of matching non-regular languages, and iterative regex-based substitutions (as his client-side code does--his server-side code is equivalent to the stack-based approach) can approximate the effect.
And the content of the advertisement. If it was clear that the job was web programming in a Linux environment, then it's more surprising that a significant number of applicants failed to show basic proficiency with the tools of the trade (SSH and the command line). If it said "Programmer wanted, high pay and stock options, no degree required", then one could expect replies from just about anybody.
The challenge description says candidates were encouraged to install other programming environments if they so desired. This leads me to suspect they were given root access.
In regard to the 5 point Neil makes under the original post, I have a sad observation:
- numerous people would, and often do, rally against gag orders handed down by various offices
- yet the wide society is pretty quick to gag or chicane a blogger for publishing politically-incorrect observations. Observations that were surrounded by disclaimers of small population sample and other factors that clouded them. And while it's not necessarily the same people as above, it's still the wide society.
That's funny, right now in 2009, when we are planning to start a company in Thailand we are hiring only:
programming females that are married (pref. with kids)
Since they're the only ones here that are not job hopping, capable from working at home and not all day on facebook instead of the majority of Thai students/fresh graduates.
No need for that. Mothers with toddlers usually have a little mobility and heightened desire to socialize — being online is about perfect way to do that.
I dunno, I think this would be a great experiment to try out now and see what you get. It's a great way to weed out the pretenders from the real thing (provided you want programmers who know something about Linux). Plus you're almost guaranteed a larger sampling size due to the economic environment we're in. I'd love to see the statistics and demographics now versus then.
That doesn't surprise me at all. Out of 15 candidates for a position advertised as "C, C++ and Python Programmer", only 4 could actually write a "FizzBuzz" program in any of the given languages.
Its interesting only because, when you look at the negative predictors and you see 'hotmail address', 'mcse' and you nod your head in agreement, then you see 'women' and you then you start to think twice about your stereotypes.
He also didn't explain how he got the "married" data, and he didn't explain how he differentiated between people who were unable to login and people who were unwilling to log in and write code in an unfamiliar environment for a marginal job.
There's nothing scientific about this data, despite the fact that it's been dressed up with words like "correlate". It's seems borderline irresponsible for him to have even written this.
I'm amazed you think the word "correlate" counts as "dressing up". I use it regularly, without any intention of claiming scientific credentials.
I'm amazed you think that someone making true observations they think are interesting is being borderline irresponsible. Quite specifically he says the sample is too small, quite specifically he says that "not having these attributes was no guarantee of success." Quite specifically he says that they saw no positive predictors.
What should he have done? Observed these things, noted that he was surprised by some, then not told anyone? That seems to me to be borderline irresponsible.
I see this post as an invitation to make a larger experiment, one with controls, and where the sample is large and chosen appropriately.
"Among 30 random people who applied for my marginal job opening [in 2005], no married people, women, or university graduates were observed logging in to complete the unusual programming test we set up for candidates".
When you write it out the way it actually happened, it doesn't sound so interesting. That this post dates back to 2005 is not the author's fault. And yet here we are in 2009 talking about gender politics and married people and whatever on Hacker News, based on what? Nothing.
<shrug> As you wish. As someone who currently manages 8 programmers, and am always being approached to see if we have an opening, and am always concerned about the hiring process, I am always interested in such things. I thought people here on HN might also be.
I'm interested to hear your views, much as I might disagree with them. I haven't down-mud you, and I wish others wouldn't. I think your thoughts add value. They make me think again, even if I conclude that in my opinion you're wrong.
It looks to me like you're complaining that the author got too close to something you can't say. He stated the sample size and some interesting observations, and specifically pointed out that he's not trying to claim causation.
Your first sentence makes no sense to me. Can you explain what led you to this conclusion? It looks to me like I could use similar reasoning to say, "it looks to me like you really think married people suck at coding, and that's why you're jumping on my critique of the old, irrelevant blog post's data collection methodology".
The fact that we're heading towards this rathole of a conversation is a good reason to flag this post. It isn't newsworthy, and I think I have a strong argument that it isn't sound either.
You're claiming it's "borderline irresponsible" to say what he said, but not that it's false. You'd certainly be justified in calling conclusions people might draw from it (e.g. that women should not be hired as Linux programmers) false. Simply publishing some observations from a single attempt to hire a programmer is fairly meaningless in a statistical sense, but still somewhat interesting to me.
He's saying it's borderline irresponsible because Neil is publishing correlations--which he actually calls predictors--that in no way have any statistical validity, and further causes baseless damage by reinforcing negative stereotypes that a group of people struggles against daily.
In the one area he addresses sample size, all he says is that the limited sample size restricts the ability to find positive predictors. That's stupid. Limited sample size also severely restricts ability to claim significant correlation (which is implied in claiming predictive power).
Nobody is claiming that statistically rigorous data, which show that the women self-selecting for applications in this job post can't pass the qualifications, should not be made public; the claim is that claiming predictive power in a sample of two that hurts a negatively stereotyped group is irresponsible. And it is.
If I were to go out on the street, test 30 random people, two of whom happened to be black, and who randomly had lower tested IQ than the rest of the sample, would it be responsible for me to claim that being black is a predictor of lower IQ? It's the same deal. Do a proper study or don't claim predictive power. Particularly when being statistically foolish could hurt people.
http://neil.fraser.name/news/2006/11/01/
I printed out a couple of his papers on this stuff for leisure reading.