For immediate recursion, an explicit keyword makes perfect sense, and indeed may be seen as preferable owing to gotchas around what might otherwise be interpreted as a non-tail call, such as not structuring things with accumulator(s) as necessary.
Tail call can start to work magic, though, when you use it with mutual recursion. Tail calling is not just a kosher way to write a loop just to fit in with a functional religion; it makes it possible to structure programs that have certain performance characteristics that are hard to achieve any other way.
Consider a simple interpreter of opcodes, for a simple virtual machine. A trivial interpreter could use a big switch statement, with one entry per opcode; but that wouldn't be particularly neatly structured, because you'd likely have to inline most opcode implementation bodies into the switch in order to get decent performance.
An optimization is a threaded code interpreter; scan through the program once, transforming opcodes into function pointers, where each function implements a single opcode, and proceeds to the next opcode by calling it directly. But this causes a problem in most imperative languages, without low-level hackery or an unsafe feature like gcc's computed goto: the stack will grow with every opcode interpreted, because each call grows the stack. Tail call support permits this approach in both a safe and efficient way.
I do agree, though, that even in these cases, some kind of tail call directive would be preferable, to indicate to the compiler that a tail call is expected here for algorithmic correctness, and help the programmer avoid accidentally inhibiting tail calling.
I agree, (goto fname arg arg) seems like a very clean way of "tail calling" explicitly. In particular, the fact that it's in tail position can be checked as a syntax issue, as it is with (recur).
Building up thunks carries some overhead. In languages without TCO, virtual machines are usually written with e.g. a big switch statement with everything in one big function just because any overhead there will impact every single operation that runs on the system. Not every VM needs to be fast, of course, but it's a common requirement for production versions.
This is the original source of the much-quoted "Premature optimization is the root of all evil." Unfortunately, few people read the rest of the paper. Many of Knuth's "structured goto" examples have since been added into mainstream programming languages as exceptions or various types of "break" statements, but the most interesting ones still work only as tail calls or gotos, and therefore are simply not possible in certain languages. In fact, most programmers these days would never even think of the solutions that Knuth considers most elegant, because current languages have left them with a huge conceptual blind spot.
He's practically the poster boy for Sufficiently Smart Compiler cheerleading, but he despises the most useful real-world example? And he's in the middle of a blog series about how awesome auto-parallelization will be? How is observable nondeterminism not a much bigger lie than TCO?
If you like Clojure's recur but find its syntacticalness distasteful, and clueful compilers abhorrent, the Y Combinator is exactly what you want.
I don't know much about Tim Bray, and I don't usually read its articles. But this argument is OK.
Having a smart compiler is one thing ... but a tail-call is not a simple optimization that speeds up your software. And personally I haven't seen a "sufficiently smart compiler" that does a reasonable job ... Haskell's GHC does a pretty good job, but without having internal knowledge the results are often horrible.
When you modify your function and accidentally move the call such that it's no longer in a tail position (and this one does happen) that function may not run anymore.
Engineers relying on tail-calls end up looking at the generated bytecode/assembler to ensure that it's really a tail-call, otherwise the software crashes.
That's why a tail-call should be explicit. Because you're relying on it for the computation to complete. It's not for speed, it's for correctness, and with implicit tail-calls you are effectively trading one kind of accidental complexity for another.
Is it really that hard to tell whether or not a call is in a tail position? It's like type inference - it has a major impact on what kind of code is idiomatic, and if you know the language, you'll need to understand it. Still, TCO is a lot less complicated than pointers or all of the conceptual baggage necessary to design OO class hierarchies well.
"Is it really that hard to tell whether or not a call is in a tail position?"
It is easy to forget when you are focusing on the logic of the code, and easy to change accidentally. Clojure explicitly tells you when you are trying to recur from a non-tail position, which is rather nice, and not possible with implicit tail call optimization (how do you know if the programmer really wanted a tail call or not?).
Having the compiler catch your mistakes is always nice.
I am a big fan of being able to get stack backtraces when I have a problem to debug. But tail call recursion silently throws away stack frames. You get great performance benefits from doing so, but I'd like an option when debugging to get at the information that has been thrown away.
During debugging I'd prefer to change recur into a function call than make an edit that kills the optimization for less obvious reasons. So far so good.
But recur is less flexible than tail call optimization. Consider for the moment a problem that you have modeled with finite state machines. Each state is a function, and a state transition is signaled by calling the next function. With tail call optimization this won't grow your stack. With recur you have no choice because you can't jump to another function.
Good point. That is possible, albeit with a modest overhead. The overhead would be irrelevant except that state machines get used in very performance critical ways, such as inside a regular expression engine. But possibly Clojure isn't the right language if you care about performance that much.
Incidentally I'd be shocked if trampoline was a special form rather than a function. I'm not sure what type a function is (I don't use clojure nor is it installed here), but it should be possible to implement it something like this:
Yes, that does seem broken. All the state is in the arguments, so it shouldn't matter what function you "recur" to.
I know this is a hack, but in Perl you can "goto" a function to remove the current function from the stack and replace it with the thing you goto. This seems about the same as "recur", but more flexible.
There does seem to be some dislike for recursion versus iteration in his writing. He doesn't seem to be bagging primarily on recursion in languages, just that he thinks TCO should possibly be explicit. Maybe there is something to that argument and maybe there isn't.
I'm surprised that someone of his Math & CS background would not have more substantial arguments than these.
Also, being a co-editor of the XML spec doesn't position someone well to criticize recursion/TCO.
How is his authorship of the XML spec in any way relevant to the specific line of argument in this post? For all we know he also kicks puppies every morning and prefer Sanka over your favorite coffee. If his argument is so insubstantial, you should just refute it without the ad-hominem BS.
I agree that recur is nice and explicit about what's happening, but what does "it just seems wrong when the semantics of two identical function invocations vary wildly as a function of whether there’s an executable statement between the invocation and the function exit" even mean?
If there were additional statements after a recursive call, you couldn't do tail-call elimination, period, because you'd lose all stack information the earlier calls would need to complete. It's not like 'recur' can allow you to magically both keep the stack and perform tail-call elimination.
And how could you have 'two identical function invocations vary wildly'? The function is either tail recursive or it isn't; it either has extra statements after the recursive call or it doesn't. You can't ever invoke it both ways.
"The function is either tail recursive or it isn't; it either has extra statements after the recursive call or it doesn't. You can't ever invoke it both ways."
I think the thing here is that in, say, Scheme, the performance and actual execution semantics of your function vary wildly depending on how much you mangle your algorithm. If you twist it into a shape that ends with a recursive call, it does one thing; if you don't, it does another thing. This is why tail call is a leaky abstraction and why I generally look down on it (setting aside the fact that the most natural expression of many algorithms often involves other methods of implementation -- truly "naturally" recursive algorithms aren't all that common in my experience).
Okay, but you're still doing the twisting with recur. The only difference is that the compiler checks that recur is actually in a tail position. That's probably nice to a newcomer, sure, but once you do it a few times, it's pretty simple to tell whether you're making a tail call or not.
You're changing the topic of conversation from 'recur' to 'tail-call elimination', but I'll bite.
I don't understand your argument. How is the decision to take advantage of tail-call elimination any less a part of an algorithm's implementation than the decision to, say, use a lazy iterator rather than read into an intermediate array? If it's not the best tool for what you're trying to do, certainly, don't use it, but there are plenty of legitimate places where it is needed, so why argue against it?
It's pretty simple to tell when you are looking at it. However, you may forget/overlook that you are influencing a tail call. If it was optional, I would still opt to use it all the time.
I think the thing here is that in, say, Scheme, the performance and actual execution semantics of your function vary wildly depending on how much you mangle your algorithm.
Precisely the same thing is true with recur in Clojure. The only difference is that once you've mangled your algorithm, you get to proudly signpost it by wrapping it in a recur. It doesn't help the machine at all, it just helps the user. Sounds like a job for... comments!
(More specifically, recur only works in situations where TCO would have happened automatically in Scheme---and not even all of those, as other commentors have pointed out. So what you get is a weakened form of TCO that only kind of works and makes your code more verbose for no reason: as I said above, a comment would suffice in these circumstances.)
Re naturally recursive - it depends on your application ___domain. If you work with trees all day long, such as in a compiler, templating engine, scene graph, etc. recursion is almost more common than looping.
Honestly, I feel like recursion is the natural way to express transformations on any nexted data structure (like a list, tree, or graph). I think the major reason that many people don't see these data structures often is because some languages make recursion harder than iteration, so they stick with iteration across arrays and `for` loops.
Recursion is really a natural way to express almost any repeated operation on a computer.
He says, "I think the semantics of computer programs should be made explicit in their syntax" However, when the semantic definition specifies tail calls, when tail calls are performed is trivially lexically obvious. Either he doesn't understand the paper he cited or he was not being careful with his wording.
I don't think anyone will be able to have a constructive conversation with him about this until he both gets the paper or can precisely word his objections. I think that may be why people keep pointing him at the paper.
As an alternative to reading the paper, he could also just implement a basic ML compiler that does CPS and performs tail calls. It'd probably take him less time than it took to write that post and read all the comments he's going to get...
loop-recur has its good points, but it has its limitations, too. To wit, recur is not allowed inside a catch or finally form. (At least in Clojure 1.0.0.) By extension, this precludes the use of for, dotimes, doseq, and any other macro which expands into loop-recur. Luckily, calling an external function does work, so exception recovery which happens to require a loop is possible.
It's an optimization. Most optimizations don't require that the programmer annotate "I want this optimization" around the code that can be optimized. Obviously, he always wants the optimization. Tail call removal does this; it's a case that is easy for the compiler to identify and optimize. Requiring an explicit "hey, optimize this!" instruction is just confusing...
I don't think Clojure's "recur" or Scheme's tail-call elimination should really be called optimizations.
The point of optimizations, AIUI, is that the code will still run with the optimizations off--it will just run more slowly, but might be useful in other ways (easier debugging, faster compilation, a chance to confirm that there are no bugs in the compiler/interpreter itself, whatever).
But if I'm using "recur" or tail-call elimination to write loops in a recursive form, and I know that my code has to still work even if those features are turned off, then all of a sudden I have to worry about things like "how many times can I recurse in this function without exhausting the stack?"
Presumably tail-recursion is the natural expression of your algorithm.
(I will admit that I wish the compiler could optimize something of the form "sum (x:xs) = x + sum xs"; instead you have to write "sum k (x:xs) = sum (k + x) xs".)
Tail call can start to work magic, though, when you use it with mutual recursion. Tail calling is not just a kosher way to write a loop just to fit in with a functional religion; it makes it possible to structure programs that have certain performance characteristics that are hard to achieve any other way.
Consider a simple interpreter of opcodes, for a simple virtual machine. A trivial interpreter could use a big switch statement, with one entry per opcode; but that wouldn't be particularly neatly structured, because you'd likely have to inline most opcode implementation bodies into the switch in order to get decent performance.
An optimization is a threaded code interpreter; scan through the program once, transforming opcodes into function pointers, where each function implements a single opcode, and proceeds to the next opcode by calling it directly. But this causes a problem in most imperative languages, without low-level hackery or an unsafe feature like gcc's computed goto: the stack will grow with every opcode interpreted, because each call grows the stack. Tail call support permits this approach in both a safe and efficient way.
I do agree, though, that even in these cases, some kind of tail call directive would be preferable, to indicate to the compiler that a tail call is expected here for algorithmic correctness, and help the programmer avoid accidentally inhibiting tail calling.