I was comparing what I frequently see vs what I'd like to see. Your example code is pretty rare in my experience. So I think the comparison is pretty fair.
I worked on a program where the management had near-religious devotion to the one return statement paradigm. It made for a bunch of unnecessary nesting and locals just to comply. Even worse was strong resistance to using continue and break (outside of switch statements), which lead to unnecessary flagging variables on top of that.
Yes, the factored version is much clearer. Though you want to make sure the refactored version is equivalent, which it isn't in your case due to a change of order.
Maybe so, but what happens when you need to do something before returning the value, say like closing a connection? You'd have to make the change in 4 spots (and hope you don't miss one), as opposed to the first example where you'd only have to make the change in 1 spot. It's all about minimizing errors later on.
> Maybe so, but what happens when you need to do something before returning the value, say like closing a connection?
Then your choices include a try...finally block, a "using" statement if you're in c#, or ... limiting yourself to only 1 return. Multiple returns don't fix the problems in all cases, just most of them.
I'm assuming that this is on a language that has automatic garbage collection. The 1-return rule came about in C, which does not and so has a lot more cases where manual clean-up is required.
Then do something like hxa's example. Maintaining a return variable isn't always bad, it's just weird to think of it as functionally better than multiple returns. I tend to find a lot of errors and bugs in heavily nested code because it involves tracking a lot of variable state.
if (...) {
if (...) {
...
} else {
if (...) {
...
} else if (...) {
...
} else {
...
}
}
and prefer this:
if (...)
{
if (...)
{
...
}
else
{
if (...)
{
...
}
else if (...)
{
..
}
else
{
...
}
}
}
The first example is confusing and hard to read. The second is clean and easy to read.
As to the question of optimization, that's a different matter. Lots of tricks can be used to compress code into fewer statements on a case by case basis:
if(pushbutton == 0)
{
led = 12;
}
else
{
led = 24;
}
I find the first version easy to read, and the second confusing and hard to read. I believe this is a matter of what a person is more accustomed to, and I learned on the first form and lately have been using the first form more. For me the extra brace below the keywords makes it hard to find the phrase of interest, and the expanded vertical space also makes it more difficult to line up indentation levels. But these visual cues are flexible, and I could learn similar scanning tricks for the second form.
I have worked with people that were dogmatic that one form was objectively better than another form, rather than it being a matter of local optimization in the brain; I have difficult working with these people because they tend to show similar rigidity and confusion of the subjective with the objective in other areas, and such small mindedness limits what type of engineering solutions one is able to come up with.
An early history of writing code on 24 x 80 terminals has trained me to prefer K&R style (where the bracket is on the same line as the if) over the expanded style. The goal was to have as much of the code on the screen as possible so vertical space was precious. Of course when you've got a 1080p monitor on its side you get all the vertical space you need and the open style does give you more room.
I strongly avoid however converting logical semantics into arithmetic ones. The LED example where:
led = (pushbutton * 12) + 12;
Makes me cringe.
You know that the person who wrote that assumed 'non-zero' meant it was pressed, not '1' means its pressed at some point the pushbutton is connected to bit 3 of the I/O port and the the kid was doing a byte read 'cuz it was easy' and now the value of pushbutton is 8 (which still would work on their code but not the refactored code) and led gets the value 108 and the code that takes the LED value turns on some random GPIO bit somewhere else because it can "only" get either 12 or 24 when now it can get a lot of different values. And a long escalation in manufacturing that the 'new software screws up the gizmo' turns out to be a side effect from refactoring. Been there, done that, wrote the after action report. No thanks!
First I understood that you were not seriously proposing this sort of I/O code, but as someone who has been around the block my share of times I know that such code both exists and is in production, almost exactly as written. The memories are like a war veteran who can't forget the trenches, I wince now and then :-).
Second, my comment was that code which has a 'boolean' flow is probably boolean code, and not arithmetic code. So any refactoring that 'assumes' boolean and uses arithmetic becomes a landmine for some future coder to trip over. MUCH better to refactor the code as such into:
led = (button)? 12 : 24;
Than "assuming" that true is '1' and false is '0'. When reading old code I've trained my self to read the statement
if (variable == 0) {
// this code assumes the variable is zero
} else {
// this code has to work for all non-zero values
}
Which may not have been the programmer's intent, but it is what they wrote.
I understand. This was about coding style and not about solving a particular problem correctly. Next time I might choose a better example. Maybe I'll include complete keyapad matrix scanning, debouncing and encoding code...just to be safe. (just kidding).
It's been a long, long time since I've checked button activity that way. These days andy time I do embedded work I tend to use either my own RTOS or a third party RTOS. In most of these cases UI events get turned in to messages in a message loop (sort of like programming for Windows). So it is likely that there's a big switch statement somewhere picking picking off messages off the FIFO and processing them accordingly. The low-level stuff happened somewhere else. The message would indicate the event that took place (UP, DOWN, CLICK, DOUBLE_CLICK) and parameters in the message packet would identify the button or buttons (in the case chords are enabled) that generated the event.
Fun stuff. Wrote this code years ago and keep reusing it. It simply works. Now doing a lot more iOS and web development. iOS can feel a lot like embedded development at times.
Good procedural code is factored into short functions, which would make some of the types of programs the article is alluding to easier to read.
Also, object oriented programming can be a good option for reducing cyclomatic complexity in a lot of different situations where the system you are modeling lends itself to a class structure. And CoffeeScript makes this easier also.
In C at least, pushbutton could be strictly greater than one and still be treated as true, but that would be a different result in the arithmetic versions of the statement. It could be made identical by using "!!pushbutton".
I agree, the second style is infinitely more readable. It clearly shows the hierarchy. Unfortunately it seems to be rare compared to the second one, as most people seem to want to hide the opening braces or something. Maybe because they come from ugly languages without braces, like Assembly, Python, Fortran, or Ruby...
I've programmed in everything from entering hex code into a hexadecimal keypad to assembly, Forth, APL and beyond. Sure, languages can influence style. I wouldn't go as far as calling any language "ugly". There's that initial phase of exposure when one might not be comfortable with the language constructs. And yes, at that moment in time it is natural to feel that a language might be "ugly". I certainly felt that way when learning Objective-C. Once you get on the other side and you start using the language for real work things can change. I don't hate any particular language but do enjoy some more than others. Forth, APL and Lisp were a lot of fun to work with.
You certainly can, but it doesn't mean that you should.
Case in point - Linux networking code, and more specifically sockets handling part. Mr. Kuznetsov did an excellent job abstracting a lot of things behind interfaces (= blocks of function pointers), but at the same time this made it much much harder to follow the code. It's basically impossible to trace the code by hand without a pen and paper.
(edit) To elaborate on the context - reading and tracing the code straight off the source files is a necessity of kernel development, routinely required for both debugging and writing new code.
if (!test()) {
handle_failure();
} else {
// do lots of work
}
is just another incarnation of
if (NULL == v)
i.e. it sacrifices a natural flow of code and makes it harder to follow for something of a superficial value.
Also this -
if (!test1()) {
handle_failure_1();
return false;
}
is fine unless the function needs to do a cleanup, which is pretty much always the case, and which in turn means that the { } block with grow with every if and it will include a lot of redundant code.
A far cleaner and easier to follow approach is to use if-goto, whereby all cleanup code sits at the bottom of the function (where it naturally belongs) and the flow jumps to the appropriate part of the cleanup on failure. The only exception is when there's no cleanup needed, in which case it can do if-return.
I use both if-return and if-goto. In practice you need both, depending on circumstance. In much of my recent code I've managed to use a lot of if-return checks at the top of functions. Another nice thing about a bunch of if-returns at the beginning is it makes a nice checklist of things that will cause the function to fail, all right up front.
Asserts are for enforcing invariants in the code. If you replace an assert with a conditional that allows the execution to continue, you are concealing the problem and making it that much harder to track.
Practically speaking, your function should not be failing because your other code passed it invalid parameters. That's a bug in the former. That's why parameter validity should be asserted.
The broad suggestion that indentation is somehow wrong is ill-advised.
Sure, various of the refactorings given are sensible. But the basic problem is not indentation. If something can be simplified, simplify it. If something is too long, break it into pieces. But if something is tree-structured, that is what it is, and it probably ought not be bent out of shape.
(Note: Fowler's refactoring 'Replace Nested Conditional with Guard Clauses' is very specific -- it does not declare that all nesting is bad.)
A tree-like-indentation structure is a perfectly good element of programming -- why try/want to change it into something else?
In fact, a good case could be made that software in its most fundamental essential conception is tree-structured -- even that engineering generally is tree-structured.
At this point in my coding life (15 years in), I'm pretty strongly against if / else clauses. The readability of such code just plummets in my view.
I try to limit my conditionals to inline statements:
$x = $y if $b;
Often joined with early returns:
return $b if $a;
If I find myself coding an if/else branch, that's often a clue that I should refactor this chunk into a function that can use early returns to avoid nested conditionals:
I'm glad I am not the only that is particular about these things. Computers may not care which order your IF statement is in... but some people completely disregard that their code is going to be read by humans too.
Generally good advice, but w/r/t the first recommendation, I prefer to structure if/else clauses so that common case is handled first and the exceptional case is handled last.
As shown in the article, that first recommendation leads directly to the next step(s). In the ideal case (which actually happens sometimes in the real world!), you end up with something like this:
if (bad1) fail();
if (bad2) fail();
if (bad3) fail();
step1();
step2();
step3();
step4();
return SUCCESS;
To me, this is a very nice way to structure things when you can. You get a checklist of things that might be wrong, and then comes the wonderfully naive straightforward implementation.
Stating failures first can also be considered analogous to input validation, preconditions, or declaring the ___domain for which the function is undefined. Some related discussion on here: http://c2.com/cgi/wiki?GuardClause
I like to do this as well, both for reasons of clarity and because I like to imagine that it saves my code from executing an often-wasted conditional (though I really have no idea if the order of branch condition evaluation is guaranteed in the face of compiler transformations, or if branch prediction and/or speculative execution make this point completely moot).
meh. when i'm looking for a bug, i tend to start in the places with lots of branching. in general, high quality code doesn't have much branching, and if you find yourself neding complicated and nested branching, there is probably a better way to factor your code.
one easy trick is to structure your code such that 'null' is impossible, by initializing to an empty value, or re-ordering the way things happen, or using exceptions or error-monad.
I have to run so this is really a substandard comment for hacker news, but... Really? He advocates adding multiple exit points to a function as a way of REDUCING COMPLEXITY? The solution when you have too deep of nesting is to pull out inner parts into useful simpler stateless functions, or so I was led to believe!
He's preaching to the choir in my case. I'm a big fan of busting out of functions/methods early when you already know "it's over". (sometimes this means putting the cleanup in something like a try/finally wrapper routine, though)
But, kudos: I can send this to my team instead of having to write it myself. Win!
Or just test the resource handles you opened, and only close them if they're valid. That way, you don't have to worry about doing a goto to the wrong label, or messing up the order of cleanup.
BOOL do_stuff_with_resources()
{
int success = FALSE;
char *memBuf = (char *) malloc(256 * 1024);
if (NULL == memBuf) goto cleanup;
if (randBomb()) goto cleanup;
FILE *fp = fopen("dummy-file.txt", "w");
if (NULL == fp) goto cleanup;
if (randBomb()) goto cleanup;
int sock = socket(AF_INET, SOCK_DGRAM, 0);
if (-1 == sock) goto cleanup;
if (randBomb()) goto cleanup;
int shmId = shmget(0x1234ABCD, 256*1024, IPC_CREAT);
if (-1 == shmId) goto cleanup;
shmRegion = (char *) shmat(shmId, NULL, 0);
if ((char *) -1 == shmRegion) goto cleanup;
if (randBomb()) goto cleanup;
sem_t st = -1;
if (-1 == sem_init(&st, 0, 1)) goto cleanup;
if (randBomb()) goto cleanup;
// TODO: Do stuff
success = TRUE;
cleanup:
if(-1 != st)
{
sem_destroy(&st);
}
if(-1 != shmId)
{
shmctl(shmId, IPC_RMID, NULL);
}
if(-1 != sock)
{
close(sock);
}
if(NULL != fp)
{
fclose(fp);
}
if(NULL != memBuf)
{
free(memBuf);
}
return success;
}
If you do your indentation "wrong" and nothing breaks, then you will do your indentation "wrong". That's why I find whitespace-sensitive languages easier to read, YMMV.
He's basically a step away from having several objects (let's call them handlers) which can do your handling for you. Further, they can define the set of circumstances in which they are valid.
I'm going to work with a couple of assumptions:
1) The test casts which he refactors to test*() methods are non-trivial
2) The handlers _may_ be large sets of code
Here's a way to handle one of this refactored methods:
if (!test1()) {
handle_failure_1();
return false;
}
if (!test2()) {
handle_failure_2();
return false;
}
if (!test3()) {
handle_failure_3();
return false;
}
handle_success();
return true;
That's a lot of potential code we're dealing with in this class.
What if:
class HandlesFailure1
def handles_this_case(params)
#code from test1()
end
def handle(params)
#code from handle_failure_1()
return false;
end
end
class HandlesFailure2
def handles_this_case(params)
#code from test2()
end
def handle(params)
#code from handle_failure_2()
return false;
end
end
class HandlesFailure3
def handles_this_case(params)
#code from test3()
end
def handle(params)
#code from handle_failure_3()
return false;
end
end
class HandlesSuccess
def handles_this_case(params)
return true
end
def handles(params)
//code from handle_success
return true
end
end
#And now the original class
class DoesSomething
#There are ways to set this up fairly easily. One simple way it to just new it up here, but you could auto-wire it too
handlers = [HandlesFailure1, HandlesFailure2, HandlesFailure3, HandlesSuccess]
def do_something(params)
handlers.each do |handler|
return handler.handle(params) if handler.handles_this_case(params)
end
end
end
There. Now there's a loop and an if. Check each case and either handle or toss it out. This can also work for the case where you might want multiple handlers to handle something, just don't return on the first one.
I'm not saying it's the right way, but it's another way.
Pros:
It really helps to group together the handling methods with the test methods.
Very simple dispatch method
Cons:
Separate classes, and harder to view all the handlers together.
I used to be a big goto-error-cleanup proponent, but ultimately gave it up. The resulting code is cleaner, but the bug it's trying to treat (returning without correctly cleaning up) is still too easy. Complicated functions end up with 4-5 cleanup labels and it's hard to look at them and understand the resource acquisition order.
Generally these days I prefer functional decomposition for that (actually honestly I'm preferring C++ RAII style where you get the cleanup for free in almost all cases). That is, do something which requires cleanup (e.g. allocate memory, take a mutex, etc...), call a function to "initialize" the thing you just did, check results and free it on failure. So the allocation and cleanup happen next to each other in the code. Nest as needed for complicated resource strategies, and it scales well. No one is ever confused about what the state is supposed to be.
The problem isn't with 1 resource. It is with 2 or more, right? You successfully allocate the first, but the second fails and so you have to go back and deallocate the first. The goto strategy avoids code repetition.
I read the comments just expecting yours. Something must obviously bad if Dijkstra says so, right? We don't have to think for ourselves anymore if the god of CS himself has banned something, right?
Any of those mysterious jump-to-somewhere-else code features resemble goto. Not that I don't think they clean up some nesting issues, but they're pretty much the same thing as goto.
More specifically, the `defer` keyword lets you handle cleanup in the case of failure a bit more neatly.
The gotos can be replaced with stuff like this:
foo, err := bar()
if err != nil {
return err
}
defer foo.cleanup() // or what have you
Now, right before the function returns, Go will call `foo.cleanup()` and your `cleanup()` call sits right next to the initialization code. It's a lot easier to tell whether or not you are (or someone else is) doing cleanup properly.