Hacker News new | past | comments | ask | show | jobs | submit login
Vim plugin to disapprove deeply indented code (github.com/dodie)
161 points by danjoc on March 7, 2017 | hide | past | favorite | 93 comments



... and then people just "solve" the problem by not indenting at all.

Seriously though, it's super hard to avoid deeply-indented code in some languages (cough Java, JavaScript).

    @Override
    protected void onDestroy() {
        super.onDestroy();
        if(someCrap != null) {
            PendingResult<MessageApi.SendMessageResult> pending = Wearable.MessageApi.sendMessage(mGoogleApiClient, mPhoneNode.getId(), path, data);
            pending.setResultCallback(new ResultCallback<MessageApi.SendMessageResult>() {
                @Override
                public void onResult(MessageApi.SendMessageResult result) {
                    runOnUiThread(new Runnable() {
                        ....


"Bulldozer code" - gives the appearance of refactoring by breaking out chunks into subroutines, but that are impossible to reuse in another context (very high cohesion)


Subroutine that is only used in one place still improves readability: instead of a bunch of statements, you have a name, explicit parameter list and a retugn value.


...and then that subroutine has another subroutine, each has its own *doc comments, and suddenly to understand the code you need to jump between three files and read 50 lines of code that could be five lines of code in one file.

Breaking up code doesn't always make it more clear and readable.

The same goes for OOP and desperately creating objects from things that don't represent real world entities.


I only agree with this in the case of pure functions. However, when your function has a side effect (quite usual in C++ methods for example) then all readability gains are lost.

I much prefer inlined code that reads from top to bottom, rather to have to jump all over the place into single purpose functions.


I like this. I never really thought about it this way, but I'm much more okay with long functional methods rather than long mutating method


it's not when applied once, it's when something that can be understood more easily "locally" is exploded into a billion such pieces - then you have to mentally find all the call sites and usages to mentally build the call graph back up — you still have to understand how the whole ensemble works and bulldozing can (but not always) make understanding the ensemble more difficult especially if done by a poor programmer has the effect of obfuscation rather than clarification.


It's good when it gives a name to what you're doing.

connection = establish_connection_given_these_conditions(cond1, cond2)

Vs the many lines of checking, creating the ConnectionFactory, passing in the the conditions to the ConnectionConditionFactory, and generally having a new statement every time you want to blow your nose. What were we doing again? I dunno.

I've used it. I'll defend it. It's better than comments when done properly as above.


What's the difference between cond1 and cond2 parameters? Can they be swapped? If cond1 and cond2 are booleans, they're almost indecipherable at the call site. Why not cond1 && cond2 (or symbolic equivalent)?

Abstracting away the gruntwork of your connection factory and your connection condition factory into a couple of separate routines - those are resuable bits of code, they'll let you write this code at a higher level of abstraction using reusable abstractions. But pushing the ugly wiring into a non-reusable black box and pretending it's better design - bullshit.


Good point on the conditions. I wanted to avoid pasting prod code I'd been working on and was trying to make an example: I generally use this for things like logging.

It's best when:

1) The logic is very specific to the situation and isn't reusable 2) Isn't important to the function's api contract.


Indeed!

More generally, functions are not only about abstraction, but also about decomposition.

The only thing that may hurt readability is when the decomposition itself is too much nested. Usually, it's best to split a large task purely linearly in subtasks (functions), and have them called all from a single function. That main task function gives you not only an overview of all sub tasks, but also about the dataflow between them.

In constrast, too much subtask nesting (subtasks of subtasks of subtasks) often hide the overall flow, which then makes it hard to get the overall view.


Code maintenance is hard, therefore a bugfix could easily turn the function to establish_connection_given_these_conditions_and_implicit_one(cond1, cond2); Now the identifier is misleading and becomes a landmine waiting to blow your feet off :(

Having done some debugging where identifiers/comments/structure are misleading I am a bit afraid of such code. I am really not sure what is the best way to structure highly imperative code though, because every option I have seen contains some dark corners.


Fair. On of the good aspects is that it makes this chunk testable, rather than having to test the behavior of the long line of statements. If there's a test for it it's a lot harder to sneak in an implicit assumption.


What I meant was that the implicit condition may be too context specific, dunno, maybe some NIC flags or something that are extremely unlikely to be caught by tests (especially written by anyone working on the code piece), code reviews or any such measure. Because the code works seemingly well until underlying conditions stop being satisfied.

Such conditions happen all the time, but are unproportionaly painful to debug in bulldozer code :(

Rant: Even though tests are nearly invaluable, though I personally believe that tests written by the same person writing implementation give a bit false sense of security. I see spec, tests and implementation different representations of author's understanding of the problem. If any two of those (e.g. tests and implementation) are produced from the same mind/understanding then they cannot be used to verify that understanding behind each other is the same.


Good point. I think I should make myself clear: I explicitly meant _procedures_ as in _functions_, most likely, pure ones, and not _methods_ of the same instance, for example. which have acccess to the same dozens of private members. If you can, declare these procedures to minimize their possible side effects - that makes reasoning about them much clearer.


The point is not to do that. When it is split into small pieces and I am validating the code, I care only about whether the content of function does what the name says. I don't need to have it all in head at the same time.


> The point is not to do that.

Then it's not bulldozer code.


With no guarantee that the name corresponds with what it does (particularly 5 or 10 years in the future) and your parameter list doesn't have names at point of use (what does the 'true' argument do?) and may be overloaded.

Factoring things out into single-use subroutines is a real mixed bag. It's not a clear win at all, not at all.


While I don't think refactoring single use function is a win in every case, I think it makes sense when what you're trying to do in the block of code could be conceptually separated from the rest of the function - and named appropriately.

Arguments sometimes make this practice messy, especially if you find out that your detached subroutine requires 5 different arguments from its parent. But IDEs or smart editors with tooltips or goto-definition make this an almost no-brainer in most cases. Besides that, many languages allow you (or even force you in the case of Smalltalk and Objective C) to solve the argument readability problem at the language level. When it's really naked 'true', you could just use keyword arguments.


I used to agree with you but now I resoundingly do not. The only compelling reason I have left is still quite compelling: testability. Without breaking your code I to subroutines it becomes orders of magnitude harder to test effectively or at all.


It comes at the expense of indirection, so it's certainly not free.


That reminds me of this blog post about inlining code based on a John Carmack email.

http://number-none.com/blow/blog/programming/2014/09/26/carm...


You mean...very high coupling?


perhaps, coupling does seem better; I am actually just quoting the definition as sourced here.


Small improvement, but this can help:

    if(someCrap == null) {
        return;
    }
I like that Swift makes this more explicit with the guard statement, but I've found it useful to reduce the "mental stack" in other languages, too.


Not everybody is allowed to do that. MISRA for example requires a single exit point.


That's 14.7, but if for some reason 14.7 doesn't apply 16.8 requires that all exit points return a value. So I'd argue MISRA lets you do that.


Reading up on MISRA, I found an interesting defense of a single return statement [1].

In the embedded world, I imagine it would be a good standard to force people to consider refactoring functions to their truth table.

Generally, I strongly favor coding standards to combat bad coding habits. Curious how MISRA works out in practice.

[1] https://spin.atomicobject.com/2011/07/26/in-defence-of-misra...


Equivalent to the "multiRet" function there and, I think, easier to follow:

  int multiRet() {
    if (!a) return 0;
    if (!b) return 1;
    return c ? 2 : 0;
  }
The singleRet function, as someone else already pointed out, actually has different behaviour (as well as being, in my view, just slightly harder to read and reason about):

  int singleRet() {
    if (!a) return 0;
    return (b&&c) ? 2 : 1;
  }
Both of these seem to me much much clearer than the multiply-nested versions there, and much clearer than introducing a new variable for the sake of a single return point.

(The other way to get a single return point here is to turn the code into a nest of ?: operators. I claim that will be less readable and more error-prone for most C programmers.)

I can't recall any instance I've ever seen where code became simpler or easier to follow as a result of turning multiple return points into a single return point.


> That’s a lot simpler isn’t it?

WTF?? No it isn't. You have this magic initial value and have to figure out which code paths have mutated it and which haven't. Mutable variables make code far far more complex to read.

I'm all for single return but through the use of expressions, not variables; I would probably express the given function as:

    expression() {
      return
        a ?
          (b && c) ?
            2
          :
            1
        :
          0
      ;
    }
(Ideally I'd use a language that allowed a sensible conditional expression rather than one that relies on bizarre symbols, but we work with what's available)


In OCaml :

    let expression a b c = match (a, b, c) with
        | (true, true, true) => 2,
        | (true, true, _)    => 1,
        | (false, _, _)      => 0
I like how pattern matching allows you to create something very close to a truth table. It allows you to visually see what's going on.

EDIT: Just realized that the two code examples in the parent's link don't even do the same... Look at the truth table, and look at the case of 1 1 0. The first code would indeed return 0, but not the second one. Derp.


> In the embedded world, I imagine it would be a good standard to force people to consider refactoring functions to their truth table.

It's not only a "good standard". It helps with verification: you can verify that the truth table is correct, and then that the function correctly implements that truth table. At least the former can sometimes be done formally.

> Curious how MISRA works out in practice.

Like every other standard, it depends on who's implementing it and with what tools, but it generally works out pretty well.


MISRA wasn't written with java in mind.


And in C(++) you never have to check preconditions that maybe should lead to an early return? I don't understand your point.


From the first paragraph on wikipedia: "Its aims are to facilitate code safety, security, portability and reliability in the context of embedded systems, specifically those systems programmed in ISO C / C90 / C99."

Java has better much better code safety than C (no pointers, gc, etc.) is portable via a VM, and is not used in embedded systems. It's also not specifically ISO C. MISRA was built do address certain issues with a certain language, in a certain environment. You shouldn't apply everything it recommends blindly to another language just because they both have conditions and return statements.

If you can explain to me why java benefits from having a single return statement, I would consider it. Otherwise, I refuse to blindly follow guidelines that were designed for different problems.


If you read this thread again, you'll see that someone recommended early returns to reduce nesting. I replied that some people are not allowed to do that (eg because they have to follow MISRA). There is no need to quote Wikipedia. None of the points raised in this thread pertains only to one programming language. I'm sure you'll find at least one style guide forbidding multiple returns for every language that has return statements.


This is getting ridiculous. The reason I quoted wikipedia was to explain why "MISRA wasn't written with java in mind". Will you argue otherwise?

> None of the points raised in this thread pertains only to one programming language.

Except you replied to a java problem with a C style guide.

> I'm sure you'll find at least one style guide forbidding multiple returns for every language that has return statements.

And if that guide can explain why java benefits from having a single return statement, I would consider it. Otherwise, I refuse to follow guidelines that were designed for different problems or cosmetic reasons.


retrolambda will get rid of some of those :)

But generally, if the code is too onerous to modify due to e.g. excessive use of closure-captured data, then the code usually deserves a long, hard ಠ_ಠ to make sure it doesn't deserve a (╯°□°)╯︵ ┻━┻. Hard to test, hard to fix, hard to control.


> a long, hard ಠ_ಠ to make sure it doesn't deserve a (╯°□°)╯︵ ┻━┻

Truly an idiom for our age.


Prehistorik humans used glyphs to record events. Seems that 'history repeating' proves itself again.


It's only "super hard" if you insist on anonymous classes. Reversing the `someCrap != null` condition and using an inner class reduces the identation count down to 2 (and that's in only one place for the return statement in the inverted null condition).

Also, consider using static imports for `Wearable.MessageApi.sendMessage` and `MessageApi.SendMessageResult`.


I really don't like static imports for stuff like that. You always have to do a double take to recognize where the method comes from.


Personally I've never had this problem. The context was always sufficient, and a simple IDE shortcut can offer me any additional information. I would argue a double take is worth reducing the number of dots I have to follow when reading the code. Also, in the case of `Wearable.MessageApi.sendMessage`, you can statically import `MessageApi.sendMessage` which gives a little more context.


I really like being able to read code in source control historical diffs, pull requests, etc. IDE shortcuts aren't a full solution; not everyone has time to check out a branch at that commit, create a project (if necessary), fire up an IDE (if there's a UI available) just to understand a line of code. Similarly, static imports at the top of the file are usually remote from the point of use, so you lose that context when viewing a diff.


Time for my apocryphal story:

I needed to a fix a particular kernel protocol subsystem so I could use Linux at work. I sat down with the spec and a non-Linux computer that implemented it. I spent 8 weeks (I thought it would take a couple days--HAH!), and I wrote it and wrote the very nice test suite which included the single test case I couldn't handle (handling would have required major changes to the kernel).

It's a protocol. The word protocol means "state machine". Especially when hardware dudes design something.

However, state machines mean "indentation". One level for current state, one level (at least) for output variables, and a couple of levels for input variables. My code was well documented, tested, and easily changeable if you needed to add a new state.

"But, that's more than two levels of indentation. Rewrite it.".

After I pulled my jaw up off the floor, swore a couple of times, I replied: "No. This code is properly written, well-tested, and very amenable to changes when the spec gets updated. This code does what I need and what a bunch of people need. I will happily distribute it to them and you will have a bunch of users of this subsystem happily telling everybody to not use mainline because they are a bunch of tossers. If you wish to rewrite it to your indentation spec, feel free, but I will not sign off until it passes all of the tests."

So, they did. They decided to rewrite it to remove the indentation. Lots of early returns (full of bugs). State cases that got "simplified" by sharing common code (full of uninitialized variable bugs). And, because of all the sharing, it became difficult to add cases when the spec changed (which it did--regularly).

But, eventually they got their two levels of 8 character indent. After SIXTEEN WEEKS.

My question, of course, was "Why not just pull each state into a function? It was the obvious way to get what you want. It would have been much easier."

The response: "Well, we tried that. But there are so many variables per state, that the function calls were all exceeding an 80-character line length and we had to wrap them too much."

I had a bruise on my forehead that day.

Yeah, formatting rules exist and have good reasons. But please remember that the goal is "code quality and readability"--not formatting rules.


But for example your indent

if(someCrap != null) {

followed by indentation on the following block, would be turned by the programmer into something like:

if(someCrap == null) return; //nothing to do

without an indent following.

thereby saving the indent and having to have the reader keep track of another level.

How many times have you seen:

       }
    }
  }
}

And is it really that great?

Breaking it into line by line is actually easier to follow than nested conditions. (You don't have to worry about proper indentation or matching braces properly, it's simply easier to write.)

I think it's easier to read as well.


I agree with you if that's all it were. But sometimes it's

    if(someCrap == null) {
      try {
        CrapSurface mCrapSurface = CrapGenerator.generateSomeCrap(CrapGenerator.CRAPPY, new CrapFactory.CrapOptions(new CrapOptionsCallBack() {
          @Override
          int getHeight() { return 5; }
          @Override
          int getWidth() { return 9; }
        }));
        someCrap = CrapSurface.getCrap();
      }
      catch (SomeException e) {
        try {
          someOtherFailSafeButInefficientWayToGenerateTheCrap();
        } catch (WeirdExceptionThatWillSeriouslyNeverHappenButStupidJavaRequiresMeToHaveATryCatchClauseAnyway f) {
          // do nothing because this exception will never happen and if it happens the user deserves it
        }
      }
    } else {
      ... the other code I had before ...
    }


  if(someCrap == null) {
    someCrap = makeNewCrap();
  }
That's more readable and less indented. It's also self documenting.

   } catch (WeirdExceptionThatWillSeriouslyNeverHappenButStupidJavaRequiresMeToHaveATryCatchClauseAnyway f) {
      // do nothing because this exception will never happen and if it happens the user deserves it
   }
Nice joke, don't do this. I don't believe it's the user's fault, and even so either print the stack trace or name the exception `ignored`. It's not that hard to pipe all ignored exceptions in a single catch.

Also, somebody made the conscious decision of making that exception checked, which means it's the implementor's fault for forcing you to catch it. If you want, you can use lombok's `@SneakyThrows`.


I just do this when I don't want to handle a checked exception.

    throw new RuntimeException(e);


> That's more readable and less indented. It's also self documenting.

It also pushes you ever closer to the dex method cap imposed by Android, unfortunately.


this is hilarious :) thanks.


My questions I ask myself every time to keep indentation to a minimum:

- Is there a way to use an early return? (In your case early return if == null) - How can I get around using an `else` ?

And I am happy this works so well. Of course there are sometimes if/elseif constructs but exclusive if/else is rare.


No, it's not hard to avoid, actually it's pretty easy to avoid if you understand what you are doing.


I like to think of plugins like this one playing a sort of "assistant" role rather than a "benevolent dictator" kind of role. So you don't have to end up making ugly hacks to just make it work.


async/await has greatly reduced the need for indentation in JavaScript


Even without async/await, if you properly organize code in bite size functional blocks and use other asynchronous constructs then the indentation should be manageable.


And Promises too!


Not if you auto format in CI.

Though that could be dangerous.

Although you could just have a build step that waits for a human if the formatted doesn't match submitted.


Promises?


Cyclomatic complexity[0] is a decent proxy for indentation level all though not all indentation increases cyclomatic complexity and not all increase in cyclomatic complexity increases indentation. However for the purpose of reducing complexity it is wonderful. The ruby community is fairly good at using Rubocop[1] to enforce number of lines in methods and cyclomatic complexity. I've found that limiting yourself to 10-20 lines per method and a cyclomatic complexity lower than 7-10 is very beneficial to code quality and perceived complexity. Those numbers obviously need to tweaked to suite the language, but work well for ruby.

Often this does involve pulling out blocks of code into a methods that's only called once, but as discussed elsewhere in the thread this is still an improvement in perceived complexity and readability.

0: https://en.wikipedia.org/wiki/Cyclomatic_complexity 1: https://github.com/bbatsov/rubocop


So looking at the example, you can avoid one level of indentation by doing e.g.:

  if (items === null) { return 0; }
at the start. Which is fine if that's the preferred style, but if it is, then you should do it regardless of how deeply nested the dependent stuff is.


Yep! That's happy path/golden path/line of sight programming, and it's awesome.

https://medium.com/@matryer/line-of-sight-in-code-186dd7cdea...


Nice, I didn't know there was actually a name for it. My 15 years of coding experience has just taught me it's the far better way to code.

It's one of my peeves to be following code paths and oh look, now we're in a situation where it's if (OK) { if (OK) { if (OK) { if (OK) { ... } } } } and I'm trying to remember what all the conditions were to get here since it's long since scrolled off the page. I always find myself thinking "why, why didn't you just return way up there!?"



Sorry to be cynical, but this sounds as about as useful as Microsoft Word syntax checkers that draw green lines under perfectly valid sentences.


I was helping someone edit a book, and that syntax checker came in very handy. There were indeed many false positives, but it provided a much smaller search space for my eyes than 'every word on every page'. (I did read every word, but the ones with the green lines under them got extra focus that was not going to be available for 200+ pages.)


I also love how the very people who can spend tens of hours on configuring vim, emacs, bash, zsh or their IDE can't be bothered to spend 5 seconds disabling the Office spellchecker if it bothers them so much :)


I'm glad that feature worked for somebody--that's definitely a different use case. For smaller documents it's more effective just to reread the doc carefully.

Spell checks on the other hand are invaluable, especially since you can extend the dictionary to cover new words like "deserialization."


Does anyone know a good plugin that will provide faint vertical lines so I can see where the indents line up (as described below)?

Someone I know used something called Cream, which they said was Vim with a simplified GUI-based interface (I know; what's the point?). But it had one nice feature: If you indented a line, then Cream would draw an unobtrusive vertical line, extending down from the indent's left-most character (I forget how far the lines extended or what that logic was, but they went far enough to be useful). You could easily see how many tabs 'in' you were and line up a current line with one (not directly) above.


I think you're looking for [IndentLine](https://github.com/Yggdroot/indentLine)

I also use the following options in my vimrc:

  " set indentline style
  let g:indentLine_char = '│'
  let g:indentLine_color_term = 66
  let g:indentLine_color_gui = '#4f5b66'

  " fix performance issue with long lines
  let g:indentLine_faster = 1


Excellent, thanks.



Gracias


Sublime Text puts in the lines natively. It's really useful.


This plugin made me laugh! Those disapproving eyes.

Might go crazy if using this while developing though.. (stop harassing me, EDITOR)


Nice idea, but I'm afraid those who are too lazy to structure their code are not likely to give Vim a chance either. A plugin for a popular IDE would be more beneficial.


I have opposite experience. People with good IDEs (idea, visual) were more likely to structure the code while people with weaker IDEs (vim, sublime) less likely. Mostly because weaker IDE does not navigate code that well and jumping from call to definition requires more manual scrolling.

In the extreme, with pure text editor, you want all details together where you effortlessly see them. Helper function means manual search and scroll.


Helper functions mean one key press to jump to their definition and another one to jump to their declaration, and one more to jump back.


Yes, when you have full blown IDE and use typed language. It is somewhat weaker when you have full blown IDE and use something like javascript (e.g. which find function from gazimillion find functions in project was it supposed to be)?

It is really weaker with sublime or vim, at least from what I have seen. They often did not found right function even when it was in the same file and there was no ambiguity - leading to harder to read code with deep nesting and loooong functions (that really should have been split into named steps).

It does not do that at all when vim user did not bothered to configure right plugins or did not bothered to learn them.


It's funny because I went from vim to intellij a year ago and recently tried out sublime text, and found my intellij code to be worse than my vim code. Mostly because intellij's auto complete and easy "go to implementation" makes up for (and can cause, if you aren't aware of the problem) a lot of quality issues - things you can't ignore in vim unless you want an codebase you won't be able to easily change.

I would gather that sublime text pushes you towards more focused, tight code, because their auto complete isn't context aware.

To your specific point, helper functions should be as close as possible to the function it is helping. If you have easy "go to function" functionality, some people will slack off on this.


I found the "tight" code collegues generated (quite possibly much different then yours) harder to read. It was pretty much impossible to read high level overview of what was supposed to happen without having to go through all the low level details. When things were split into smaller functions, I could come to unknown code and read it fast. Reading what it is supposed to do from vim written code took much more time. It was harder to mentally split it into self contained pieces that means something.

Refusal to use helper fuctions also led to a lot of code repetition in various (especially on click) handlers. When you have bugs from people forgetting to change fourth place where pretty much same thing is done, then the code is bad.

We had deadlines and I really think that everything would be way more effective if we took advantages of ide and generated code that express intent better and is composed of smaller units. Instead everything took longer and complexity was pretty much unmanaged (since they refused to use helper fuctions they never learned how to use then effectively nor how to effectively modularize code - the codebas was fine while small and increasingly unmanageable as it grew).


That is just the point watwut made. The jump to definition function in vim or sublime is not quite as good as a (correctly configured) full-blown IDE. They search for something like a definition, but do not understand the code on the same level as an IDE that has all the build information. On the other hand, IDEs force you to spend time configuring it, or there will be red squiggles under everything.


There are plugins that let have this level of understanding. Youcompleteme.

But then vim gets heavy like an ide... Pick your poison.


Never needed a plugin for this, just `:set tw=79`


Reminds me of this emacs plugin, deldo:

https://youtu.be/D1sXuHnf_lo?t=2m57s

Slightly NSWF.


Clever idea but I tend to disapprove of using things like this in practice. Formatting (or rejection of poor formatting) should be handled in CI.


If you wait for CI, you're adding time to the feedback loop. This is, imo, a bad thing since it interrupts The Flow to go back to previously written code to fix it. By the time I commit, I've already evicted that stack from my mental model; I'd rather not have to go back and rebuild it 5 minutes later when the build breaks.


So run the linter locally, either on-demand or using git hooks in addition to CI?


Which is pretty much exactly what this tool is doing...


I'd like a language that designed to have zero indentation and you wire things up graphically from those subroutines you've created.


Try LabView. It is graphical programming and you can make literal spaghetti code. It has problems, like diff and merge is very hard.


It's also good at making extremely simple things incredibly complex... but then provides some high level "functions" to do weirdly specific things.


The problem is not in the code you write, in the code you have to extend or maintain.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: