One - unit tests explain nothing. They show what the output should be for a given input, but not why, or how you get there. I'm surprised by the nonchalant claim that "unit tests explain code". Am I missing something about the meaning of the english word "explain"?
Two - so any input value outside of those in unit tests is undocumented / unspecified behavior? A documentation can contain an explanation in words, like what relation should hold between the inputs and outputs in all cases. Unit tests by their nature can only enumerate a finite number of cases.
This seems like such an obviously not great idea...
Not sure about this, but I like it the way it is done in the Rust ecosystem.
In Rust, there are two types of comments. Regular ones (e.g. starting with //) and doc-comments (e.g. starting with ///). The latter will land in in the generated documentation when you run cargo doc.
And now the cool thing: If you have example code in these doc comments, e.g. to explain how a feature of your library can be used, that script will automatically become part of the tests per default. That means you are unlikey to forget to update these examples when your code changes and you can use them as tests at the same time by asserting something at the end (which also communicates the outcome to the reader).
IDK, they sound like they overflow the "maximum code" counter and land up straight in the literate programming land. I wonder how far you could go writing your whole program as doctests spliced between commentary.
I discovered that completely by surprise when I was learning Rust. I wrote code and unit tests, wrote some documentation, and I was blown away when I saw that my documentation also ran as part of the suite. Such a magical moment.
Yeah, combining unit tests and written docs in various ways seems fine. My reading of the article was that the tests are the only documentation. Maybe that was not the intent but just a bad interpretation on my part.
Though some replies here seem to keep arguing for my interpretation, so it's not just me.
What you appear to have in mind here is the documentation of a test. Any documentation that correctly explains why it matters that the test should pass will likely tell you something about what the purpose of the unit is, how it is supposed to work, or what preconditions must be satisfied in order for it to work correctly, but the first bullet point in the article seems to be making a much stronger claim than that.
The observation that both tests and documentation may fail to explain their subject sheds no light on the question of whether (or to what extent) tests in themselves can explain the things they test.
One: Can we test the tests using some somewhat formal specification of the why?
Two: my intuition says that exhaustively specifying the intended input output pairs would only hold marginal utility compared to testing a few well selected input output pairs. It's more like attaching the corners of a sheet to the wall than gluing the whole sheet to the wall. And glue is potentially harder to remove. The sheet is n-dimensional though.
I really don't understand the "exhaustive specification" thing. How else is software supposed to work but with exhaustive specification? Is the operator + not specified exhaustively? Does your intuition tell you it is enough to give some pairs of numbers and their sums, and no need for some words that explain + computes the algebraic sum of its operands? There are an infinite number of functions of two arguments that pass through a finite number of specified points. Without the words saying what + does, it could literally do anything outside the test cases.
Of course, for + it's relatively easy to intuit what it is supposed to mean. But if I develop a "joe's interpolation operator", you think you'll understand it well enough from 5-10 unit tests, and actually giving you the formula would add nothing? Again I find myself wondering if I'm missing some english knowledge...
Can you imagine understanding the Windows API from nothing but unit tests? I really cannot. No text to explain the concepts of process, memory protection, file system? There is absolutely no way I would get it.
The thing about Joe's interpolation operator is that Joe doesn't work here anymore but thousands of users are relying on his work and we need to change it such that as few of them scream as possible.
That's the natural habitat for code, not formally specified, but partially functioning in situ. Often the best you can do is contribute a few more test cases towards a decent spec for existing code because there just isn't time to re-architect the thing.
If you are working with code in an environment where spending time improving the specification can be made a prerequisite of whatever insane thing the stakeholders want today... Hang on to that job. For the rest of us, it's a game of which-hack-is-least-bad.
What's stopping someone from reading the code, studying it deeply, and then writing down what it does? That's what I do, but I see people struggle with it because they just want to get more tickets done.
But if you want other people to benefit from it, a good place to put it is right next to a test that will start failing as soon as the code changes in a way that no longer conforms to the spec.
Otherwise those people who just want to get more tickets done will change the code without changing the spec. Or you'll end up working on something else and they'll never even know about your document, because they're accustomed to everybody else's bad habits.
If you're going to be abnormally diligent, you might as well so in a way that the less diligent can approach gradually: One test at a time.
I suspect we're thinking about quite different use cases for our testing code. If the input-output pairs are describing a highly technical relationship I would probably want a more rigorous testing procedure. Possibly proofs.
Most of the tests I write daily is about moving and transforming data in ways that are individually rather trivial, but when features pile up, keeping track of all requirements is hard, so you want regression tests. But you also don't want a bunch of regression tests that are hard to change when you change requirements, which will happen. So you want a decent amount of simple tests for individually simple requirements that make up a complex whole.
Unit tests are programmatic specification. I'm assuming it is in this manner that the article is referring to them as documentation, rather than as "explanations" per se.
Obviously unit tests cannot enumerate all inputs, but as a form of programmatic specification, neither do they have to.
For the case you mention where a broad relation should hold, this is a special kind of unit test strategy, which is property testing. Though admittedly other aspects of design-by-contract are also better suited here; nobody's claiming that tests are the best or only programmatic documentation strategy.
Finally, there's another kind of unit testing, which is more appropriately called characterisation testing, as per M. Feathers book on legacy code. The difference being, unit tests are for developing a feature and ensuring adherence to a spec, whereas characterisation tests are for exploring the actual behaviour of existing code (which may or may not be behaving according to the intended spec). These are definitely then tests as programmatic documentation.
This is one of those thing that is "by philosophy", and I understand, i think, what you are saying.
I do think that tests should not explain the why, that would be leaking too much detail, but at the same time the why is somewhat the result of the test. A test is a documentation of a regression, not of how code it tests is implemented/why.
The finite number of cases is interesting. You can definitely run single tests with a high number of inputs which of course is still finite but perhaps closer to a possible way of ensuring validity.
…it’s pretty obvious what those float arguments are for but the “source” is just a Path. Is there an example “source” I can look at to see what sort of thing I am supposed to pass there?
Well you could document that abstractly in the function (“your source must be a directory available via NFS to all devs as well as the build infra”) but you could also use the function in a test and describe it there, and let that be the “living documentation” of which the original author speaks.
Obviously if this is a top level function in some open source library with a readthedocs page then it’s good to actually document the function and have a test. If it’s just some internal thing though then doc-rot can be more harmful than no docs at all, so the best docs are therefore verified, living docs: the tests.
(…or make your source an enumeration type so you don’t even need the docs!)
Often times matching input to output is already enough of a clue for the reader to understand the purpose of the functionality being tested. Often the code being tested is hard to use at all unless an example input is shown to the user. This is often my reason to read the unit tests: the function takes data of a shape that's very loosely defined, and no matter how I arrange those data, the code breaks with unhelpful error messages. This is, of course, also the problem of the function that I'm trying to run, but usually that's outside of the area of my responsibility, where my task is to just "make it work", not fix it. So, in this sense, unit test is a perfectly good tool to do the explanation.
Unit tests show expected input and output of a function.
In code that was written without tests, inputs/outputs end up being surprisingly far more spread out than you might think. One of the function inputs might be a struct with 5 members, but the function itself only uses one of those members 50 lines in. If it's OOP, one of the inputs might be a member variable that's set elsewhere, and same for the outputs.
A unit test shows the reader what information is needed for the function and what it produces, without having to read the full implementation.
Also, when you write it, you end up discovering things like what I mentioned above, and then you end up refactoring it to make more sense.
Often, tests are parameterized over lists of cases such that you can document the general case near the code and document the specific cases near each parameter. I've even seen test frameworks that consume an excel spreadsheet provided by product so that the test results are literally a function of the requirements.
Would we prefer better docs than some comments sprinkled in strategic places in test files? Yes. Is having them with the tests maybe the best we can do for a certain level of effort? Maybe.
If the alternative is an entirely standalone repository of docs which will probably not be up to date, I'll take the comments near the tests. (Although I don't think this approach lends itself to unit tests.)
In some cases, unit tests can both test and specify the semantics of the system being tested. My favorite example is the ReadP parsing library for Haskell. The source code ends with a short and automatically testable specification of the semantics of the combinators that make up the library. So, in this example, the tests tell you almost everything you need to know about the library.
One - Why do you care how you got there? You need to read the code for that. But the tests do explain/document how you can expect the test to work. If the code is unreadable, well that sucks. But you at least have a programmatic (and hopefully annotated) description of how the code is expected to work, so you have a stable base for rewritting it to be more clear.
Two - Ever heard of code coverage? Type systems/type checkers? Also, there's nothing precluding you from using assertions in the test that make any assumed relations explicit before you actually test anything.
I think the line of thought behind the article is making the tests be like a "living spec". Well written tests (especially those using things like QuickCheck, aka "property testing") will cover more than simply a few edge cases. I don't think many developers know how to write good test cases like this, though, so it becomes a perilous proposition.
Do note TFA doesn't suggest replacing all other forms of documentation with just tests.
At least sometimes, it really helps for a test to say WHY it is done that way. I had a case where I needed to change some existing code, and all the unit tests passed but one. The author was unavailable. It was very unclear whether I should change the test. I asked around. I was about to commit the changes to the code and test when someone came back from vacation and helpfully explained. I hope I added a useful comment.
Yes, actually. Sometimes the edge cases that aren't covered by unit tests are undefined behavior. I don't recommend doing this frequently but sometimes it's hard to know the best way to handle weird edge cases until you gather more use cases so deliberately not writing a test for such things is a legit strategy IMO. You should probably also add to the method doc comment that invoking with X is not well defined.
Two - so any input value outside of those in unit tests is undocumented / unspecified behavior? A documentation can contain an explanation in words, like what relation should hold between the inputs and outputs in all cases. Unit tests by their nature can only enumerate a finite number of cases.
This seems like such an obviously not great idea...