Comments in JSON

LinaLauneBaer · on Aug 2, 2013

There is a interview with the inventor of JSON somewhere. In that interview he explained why he did not allow comments in JSON like in XML. He said - if I remember correctly - that it was intentional to not have comments in JSON. The reason way that comments could be misused to add additional information for a parser. For example in XML you could use comments and a special parser could use these comments to create code while parsing. He did not want that. He wanted every JSON parser to be a JSON parser and nothing more. If you wanted to have comments in JSON he said that you could simply make the comments inline and have a convention for the keys which are comments for example every key ending with _comment could have a value which is then seen as a comment by the application but not by the parser.

jaredmcateer · on Aug 2, 2013

Yes the JSON spec was designed with interoperability in mind, I don't believe Crockford claims to have invented JSON, merely discovered it.

That said if you want your Static JSON objects to have comments, just pipe the JSON object through a minifier to strip comments before parsing.

TranceMan · on Aug 2, 2013

You are correct - confirmed in this video: Lessons of JSON

'A recent (and short) IEEE Computing Conversations interview with Douglas Crockford about the development of JavaScript Object Notation (JSON) offers some profound, and sometimes counter-intuitive, insights into standards development on the Web.'

http://inkdroid.org/journal/2012/04/30/lessons-of-json/

{ Thank you Douglas for your vision :) }

jerf · on Aug 2, 2013

He both invented and discovered it. Yes, the object literal syntax existed, but he also carefully (and IMHO correctly) specified a strict subset as well, for these interoperability reasons. For instance, Javascript is happy with {a: 1}, but that is not legal JSON. It's a very well done standard.

ryanpetrich · on Aug 2, 2013

JSON is not actually a strict subset. Certain characters when left unescaped in a JSON string make for invalid JavaScript: http://timelessrepo.com/json-isnt-a-javascript-subset

jerf · on Aug 2, 2013

Indeed, and I apologize for my ambiguity, as you are correct. By "strict subset" what I meant was a subset that attempts to reduce options, so that legality and illegality is easier to discern. That is, where Javascript accepts apostrophe and double-quote to delimit strings, JSON only accepts double-quotes, thus, "stricter" than real Javascript.

You are of course correct that JSON turns out not to quite be a strict subset in the set theory sense of "strict subset", though obviously that's a bug in the spec rather than a deliberate design decision.

benesch · on Aug 2, 2013

Douglas Crockford has also posted his explanation on Google+:

https://plus.google.com/118095276221607585885/posts/RK8qyGVa...

wissler · on Aug 2, 2013

"I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability." -- Crockford

This is horrific design reasoning. It's an authoritarian, presumptuous, "punish everyone in the classroom because one child misbehaves" mentality.

Comments would be useful in JSON because comments are useful in code, and JSON is code. For example, I might have a config file that I'm typing in that I want to leave a documentation trail for.

Don't tell me I can do a silly thing like redefine a field, as if it's "neat". It's an abomination that I have to resort to such things. And guess what: by resorting to such things I can still do precisely what Crockford claims he was trying to prevent. So his rationale is not only insulting to one's intelligence, it's sheer stupidity.

IanCal · on Aug 2, 2013

> It's an authoritarian ...

Which is pretty much what a specification is.

It's one or more people saying "This is how things are if you call them X".

> presumptuous

Presumptuous? It was in response to the feature being abused!

> "punish everyone in the classroom because one child misbehaves" mentality

No more than creating laws is. A significant subset of the population are misusing it in such a way as could cause widespread damage. It is a minor inconvenience to the 'law abiding people' (particularly given than any comments would be removed if read in and spat out by any program). There are workarounds ("field_comment":"some comment") or if that's not enough, use another format. Use one that allows comments, there are many.

> Don't tell me I can do a silly thing like redefine a field, as if it's "neat". It's an abomination that I have to resort to such things

It's also completely unreliable, it's a terrible solution and nobody should use it. I think we're fully in agreement here.

> And guess what: by resorting to such things I can still do precisely what Crockford claims he was trying to prevent. So his rationale is not only insulting to one's intelligence, it's sheer stupidity.

No you can't. The point was to stop people adding pre-processing commands or other such things to json, which would be in random formats and invisible to some parsers (as comments should be), visible and important to others. You don't want to pass a valid piece of JSON through a parser and end up with two different outcomes dependent on something in a comment, do you? Or have to use parser X or Z because Y doesn't understand directive A, but it does understand directive B and C, and while Z understands C, and X knows B, Z doesn't, so I have to use the version from a pull request from DrPotato which I think supports...

What I'm saying is that there is a benefit in simple standards.

fusiongyro · on Aug 2, 2013

I'm curious how the notion of XML processing instructions informs your opinion. In general I think having a standard is somewhat more important than the precise details in the standard, but XML PIs enable precisely the kind of thing Crockford feared, yet it doesn't seem to have materialized. Is this because processing instructions are not inherently harmful or because segregating them from comments disarms them?

IanCal · on Aug 2, 2013

XML PIs have a spec, don't they? (actual question) From some googling the W3C site has this :

> PIs are not part of the document's character data, but must be passed through to the application

If they're being passed through and not being used by the parser, it's no different really than a

    "directive" : "blah"

in JSON, which is fine. The application at the end needs to deal with it, but the parser doesn't, and that's really important. If it's just a comment, passing the file into and out of a program could remove the comment.

    something.json | python -mjson.tool | myjsonprocessingapp

Should be the same as

    something.json | myjsonprocessingapp

If the parser does need to understand the directive, at least there's a difference between an error of "I don't understand directive X" and no error at all because your parser ignored the comments.

nonchalance · on Aug 2, 2013

> and JSON is code

JSON is data. It appears to be JS code, but JSON is data. Data is not code ( http://www.c2.com/cgi-bin/wiki?DataAndCodeAreNotTheSameThing ). That's why the idea of data holding parsing directives is silly. If you want to do that, then embed that in the data (hold a MsgType key in the data records). There's no need for comments unless you are trying to use it for something other than raw data.

nmcfarl · on Aug 2, 2013

> There's no need for comments unless you are trying to use it for something other than raw data.

Is this a true statement? Even books have margins, and word docs comments. I think it’s not infrequent that pure data calls for metadata to put it into context for future users of that data.

And in computing most "pure data" formats have had either comments - or schemas and specifications which outline which the contents. The later sure look like comments stored externally to the documents, from my perspective.

In general I do not think data is self describing, and thus must be commented on in some form to describe it.

nonchalance · on Aug 2, 2013

You can represent annotations (which describe most of your examples) by adding keys:

    {
        "data": "some data",
        "data_comments": "here are my comments"
    }

sixbrx · on Aug 2, 2013

Not transparent to actual clients of the data.

edit for clarity: You're assuming that the application code isn't doing something with each key that it reflectively sees in the object, e.g. creating database fields to match them, or launching missiles towards those destinations, etc.. If you wouldn't automatically add dummy elements to a hashmap or dictionary in Java or Python, then you shouldn't add keys in a javascript object, unless you control the source to the program that will processing the data. Even then you shouldn't, because it will become a habit to add comments this way, and that will bite you when an extra key does matter.

dnautics · on Aug 2, 2013

or just use the key "comment" more than once, which is sort of a hybrid of the ideas.

IanCal · on Aug 2, 2013

Parsers might throw an error on duplicate keys, or launch emacs solving the towers of hanoi.

greenlakejake · on Aug 2, 2013

"Data is not code"

Lisp programmers disagree.

nonchalance · on Aug 2, 2013

Lisp programmers think "Code is data", not "Data is code"

_19qg · on Aug 2, 2013

Lisp programmers write code so that data is code.

nonchalance · on Aug 2, 2013

Lisp: "All code is data"

That's not the question. "All data is code" is not the same statement.

In a different context: "All apples are fruit" may be true but that doesn't imply "all fruit are apples"

TeMPOraL · on Aug 2, 2013

Lisp programmers think "data is code and code is data, both are the same thing and they're interchangeable". As it happens to actually be. I point to GEB [0] for more detailed discussion of this, but let me give you a few examples which point out that the distinction between code and data is mostly meaningless.

- Ant build descriptions that look suprisingly like executable Lisp code if you replace "<tag> ... </tag>" with "(tag ...)".

- Musical notation which is obviously code for humans playing instruments (it even has loops, I think, AFAIR from my music lessons; don't know about conditionals; if it has them, maybe it's Turing-complete? (ETA it would seem it is[3])).

- Windows Metafile format for bitmap and vector graphics which is basically a serialized list of WinAPI calls [1].

- "fa;sldjfsaldf" - the "not code, just data" example from [2] that happens to be "a Teco program that creates a new buffer, copies the old buffer and the time of day into it, searches and then selectively deletes". Oh, and it's also "a brainfuck program that does nothing, and a vi program to jump to the second "a" forwards and replace it with the string "ldjfsaldf"".

[0] - https://en.wikipedia.org/wiki/G%C3%B6del,_Escher,_Bach

[1] - http://en.wikipedia.org/wiki/Windows_Metafile

[2] - http://www.c2.com/cgi-bin/wiki?DataAndCodeAreNotTheSameThing

[3] - http://programmers.stackexchange.com/questions/136085/is-mus...

dragonwriter · on Aug 2, 2013

> Musical notation which is obviously code for humans playing instruments (it even has loops, I think, AFAIR from my music lessons; don't know about conditionals; if it has them, maybe it's Turing-complete?).

The are conditionals in standard music notation, at least ones that involve "executing different code" based on the value of a loop counter.

njharman · on Aug 2, 2013

Don't Lisp much do you?

Code is data and vice versa. Look up what the acronym JSON means sometime.

nonchalance · on Aug 2, 2013

Code is data but data isn't necessarily code. Even in Lisp.

TeMPOraL · on Aug 2, 2013

The difference is one of interpretation, not of representation; i.e. it's determined by an application, above parser level. When looking just at the written down form, data and code are the same thing.

Code more Lisp and read more Hofstadter ;).

dragonwriter · on Aug 2, 2013

> Data is not code

All code is data, but not all data is code.

wissler · on Aug 2, 2013

Nonsense. This is just more arrogance.

JSON is code because I use it as code. It's not your business to tell me it's not code -- you haven't seen how I'm using it. And don't go chirping that I should only do things your way, it's none of your god damned business what I'm using it for.

Further, if JSON was really only data, then it's an incredibly stupid way to store data, given that it has a human-readable syntax that the computer can only deal with after it's been parsed. As data, it's bloated and inefficient. To the extent that JSON is a good format, it's code. To the extent that it's data, it's not a good format.

DominikR · on Aug 2, 2013

A fork can be a spoon for you, if you choose to use it that way. Nobody is telling you what you are supposed to use it for, but still JSON was designed as data format.

If you don't like the format or feel that JSON is too restrictive/bad feel free to extend it or create your own format from scratch.

TeMPOraL · on Aug 2, 2013

> still JSON was designed as data format

While I don't think that comments belong in JSON, I don't agree JSON is designed as "data and not code" format. Trees of tokens are actually the natural format for writing code (also known as Abstract Syntax Trees, AST) and the data/code distinction is really, really blury when those two meet together, so it's only to be expected that people will end up coding in JSON (what are the 'build definition' files for various build tools / package managers, if not very simple programs)?

IanCal · on Aug 2, 2013

You can use a screwdriver as a hammer all you want, it's not going to make it a good idea. This isn't a free speech issue.

> Further, if JSON was really only data, then it's an incredibly stupid way to store data, given that it has a human-readable syntax that the computer can only deal with after it's been parsed. As data, it's bloated and inefficient.

So use something else. Also, a computer can only read any file after it's been parsed in some way. I'm not really sure what you're suggesting as an alternative.

> To the extent that JSON is a good format, it's code

Is it executable? Is it turing complete?

TeMPOraL · on Aug 2, 2013

> Is it executable? Is it turing complete?

It represents groups of more-less arbitrary tokens as trees, therefore it's a natural format for code representation as it's equivalent to an AST, therefore it's trivial to attach a basic execution context with if and lambda defined, and now it's executable and turing-complete.

IanCal · on Aug 6, 2013

So any indented text file would be considered code?

dragonwriter · on Aug 2, 2013

> JSON is code because I use it as code

You could use JSON as code, but that's somewhat silly, because there's already a superset of JSON designed for that use.

nonchalance · on Aug 2, 2013

Technically not true: http://timelessrepo.com/json-isnt-a-javascript-subset

        {"JSON":"ro cks!"}

(there's a unicode line separator -- 2028)

defen · on Aug 2, 2013

> JSON is code because I use it as code

You can't use JSON to compute things, therefore it is not code (unless you are willing to concede that any document format is code).

krapp · on Aug 2, 2013

Maybe a more useful resolution to this would be to state that while all code is data, no data should be code?

You could, if you were crazy enough, write perfectly valid JSON that passed the values to eval() or a parser or what have you. And while there are encodings in JSON that don't work in javascript (i've broken JS innumerable times trying to get that to work) JS does of course allow you to add closures as an object, or an array, whatever you like, and some forms of valid JSON (if not all) are also valid javascript. So you could indeed use JSON to compute things if you wanted to.

TeMPOraL · on Aug 2, 2013

> unless you are willing to concede that any document format is code

Because it is. Data vs. code distinction is arbitrary. The following sequence of characters:

"echo 'foobar';"

can be interpreted as describing a string, a series of tokens, a piece of code, a piece of music or a small icon, whatever interpretation you choose.

defen · on Aug 2, 2013

Yes, I understand that "code is data". This does not mean that data, in general, is code; unless you are willing to make the words completely meaningless. "Code" requires some notion of an execution platform/environment, which does not exist for arbitrary data. Here is a string: "the quick brown fox jumps over the lazy dog". Or how about "\u0000\u0000". That is not code, as generally understood.

TeMPOraL · on Aug 2, 2013

> "Code" requires some notion of an execution platform/environment, which does not exist for arbitrary data.

Arbitrary data don't exist without some notion of an execution (or interpretation) platform.

We tend to use "code" as a word for "commands telling some execution process what to do" and "data" as a word for "information that is meant to be transformed" but in reality this distinction is meaningless; both are fundamentally the same thing, and even our "code" vs. "data" words have blurry borders. It's very apparent when you start reading configuration files. For example, aren't Ant "configuration files" essentially programs[0]?

We all know what we usually mean in context by saying what is "code" vs. what is "data", but one has to remember, that in fact they are the same - minding it leads to insights like metaprogramming. Forgetting about it leads to dumb languages and nasty problems, and is generally not wise.

[0] - the answer is: yes, they are, see http://www.defmacro.org/ramblings/lisp.html for more.

ETA:

Questions to ponder:

- are regular expressions code, or data?

- is source written in Prolog code, or data?

Also I recommend watching http://www.youtube.com/watch?v=3kEfedtQVOY to learn how what would be data, as defined by formal grammars of some real-world protocols, can - by means of sloppy grammars and bad parser implementation - cross the threshold of Turing-completeness and become code.

defen · on Aug 2, 2013

I understand all this. Like many people, I've written programs in C++ templates. But I think we're talking past each other because you want to make a pedantic point. I'm using the words as they are generally understood, not in a technical computer science way. I'm talking about first-level stuff, not metaprogramming. Let me give you some questions to ponder:

- Is the text of Hamlet code?

- Was it code as soon as Shakespeare wrote it?

- If not, did it become code once the electronic computer was invented? Or did that happen once a version was stored in a way accessible to an electronic computer?

- Did all the existing paper copies immediately become code at that point as well?

TeMPOraL · on Aug 2, 2013

> But I think we're talking past each other because you want to make a pedantic point.

I guess that's true.

The flavour of "code vs. data" discussion in this thread was one of representation formats. You could argue that when looking at works of art from past centuries one should immediately say "data!" [0]. But in case of JSON, a format suspiciously almost identical to Lisp in structure, one needs to be careful in saying "it's for data, not for code".

Actually, I'm not sure what kind of point I'm trying to make, as the more I think of it, the more examples of things that are borderline code/data come to my mind. Cooking recipes is the obvious candidate, but think about e.g. music notation - it clearly feels more like "code" than "data".

I feel that you could define a kind of difference between "code" and "data" other than in intent, something that could put bitmaps into the "data" category, and a typical function into "code" category, but I can't really articulate it. Maybe there's some mathematical way to describe it, but it's definitely a blurry criterion. But when we're discussing technology, I think it's harmful to pretend that there's a real difference. Between configuration files looking like half-baked Lisp listings and "declarative style" C++ that looks like datasets with superfluous curly braces, I think it's wrong to even try to draw a line.

[0] - there's a caveat though. "How to Read a Book" by Mortimer Adler[1] discussess briefly how the task of a poet is to carefully chose words that evoke particular emotional reactions in readers. It very much sounds like scripting the emotional side of the human brain.

[1] - http://en.wikipedia.org/wiki/How_to_Read_a_Book

dragonwriter · on Aug 2, 2013

> Is the text of Hamlet code?

> Was it code as soon as Shakespeare wrote it?

Yes, the text of a play is code meant to be executed by humans.

defen · on Aug 2, 2013

That is pretty funny, but not what I was going for :)

pdeuchler · on Aug 2, 2013

So is all opinionated design "stupid"?

I do not presume to know who you are, or what you have accomplished, but there are few people with the professional and academic background that qualify to be able to call Douglas Crockford "stupid".

coldtea · on Aug 2, 2013

>So is all opinionated design "stupid"?

He never said that.

>I do not presume to know who you are, or what you have accomplished, but there are few people with the professional and academic background that qualify to be able to call Douglas Crockford "stupid".

Why, who do you think Douglas Crockford is and what is his "academic background"? He doesn't even have a related degree. Most of his JS fame he ows to his book.

loqi · on Aug 2, 2013

Since I too lack the lofty requisite background for it, I'll just let Mr. Crockford do the job the for me:

> The reason to use semicolons is because coding rigor tends to produce significantly better software.

wissler · on Aug 2, 2013

Don't put words in my mouth, I didn't claim he was "stupid." To say that one thing he said somewhere is stupidity is a far cry from claiming he is stupid.

I also never said that "opinionated design is stupid".

Perhaps you could rephrase your question in such a way that you aren't presuming to speak for me.

jdp · on Aug 2, 2013

JSON isn't a configuration language, it's just another data encoding format with the added benefit of being readable by humans. That and its ubiquity make it an appealing choice for stuff like ad-hoc configuration at first glance, but it's not the best choice. If you want a config language for shared human and machine consumption, use one designed for that purpose. JSON is pretty much just an encoding that is easy for humans to inspect and debug.

olefoo · on Aug 2, 2013

This. I've worked with a number of systems that "use json as the configuration language"; and in every case it's led to issues.

Given a choice it's better to have a .ini style format like the one that pythons ConfigParser will digest. That way you can have sections, comments and you won't be tempted to have the application write things into the configuration on it's own...

jmcdonald-ut · on Aug 2, 2013

I'm sure there are counter points to what I'm about to bring up, but three observations:

1. In my experience JSON is frequently output programmatically, and taken in programmatically. Comments are not useful in these cases.

2. The only time comments could be perceived as useful then would be when parsing JSON by eye or hand. However, it is not difficult to parse JSON and understand it unless the keys have used obfuscated names. If key naming is obfuscated, comments aren't really the correct solution.

3. "An object is an unordered set of name/value pairs", as mentioned by jasonlotito and others earlier. There is no guarantee that a JSON parser will give you the right value if there are two of the same keys in the same scope.

masklinn · on Aug 2, 2013

> There is no guarantee that a JSON parser will give you the right value if there are two of the same keys in the same scope.

In fact, reading the RFC:

> The names within an object SHOULD be unique.

I'm pretty sure an implementation could refuse to parse the form altogether.

rpledge · on Aug 2, 2013

SHOULD is a horrible word to put in any spec if it doesn't specify what the result will be if that recommendation is violated

IanCal · on Aug 2, 2013

SHOULD is defined in RFC 2119 as

    3. SHOULD   This word, or the adjective "RECOMMENDED", mean that there
       may exist valid reasons in particular circumstances to ignore a
       particular item, but the full implications must be understood and
       carefully weighed before choosing a different course.

The consequences are undefined, I feel, for a reason. You can't put them all down on paper, it depends on what all the parsers do. The parsers can accept or reject things with duplicate keys, or they can play a nice little ditty through the speakers.

All it means is a parser isn't required to reject JSON with multiple keys. It can, however, do whatever the fuck it wants with them.

If the wording was precise, then it should be a MUST. SHOULD indicates a terrible world of unknown consequences.

k3n · on Aug 2, 2013

Re: #1

I know there is a lot of JSON handling that happens behind-the-scenes, but there is also a non-trivial amount of JSON that I have manually created and/or altered, and have to share with a team.

It's a blessing and a curse, these modern NodeJS projects -- it's awesome that I can simply create/modify a .json file with a few properties, run a command, and magic happens. However, if I want to try and communicate out the intent of the values to my team of 20+, it becomes really convoluted. The projects all magically work by looking for foo.json, but if I comment that file then it breaks.

So I have to create another foo.comments.json, add another script that will remove the comments and then call the original instructions. Then I need to create additional documentation instructing the team to ignore the developer's docs regarding native use, and to run the application with our own homebrew setup.

It also can make testing a pain in the ass, because now I can no longer comment out values, I have to remove them completely. Not a huge deal, annoying nonetheless.

mst · on Aug 2, 2013

Right, we're suffering from people using JSON for config files just like a few years back a lot of projects suffered from using YAML for config files (though YAML was at least designed to be human editable ... ingy and I regularly disagree over whether he succeeded :).

For the past few years, I've generally been using either apache-style via http://p3rl.org/Config::General or some sort of INI derivative (git is proof that ini is good enough for a lot more things than you might expect).

For the future, ingy and I have been working on http://p3rl.org/JSONY which is basically "JSON, but with almost all of the punctuation optional where that doesn't introduce ambiguity" - currently there are perl and ruby parsers for it, javascript will hopefully be next.

Admittedly, we -haven't- got round to defining a format for comments yet, but my point is more "JSON wasn't really designed for that, let's think about something better".

chmike · on Aug 2, 2013

Why not adding an object field with identifier a_comment:"blabla..."

The advantage I see in this way of commenting is that the comment becomes accessible inside the program instead of being stripped off by the parser. For the human reader it's also more obvious.

Unfortunately, it's not possible to add comment to anything else than objects. But the OP's proposal as well.

k3n · on Aug 2, 2013

Why have comments in code at all, then? You could always just make a variable/constant, with the added benefit that the comment becomes accessible inside the program...

But that makes no sense at all to me. I agree that using comments as metadata/directives is typically an antipattern hack, but what about for non-metadata comments? Embedding comments into code is just as ass-backwards as embedding code into comments. Neither is right.

> For the human reader it's also more obvious.

Strongly disagree here -- if I open a file that I've never worked in before, I have faith that the comments were meant specifically for me. Likewise, I assume all code in the file is not for me (on account that I'm not a compiler/interpreter/etc.).

CanSpice · on Aug 2, 2013

Given the RFC says "The names within an object SHOULD be unique", there's nothing stopping me from writing a parser that takes the first name/value pair and throwing all the others on the floor. Or even better, picks a random name/value pair when the same name appears. Both of these behaviours are allowed by the RFC, and would break this hack.

Putting comments into JSON in this way is a hack and shouldn't be used by anybody who has any interest in writing maintainable software. Relying on ambiguities in an RFC and someone saying "JSON parsers work the same way" is a good way to end up with a really obscure bug in the future.

serichsen · on Aug 2, 2013

At least in ECMA-262 5, Ch. 15.12.2, there is a NOTE: "In the case where there are duplicate name Strings within an object, lexically preceding values for the same key shall be overwritten."

It still does not feel right.

bzbarsky · on Aug 2, 2013

Assuming you mean RFC 4627, you're quoting the restrictions on what character streams can be called "JSON". The "should" means that if your names are not unique you can still call it "JSON", but you should think twice about it.

The parsing behavior for JSON is not defined at all in RFC 4627, actually. Browsers (and Node, since it's using a browser js engine) use the parsing specification in ECMA-262 edition 5 section 15.12.2.

Note that ES5 section 15.12 in general is much stricter than RFC 4627, as it explicitly points out if you read it.

adamtj · on Aug 2, 2013

This is misguided. You don't need comments in a JSON config file. Why? Because you don't use JSON for config files that need comments.

JSON is like duc(k|t) tape. It's really easy to stick two things together with it. That doesn't mean you always should. It's the simple thing that gets the job done so you can focus on what matters.

One shouldn't pick JSON for your config files and then hold it up as good design. "Look at me, I'm daring and _not using XML_!" Using JSON is crap design, but good engineering means sometimes picking something crappy and not wasting effort on things that don't matter in the end.

If your configuration files become both complicated and important enough that you need comments, then you should stop using JSON. If your duck tape job starts needing additional reinforcement, then you should probably just get rid of the duct tape and do it right.

If one of your requirements is a sufficiently trendy yet commentable config language, look into YAML. Also, gaffer tape. The white kind is easier to write on.

glhaynes · on Aug 2, 2013

If crap design like JSON is the right engineering choice sometimes (and I agree that it is), that seems like an argument that adding comments in this crappy way may sometimes be the right engineering choice.

IanCal · on Aug 2, 2013

Relying on undefined behaviour in a parser for comments is something I find quite hard to define as "the right engineering choice" in any situation.

tieTYT · on Aug 2, 2013

Yeah maybe you don't use JSON for config files that need comments, but that's because there's no documented way of how to put comments in JSON. The article solved the problem.

Actually, I'm 100% playing the devils advocate here. I'll even flip-flop to prove it. Regarding the article, I doubt that every JSON parser will let this slide. To me that's an even better reason to avoid this practice.

IanCal · on Aug 2, 2013

> Regarding the article, I doubt that every JSON parser will let this slide. To me that's an even better reason to avoid this practice.

If someone uses undefined behaviour in config files for the sake of storing a comment, I reserve the right to hunt them down if I have to maintain their code.

nonchalance · on Aug 2, 2013

The JSON RFC (http://www.ietf.org/rfc/rfc4627.txt?number=4627) says

    The names within an object SHOULD be unique.

SHOULD is defined (http://www.ietf.org/rfc/rfc2119) as

    3. SHOULD   This word, or the adjective "RECOMMENDED", mean that there
       may exist valid reasons in particular circumstances to ignore a
       particular item, but the full implications must be understood and
       carefully weighed before choosing a different course.

Salient point is that you would need to ensure that you are only using JSON parsers that tolerate duplicate names (and use the last value)

IanCal · on Aug 2, 2013

> Salient point is that you would need to ensure that you are only using JSON parsers that tolerate duplicate names (and use the last value)

To drive this home a bit more forcefully, it requires knowing the behaviour of your parser where it is marked as "undefined" in the spec.

If that isn't enough to stop you, DON'T USE JSON. A patch level change in a library could break your code in a non-obvious way and it would be your fault. If you want comments, DON'T USE JSON, JSON DOESN'T HAVE THEM.

bzbarsky · on Aug 2, 2013

Note that if your parser is the ES-standard JSON.parse, then the behavior here is in fact defined by ES5 section 15.12.2, even with duplicate names.

juandopazo · on Aug 2, 2013

And the big point here is that the members of the RFC group were considering breaking the EcmaScript standard and change it to MUST which would break existing programs and the "workaround" in the article.

tonyg · on Aug 2, 2013

I wish they had! I wonder why they didn't? JSON is already a subset; limiting it to non-duplicated keys would just tighten it a little.

NathanKP · on Aug 2, 2013

This hack, while nice, is still just a work around. I highly recommend that if you can, in as many places as possible use YAML instead of JSON.

JSON works great for on the fly communication with frontends that are running JavaScript, or for communication between JavaScript processes like Node.js servers. But for configuration files and other things that need comments YAML is many times better, both for it's clean, Markdown reminiscent structure, and its native comment support.

Node.js has a great module called js-yaml (https://github.com/nodeca/js-yaml) which automatically registers handlers for .yml and .yaml files, allowing you to require them in your Node.js code just like you can with JSON files.

It also comes with a YAML parser for the browser side of things, so if you want you could even communicate YAML directly from the server to the client side, although frankly I don't see much advantage to sending YAML over the wire instead of JSON. (And as others have mentioned below untrusted YAML sources could insert malicious objects in YAML, so I wouldn't recommend this technique.)

You can even use YAML for your package.json in a Node program: (https://npmjs.org/package/npm-yaml)

wmil · on Aug 2, 2013

YAML is neat, but library developers have a history of writing unsafe YAML parsers.

There's the famous Rails vulnerability due to YAML. Python needed to add 'yaml.safe_load'.

YAML is a little too rich. It's always one poorly thought out convenience feature away from disaster.

ricardobeat · on Aug 2, 2013

Hence TOML was born: https://github.com/mojombo/toml

It has parsers for nearly every language, I wrote one for js: http://npmjs.org/package/tomljs

ygra · on Aug 2, 2013

And JSON was often “parsed” with eval().

krapp · on Aug 2, 2013

That's not really a problem with JSON though is it? Anything you run through eval() is a disaster in the making. Maybe the problem is that people are trying to make data formats too powerful, and too many things seem to be creeping towards Turing completeness that don't need to be.

I think parsers for JSON and Yaml, INI etc should be designed in such a way as to make it impossible to assign anything like an object, class, function, etc. Numbers, strings, and collections of numbers and strings... that's all you should get (though obviously "string" is frought with peril.) Anything more is unnecessarily complex.

dehora · on Aug 5, 2013

It is a problem with JSON in the sense that it's a JavaScript subset, 'in practice' - modulo the Unicode support that goes beyond JavaScript. So it's to be expected that eval() will be used as a convenience by developers, ignoring the security implication that comes will eval() hoisting full JavaScript.

The way to have avoided the issue would have been for JSON to have a grammar that broke eval(). But one could argue the ability to pass JSON into eval() to get JavaScript is one of the reasons JSON became popular to begin with.

dehora · on Aug 5, 2013

Agreed.

YAML is easy to type, even with the whitespace. So is INI. And as verbose as XML is, it's easier, ime, to type than JSON. Of those four, JSON is the hardest to write by hand; certainly it's the one I make most mistakes with, to extent I have a particular technique for writing it out (prefixing the commas). As a result JSON as a config file format is tedious, verbose, and error prone; its sweet spot is a machine interchange format that a human can debug/read if needed.

homakov · on Aug 2, 2013

This hack, while nice, is still just a work around. I highly recommend that if you can, in as many places as possible use YAML instead of JSON.

Rails RCE, sup

NathanKP · on Aug 2, 2013

I've actually never developed anything serious in Rails. I just don't like the framework, and the performance of Rails leaves a lot to be desired in my opinion. I'm a 100% Node.js convert these days.

But I do like the Rails convention of using YAML format and have adopted that in my own code as much as possible.

IanCal · on Aug 2, 2013

I think he's referring to the rails YAML exploit [0] because you can use yaml to create objects, like this:

    --- !ruby/hash:ActionDispatch::Routing::RouteSet::NamedRouteCollection
     'foo; eval(eval(puts '=== hello there'.inspect);': !ruby/object:OpenStruct
       table:
        :defaults: {}

Allowing people to run arbitrary code on rails servers.

[0] http://rubysource.com/anatomy-of-an-exploit-an-in-depth-look...

NathanKP · on Aug 2, 2013

Yeah, I had read about that. One more reason not to send YAML over the wire. YAML makes great sense for your internal configuration files and internal data structures where you need comments and readability. YAML is perfectly safe here because chances are you aren't going to be exploiting yourself by putting malicious objects in your YAML.

But for over the wire communication, JSON makes more sense than YAML, not only because parsing unsafe YAML from an untrusted client could cause exploits like you mentioned, but also because YAML is dependent on indentation and line breaks, and therefore makes communication with the client side much more awkward than just sending JSON to the client or receiving JSON from it.

timtadh · on Aug 2, 2013

I believe the parent was referring the many recent YAML based vulnerabilities found in Rails (and elsewhere). He is basically saying, "You can use YAML -- if you don't care about injection vulnerabilities."

ianburrell · on Aug 2, 2013

In my experience, YAML is better for configuration files and human edited files. JSON is better for data and communication between computers. The features that make YAML easier to write (comments, more flexible format, less quoting) make it more complex and slower to parse.

Also, many of the security holes in YAML come from its use as a serialization format which can represent native classes. I wish the YAML parsers had more explicit support for simple data schemas which would reduce the security risk and be sufficient for most configuration files.

stormbrew · on Aug 2, 2013

Ironically, YAML has object serialization features out the wazoo and JSON for that purpose is relatively more spartan. I will never understand why that happened the way around it did. YAML should have been left at human readable with none of the object serialization stuff thrown in.

breck · on Aug 2, 2013

While on the topic of encodings (I'm a huge encodings geek), let me plug a new one we recently discovered called Space (https://github.com/nudgepad/space). It is dead simple and has the nice feature that it is extraordinarily easy for both humans and machines to read and write.

NathanKP · on Aug 2, 2013

It is definitely very minimalist. Personally I have issues parsing it visually though, because the indentation of only one space makes it hard to differentiate inner data structures particularly on a large screen with small fonts. Additionally the lack of a division character other than space between the key and the value makes reading each key value pair much harder because the key and value tend to run together visually.

breck · on Aug 2, 2013

Thanks for the feedback. I totally agree with you.

Adding easy syntax highlighting is my next step to address this problem.

mst · on Aug 2, 2013

YAML is excellent for resource files, i.e. human editing complex data.

For -configuration- you want a simpler format; INI is worth considering, as is http://p3rl.org/JSONY which is ingy's implementation of a vision we thrashed out for a more sysadmin-friendly config format.

triplepoint217 · on Aug 2, 2013

I agree it is a cute hack, but it is also kind of horrifying. You are depending on an undocumented behavior that happens to be shared across the ecosystem. Now what happens if that file hits a parser which takes the first instance, or a functional one that errors out when it sees multiple assignments?

+1 re YAML

rubinelli · on Aug 2, 2013

I used some YAML to configure internal systems, and the impression of my teammates was that it was a bit fragile. Maybe we were using it wrong?

NathanKP · on Aug 2, 2013

It is dependent on a specific indentation format which is one thing I dislike about it. But if you configure your vim or whatever editor you use to properly indent YAML files you should have few issues with fragility.

Even with indentation problems, the time saved in not typing curly brackets, extra quotation marks, and commas, and the time saved in not having to visually parse these when reading YAML more than makes up for the occasional data structure bug caused by bad indentation.

hosay123 · on Aug 2, 2013

This would completely break any event driven (streaming) parser.

the_gipsy · on Aug 2, 2013

Or a parser that simply discards existing keys.

IanCal · on Aug 2, 2013

Which, importantly, would be perfectly fine according to the spec (as I understand it).

masklinn · on Aug 2, 2013

Indeed, the spec states that keys SHOULD be unique (with RFC 2119 meaning) and leaves behavior unspecified in case of duplicate key.

IanCal · on Aug 2, 2013

My favourite example of dealing with undefined behaviour is this:

In practice, many C implementations recognize, for example, #pragma once as a rough equivalent of #include guards — but GCC 1.17, upon finding a #pragma directive, would instead attempt to launch commonly distributed Unix games such as NetHack and Rogue, or start Emacs running a simulation of the Towers of Hanoi.[7]

Source: http://en.wikipedia.org/wiki/Undefined_behavior

jgeerts · on Aug 2, 2013

It's overwriting existing keys, which is fine imo. When I use a map in any language and put a new value with a new key, expected behavior is that the previous key is overwritten.

jasonlotito · on Aug 2, 2013

My first thought in seeing this was that objects aren't guaranteed to maintain order: "An object is an unordered set of name/value pairs" - http://www.json.org

jfoutz · on Aug 2, 2013

There is an intrinsic order in the text though. it's up to the parser to keep clobbering a value every time a new value comes in for a given key.

This seems like a bad idea. It seems heavily reliant on edge case behavior. But hey, might work well for the original author.

IanCal · on Aug 2, 2013

> it's up to the parser to keep clobbering a value every time a new value comes in for a given k

Nope, parsers are perfectly in their rights to do whatever they want with multiple keys. They could read them backwards, sort them, whatever. The behaviour in the instance of multiple keys is undefined.

> This seems like a bad idea.

It is an astonishingly bad idea. I'm concerned by it being so high on the page.

> But hey, might work well for the original author.

Depends on their parser. It's undefined behaviour according to the spec. It might work now, but I'd argue it doesn't work well, as a patch level change could bork this.

jfoutz · on Aug 2, 2013

I'm not so sure. I think, JSON falls back to the ecma script standard for specific details. The object initializer semantics seem to force a left to right evaluation order, in the ecma spec around page 65. I'll admit my claim was unfounded when i made it, and I only went to the spec to avoid being wrong :) If I were to implement a JSON parser, I would now feel obligated to eval in order, due to my reading of the spec.

However, I think we wholeheartedly agree, don't rely on this behavior. It is an outright strict mode error.

_ZeD_ · on Aug 2, 2013

while this is true, I think it's irrelevant: the "trick" is about "abusing"

* the fact parser work from top to bottom of the text

AND

* the fact that assigning the same key many times with different values update the key with the last value

your quote regards the order in witch the different keys are saved.

masklinn · on Aug 2, 2013

Both are only "correct" for specific implementation, this is not specified behavior (and duplicate keys is strongly recommended against by the key)

_ZeD_ · on Aug 2, 2013

absolutely. This is nothing more than a clever trick, but I would never rely on it.

Honestly, tough, I think all major JSON parser behave following the two assumption.

masklinn · on Aug 2, 2013

streaming parsers can't follow the assumption short of becoming useless. They're either going to send only the first instance or going to send two different events.

JulianMorrison · on Aug 2, 2013

This definitely qualifies for a Zen style thwack over the head with a stick and a reprimand of "stop being clever!"

varikin · on Aug 2, 2013

This sounds great until some parser uses the comment definition instead of the value. Is it defined in the spec that parsers need to use the last defined value for a key?

dak1 · on Aug 2, 2013

Since the order of an object's keys is not guaranteed, it seems like even if a parser respected the last-defined rule, you could still potentially end up with the wrong field last.

ygra · on Aug 2, 2013

Not really defined, but since an object is defined as an unordered collection of key/value pairs, a conforming parser could probably shuffle the pairs before parsing them.

treerex · on Aug 2, 2013

I suppose it could, but the point of the object being defined as an unordered collection is because the most straight-forward way of implementing this is through a hash table, where the order of the keys cannot be guaranteed without additional work. I'm sure they didn't consider a parser randomly permuting the lexical order of the pairs as something a sane person would do.

IanCal · on Aug 2, 2013

Well it could perfectly sensibly do this:

    if not key in hash:
        hash[key] = value

That's a sensible approach, valid as per the spec.

> I'm sure they didn't consider a parser randomly permuting the lexical order of the pairs as something a sane person would do.

It could sort the keys, in which case the order is no longer guaranteed (again this doesn't seem insane).

The proposal is to rely on undefined behaviour for comments. I'm amazed we're still talking about this.

ygra · on Aug 2, 2013

Granted, hosay123's comment [982] is much more valid, though.

[982] https://news.ycombinator.com/item?id=6147084

rwmj · on Aug 2, 2013

About as defined as anything else in JSON, eg. the range of integers.

masklinn · on Aug 2, 2013

Actually, duplicate keys is very specifically recommended against in the RFC, and left entirely unspecified.

avolcano · on Aug 2, 2013

Can we all just agree, as a community, to add comment support to our JSON parsers? Hell, I'd do a PR on V8 if I knew C++.

It's ridiculous that I can't document notes on dependencies in my NPM package.json, or add a little reminder to my Sublime Text configuration as to why I set some value, because we're using JSON parsers that can't handle the concept of ignoring a line with a couple slashes prefixing it.

IMO - either we add comments to JSON, or we stop using it for hand-edited configuration.

IanCal · on Aug 2, 2013

> It's ridiculous that I can't document notes on dependencies in my NPM package.json, or add a little reminder to my Sublime Text configuration as to why I set some value

Why not have

    { "keyname" : "aldkjfhaldhfa"
      "keyname_comment" : "asdfjnad" }

If that's not enough, use something other than JSON. Adding comments will just result in it being valid in some parsers and not others.

sixbrx · on Aug 2, 2013

How do you know application programs won't barf if they see an unexpected key?

sehrope · on Aug 2, 2013

Or just use YAML[1]. It's a super set of JSON, includes comments, nicely formatted lists, and is (IMHO) much easier on the eyes.

[1]: http://en.wikipedia.org/wiki/Yaml

RHSeeger · on Aug 2, 2013

YAML is not a superset of JSON, it's a totally different format.

TomasSedovic · on Aug 2, 2013

According to the spec, YAML 1.2 is:

"The primary objective of this revision is to bring YAML into compliance with JSON as an official subset."

http://yaml.org/spec/1.2/spec.html

If I understand it correctly, the earlier versions were close but not 100% compatible.

sehrope · on Aug 2, 2013

YAML 1.2 is a super set of JSON. Check out the spec itself: http://www.yaml.org/spec/1.2/spec.html#id2759572

bct · on Aug 2, 2013

It's both.

phpnode · on Aug 2, 2013

Crockford's rationale for not supporting comments is that people use them to add meta data to the object (e.g. type annotations) which makes it hard to consume with different parsers.

avolcano · on Aug 2, 2013

Trusting the community to do the right thing is better than handicapping your users.

Regardless, of course, people add metadata to JSON already - there's zero reason you can't "_type": "int". It's a completely arbitrary reason.

IanCal · on Aug 2, 2013

> Regardless, of course, people add metadata to JSON already - there's zero reason you can't "_type": "int"

Which is fine, because it's in a format that everyone can parse. Adding

    //COMMAND: Extension(github:IanCal/preparser).parsethis

is the kind of things this forbids. Most peoples parsers would ignore this as a comment (if we have comments), but maybe some would do it in a special way. Either everyone ignores the comments in the parser (this is unlikely to carry on for long, someone will want to extend it) or nobody is allowed comments. That way everyone parses the same text.

jerf · on Aug 2, 2013

"Trusting the community to do the right thing is better than handicapping your users."

And the "community" in question had repeatedly and grossly demonstrated itself to be unworthy of such trust.

Crockford was not hypothesizing that this might happen, he'd seen it. Repeatedly. If you want to argue against it even so, fine, but bear in mind that is what you are arguing against, real pain that real people experienced, not mere possibilities.

wvenable · on Aug 2, 2013

JSON did support comments for a time and people were starting to use it for meta data.

The problem is that JSON is not meant to be used a configuration file format and just because it's really good for information exchange, doesn't mean it's good for configuration (and vice-versa). Configuration really requires comment support and information-exchange is better avoiding it. Two standards are needed.

phpnode · on Aug 2, 2013

right - but that is valid syntax! any json parser can understand that, and that's what he recommends doing instead. But if you're doing this in comments, you end up writing your own mini language to describe your annotations, and nothing else knows how to parse it. that should clearly be avoided.

esailija · on Aug 2, 2013

If JSON had comments, then of course any JSON parser could understand those comments just as well as they can currently understand "_type": "int". What am I missing?

masklinn · on Aug 2, 2013

That because they're comments specific JSON parsers could (and likely would) interpret processing instructions embedded in those comment to toggle behaviors on the fly. Crockford's fear (founded I think) was that comments would be used to "extend" json.

esailija · on Aug 2, 2013

Yes I get that. What I don't get is how "_processing_instruction": "whatever" is any different.

phpnode · on Aug 2, 2013

when you do

    {
        "_type": "int",
        "foo": "123"
    }

a JSON parser will always know how to represent that object. How you process that object is up to you.

However, when you write:

    {
        // @type int this is a comment
        "foo": "123"
    }

and you call JSON.parse(), what would you expect to get back? You can no longer represent it as a simple object, you need some way to access the comment, how do you do that? Moreover, whose responsibility is it to process the annotation in that comment? the parser's? Should you get back an integer rather than a string for obj.foo? how would you support different types of annotation? What happens if you're using parser A and your client uses parser B? Does parser B support all the annotations that parser A supports? If you need to modify a JSON structure, e.g. JSON decoding, adding a property and re-encoding, should the comments be preserved? ...

You can see that having comments introduces a whole host of other questions, ambiguity and would only make it harder for different platforms to share data. Avoiding this kind of cruft is why JSON is winning vs XML for most things these days.

esailija · on Aug 2, 2013

In the eyes of compliant parser (assuming JSON supported comments) it is just a comment like "_type": "int" is just a key-value pair.

However, when using ad hoc parser, then all bets are off what the result is in both cases again, not just the comment case. Regardless of comment support in JSON the same problem appears to exist.

masklinn · on Aug 2, 2013

And worse than metadata, processing instructions.

roryokane · on Aug 2, 2013

Sublime Text actually already supports '//' comments in its "JSON" configuration files, though it's non-standard. The comments are properly ignored, and syntax-highlighted. However, the comments (along with all other manual formatting) are lost if the file is programmatically edited, for example by changing the font size using the keyboard shortcuts.

avolcano · on Aug 2, 2013

Ah, thanks for the correction :)

on Aug 2, 2013

[deleted]

buro9 · on Aug 2, 2013

TOML is nice for config files: https://github.com/mojombo/toml

njharman · on Aug 2, 2013

> we stop using it for hand-edited configuration.

Bing, Bing, Bing. We have a winnar!!!

XML sucks in large part not because of XML but because people used it for everything, everywhere in places it was highly ill-suited. Don't fuckup JSON the same way.

sixbrx · on Aug 2, 2013

I actually like xml about as much as anything else for complicated config files. I find the explicit (named) block closing tags to help readability.

d0mine · on Aug 2, 2013

  s/^#.*//g

or

  yaml.safe_load(json_file_with_comments)

xsace · on Aug 2, 2013

you can comment things out in sublime text configuration files

julius · on Aug 2, 2013

Funny story. JSLint[1] does not approve of this technique. I asked Crockford to implement the duplicate check in April 2009 via email. 20 minutes later, out of nowhere, he was done implementing that check and wrote back "Please try it now."

This guy is fast. Especially nice considering we do not know each other at all.

[1] http://www.jslint.com/ - JS checking tool from the inventor of JSON

WayneDB · on Aug 2, 2013

I sent him an email once asking for the same JSLint license that he gave to IBM (you know, the one without the "do not use this for evil" clause.)

He responded that he was getting annoyed by everybody asking for this, so it was going to cost me $100K to obtain such a license.

I responded that I only asked for that license in order to annoy him (and thanks for the confirmation that it worked), because his immature license clause is annoying everybody else.

kalleboo · on Aug 2, 2013

Note that these comments would disappear the second you use a JSON-aware tool to manipulate one of these files.

mtkd · on Aug 2, 2013

You hope it is the comment dupe that disappears and not the field you want.

kstenerud · on Aug 2, 2013

Instead of using tricks that rely on parser implementation behaviors, why not just put an actual comment field in the object?

    {
        "myvalue_comment": "This is a comment",
        "myvalue": 42
    }

MatthewPhillips · on Aug 2, 2013

That example is fine, but you wouldn't want a long comment getting loaded into memory because the parser doesn't know any better.

dnautics · on Aug 2, 2013

for that matter, just do:

{

  "comment":"this is a comment";
  "value": 45;

  "comment":"this is also a comment";
  "value2": 64;

  "comment":"we like overloading the comment field";
  "stringval":"but these stay the same";

}

IanCal · on Aug 2, 2013

Then the parser might fail, and rightly so. A comment lower down shows it failing in a simple parser in go: https://news.ycombinator.com/item?id=6147478

Keys SHOULD be unique.

dnautics · on Aug 2, 2013

SHOULD != SHALL

And, no, this scheme doesn't break the go parser because there isn't a typeshift between the "comment" fields, they are all strings.

http://play.golang.org/p/bxcIIyAeph

IanCal · on Aug 2, 2013

Ah fair enough on the break, I was wrong there.

But if the spec says that keys SHOULD be unique, what's the behaviour when they aren't?

dnautics · on Aug 2, 2013

I would agree it's very hackey and probably not a good idea since the spec is liable to change. But I wouldn't be sad if the spec were changed to allow for this, or to allow for comments.

IanCal · on Aug 2, 2013

> I would agree it's very hackey and probably not a good idea since the spec is liable to change

Well it's not really about the spec changing, the spec doesn't have a defined behaviour for duplicate keys.

> But I wouldn't be sad if the spec were changed to allow for this, or to allow for comments.

I don't think duplicate keys should be allowed, but I've no strong feelings on comments. I don't think there's any real need for them though.

dnautics · on Aug 2, 2013

I'm biased, back in the BeOS -> Haiku days, I was wanting some sort of configuration textfile that would neatly be able to be parsed into a BArchive object (and presumably transmitted into a BMessage). XML was all the rage at the time, so I wrote for myself a sort of XML-ish format, but I never contributed it to the tree. I learned the problems with XMLs (should it be an attribute or an innerText?). I wanted something with a bracket notation, but JSON had not been discovered by crockford yet, if it had been I would have gotten more involved and tried to have it be adopted.

kstenerud · on Aug 2, 2013

Why not? It's just a configuration file.

nrivadeneira · on Aug 2, 2013

Terrible spec-violating hack aside, the idea of the author soliciting upvotes on StackOverflow doesn't sit well with me. I'd hate for SO solutions to become diluted by answers from users who are 'marketing' for upvotes.

jgeerts · on Aug 2, 2013

It is a 'hack' as discussed in the article and I will probably never use it. JSON should be either self explanatory or documented, I don't see any reason why you would add this unnecessary clutter to these messages.

It is already hard to read as is and it's making it worse to read and confusing, if some big service would start using this, you would have to know about this 'hack' otherwise he would have to look up what the hell is going on.

Also, this is the same information for each call and thus redundant, makes your messages larger when an advantage of JSON is that it's generally a small message.

JOnAgain · on Aug 2, 2013

This, to me, looks like an example of relying on a nondeterministic implementation. To my knowledge, the standard doesn't prescribe that parsers take the second/last of a duplicate key. As a result, this is relying on implementation-specific choices which can lead to a terrible upgrade process.

Switch to a different JSON parser, does it still work? probably. but I wouldn't bet that much.

If I were implementing a JSON parser, might I throw an error on a duplicate key? maybe. Maybe I would just print a warning?

If I were every going to give someone advice it would be to never do this.

asnyder · on Aug 2, 2013

You should use standard JS comments and process them out. Douglas Crockford's offical answer on comments, https://plus.google.com/118095276221607585885/posts/RK8qyGVa.... Essentially just process them out beforehand with something like jsmin, pretty straightforward.

sktrdie · on Aug 2, 2013

This is a horrible hack. You should use JSON-LD [1] to describe the fields of your JSON. It's a W3C standard!

Also, it's not defined in the JSON standard in which order an implementation needs to parse the JSON fields/keys. So you could end up with potentially wrong results!

1. http://json-ld.org/

basicallydan · on Aug 2, 2013

This is a nice trick, but probably only should be used in systems where the set people touching the code is a limited, rarely-changing set of people and anything using the JSON is strictly going to treat the last defined value as the value to use. Dragons lurk elsewhere!

peterkelly · on Aug 2, 2013

> Believe it or not, it turns out JSON parsers work the same way

Please don't do this. There's almost certainly some parsers out there currently that don't work like this, and if not, there likely will be one day.

rcarmo · on Aug 2, 2013

I do something else that is a lot more readable:

    { 
      "#": "this is a comment for the next line",
      "url": "http://foo.bar"
    }

Simple.

IanCal · on Aug 2, 2013

Hopefully you don't use the same key multiple times, as that's not guaranteed to work in different parsers.

zemo · on Aug 2, 2013

if I ever saw this in a project, I would remove those comments in a heartbeat. The behavior here is specific to the json parser. JavaScript is not the entirety of programming.

It does break the json parser in the Go standard library, in a totally nonobvious way: http://play.golang.org/p/BsDd47vWna

I would be surprised if it doesn't break many parsers, especially json parsers in static languages. If you want that sort of behavior, don't use json.

znmeb · on Aug 2, 2013

This is a celebration of programmers' ability to generate unmaintainable code by exploiting implementation dependencies. People get fired for pulling this horseshit every day!

M4rkH · on Aug 2, 2013

A common practice in config files is to comment out whole sections e.g. optional proxy server settings. This sort of multi-line comment is not addressed by this hack

kgabis · on Aug 2, 2013

Well, here we go: https://github.com/kgabis/parson/issues/7

wickedlogic · on Aug 3, 2013

Don't use them, there is no such thing. Make your comments first class citizens in the data.

lttlrck · on Aug 2, 2013

Nice hack but fails JSHint.

[1] http://jshint.com/

opminion · on Aug 2, 2013

JSON has comments already. It just requires you to decide what the comment marker is.

quantumpotato_ · on Aug 2, 2013

I thought JSON is mainly for machine to machine consumption.. who reads comments?

knodi · on Aug 2, 2013

This is a recipe for disaster.

davidradcliffe · on Aug 2, 2013

Neat trick! Not sure I'd trust it, and might be confusing for anyone reading who didn't know this.

8ig8 · on Aug 2, 2013

That seems pretty fragile.