> If a character is printable and doesn't conflict with symbols used in the language, how would you decide if it should be allowed?
I don't know what should be allowed or not, hence this question. I've only ever programmed in English. I have written tons of localization/Unicode software but the code itself was English. I can imagine there being lots of non-web programming languages in any number of natural languages but my question pertained to the web. If the code that runs the client end of a website must be in English, why allow it to accept non-ASCII characters in variable names? I guess what you're saying is, why not?
Yes, I will even try to give an argument for "why yes".
I see no compelling reason, why in the programming of a website, consisting maybe of HTML, JavaScript and CSS, and maybe server side technologies like PHP, generally only words from the English language should be used. For example, it doesn't matter at all, if you use English words to name CSS classes. class="content" looks as nice to the browser as class="inhalt", which is the word for "content" in German. Some people, who are not native speakers of English, will prefer using variable names in their programs, which they can understand easier. Allowing characters that go beyond the collection of ASCII could be a step to make the world easier for these developers. After all, the web was made for people who want to share content and it should be made as easy as possible.
If JavaScript source code is delivered to a browser, the only thing that matters is that the script works correctly. The details of the variable names used in the script are only of interest to people who read and maintain the source code. Of course, in an (international) open-source project it would exclude many users to participate in the project or to use the code in own projects, if the language for the developers, who are reading and modifying existing code, is unknown and exotic. But for other websites or projects a developer might like to pick any language that makes his life easier.
Maybe a variable called "counter" will be understood by more developers around the world than a variable called "zaehler". But those, who understand "zaehler", might even prefer to write it as "zähler", perhaps they even have the character "ä" on the keyboard.
Large parts of programs appear to be written in English, since the programming language itself is using many words from English, not only on the web. For example, we may have "while" and "for" loops, "if" conditions and things like "unless". This could give the impression that there is only one human language used in programming (for the web or elsewhere).
If the support of Unicode characters later turns out to be exploitable for malicious purposes, ... well, this may happen with any technology we use for creating the web. The only way to absolute security seems to me to use and do nothing at all. Then there is no web - boring.
I don't know what should be allowed or not, hence this question. I've only ever programmed in English. I have written tons of localization/Unicode software but the code itself was English. I can imagine there being lots of non-web programming languages in any number of natural languages but my question pertained to the web. If the code that runs the client end of a website must be in English, why allow it to accept non-ASCII characters in variable names? I guess what you're saying is, why not?