This regular expression has been replaced with a substring function. God I *wish...

meshko · on July 21, 2016

i wish people would stop using regular expressions in situations where they can be replaced with a substring function.

zamalek · on July 21, 2016

Nick explained on Reddit why the regex was used[1]:

> While I can't speak for the original motivation from many moons ago, .Trim() still doesn't trim \u200c. It's useful in most cases, but not the complete strip we need here.

This would have probably been my train of thought (assuming that I consider regex to be a valid solution):

Trim() would have been the correct solution, were it not for that behavior. Substring is therefore the correct solution. Problem is, IndexOf only accepts a char array (not a set of some form, i.e. HashSet). You'd need to write the <Last>IndexOfNonWhitespace methods yourself. Use a regex and make sure that it doesn't backtrace, because it's expressive and regex "is designed to solve this type of problem." The real problem/solution here isn't substring, it's finding where to substring.

I consider regex too dangerous to use in any circumstance, but I can certainly see why someone would find it attractive at first.

[1]: https://www.reddit.com/r/programming/comments/4tt6ce/stack_e...

meshko · on July 21, 2016

Oh totally. I assumed that unicode bs immediately. And anyone would make this mistake easily. That's the point -- gotta have it imprinted in the brains, that regexes are for finding things in files, not for your production code. I've used them myself, but I'd like to think that when i type that regex in i stop and thing whether i will be feeding raw user inputs into it.

Scea91 · on July 21, 2016

Many times regexes are more clear and therefore less bug prone than any non regex alternative. They have their use. Even in production.

meshko · on July 22, 2016

example please

qw · on July 22, 2016

Compressing multiple forms of non-unicode whitespace to single space. Used for cleaning text from input fields that often contains unwanted characters from copy/paste.

The regexp for this is simply \s+