Hacker News new | past | comments | ask | show | jobs | submit login
Why doesn’t findstr use the standard regular expression library? (microsoft.com)
61 points by nikbackm on Dec 11, 2015 | hide | past | favorite | 24 comments



> For example, Safari uses PCRE, but the PCRE copyright, licensing terms, and disclaimer do not appear in the Safari EULA or any other Safari documentation I can find.

Open Safari -> Help -> Acknowledgements -> Cmd + F -> PCRE

Bam, PCRE license found!

http://imgur.com/lFkvkFS


So basically for the same reason make still cares about the difference between 0x20 and 0x09.

Which honestly never made sense to me. I mean, we all know the story about how the original author realized having leading tabs, rather than leading whitespace, as a syntax element was a terrible idea. But I never understood why he couldn't modify make to accept the latter instead of the former -- more specifically, why "there's already an existing user base" was cited as a reason not to make this non-breaking change.


Wow, two reason for this, one of which I didn't expect (the first).

1. Because the code is so old we don't want to have an intern figure out how to do it. (Adding a switch for a new PCRE, such as /P, is hard) Oh, and reading and understanding licenses is HARD?!?!? WTF?

2. It would break backwards-compatability if we did it. (Well, duh).

I understand now it may not be worth it (powershell/.net is the future or whatever), but really, at no time in the 90's and 00's did you think it may be worth some time?


>Oh, and reading and understanding licenses is HARD?!?!? WTF?

Hey, man, you know what? Some people want absolutely nothing to do with legal/compliance folk in large bureaucratic companies. Not that they aren't lovely people, but that involving them in your work is more risk/work than it's personally worth.

>but really, at no time in the 90's and 00's did you think it may be worth some time?

Hearkening back to the 90's and 00's, I'd venture to guess that there may have been some amount NIH syndrome at MS with respect to OSS, in addition to uncertainty as to how future inevitable licensing problems would be resolved in courts.


When I worked at Microsoft (disclaimer: I used to work at Microsoft) as long as you weren't breaking backward compatibility, it was medium-easy to find the PM of a project you wanted to tweak and volunteer some free evening and weekend time. Never saw a ton of people besides me do it, though, but maybe that's just because you had to keep it under wraps, politically, back then.

> Oh, and reading and understanding licenses is HARD?!?!?

The official word on that, IIRC, is DO NOT ATTEMPT TO READ OR UNDERSTAND LICENSES. Send them to LCA, and LCA tends to err on the side of caution.


> Because the code is so old we don't want to have an intern figure out how to do it. (Adding a switch for a new PCRE, such as /P, is hard) Oh, and reading and understanding licenses is HARD?!?!? WTF?

The problem isn't reading and understanding the license: the problem is that you now have to provide security updates in a timely manner for code you don't know. (And relying on upstream alone here doesn't work: you might have a stability policy which means you only want to ship security fixes and not other changes, so you have to be able to separate them out and deal with the conflicts while not knowing the code.)


> did you think it may be worth some time?

So how much would you pay for it, roughly? I'm sure if we find enough people to chip in, it will happen.


Don't be so obtuse, getting a feature in a proprietary operating system when the vendor has a shitload of customers doesn't work that way.

For example, the windows command prompt has been shit forever, only after more than a decade does resize and copy/paste work properly.

Changes happen when they want and on their schedule. You can feel free to throw money in the millions at them, then you might at least have someone talk to you about their direction/plan.


Yes; screenfuls of TL;DR to explain the obvious.

The whole article is just a pretext for celebrating the monumental career achievement of the author of qgrep.


I wonder why the duplicate detector didn't go off on this. This URL is exactly the same as the one noted in the flagged comment (show hidden).

@Dang: Have HN changed the way in which dupes are detected? Also, should we be flagging comments that point to previous submissions btw? It seems a bit harsh to flag a comment in this way, although I agree that the previous post doesn't add anything to this particular discussion. Wouldn't a downvote be more appropriate?


I reposted this submission because I got a mail from [email protected] suggesting doing so, it even provided a link for doing the repost.

Reason for that was stated as: "This is part of an experiment in giving good HN submissions multiple chances at the front page"


Yes, the way duplicates are handled has changed, here[0] is dang's post on it from a few months back. And the mods are actively trying to give good posts another change, as shown by nikbackm's comment. I like the change, but perhaps people flagging should be more aware that not everyone has seen the memo about.

[0] - https://news.ycombinator.com/item?id=10223645


Here's a more recent comment that describes the evolution of this feature: https://news.ycombinator.com/item?id=10705926


The policy is even explained in the FAQ:

Are reposts ok?

If a story has had significant attention in the last year or so, we kill reposts as duplicates. If not, a small number of reposts is ok.

I personally think reposting after one day is acceptable, but I'd like if there were concrete guidelines (It gets annoying if links are reposted every 3 hours, which happens often enough)


Given the frequency with which this discussion seems to be being rehashed, I wonder if there's a case to be made for having the UI indicate promoted reposts of the sort we're looking at here.


That would give the indication that non-promoted reposts aren't ok, but they are.


I think I noticed another duplicate subscription yesterday. So yes - they may have changed this.


I honestly thought this would be talking about a startup pronounced "Findster".


"Besides, you can’t change the regular expression language accepted by a program after it has been released, because that would break all the scripts that used the old language."

Lie. In my world, this is commonly referred to as an "option".

Many command-line programs have options that disable or enable some standard to perform their operations.

And guess what program does that? Grep.

Grep option switches allow the user to choose between basic, regular and Perl expression engines (-E -F -G -B options).

Breaking the existing scripts would have been caused by setting another regular expression by default in findstr, which is basically something that nobody would ask.

The real reason findstr is not improved is because Bob left the company and probably took the source code with him.


> Lie. In my world, this is commonly referred to as an "option".

It seems a bit unfair to call Chen a liar on this point since the very same paragraph you quoted from ends like this: "A change in the regular expression syntax would require a new switch to opt into the new behavior."

(I do think it's pretty weird that he portrays this as a major problem. I suppose the point is that if you care about backwards compatibility your options aren't "crappy old RE syntax" and "nice new RE syntax", they're "crappy old RE syntax" and "crappy old RE syntax as default, and nice new RE syntax with an option" and that's not so appealing because (1) now maybe your program needs two regexp engines in it and (2) even a trivial inconvenience like remembering to add the option is something of a nuisance.)


> The real reason findstr is not improved is because Bob left the company and probably took the source code with him.

No that's not true. That would mean they were doing a special binary drop into Windows just for that one tool. That would be a major no-no, because it wouldn't be serviceable if there was, say, a security bug in it, or you had to localize it to a new language.

The real reason it hasn't been updated is that if anyone ever suggested it, and it made it into milestone planning for that team, nobody felt passionate enough to argue it above the cut line. Besides, PowerShell was on the horizon, and it was going to fix everything.


Lie. In my world, this is commonly referred to as an "option".

No lie, read the text again without reading between lines which arent't there anyway. He says you cannot just change the language used, he doesn't say one cannot add another one and use a switch to get it, which is what you are after. And which indeed is a good option.


> The real reason findstr is not improved is because Bob left the company and probably took the source code with him.

Did you actually read the article, or just skim it to find things to complain about?

>When Bob gave the findstr project to the Resource Kit team, they got the source code, but there was no knowledge hand-off so that somebody on the Resource Kit team understood how the program worked, in case they needed to fix a bug or add a feature. Not that there was anybody on the Resource Kit team available to receive said knowledge. The Resource Kit was primarily a book, so the Resource Kit team consisted mostly of writers and editors, not programmers.


> The real reason findstr is not improved is because Bob left the company and probably took the source code with him.

I doubt that's the real cause. It's probably "good enough" and there's no real external push to change it. Likewise, modifying it would probably break existing programs, and nobody wants that on their shoulders.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: