Hacker News new | past | comments | ask | show | jobs | submit login
We are stuck with egrep and fgrep (unless you like beating people) (utcc.utoronto.ca)
310 points by GrumpySloth on Oct 13, 2022 | hide | past | favorite | 342 comments



There seems to be a lot of confusion in this discussion about what the change is and how egrep and fgrep work. These are not symlinks like some have suggested but rather shell scripts. You can see the exact commit diff here:

https://git.savannah.gnu.org/gitweb/?p=grep.git;a=blobdiff;f...

I remember around ~10 years ago being told "you should never use `egrep` because it is slower than `grep -E`." precisely because the former requires extra forks() compared to the latter. However I'd counter that advice saying "if you're concerned about the performance of forks() then why are you running your code as a shell script to begin with?"... and I stand by that assessment now. In fact it would probably take me longer to type `-E` than it would for any modern system to run fork() (and I'm sure as hell not planning on using `grep` inside any hot loops!)

I think what will likely happen here is that distro maintainers will either remove that warning themselves or take ownership of the `egrep` and `fgrep` shell scripts. I'd be surprised if that warning made its way to the mainstream distros. I also wouldn't be surprised if the GNU grep maintainers did a u-turn on this change -- though it has already been committed to git for more than a year now and this is the first I've heard people complain about the change.


When it comes to avoiding forks, the question is why do egrep, fgrep have to be shell scripts rather than interpret the invocation name (argv[0]) for switching behavior. Also, I find the original POSIX spec to lump everything into a single grep binary questionable since egrep, fgrep developed as extension and restriction, resp. of grep under a classic regexp discourse and it isn't clear at all that automaton construction for powerful egrep support has to live in the same binary as fgrep which isn't using automaton construction at all.


The gnu coding standards has this to say: https://www.gnu.org/prep/standards/standards.html#index-beha...

Please don’t make the behavior of a utility depend on the name used to invoke it. It is useful sometimes to make a link to a utility with a different name, and that should not change what it does.

The next section provides some reasoning:

Providing valid information in argv[0] is a convention, not guaranteed. Well-behaved programs that launch other programs, such as shells, follow the convention; your code should follow it too, when launching other programs. But it is always possible to launch the program and give a nonsensical value in argv[0].


Thanks for the link, but I find that reasoning entirely unconvincing:

The ability to interpret argv[0] as a way to avoid exec is traded for hypothetical linking - but wherever linking is used (such as for idk a mail filter or other configurable executable) that's the place where a wrapper shell script could be used instead. And avoiding improper launching of egrep/fgrep from a program that doesn't bother to build argv[0] properly is also an (unlikely) non-use case that's better handled by fixing that offending program.


I find the reasoning convincing. It's the principle of least surprise. And a bunch of other principles. I don't want a program to care what it's executable name is, if it's being invoked with a relative path, absolute path, through a symlink or hardlink or anything else.

Of course, there are exceptions to the rule, like busybox. The point is to have uniform, predictable behavior across the ecosystem and not have every other program have its own weird idiosyncrasies.

That being said, the egrep/fgrep legacy is precisely such a case where it makes sense to make an exception. It's a decades-old legacy and grep isn't just any program, it's part of Using the Shell 101.


What makes the reasoning unconvincing for me is that it's simply wrong: argv[0] is not the name of the program being invoked. If it were, I would agree, but it's not. Rather, it's simply another argument whose default value is the name of the program being invoked, but you can pass something else in place of it if you want. Moreover, what I find surprising is seeing a knob that I can adjust, but which doesn't do anything. It's like having an argv[1] parameter that is always ignored. It's much better to make it do something that logically corresponds to its value, and the same goes when the index is 0.


> argv[0] is not the name of the program being invoked [...] it's simply another argument whose default value is the name of the program being invoked

That's your opinion. Another opinion would be that it's an interface contract with the calling program to pass the path to the program being invoked in argv[0]. This contract has been established by common practice and a corresponding expectation by most programs, even if it wasn't formally specified.

I'm not necessarily taking that position, but I can see convincing reasons to do so, and as such the reasoning in the article is also convincing.


That's not an opinion, that's a fact. The only thing that's an opinion in what I said is what the final decision should be, not the fact that it's a mutable argument. Whereas their stance is based on false premises to begin with. If their opinion was "we realize arg0 can be set explicitly but too many people wrongly assume it can't, so we will follow the crowd" that would make their argument more compelling.


The fact that it can physically be set to anything doesn't mean there isn't an implicit contract.


This isn't just "I can set it to something else if I go out of my way", this is "systems provide well-documented and standardized methods for setting this parameter to something else, and there are some widely used programs using precisely this feature." It's not common but that's exactly what default parameters are for: uncommon-yet-valid use cases. Yet you seem to treat your opinion that this is an implicit contract as somehow sufficient for establishing that it's a contract, despite what I just mentioned indicating otherwise. What contract can you point to that mandates argv[0] be set to the program name?


Established common practice.

The egrep use case, btw., wouldn't deviate from that. In fact, it would rely on argv[0] being truthfully set to the sym/hard link's name.


Not every established common practice implies a contract to follow that practice. It also matters what the reason for that practice is, and whom it's even relevant to. Like just because most people take the highway when driving from SF to LA that doesn't mean you're breaching some sort of contract by opting to take a side road.

There are two practices here. One is passing the program name to arg0. The other is never making some use of arg0. Both of these are established because they're the most convenient things to do by default, and because people rarely have a reason to deviate from them. That's it. Heck, if there was some contract to ignore arg0 then people wouldn't feel any contractual obligation to pass the program name to begin with, since it should be getting ignored anyway - that argument is pretty self defeating. Moreover, I think your position is effectively equivalent to saying "if you don't want a common practice to become a contract, then you must go out of your way and inconvenience yourself to deviate from that practice for absolutely no other reason than to make this very statement true", which is a rather bizarre (and inefficient) expectation from everyone around you.


Different behavior based on argv[0] was first brought to my attention when I discovered that /bin/sh was a symlink on some Linux systems. Bash has a Bourne shell compatibility mode.


Busybox is the forefront example of this in my mind. Busybox is a single binary that provides a ton of POSIX utilities (including sh, awk, ls, cp, nc, rm, gzip, and about 300 other common ones), and these can be invoked with `busybox $command`, or a symlink may be made from the command name to the `busybox` binary to automatically do the same. Many embedded Linux systems just use busybox and a ton of symlinks to act as the majority of the userland.


GNU does it their way, and Busybox does their own way too. Both have valid reasons for how things are set up. For users friendlinesr it's important then that they are as consistent as possible with it.


Bash actually has a POSIX compatibility mode.

There is a great deal in POSIX that was not in Bourne, native arithmetic expressions being the first to come to mind, then new-style command substitution.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...

All of the POSIX utility standards, including the grep variants, can be found in the URL's parent directory:

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/

If this new GNU grep functionality becomes widely distasteful, I think that OpenBSD's grep tries to emulate GNU in a few ways, and could supplant it, becoming the system grep while "gnugrep" is relegated to the shadows.

It is also extremely expensive on Windows for a version of grep to be implemented as a shell script. Launched by xargs, the penalty will be severe.

The commercial platforms based upon GNU are in a marriage of convenience, and can easily pick and choose.


> Bash actually has a POSIX compatibility mode.

It's partial. A short example is `date&>FILE`.

[edited typo]

On a POSIX shell, `date` will run in the background and the date will be written to stdout (`date&`) and `FILE` will be created or truncated (`>FILE`). Using `bash --posix`, the date will be written to `FILE`, since the incompatible bashism `&>` still takes priority.


I agree that is ambiguous.

If I were writing such a script where I wanted to launch a background process, and then create/truncate a file, I myself would separate them:

  date &
  >FILE
Bash wouldn't mistake that, but a lot of shell scripts look like line noise and there are situations where bad form is required (quoted in an ssh, for example).

Obviously, people wanting the bash functionality in POSIX would:

  date > file 2>&1
Bash discourages the latter form in my man page, but anyone interested in portability knows how profoundly bad that advice is.


Yikes. This is why I do not use bash. Too complicated.

For Linux, I added tab autocomplete to dash and use that for both interactive and non-interactive shell instead of bash.

Saves keystrokes and bytes. No need to keep typing "#!/bin/sh" into every script.


You can compile dash with a line editing library, but it does not come in most packages.


I use libedit. I am used to NetBSD sh; dash is similar, derived from the same source. I wanted dash to behave more like NetBSD sh, so I changed it. Anyway, I am not a huge fan of "packages". (The problems described in this thread are one reason why.) Of course I use packaged binaries in some instances, but normally I only install them when needed, use them and then uninstall them when I am done. I prefer to compile the kernel and userland myself. That includes the shell. Simply compiling dash with libedit does not provide tabcompletion. Need to make some small changes.


I just checked OpenBSD, and I find that there are 6 total links in /usr/bin: egrep, fgrep, grep, zegrep, zfgrep, and zgrep.

The OpenBSD package is superior.

The GNU gzip package includes a zgrep shell script that is adapted from work by "Charles Levert <[email protected]>" - this is similarly adapted for bzgrep and xzgrep.

The OpenBSD implementation will have superior performance for zgrep, because it is native C.

  rebel$ ls -li /usr/bin/*grep 
  466612 -r-xr-xr-x  6 root  bin  31520 Apr 11  2022 /usr/bin/egrep
  466612 -r-xr-xr-x  6 root  bin  31520 Apr 11  2022 /usr/bin/fgrep
  466612 -r-xr-xr-x  6 root  bin  31520 Apr 11  2022 /usr/bin/grep
  466711 -r-xr-xr-x  2 root  bin  15288 Apr 11  2022 /usr/bin/pgrep
  466612 -r-xr-xr-x  6 root  bin  31520 Apr 11  2022 /usr/bin/zegrep
  466612 -r-xr-xr-x  6 root  bin  31520 Apr 11  2022 /usr/bin/zfgrep
  466612 -r-xr-xr-x  6 root  bin  31520 Apr 11  2022 /usr/bin/zgrep


> But it is always possible to launch the program and give a nonsensical value in argv[0].

Well, if you're being a smart@ss and doing that, don't expect it not to break later.

Putting a weird value there and hoping it won't break is as stupid as calling the program with wrong options and hoping it will work.


Exactly.


the name of an executable is like a global variable the instant it is used as data; it can be changed at any time by someone else and then behavior changes.

names are not data. don't make them data.

names are names.


Using argv[0] for dispatch has worked for 4+ decades. There is zero non-contrived evidence that it ever fails to work.

Further, there's 4+ decades of legacy here, and it's very rude to break backwards compatibility with something like this where there's no need, no security vulnerability that must be fixed by breaking backwards compatibility.


I'm not saying it doesn't work, I'm saying that relying on it being a certain value, or one of a set of values, is like relying on the value of a global variable.

we avoid that in software development-land whenever possible, and for good reason.


Your point is moot given that this works and has for decades, and you concede that point.

> we avoid that in software development-land whenever possible, and for good reason.

We also avoid breaking backwards compatibility unless it's really necessary. In this case, granting your premise for the sake of argument, the harm caused by this change exceeds the actual non-harm of the thing you're objecting to regardless of the theoretical badness of that thing.


So if you symlink grep to something else then you have to use `-E` for that, for the 99.999999% use case of the rest of the world egrep being a hardlink to grep would have worked just fine.


> When it comes to avoiding forks, the question is why do egrep, fgrep have to be shell scripts rather than interpret the invocation name (argv[0]) for switching behavior.

I don't know. I'd have taken the argv[0] approach personally. And it's the approach some other grep implementations have taken too (eg https://github.com/freebsd/freebsd-src/blob/main/usr.bin/gre...).

The argv[0] approach (to me) feels like a right kind of compromise.


Switching behavior according to argv[0] been used in Unix for decades. mv(1)/ln(1)/cp(1) historically are the same program, hardlinked.


Wasn’t rm also the same?

I vaguely recall it being little more than ‘mv $1 /dev/null’ under the hood. But this might be something I’m misremembering.


Yes. I made a mistake, I think cp(1) was separate.


> I'd be surprised if that warning made its way to the mainstream distros.

This is the first time I'm hearing about the warning, but I thought I'd run `egrep` and `fgrep` on my Arch system. Sure enough, both result in warnings: `egrep: warning: egrep/fgrep is obsolescent; using grep -E/-F`

So it's made its way into Arch, at least. Though, like I said, this is the first time I'm hearing about this.


I use arch, and I had an alias for grep to expand to `egrep --color=always` in my interactive shell. I noticed real fast. Fortunately it's an easy fix to `grep -E --color=always`.

sidenote: I wish that grep would allow running `grep -E -F`, and have the `-F` flag override the `-E` flag, rather than giving an error about conflicting options, so that I could have an alias like this to make extended regex the default, but allow me to change it with a flag.


You could write a little function that does that


For me, that's the biggest takeaway:

Nothing is stopping users from maintaining the behavior they had previously.

It's being made explicit that the burden of maintaining that behavior is changing.


What exactly do you think the burden of _not_ changing a shell script is?


Yeah, this is a good point. This is basically just GNU saying that those old aliases are not something they want within the scope of grep itself.


I too was surprised to see these were shell scripts. I was expecting the grep/fgrep/egrep names to be hard links to the same executable that would check `argv[0]`, as the BSD implementation does.

More interesting than the commit diff is the brief discussion on the bug report from when this all happened a year ago, as referenced in the commit message: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=49996

Sigh. Time to retrain the old fingers again...


Thanks for sharing that. It was an interesting read. Particularly this comment:

> The irony... that one of our own tests used fgrep!

If ever there was an argument for fgrep becoming a pseudo-standard that should remain, that would have been it.


> though it has already been committed to git for more than a year now and this is the first I've heard people complain about the change.

This reasoning is flawed. I only noticed it a few months back on Arch. So between the time it takes the project to make a new release and it hitting Linux distros (Debian is slower and Ubuntu even moreso), don’t expect a flood of complaints to be time correlated with the change.


> This reasoning is flawed.

I wasn't making any reasoning. I was just making the observation that this change has been staged for a while now and that I'm surprised there hasn't been more noise about it before now.

> don’t expect a flood of complaints to be time correlated with the change.

I expect complaints to be correlated with whenever a blog/tweet/whatever moaning about the change happens to trend. I don't think it has much to do with when distros pick up the change because, as I also said, I expect distros to backport the old behavior. So I think the timing of any backlash is entirely dependent on the mood of the internet hive mind.


> I was just making the observation that this change has been staged for a while now

“There’s no point in acting surprised about it. All the planning charts and demolition orders have been on display at your local planning department in Alpha Centauri for 50 of your Earth years, so you’ve had plenty of time to lodge any formal complaint and it’s far too late to start making a fuss about it now. … What do you mean you’ve never been to Alpha Centauri? Oh, for heaven’s sake, mankind, it’s only four light years away, you know. I’m sorry, but if you can’t be bothered to take an interest in local affairs, that’s your own lookout. Energize the demolition beams.”

Again, you're completely missing the point of the comment you're replying to: average users don't pay attention to what's being staged upstream. You're only going to get the real flood of complaints when it actually gets pushed out and people's houses start getting demolished.


> Again, you're completely missing the point of the comment you're replying to

Actually you and the GP are the ones missing my point by obsessing over a throwaway comment about this commit being over a year old. I was literally making zero conclusions from that observation. You guys are reading far far far too much into that comment. You seem to be projecting your annoyance about this change onto me as if I’m defending and supporting this change, yet literally nothing I’ve posted has supported that claim.

> average users don't pay attention to what's being staged upstream.

I’m going to assume that you skipped over my point about how any outrage will come from blog/Twitter/etc posts going viral.

This has already landed on some distros and most people, rightly or wrongly, went “meh”. If that last 30 years of the internet has taught me anything, it’s that people get outraged by posts, not by software. And the fact that youre arguing over a meta-point like when people will get annoyed, rather than discussing the technology itself, really just confirms my 3 decades of observations.

In fact the only reason we are discussing this now is because someone blogged about it and it hasn’t even hit the distro they’re using; they know about it because they read another news article who found out about it from the release notes posted on the mailing list and the authors then went back and checked the commits! Literally nothing in that chain of discovery was via experiencing the change itself in their chosen distros.

So to be clear:

I am NOT suggesting that the commit being > 1 year old means GNU have a free pass to make a breaking change. Any conclusion like that you derive from my posts are a misinterpretation and not worth arguing over.

Now, can we move on to more interesting things?


The maintainers' fears about using argv[0] to select behavior are unfounded.

Anyone exec()'ing grep with an alternate argv[0] can just use (or not use) the -E/-F options to get the behavior they want.

It's very simple. There is just no good reason for egrep or fgrep to be shell scripts or wrapper programs of any kind. Even if the were a good reason for having a wrapper program, the wrapper program could be a very tiny C (or even assembly) program that just exec()s grep with -E or -F added, thus avoiding any additional fork() calls (though there would still be one additional exec() call).

But, really, argv[0] dispatching is plenty good enough.

This warning, however, is much too annoying.

Plus, argv[0] dispatching has worked for every other Unix and Unix-like OS of note for 4+ decades! What's so special about GNU in this regard? The answer is clearly: nothing.


What's the problem? Just make egrep and fgrep shell aliases in the default /etc/profile (if you're a distro maintainer).

Also, where's the extra fork?

    #!/bin/sh
    exec grep -E "$@"


the overhead isn't even an extra fork, as it uses exec


On most context involving "using grep", shell scripts, interactive shell, etc, it's nothing.

81 Vs 128 system calls on my laptop (just printing --version)

    $ strace -fc grep -E --version 2>&1 | tail -1
    100.00    0.000000           0        81         3 total
    $ strace -fc egrep --version 2>&1 | tail -1
    100.00    0.000000           0       128         6 total
Measured with "time" 0m0.002s Vs 0m0.003s, always, testing 4 times each option.

Most situations where it could be being used, probably are surrounded by much bigger optimizations to work on.


true. using hyperfine on my machine shows that grep is about 2x faster than egrep (and yeah, still nothing, 1.5 vs. 3ms)


Good point. Though invoking $SHELL and parsing the script (as short as it is) can't be cheap either. Academically speaking of course; I'm not trying to justify the change on performance grounds.


> I remember around ~10 years ago being told "you should never use `egrep` because it is slower than `grep -E`."

I remember around 30 years ago reading that egrep was faster (the UNIX versions were entirely separate implementations, egrep of course being newer) and have used it ever since.


exactly the same here.

although i'm using ripgrep more now, proving that you can teach an old dog new tricks.


I am not sure I agree with the logic. On an absolute instruction count scale grep will often dwarf the instructions necessary for the higher level logic captured in the shell script. Insert appeal to andahls law here. Writing in some more optimal language often requires considerably more work, and if the marginal benefit of reducing a few forks is minuscule compared to grep performance, why would you do this?


It depends on where your hot path is. If you’re writing the kind of performance critical code where the different between ‘egrep’ and ‘grep -E’ matters, then you shouldn’t be writing it in $SHELL to begin with. Not just because of the expense fork()ing, but because Bash is interpreted, because each exec requires loading new code into your L1 cache, and so on. It’s just never going to be as quick as something written in most other languages with a mature regex library.

I’m not saying this as a criticism against shell scripting (I’m an author of an alternative $SHELL and shell scripting language, so do have a love for shell scripting). I’m saying it as a criticism against people who try to optimise against the wrong things.

I guess in a way we are “violently agreeing” (as some say) because your point about the engineering effort mattering more than micro-optimisations is an example of the right kind of optimisations. Which is what I’m also, albeit badly, trying to describe.


> "if concerned with performance why are you running code as a shell script"

Rewrite grep in python before running - got it. :P


That's not what I suggested. I was saying if you're writing performance sensitive code with a hot loop calling `egrep` then a smarter approach might be to use a language better tuned for performance which supports regex libraries.

Shell scripts have their place and can out perform most languages if you're writing simple logic that can run across large datasets in parallel, such as

  cat very-large-file | grep "do not want" | sed -r 's/foo/bar/' > very-large-file-refactored
(please excuse the useless use of `cat`, it's there to illustrate the direction of data flow)

But in those types of scenarios the cost of $SHELL interpretation and fork() is massively outweighed by the savings of stream processing.

So my point was: if you're writing a function which $SHELL interpretation and/or fork() create enough of a performance impact where you're looking to optimize how you exec `grep -E`, then maybe it's time to investigate whether $SHELL is the right language to write your function in.


> please excuse the useless use of `cat`, it's there to illustrate the direction of data flow

Note that you can write

    < very-large-file
Instead of

    cat very-large-file |
To avoid a useless use of cat and keep the direction. Not that I care very much though. Using cat is less surprising to most people.


> Using cat is less surprising to most people.

Exactly :)

STDIN redirection is a neat trick though so definitely worth highlighting. But it wouldn’t have helped with readability in my particular example.


Perhaps I should have added a "/s", in addition to my ":p", but I appreciate the explanation here nonetheless


Why wouldn't they just convert them to aliases instead of adding the warnings?


aliases aren't a global state. They're shell and profile specific (as there are plenty of instances where a profile isn't loaded).

With `egrep` defined as a shell script, any process (even callers who are not shells) will have the same behavior.

That doesn't mean the shell script is the best solution though. Personally I'd rather they be a symlink to `grep` and have grep check the name it's been invoked as, and if it's "egrep" (or "e" if you're just checking the first byte) then implicitly enable the -E flag. This is how FreeBSD works and it feels like a more elegant solution in my opinion.


Sure, makes sense- so then I guess I'd amend my question to that: I can't see why they'd add the warning instead of just redirecting the command


wow. Didn't realize they are shell scripts on linux.

on (an older) macos system, they are identical executables


I use fgrep all the time. Why should I spend time converting scripts, and retraining my fingers, and forever after taking longer to type grep -F? This is a breaking change that has no upside, only downside. It is idiocy.

Just to elaborate, there must be many thousands (hundreds of thousands? millions?) of shell scripts out there that use fgrep. For no reason at all (whatever maintenance burden fgrep represents must be far outweighed by the time wasted debating this issue), these scripts will no longer work. There is no guarantee that defining an alias somewhere is going to fix all your usages of fgrep. And even if it did, the cost for every single installation that is affected of someone realizing there is a problem, finding the source of the problem, figuring out that an alias will fix it, making sure that that alias gets preserved after updates, and testing that everything now works will exceed the benefit to the maintainers of grep (which is actually negative, as explained above). So looked at from a community-wide perspective, the cost is many thousands of times greater than any possible (but actually non-existent) benefit.

And as the post mentions, there are also all the books, and all the stackoverflow answers, that use fgrep, and which will now have non-working code.

As for the argument that this is open-source software maintained by volunteers, who can therefore do whatever they wish - if you are a maintainer of widely-used open-source software, and decide that you personally are not interested in working in a manner that benefits the community of users, then you should resign and let someone else take over.


There's a certain type of person who gets pleasure from following rules exactly, particularly when other people don't follow those rules. A kind of smug moral superiority.

If they get into a position of power, they can make life worse for other people, without much regard for the actual cost of breaking such rules.

Removing egrep and fgrep is desperately petty stuff.


If you look at the POSIX standard that removes fgrep & egrep [1], then you can see that most of the utilities being removed there really are quite obscure things.

In light of that, then another (perhaps more charitable) way to look at it is that the maintainers are keen to clean up stuff that they perceive as old cruft, and perhaps don't realise quite how ingrained these particular pieces of "cruft" are into the ecosystem.

(On the other hand: the same POSIX link specifies the removal of cpio and, uh, tar. I'm not sure I see that last one flying.)

[1] https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xcu...


No `tar`? BURN. IT. ALL. DOWN.

Dear God, what could they possibly be thinking? People are capable of making changes, I think we've all switched from `more` to `less`, *csh to more modern shells, etc., and if `pax` is really all that I'm happy to switch. But removing tar is insane.


”removal of cpio and, uh, tar”

They’ve finally figured out how to make people use pax.


It’s a show of how free software partisans are often out of touch.

The cost of egrep and fgrep is just two symbolic links. Some people though would rather talk about ‘libre’ and ‘free as in speech’ and ‘free as in beer’ than talk at all about the user experience. And they wonder why most people just run windows or macOS, but they’ll never understand.


> The cost of egrep and fgrep is just two symbolic links.

A cost that needs to be paid every time you invoke them (a double lookup in the filesystem). Just use a regular link.

(I felt a pedantic response would be in the spirit of this change by the GNU project).


> > The cost of egrep and fgrep is just two symbolic links.

> A cost that needs to be paid every time you invoke them (a double lookup in the filesystem). Just use a regular link.

I'm having a bit of trouble understanding the outrage. Can't all egrep/fgrep in all scripts be fixed once in one or two commands and remain fixed?

     $ sed -i 's/fgrep/grep\ -F/' *.sh
(or something)

I see it as getting accustomed to bells and whistles on some modern shell (like tab completion on zsh and others), then having to use a shell on another system without them (some ragged old korn) and getting tripped up... and the ultimate solution is to be conscious that these things don't work everywhere, so don't be too reliant on them.


How would you find every script given that people download all sorts of scripts all the time? What about docker images?

And that doesn't even open the can of worms relating to quoting...


If we don't know where our scripts are, we have much deeper issues, but still not insurmountable.

      $ sed -i 's/fgrep/grep\ -F/' $(grep -rw '/' -e 'fgrep')
(or something)

I don't manage docker images, but surely there must be a way to do something as simple as replace a file in one image,[1] and if that can be done, doing so in all images is just one clever step away.

[1] https://stackoverflow.com/questions/40895859/how-can-i-overw...


Aside:

What do you mean by "use a symbolic link" (I'm familiar with `ln -s` fwiw) the alias carries the command switch, how do you do that with a link?

    # example
    alias egrep='grep -E'
??


Currently, egrep and fgrep are shell scripts (the too-simple version of this idea would be

  #!/bin/sh
  exec /usr/bin/grep -E "$@"
but that would mess up help messages and probably also needs disaster handling in case exec fails). Another traditional way, however, is to symlink everything to a single binary, which uses argv[0] or getprogname() to see which name the user called it under and acts accordingly. In the extreme case, you get the "multi-call binaries" Busybox or Toybox, which contain an entire set of Unix utilities in a single (not-too-slim) executable image. GNU grep also used that technique at some point in its history, but it was discarded for a reason I don't recall.


Actually it turns out that, at least in debian 11, you were right on the money:

    cat `which egrep`
    #!/bin/sh
    exec grep -E "$@"
But I agree that it would be cleaner and trivial to implement this behavior by depending on argv[0], GNU style guide be damned.


You use argv0 to check the program name. The utility has to support it in code which GNU apparently dislikes as someone mentioned elsewhere in the thread.


A program like "grep" gets the path to the binary in argv[0], if it sees that argv[0] is something like

   /some/path/fgrep
it adds the -E flag. A hard link works OK too, as done copying the whole binary.


This is not the case for GNU grep. It appears only some BSDs do this.


The worst is when the rule they cite isn't a real rule at all, like Lennart Poettering's defense of systemd breaking on usernames starting with a digit, which is completely valid as far as the Linux kernel is concerned but he decided such names were ackshully against the rules and therefore systemd wasn't broken, bug report closed not-a-bug.


> There's a certain type of person who gets pleasure from following rules exactly, particularly when other people don't follow those rules. A kind of smug moral superiority.

HN commenters in a nutshell.

Not you in particular, but oh my god, the rules-lawyering and lecturing on this site is unbearable at times.


We've seen a lot of this in recent years.


> decide that you personally are not interested in working in a manner that benefits the community of users, then you should resign and let someone else take over.

As walled gardens begin to reassert themselves, I think this advice above will be the big argument in open source over the next ten years and needs to be written on every vertical surface. Either we will have social contracts in open source, or we will have commercial contracts. There will be diminishing examples of “both” and people will continue clutching their pearls about co-opting until something changes.

“Be grateful you get anything” is good advice for people struggling to find their own peace of mind, but when it comes up in an argument that’s just abuse.


This is exactly why Torvalds is so radical on not regressing: it is costly for him (I am sure there are many things hr would have done differently now) but ensures the users’ security.


I mean they could just make it an alias that calls the appropriate grep function, and mention fgrep/egrep are just aliases in all the documentation. Tools like ShellCheck could warn on use of fgrep/egrep in scripts in favor of grep if it wants things to be pure.

But the cost of keeping an alias around for decades is negligible I think. It's cheaper than breaking every existing usage in any case.


If your reason for using fgrep is that grep -F is more characters longer, the logical thing would be to have your own function or alias which is called just f or fg.

I stopped using egrep and fgrep in scripts ~25 years ago when I found out they weren't in the standard.

grep -F is only useful when the pattern contains regex characters that should be treated literally, and since the default regex language of grep is BRE, those characters are few.

If only one special character occurs you break even on character count if you use a backslash escape:

  fgrep 'a*c'
  grep 'a\*c'


I find it amusing that someone who uses fgrep all the time - and is presumably familiar with alias and sed - would be worried about the inconvenience of updating their shell scripts...

I think the larger issue is that people generally don't do dependency management for their scripts. Scripts are programs, and they are dependent on the shell environments they are written for.


I think you're not grasping the point that the cost of using 'alias' or updating scripts, though not huge, is paid by every single user, and often over and over again, every time they switch to using a different system. Against this cost for every user, one weighs the benefits of this change - which are zero. Indeed, the benefits are negative, since the maintainers must surely be wasting more time debating the change than they will ever save from it.

Of course, scripts are dependent on their environment. That's why changing the environment for no reason is such a bad idea. Nobody manages their shell scripts to account for the possibility that 'true' might cease to exist some day...


> the cost of using 'alias' or updating scripts, though not huge, is paid by every single user, and often over and over again

The cost of a change that effects N users is paid by N users.

    user*N - change*N = user - change
The amount of work each person must do is not increased by the amount of work everyone else most do.

> Against this cost for every user, one weighs the benefits of this change - which are zero.

That point specifically I am not arguing with. I agree this change doesn't seem to have merit. Remember: I am not the one imposing it on you.

Deprecating fgrep in a grep package distribution is just like deprecating a function in a library. The only reason it would be more difficult to handle that change is if you don't have a straightforward way of managing changes in your scripts and/or shell environments.

And that's the point I was making: it's very common for shell scripts to exist in unmanaged - and frankly brittle - environments. That's a problem that is not unique to this situation.

---

Here's an example:

If someone deprecated the printf() function in gcc version 127.4, replacing it with print(format=False,...), then anyone using gcc 127.4 would have to make a straightforward change to their code. Would it be annoying and pointless? Yes, that's the criteria for this hypothetical situation. Would it be difficult to manage? No, not really. You would just update your gcc dependency version, find and replace printf, sanity check, commit, done.

The reason deprecating fgrep is scary is that people tend to manage shell scripts more liberally. In other words, they likely aren't managed at all. Are they in a git repo? Who is running it? On what machine? What user? Critically, there probably isn't a clear way to predict what version of grep is going to be installed in what environment, and when.

This isn't a new problem. Package maintainers diverge all the time on what binary names and aliases they provide. Tell me, if you opened a shell on any random GNU/Linux system you have access to, and typed "python", would it be version 2 or version 3? Would python3 be in the path?

It's valuable to be able to manage these things. That's why projects like Nix are so popular.


On my system, egrep and fgrep are _already_ shell scripts, just changed to add the stupid warning:

    $ cat =egrep =fgrep
    #!/bin/sh
    cmd=${0##*/}
    echo "$cmd: warning: $cmd is obsolescent; using grep -E" >&2
    exec grep -E "$@"
    #!/bin/sh
    cmd=${0##*/}
    echo "$cmd: warning: $cmd is obsolescent; using grep -F" >&2
    exec grep -F "$@"
Looking at src/egrep.sh in the git repo[1], I see that before this warning was added in 2021, the last change was back in 2015. In fact, since the git repo history began in 1998, there have been a total of 4 changes to this file. Clearly the maintenance burden of these scripts is not sufficient to require the maintainers to drop them. My guess... this is being done out of some weird sense of purity over practicality. Bad call by Paul Eggert IMO[2].

1: https://savannah.gnu.org/git/?group=grep

2: commit a9515624709865d480e3142fd959bccd1c9372d1 added the warnings


If you have interacted with Paul Eggert in real life (I have; he's a professor at UCLA), you would've known that he is a man that values purity over practicality. Not surprising at all.


Is it the same guy that maintains tzdb? https://lwn.net/Articles/870478/


Yes he is. He's a maintainer for a lot of GNU software including Emacs, sort, etc.

Running

    fgrep -U -l -r "Paul Eggert" /usr/bin
should turn up quite a few results. (That's of course an undercount because he doesn't always put his name into everything he touches.)


You surely mean

  grep -F -U -l -r "Paul Eggert" /usr/bin
Don't you?


> cat =egrep =fgrep

What is that? I can't seem to find it by searching for " =" in `man bash`


It's a zsh thing.

14.7.3 ‘=’ expansion

If a word begins with an unquoted ‘=’ and the EQUALS option is set, the remainder of the word is taken as the name of a command. If a command exists by that name, the word is replaced by the full pathname of the command.

From: https://zsh.sourceforge.io/Doc/Release/Expansion.html#g_t_00...


This is a zsh thing, it's shorthand for:

cat "$(which egrep)" "$(which fgrep)"


In ZSH `=executable` expands to the full path of `executable`.


What problem is this warning trying to solve? Are these two symlinks too much maintenance burden? Or is the check in the code hurting the code quality? Is the extra check at startup ruining performance?

I'm usually in favour of having one way to do things but in this case, with this much legacy it just doesn't seem worth it.


This is free software. It is legitimate to consider convenience for the maintainers. If they don't want to maintain two symlinks then they are empowered to make that call.

If anyone thinks it is a big enough problem that they want to fork the software they can, or the distros can maintain their own symlinks. But I think in this case the simple answer is if the maintainer doesn't want it in the source tarball then it isn't going to be there and that is more than sufficient a justification.

Complaining is probably more reasonable than asking for a justification here.


I don't know when society in general just said "Fuck It" to the idea of stewardship; but we now have it ingrained that people in positions of trust and power (whether volunteered, elected or appointed) are not morally or ethically beholden or responsible to the communities they have taken it upon themselves to represent.

At least we used to pay lip service to that ideal.


> At least we used to pay lip service to that ideal.

When was that? For as long as I can recall, a core mantra of free/libre software has been that it was provided "AS IS" without warranty of any kind. Decades ago the dominant response I recall was one of gratitude and a little amazement that ad-hoc communities of volunteers were making real software that wasn't just academic but on par with commercial offerings. Some of those communities chose to adopt a user-friendly posture because they wanted people to like them but plenty did not and just did their own thing. As long as they could attract contributors they kept going.

Personally what concerns me is this growing expectation that volunteering to maintain an open source project also means you are "morally or ethically beholden or responsible" to anyone who uses it. In practice that seems to mean maintainers must respond to user requests or end up on the receiving end of a great deal of vitriol. It's no wonder so many volunteer maintainers who have internalized this responsibility are burning out, and how many more potential maintainers are dissuaded by seeing what is happening to the current maintainers.


Well, concurrent with this decline in stewardship has been a decline in graciousness towards volunteers, so I see your point.

I remember when I was younger, every volunteering experience I had was a delight; people thanking me for my time, getting me free coffee "just because", etc. Some of my more recent volunteering experiences have been less pleasant.


They didn't necessarily "choose communities to represent". As the maintainer and author of various open source libraries and tools that are used by many thousands, in most cases it's just that ... I'm one of the few willing to spend the time on it, and it's usually useful for myself as well. I don't really "represent" any community or anyone.

That said, I certainly wouldn't have put in this change myself, because I wouldn't like to inconvenience anyone. But that's just basic good manners that you should have in every-day life to random strangers.


That's a very different thing than joining GNU. If your package gets adopted by thousands of peoples that's having this community thrust upon you.

I still feel you have an obligation to transfer to someone willing to shoulder the responsibility if this happens, an obligation incurred by publishing it in the first place.

Though, I don't feel too strongly about this.


Jim Meyering made the change, who has been maintaining these things from before half the people here were born.

Even when taking over maintainership for popular packages later on, people often aren't exactly breaking down the door for it. So my philosophy is simple: "if you do the work, you get to decide". I may like or dislike these decisions, and at times I may even rant about how stupid a certain decision is, but in the end ... the people doing the work get to decide. The alternative of being beholden to a vaguely defined "community" of armchair quarterbacks is much worse, IMO.


[flagged]


Well, if I write some software for my own use, put it on the internet "because why not?", and lots of people start using then that's nice. But ... I don't think putting anything on the internet automatically imparts any kind of responsibility towards "the community", which usually means "people who download and use your software, the overwhelming majority never give anything back in the form of code, bug reports, money, or anything else".

But like I already said in my previous comment, I wouldn't have made this change myself. I actually strongly disagree with it. But I can also accept that other people have a different attitude, and that's okay too, even if I personally don't really care much for the particular attitude.

Funny you mention OpenBSD, because OpenBSD is very much "by the developers, for the developers" and has a fairly decent "fuck off" attitude once people start making demands (which I don't think is a negative per se).


Are you suggesting that a 15 year deprecation path is a fuck it attitude towards stewardship?

Distros are free to've replaced the symlinks with wrapper scripts in the meantime. Granted, I expect Arch will go ahead & let the symlinks disappear. But I can't imagine what you think of Arch's stewardship

I expect this should have about as little impact as the usr merge many distros have gone through


I'm replying to the attitude in the comment I'm replying to, if I meant this as a comment on GNU's approach to fgrep/egrep, it would have been top level.

The original deprecation seems ... petty? Not sure why it's a priority, but whatever.

I know there are scripts still in use that I wrote more than a decade ago that might do weird things now and I wish the maintainers best of luck.


One major issue with humans is that sacrifice begins to be expected and often is not rewarded. When a job not only becomes thankless, or near enough, but also expected as the default, it becomes hurtful to continue doing it.

Are the stewards being provided fair compensation? How do we even talk about what is fair compensation when non-monetary compensation has become difficult to even discuss (often due to past instances being extreme disproportionate or of a form that is no longer tolerable).

The simplest way to put it is that if you can't find a steward you aren't paying enough and trying to use appeals to morals or ethics to get people to accept lower pay no longer holds as much weight when being moral or ethical no longer provides the same level of non-monetary benefits.


> One major issue with humans is that sacrifice begins to be expected and often is not rewarded. When a job not only becomes thankless, or near enough, but also expected as the default, it becomes hurtful to continue doing it.

Oh, most definitely. The anger and vitriol I see directed towards maintainers, or volunteers of any type, who are "stepping back for personal reasons" is horrifying.

My father was a very active volunteer in his community ... the number of people who were mad at him when he stepped away after his heart problems was startling. Conversely, the number of people who volunteered to help him and my mother with shopping etc. when COVID hit was heart warming.


This is "back in my day"-levels of lazy armchair criticism. Nothing has changed. People will always make decisions that you disagree with, and they are more likely than not doing it in a good faith attempt to benefit the community. These sorts of inflammatory and hyperbolic comments help nobody, and are just childishly over-the-top. egrep going away is not evidence of moral or ethical bankruptcy, just wow.


It can't be that much work to maintain a symlink. The idea that it could be less work to remove a standard feature that has been part of Unix for several decades, has no connection with reality.

There is no reasonable explanation for this decision except that somebody thought that having both egrep and grep -E was "ugly" according to their own personal sensibilities.


And this is what I hate from the argument "you can't expect the maintainers to always support the feature". No, I don't. But I do expect them NOT to regularly remove/break the features I have contributed!

E.g. I can't count the number of times I have submitted _the_ _same_ _GUI_ _fixes_ to certain popular browser over the decades. Because apparently "they" have to rewrite the chrome of the mobile version from scratch every handful of years. "They" is in quotes because it's, in fairness, never the same person or even the same group of people. It's a CADT.

And in this case, to remove what apparently are two small shell scripts which for sure cost more to remove than to preserve....


Cascade of Attention-Deficit Teenagers?


> It is legitimate to consider convenience for the maintainers.

Yes, but the convenience of... creating 2 symlinks and adding the flag based on argv[0]?

Sorry, I don't buy it. Wanting to die on such a tiny hill has GNU written all over it though


What 'maintenance' goes into symlinks that already exist? The only thing one needs to do is nothing at all. It is in fact more 'maintenance' to delete them.


What exactly do you imagine the maintenance being? egrep and fgrep are already shell scripts, and those scripts haven't been edited since 2015. If not needing to change something for 7 years counts as a maintenance burden, sign me up.


> It is legitimate to consider convenience for the maintainers

In which case, time to fork it.

I don't fucking care who you are, you do not break grep and keep a privileged position on my machine.


Nobody is stopping you from forking it.


That's the plan. I haven't built a new package in a while, time to remember how `fpm` works.

Conflicts: grep

Provides: grep


Unfortunately, this seems symptomatic of GNU's attitude: we've come up with something silly and it's going to screw some people over badly in the least agreeable moment, but we're too stubborn and proud to admit it. Take it, or buy a Windows 11 license.


At a guess, it isn't actually the symlinks, it's the argument parsing they're trying to simplify.

Old programs have stupidly complex argument parsing. If you can pass your arguments as "cmd foo bar path", "cmd -f bar path", "cmd -fbar path", "cmd path --foo=bar" or "fcmd bar path" and "cmdf path bar", it can be really convenient for users who can structure their commands in the way that makes the most sense for them.

But it can be really frustrating to maintainers who are maintaining -- and testing! -- a thousand lines of bespoke argument parsing for a ten line function.

It's really, really tempting to define a simple syntax for argument parsing, turn it into a library, and reduce your pile of shell commands to a few lines of argument configuration and a function call.


> At a guess, it isn't actually the symlinks, it's the argument parsing they're trying to simplify.

There’s no parsing. GNU’s egrep and fgrep are trivial shell scripts:

    #!/bin/sh
    exec grep -E "$@"


Then that's a real jerk-ass move.


The problem is people being wrong, and the upside is feeling good about punishing them. IMO.


I get the annoyance, for both the GNU maintainers and distro maintainers/sysadmins. I'm not too wedded to either outcome, but my take is that we should avoid global mutable state:

- If you're relying on random globals (like the path /usr/bin/egrep) to (a) exist and (b) behave in a certain way, then don't mutate them. Stick with known-good versions, and treat updates like any other code change (review, test, etc.). This is the usual case for legacy systems. In other words, don't blindly run `apt-get -y upgrade` on systems which are meant to be stable.

- Alternatively, don't use globals: install grep in its own directory (e.g. via the `--prefix` argument of `./configure; or using a sledgehammer like chroot), and treat that ___location as a capability (as per capability-security). Programs/scripts which need that grep must be given its ___location explicitly (either for direct calls, or by having it prepended to their $PATH). If you want to use an updated version elsewhere, just install that to a different ___location; no need to touch the existing setup.

(Shout-out to Nix for doing this via ./configure and $PATH. Runner-up prizes for chroots/jails/zones/containers/VMs/etc.)


Both of your suggestions are basically equivalent to "think before you upgrade" and/or "don't upgrade". This is the second elephant in the room with all these distros that "don't use globals" and/or statically link everything (or do an analogue to that, like nix). There's very little benefit for desktops. So the upgrade to dependency X breaks component Y, and you are forced not to update X, or at least, Y's copy of X. Great. What do you do now? Swim away from upstream? Stay on outdated components? The situation is as unatenable long-term as it is on a regular distro...

The first elephant in the room is that generally you _do_ want most dependencies to be global state. Is a situation where every program is using its own version of the graphics toolkit, with different theming issues or even different themes altogether, really ideal for a desktop situation? What about libevent -- so that your mouse's wheel has one acceleration curve in some programs and some other speed in some other programs ? Or what about ibus, where generally having different client versions running simultaneously means your entire input system stops working ?

Even grep is likely something that you'd prefer to be global state, lest the grep exec() by a Python interpreter have different features than the one launched by your main shell.


Yes, but Nix just solves this nicely enough.


> The situation is as unatenable long-term as it is on a regular distro.

I may be misremembering, but isn't this pretty much the default user experience on rolling distributions like Arch Linux? -- You consult the wiki to see what changes you need to make are (or maybe see what breaks after an update), make the change, and get on with your desktop experience.


Idk, but Arch Linux is a bad model, in my experience. I had inherited a system that ran Arch Linux, but hadn't been updated in a while. Imagine my surprise when I needed to update something, and couldn't. A part of the upgrade path (or whatever you want to call it) had been removed. So now I have an Arch Linux system that hasn't been updated for way too long, and we need to consider abandoning the product that runs on that machine, or invest in porting it to some modern environment. It's hidden behind a firewall with access for only 2 IP addresses, so it's unlikely to get hacked, but it's most undesirable.

If you like that model, fine, but don't force it onto the whole world, unless you can commit more resources to it than to Arch Linux, and basically keep all upgrade paths alive forever.


That is exactly my point.


Or...

And I'm going out on a limb here...

Don't support user hostile maintainers who force breaking changes to a decades old API just because they felt like it.


> If you're relying on random globals (like the path /usr/bin/egrep) to (a) exist and (b) behave in a certain way, then don't mutate them.

It's a bit old now, but one of the principles of Twelve Factor Applications is to vendor-in all of your dependencies.[1]

> A twelve-factor app never relies on implicit existence of system-wide packages. It declares all dependencies, completely and exactly, via a dependency declaration manifest. Furthermore, it uses a dependency isolation tool during execution to ensure that no implicit dependencies “leak in” from the surrounding system. The full and explicit dependency specification is applied uniformly to both production and development.

This scenario, with `fgrep` and `egrep` releasing a potentially breaking change, is exactly why this principle exists. If your software depends on "whatever fgrep happens to be lying around at the moment", your application might break the next time you build and deploy a new image. If you're pinned to a specific version, however, you're protected.

[1] https://12factor.net/dependencies


You're protected, but it also means you can never upgrade anything. An application can never know what future versions of a dependency it can work with (because a new version might contain a breaking change), so it will always require versions that existed at the time of its own creation.

But those versions might contain vulnerabilities, and the new version might fix those while being fully compatible in every other way, in which case you really do want the upgrade.

Those dependencies should really declare when they change existing behaviour of the previous versions, and only then should your application refuse the upgrade.

No idea if there is a system that works that way; I can imagine it could get hideously complex.


> You're protected, but it also means you can never upgrade anything.

You're protected, and it means you have to test your system _before_ upgrading.

Also, I think I'm working with a much broader definition of what "a dependency" is than you are. I consider `Alpine Linux 3.16.2` to be an atomic thing; I don't try to track the state of every single binary that comes along for the ride.

But if I depend on something to be executable from my application, like `ffmpeg 5.1.2` or whatever, I _would_ version lock that, and install it myself as part of the build script.

Then, when `Alpine 3.17` or `ffmpeg 5.2` comes out, I can bump to those versions, run our integration tests, and verify nothing broke.


> You're protected, but it also means you can never upgrade anything.

No, it just means that:

- Upgrading a dependency is a change, which should be treated like a code change (review, testing, etc.). I stated as much above.

- If, for some reason, you're stuck using an old version, that only affects the relevant part of the system. For example, scripts which rely on some old behaviour can remain pinned to that version; whilst other parts of the system can use an upgraded dependency. Also, just as importantly, parts of the system which don't need something (like grep) have no access to any version of it.


Isn’t that essentially semantic versioning plus version ranges ala npm?


(I want to love Nix, but it’s so foreign. I do hope it becomes more of a standard. It’s just hard to learn and hard to use until you learn.)


I went over to Nix cold turkey for my new work laptop. Took about a week or two, to get it where I wanted it. The real issue I hit is that Nix is so alien to working with "normal" unix stuff... like stuff that assumes /usr/bin/grep exists etc, I had issues, with the existing code-base I was hired to work on.

So, I went back to Fedora. But I'd goto Nix again in a heartbeat. It is a great system, and if you start a company with it day 1... there will be a bit of heartburn as you start, but I can't see regretting it.


I'd prefer to qualify that a bit:

When you can get Nix to do what you want, it's wonderful.

When you hit a barrier (e.g. a program you want isn't packaged), that's when Nix is very hard to use, and requires learning.


I’ve been using grep -E for a long time, I thought I remember it being from a warning from egrep or something.. can’t really remember for sure, though.

Either way, I don’t see what the big deal is, just add an

    alias egrep='grep -E'
If you’re worried about your non-interactive shell scripts,

    shopt -s expand_aliases; alias egrep='grep -E'
and you can move on with your life.

EDIT: I must’ve been warned by shellcheck

EDIT2: here’s another one liner with one caveat being updates (also needs to be run as root)

    # unalias grep && unalias egrep && cp "$(command -v grep)" "$(dirname "$(command -v grep)")/egrep"
EDIT3: just realized for most this will only ever come up in scripts because many distros already add alias egrep='grep -E' to your shell aliases (~/.bash_aliases, ~/.zshrc, etc). Thus you may only need the shopt -s expand_aliases.


The problem is that we don't just live on our own machines. We hop around to different boxes and just being able to type what you are used to matters. My dotfiles and other tools don't go or even work everywhere.

We shouldn't change the oldest parts of our OSes without a really good reason.


I find this reasoning interesting since it's why I _stopped_ using egrep/fgrep around the turn of the century: because those aren't standard, it wasn't uncommon to find that you depended on some behaviour which wasn't available in the version installed on some random server but it worked when you used grep.


I switched to just grep, but it's still part of the same line of thinking. I really, really, really dislike it when something that used to be reliable changes or is removed. I understand and accept that things change, but when it's as simple as a symlink or what have you I really do not see the cost-benefit tradeoff for sunsetting something.


I don't remember why I stopped using the shorter `egrep` and only use `grep -E`, but I suspect this was the reason. I used to work on a variety of BSD and Linux servers.

I'm fine with this clean-up and simplification. Eventually, there will be more future users of `grep` than past users.


Yeah, I worked with enough older systems (SunOS, Solaris, AIX, HPUX, FreeBSD/NetBSD/OpenBSD, etc.) that I don't remember which ones caused me to do that, either. I'm really glad it's no longer common to have things like hand-compiled installs of GNU utilities, especially since not every sysadmin was diligent about updating all of them.


It’s a matter of principle. If you wrote a script yesterday (or a year ago) and didn’t include that line, you need to re-release it today. That’s stupid.

Also, you’re being very sanctimonious about this but what if you didn’t see this HN story. Would you know to do that for the next script you distribute?


Frankly I can count on two hands the number of times I’ve used egrep in my life (same for grep -E). As long as I can remember I’ve relied on sed, awk, or find -regex for filtering output with regex. I only say this because I can’t relate to those that are upset about the change.

That said, to answer your questions directly: hopefully you aren’t auto updating anywhere in production or important, so this won’t effect anything until the box has grep 3.8. Ubuntu 20.04 uses grep 3.4, for example.

And the warning prints to stderr, so honestly, I'm having a difficult time seeing this being an actual problem for more than 0.1% of users.

I’m not unsympathetic to those that it adversely affects, but I genuinely haven’t seen anyone point out any severe consequences of the change. I’m not saying it’s impossible, if someone has a real example (sorry, saying you might get a page on the weekend for a warning printed to stderr is a stretch), I’m all ears.


> Frankly I can count on two hands

Emphasis on "I", which says strictly nothing of the rest of CLI users out there.

We're very glad to learn that it is not a problem for you, but it brings very little to the conversation.


But neither does this comment.. do you have an example to share in response to my questions above?


Everyone has their own story, and one’s person’s experience can be very different from another person’s experience. I used egrep a whole lot, dozens of times for the automated test setup I have for my open source project. I had to spend most of an hour this morning updating that code to no longer use egrep—a non-trivial task. Here’s the amount of hassle breaking egrep has given me:

https://github.com/samboy/MaraDNS/commit/afc9d1800f3a641bdf1...

This is just one open source project. I’ve seen fgrep in use for well over 25 years, back on the SunOS boxes we used at the time. egrep apparently has been around for a very long time too. Just because it didn’t get enshrined in a Posix document—OK, according to Paul Eggert it was made obsolete by Posix in 1992, but apparently no one got the telegram and it’s been a part of Linux since the beginning and is also a part of busybox—doesn’t mean it’s something which should be removed.

I’m just glad I caught this thread and was able to “update” my code.


> We're very glad to learn that it is not a problem for you

Worth reading the guidelines [0].

[0] https://news.ycombinator.com/newsguidelines.html


fgrep and egrep have been indicated as obsolescent by the Single Unix Specification and POSIX since the late 1980's.

The man page for GNU grep says:

  In addition, the variant programs egrep, fgrep and rgrep are  the  same
  as  grep -E,  grep -F,  and  grep -r, respectively.  These variants are
  deprecated, but are provided for backward compatibility.
"Deprecated" should be read as "maybe ain't gonna be around one day". "Use interactively, or in your personal setup, but don't put into shipping code".

So if you wrote a script yesterday with these things, you're just one of those people who don't look into specs or man pages.

The scripts of coders who don't read documentation are going to tend not to be portable or reliable anyway. That doesn't mean we should break things left and right to trip people up, but in this case the writing has been on the wall for a very long time.


I don't think aliases help in one common case listed in the article. That is, scripts run out of cron, that don't read initialization files. And that case will email on unexpected stderr writes.


GNU Grep manpage says:

7th Edition Unix had commands egrep and fgrep that were the counterparts of the modern ‘grep -E’ and ‘grep -F’. Although breaking up grep into three programs was perhaps useful on the small computers of the 1970s, egrep and fgrep were not standardized by POSIX and are no longer needed. In the current GNU implementation, egrep and fgrep issue a warning and then act like their modern counterparts; eventually, they are planned to be removed entirely.

Considering how drastically GNU departs from the bare POSIX interface in some of their tools, I find it strangely pedantic how the maintainers hold up the standard in this specific case. Considering the Linux mantra of "Never Break Userspace", I can't find a good reason to drop the binaries (or the script equivalents of calling grep -E/-F) besides uprooting a historical choice that never hurt anybody (with little benefit and a plethora of potential consequences) in the name of better-late-than-never correctness.


GNU: GNU is Not Unix. Or Linux for that matter.


somehow the opengroup removed egrep and fgrep from POSIX in 2018 https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xcu...

chroot, usefull for poor man's containment of 'stuff' also went out.

and they replaced tar with pax --- which uses a tar file format

one of the things I love in Unix is the slow pace of evolution in systems tools. relevant know how doesn't rot as fast as in other environments.


Beyond the inadvisability of making a breaking change for no reason, it's worth noting that deprecating fgrep is actually positively undesirable. Use of fgrep should be encouraged.

The reason is that many times people want to match a literal string. This is best done with, for example, "fgrep [a] <file". Note that this is not the same as "grep [a] <file", since '[' has special meaning in regular expressions. Of course, you can write "grep \\[a] <file", but not everyone has the set of special characters used for regular expressions at the top of their mind.

Of course, one could get in the habit of using grep -F when intending the pattern to be just a literal string. Or one could write one's own fgrep shell file. But both of these options require more effort than just using fgrep. One aim of good design should be to make it easy to do things in the reliable way. That way it's more likely to be done.


If we are talking about interactive use, you could always use an alias.

But if we are talking about scripts, you should at least use "fgrep -- [a] < file". And if you’re adding an option anyway, you might as well use "grep -F -- [a] < file". Personally, I prefer using options, specifically long options, in scripts; meaning "grep --fixed-strings --regexp=[a] < file".

If you don’t do this, the script will fail spectacularly the day when the string happens to start with a hyphen (-).


I thought I was the only one who perfected long options! Are you me?

I've had coworkers call me out (not rude, just "hey you know you just use -l... instead of --longopt") on calls because I always use long options when available. I use the hyphen explanation all the time as I've ran into it a few times.

I also prefer CLI applications that are designed to use the "=" for arguments with long options. Applications which don't use "=" or respect it, irk me because the it's ambiguous... "Is that argument an argument or sub command" when looking through history.


I don’t always use long options interactively, only sometimes, but I certainly try to always use it in scripts, for readability.


Some context: egrep and fgrap are implemented as wrapper scripts that look like this:

    #!/bin/sh
    exec grep -E "$@"
28 bytes.

Now consider the cost/benefit between two options:

1. Maintain these wrapper scripts forever.

2. Update all scripts in the world that use these to use the flags instead, eventually removing the wrappers.

What is the cost of option 1? It's essentially zero. No changes means no engineering work. Some hard drive space is wasted, but TBH you can probably buy a single hard drive with enough capacity to store all copies of these files that will ever exist. Perhaps some people's sense of cleanliness will be offended by the continued existence of non-standard commands but it's hard to demonstrate any actual cost from that.

What is the cost of option 2? Many thousands of hours of engineering labor, just to update scripts. But that's just the beginning: There will likely be production outages when the script are finally removed, since inevitably some uses will be missed. That will cost money. It will also lead to a general increase in fear of software updates, leading even more people to run outdated installations that are vulnerable to security flaws. Security breaches will cost more money.

It seems clear this change will cost millions of dollars, whereas keeping things as they were would have trivial cost. Therefore, this change should not have been made, and should be reverted.

Reasoning like this is why widely-used platforms simply never remove "deprecated" APIs:

* Java still contains APIs deprecated 25 years ago in version 1.1.

* Linux's ABI is famously stable, with Linus verbally abusing anyone who even suggests a backwards-incompatible change. (I don't approve of the abuse part.)

* The web platform is backwards-compatible even with objectively bad JavaScript design decisions made by Netscape in the 90's.

* C libraries still include `gets()` even though any use of it implies a security vulnerability. (Though we hope any remaining users are not running in security-sensitive use cases...)

* The Windows API is full of trash but you can still run executables built in the 90's.

"Deprecated" was never supposed to mean "will go away". It was always supposed to mean only "there's a better way to do this".


If the GNU folks want to remove the wrapper scripts, they can. It's their software and they write the best practices for it.

Distro maintainers can still ship these scripts. They can add aliases to the user profile to do the same thing without having to launch a sh instance. Maybe they'll add a second package, grep-utils, that just contains the wrapper scripts so you can choose between GNU's position and the "I don't like change" position.

Microsoft has deprecated and removed tons of stuff and so did browsers. Vista crashed so often because Microsoft did away with their entire driver model and vendors wrote quick wrappers around their old, crappy drivers and shipped those. Many remaining browser features are reasons why web development sucks so much ("quirks mode" for one) and Java not breaking compatibility has given it many problems that dotnet solved by breaking compat once. Java also has removed several deprecated packages and moved them to libraries instead (like Nashorn).

Deprecated means "if we see a reason to remove this, we may remove this in the future". It's no guarantee for removal but it's no guarantee for being kept around either.


> If the GNU folks want to remove the wrapper scripts, they can. It's their software and they write the best practices for it.

I'm not making any argument about the maintainer's rights. I am making an argument about whether it was a good technical decision.

> [people can work around it]

That doesn't make it a good idea, though.

> Microsoft has deprecated and removed tons of stuff and so did browsers.

Only when there was a real benefit deemed greater than the cost.

In this case there is essentially no benefit whatsoever to the change, except some abstract notion of cleanliness or pedantic spec compliance.


I got this obnoxious whine from egrep on FreeBSD after my latest update. This post made me look, and I had gnu grep install as a dependency for something. Once I removed it, sanity is back and the built-in BSD egrep doesn't whine.

On FreeBSD, they are all the same:

$ ls -li /usr/bin/egrep /usr/bin/fgrep /usr/bin/grep /usr/bin/rgrep

281835 -r-xr-xr-x 4 root wheel 30736 Sep 26 14:37 /usr/bin/egrep

281835 -r-xr-xr-x 4 root wheel 30736 Sep 26 14:37 /usr/bin/fgrep

281835 -r-xr-xr-x 4 root wheel 30736 Sep 26 14:37 /usr/bin/grep

281835 -r-xr-xr-x 4 root wheel 30736 Sep 26 14:37 /usr/bin/rgrep


FWIW, I have no interest in making a similar change to bsdgrep and I can't imagine anyone else would be compelled to bother, either. I just don't see the value in removing these historical names that makes the hassle worth it.


Thanks for sharing. I feel less silly for suggesting this now (as root):

    unalias grep && unalias egrep && cp "$(command -v grep)" "$(dirname "$(command -v grep)")/egrep


why no hard link?


I didn't check where they came from, but on Ubuntu 18.04: egrep, fgrep, and rgrep are shell scripts that call grep with -E, -F, and -r


On FreeBSD, it checks the first character of the program name and modifies behavior based on that. See https://github.com/freebsd/freebsd-src/blob/main/usr.bin/gre...


It's interesting to think about the way these things evolved. Imagine if, over time, the `compress` utility got -gz, -bz2, -lzma, etc. flags, and `gzip`/`bzip2` were all converted to deprecated shell scripts. When is the right time to consolidate variants under one program (and deprecate the variants), and when is it better to let small utilities keep doing one thing well(TM)?

I see people talking about removing the compatibility scripts being driven by a sense of purity, but wasn't it that same sense of purity driving someone to collapse `fgrep` and `egrep` into `grep` in the first place? The sense that these are all just variants of the same goal, and thus should be flags to the same program? Why bother combining them if not to ultimately remove the "extra" programs one day?

I'm not sure what the right answer is. On one hand, I like the idea of a smaller namespace of programs with a larger set of well-documented switches. The alternate universe where `compress` covers the widely-used variants of compression and the variant utilities fell away over time sounds kind of nice. Or imagine if early UNIX had arrived at a "plugin" model, where top-level programs like grep should have pluggable regex engines which can be provided by independent projects? The culture we have, of tiny independent projects, will always make consolidation and deprecation messy events.


The original UNIX design philosophy was very much one-command for one-thing.

=> http://harmful.cat-v.org/cat-v/ UNIX Style, or cat -v Considered Harmful

But that argument is an argument against pretty much all command-line flags. Rob Pike argues `ls` shouldn’t have an option to split the output into columns, you should pipe ls into a column-splitting program.

So the pure version of that philosophy is long-gone. It makes this current effort seem rather arbitrary and meaningless.


As someone who learned about regular expressions before extensively using grep, I found grep to be quite unintuitive, since extended grep is what I am conceptually thinking about from a “theory” point if view. One difference between grep and compression is that grep is bounded by regular languages, but compression is more free form. It’s conceivable to therefore view grep as mature enough of a technology to be finished once and for all, but compression will continue branching into many disparate algorithms.


Consider the popularity of ffmpeg and imagemagick. Putting the mainstream algorithms together under one interface seems to be an appealing model to a lot of people, even though new algorithms are constantly being developed in those areas.

Personally, for audio encoding, I'm much happier just installing ffmpeg than I would be gathering and learning a bunch of flac/ogg/opus/mp3/etc encoders.


Yes, those are quite different. But you also can’t pipe lossy compression data willy-nilly without compounding data loss, so it’s hard to imagine an alternative with unix-style tools.

The difference may also stem from those being application-focused rather than command line, so they don’t have a burden of maintaining legacy applications forever.


I think it's a bit silly to get rid of such established aliases. Surely they aren't a real maintenance burden.

Though I can't say I entirely hate it, for the (admittedly absurd) reason that egrep = grep -E, fgrep = grep -F, but pgrep != grep -P! It's an awkward incongruity that's easy to get tripped up on occasionally.


> The egrep and fgrep commands have been deprecated since 2007.

Isn't 15 years more than enough time to handle the deprecation?


I betcha most people didn't even know they were considered "deprecated".


I learned about the deprecation from ShellCheck [1] that warns if using egrep instead of `grep -E`. That tool deprogrammed many of my bad habits. I had never seen any discussions about it otherwise.

[1] - https://www.shellcheck.net/


That's a cool tool, it's neat seeing Haskell in the wild.

Unfortunately (or not?) I tend to use actual programming languages to make my tools rather than doing shell scripting, so I don't have anything interesting to put in there, but I'll keep a bookmark of it around.


There is so much outdated info on the internet in various forms that it is hard even to realize what is "proper modern way" of doing things unless you really are into Linux \ config stuff.

I am casual user so I can get around system but modern ways always surprise me when I finally find out about it.


And for all you know "grep -E" isn't supported on some system people use. This is not really a concern if you're just writing script for yourself (which are really >90% of scripts; portability often isn't really a concern), but knowing it will work on all systems – new and old – is pretty hard, and sometimes it does matter. Does it work on NetBSD? HP-UX? Solaris? Last year autoconf changed the `..` command substitution syntax to the "new" $(..) syntax and someone complained it broke on their ancient Solaris system.

So ... people will stick with what works, like "egrep".


Yes. This is the first I've heard of it and I've got a ton of scripts that are probably going to break.


I for one didn't know they were properly deprecated.

Though I've not used either since uni over two decades ago¹ on a somewhat off-standard Unix a few of the machines ran, so until this thread I can't say I was remembering that they existed at all.

----

[1] using grep -E when needed since²

[2] and presumably grep -F too though I don't remember ever actually doing that


Most people shouldn’t have been using this in the first place. If you’ve been taught to use these in the last ~10 years then someone, somewhere, has failed horribly.


And now they do, so perhaps it's working as intended. :P


Not when the command in question has been a standard part of all Unix shell environments since 1977, no. Is that a serious question?

https://medium.com/@rualthanzauva/grep-was-a-private-command...

What actually happened here is that POSIX skipped it. It never entered a standard, even though it was (literally!) in every OS and available to everyone. But no one cared that it wasn't in some arbitrary standard, because it was always there. For half a century!

I dare say that "fgrep" and "egrep" have more active users (both interactive and scripted) than "awk" or "ed" or "od" or "bc", all of which are still around.


Sure now that I’ve finished rebuilding all my dependencies to use 64-but time_t and migrated all my services to IPv6 I can finally take some time to finish the great *grep deprecation :)

I’m just joking :)


To be fair, the pain of migrating time_t is nothing next to what I will feel trying not to type `fgrep` every time.


No. Make install a grep compatibility library if necessary, but don't change output syntax.


If you want modern regexp syntax then forget egrep/grep -E and use grep -P for Perl Compatible Regular Expressions. PCRE is the most commonly used regex syntax for modern programming languages, eg Python, Go (mostly) etc and grep -P will save lots of annoyance if you use one of these!


The -P is not for portability.

  $ grep -P foo
  grep: unknown option -- P


It's ironic that GNU is using POSIX as a justification for a decision.

From RMS:

Following a standard is important to the extent it serves users. We do not treat a standard as an authority, but rather as a guide that may be useful to follow. Thus, we talk about following standards rather than "complying" with them. See the section Non-GNU Standards in the GNU Coding Standards.

We strive to be compatible with standards on most issues because, on most issues, that serves users best. But there are occasional exceptions.

For instance, POSIX specifies that some utilities measure disk space in units of 512 bytes. I asked the committee to change this to 1K, but it refused, saying that a bureaucratic rule compelled the choice of 512. I don't recall much attempt to argue that users would be pleased with that decision.

Since GNU's second priority, after users' freedom, is users' convenience, we made GNU programs measure disk space in blocks of 1K by default.

https://opensource.com/article/19/7/what-posix-richard-stall...


Here is what I gave for 'fgrep' on Slackware 15:

cat /bin/egrep

#!/bin/sh

exec grep -E "$@"

Is that really hard to maintain :) I was expecting to see a link, but instead it is a shell script GNU is asking is to create. I do not know why GNU says that is hard to maintain going forward. BTW, this is grep v3.7


While I hated the decision of adding warnings without much notice (which, in the case of such widely used CLI tools, is the equivalent of a breaking change), I also found an easy solution that would prevent my scripts from spitting out lots of unneeded warnings.

alias egrep='grep -E' alias fgrep='grep -F'

Now GNU developers can keep doing whatever they're doing, and I can keep doing whatever I used to do.


> While I hated the decision of adding warnings without much notice (which, in the case of such widely used CLI tools, is the equivalent of a breaking change)

15 years isn't “much notice”? I had already stopped using those back then because, as noted in the article, they weren't standardized and so you had to work about portability across Unix installations.

It's also worth noting that this is only a breaking change if you are using the non-standard names in a context where you are trapping output. For the vast majority of people using a shell script which doesn't use the common name, they will at some point upgrade, see the warning, spend 30 seconds making the change, and never think about it again. If you're that sensitive to the extra work, presumably you also do some testing before installing new upstream releases.

EDIT: it was actually 17 years ago that the warning was added about egrep/fgrep:

https://git.savannah.gnu.org/cgit/grep.git/commit/?id=0b4859...


Your scripts expand aliases?


    shopt -s expand_aliases


You still have to define the aliases or source your aliases file in every script though.


scripts will be break. Strange outputs will appear. Devops will rip the gowns and douse ashes on their heads. Patches will arrive and a years later someone will make a joke about fgrep like the jokes about ed, the GNU ed line editor.


Hopefully Devops will be able to make use of nice tooling like linters (e.g. shellcheck for bash) which can hope to catch such problems.


Why aren't these split into separate packages? If a distro wants to drop them, they can still be installed. If some distro wants to include them with a warning that's also fair game.


>Why aren't these split into separate packages?

Because they're the same thing. What all is about are the `xgrep` commands being symlinks to `grep`. Though I guess you can have packages that just add the symlink.

>If some distro wants to include them with a warning that's also fair game.

Some distros already do what is recommended in release notes. Rather being symlinks they're wrapper scripts. E.g. in Nix the `fgrep` and `egrep` are just `exec ${nixpkgs.gnugrep}/bin/grep -F "$@"` and `-E` respectively.


GNU egrep and fgrep are wrapper scripts (and have been for 10 years for more). Wrapper scripts that now warn you not to use them.

https://git.savannah.gnu.org/cgit/grep.git/tree/src/egrep.sh


This should be the top comment lol.

Such an insanity...

They are wrapper scripts that precisely enable to obey to the advice the warning is giving, yet the warning ends up advising not using the wrapper...


I cannot edit anymore, so I add this: This is the case in a distribution (Arch) that has effectively taken the maintainer's advice into account, not the reverse.

So I was wrong, and let's not be unfair to the maintainer.


True, the symlink part is wrong. Guess I should've checked the code. But in similar vein, a distro can just have packages adding the scripts.

edit: But was not always wrong. Just terribly outdated. Searching the log I found the commit [5cb71b0] with the message:

  Add patch from
  Paul Eggert <> to comply with ridiculous
  guidelines (don't act differently if invoked as egrep or fgrep)
which made the change from creating symlinks to creating scripts. The code continued to adjust behavior according filename (in contrast to what someone would expect based on the commit message). Then few years afterwards in [d25bebd] the scripts and the symlink behavior were dropped for actual binaries, with in-source comment:

  /* We build specialized legacy "egrep" and "fgrep" programs.
     No program adjusts its behavior according to its argv[0].
     No scripts are provided as an alternative.  Distributors
     are free to do otherwise, but it is their burden to do so.  */
It also funnily added the following prints, quite similar to what they're doing now:

  Invocation as `egrep' is deprecated; use `grep -E' instead.

  Invocation as `fgrep' is deprecated; use `grep -F' instead.
The scripts returned about a decade later (or else few years ago) in [b639643]. The commit message mentioned the reasoning:

  Although egrep's and fgrep's switch from shell scripts to
  executables may have made sense in 2005, it complicated
  maintenance and recently has caused subtle performance bugs.
  Go back to the old way of doing things, as it's simpler and more
  easily separated from the mainstream implementation.  This should
  be good enough nowadays, as POSIX has withdrawn egrep/fgrep and
  portable applications should be using -E/-F anyway.
[5cb71b0]: https://git.savannah.gnu.org/cgit/grep.git/commit/?id=5cb71b...

[d25bebd]: https://git.savannah.gnu.org/cgit/grep.git/commit/?id=d25beb...

[b639643]: https://git.savannah.gnu.org/cgit/grep.git/commit/?id=b63964...


I'm on ripgrep (Rust, very fast). Thanks BurntSushi!


I completely agree that rg/ripgrep is a way better alternative to most grep implementations including GNU grep not only in terms of feature

check https://beyondgrep.com/feature-comparison/ for a detailed comparison

but also in terms of speed, including the fact that ripgrep allows greping by default in compress files, many different encoding and more!

This software is really a must-have for anyone who spends some time on the CLI


I also use ripgrep, and swear by it. It has many of the features previously only found in ag or git grep (parallelism, respect .ignore files, search hidden files, etc).

It's old, but here'd a feature comparison of ag, git grep, ripgrep and others:

https://beyondgrep.com/feature-comparison/


The GNU tools and programs are, theoretically, created for the GNU system, which has evidently chosen to (eventually) not provide fgrep and egrep, as they are not part of the POSIX standard. This is the GNU project’s choice to do. Other operating system projects, like Debian GNU/Linux, who use the GNU tools to provide their operating system, might choose otherwise and (separately) provide the fgrep and egrep tools. This would then be their choice (and one that I personally expect them to make).

I.e. if you don’t run straight GNU as your OS, don’t complain about this; instead object if your operating system which you actually use or depend on choose to break behavior and interfaces which your programs rely on.


It is depressing, even sickening how common such exercises in tunnel-vision vigilantism occur among software engineers. The same lack of understanding of downstream impacts is found in library maintainers that use auto-updating transitive dependencies, for example.


" there's a difference between avoiding fossilization and the kind of minimal, mathematical purity that we see GNU Grep trying to impose here. Unix has long since passed the point where it had that sort of minimalism in the standard commands. Modern Unix has all sorts of duplications and flourishes that aren't strictly necessary, and for good reasons."

Well, really, the type of purity many of the original Unix hands desired was that each tool did one thing well, not that one tool did everything well. Some folks would say that putting the -E and -F flags in GNU grep in the first place instead of using egrep and fgrep was the wrong direction. Now GNU is wanting to further consolidate the hydra.


BTW I ONLY use egrep and fgrep !!!

- egrep means [0-9]+ works like in Perl/Python/JS/PCRE/every language, not [0-9]\+ like GNU grep.

- egrep syntax is consistent with bash [[ $x =~ $pat ]], awk, and sed –regexp-extended (GNU extension)

- These are POSIX extended regular expressions. Awk uses them too.

- `fgrep` is useful when I want to search for source code containing operators, without worrying about escaping

This style makes regular expressions and grep a lot easier to remember! I want to remember 2 regex syntaxes (shell and every language), not 3 (grep, awk, every language) !

This change should be reverted; there is no point to needless breakage

Again you wouldn’t remove grep –long-flags because it’s not POSIX


You may also be interested in `grep -P`, where `P` is short for `Please just use PCRE like everything else in this century`. I don't think any of the others support that though.


Sounds like a non-story to me. fgrep and egrep have been deprecated for nearly fifteen years and anyone who wants the old behaviour has only to create a couple of simple script files.

Any distro that wants to can easily maintain such things.


I've started to wean myself off of `fgrep`, but only because I am using `zgrep -F` as I'm often searching through logs, which may or may not be gzipped.

But this is a stupid change.


This only affects linux. All the old AIX/SunOS/Solaris/HP-UX boxes will never get this update, so luckily it should have 0 effect on older platforms.

That makes me feel better, that all the old crap that I've forgotten about that might be running in production somewhere will be fine.

That said, this is a ridiculous move. The only practical effect of this change is to potentially break millions of scripts, all for an ideal that, to be frank, nobody gives a shit about.


Tool devs should get to decide on functionality - and then it's up to the distro maintainers to make aliases and organize well into their distros. Here it seems that the former are encroaching on the latter, for historical reasons?

By the way:

    ~$ bgrep
    Command 'bgrep' not found, but there are 20 similar ones.
There's not many [a-z]grep's left in the alphabet, reserve yours now :-)


Old software remains stable and usable. Updates break software, remove features, and introduce new exploits. I'll stick with the old reliable.


What’s wrong with removing the commands and adding shell aliases instead? That sounds perfectly reasonable to me.

eg. In bash, they can be expressed as:

alias fgrep=grep -F

alias egrep=egrep -E

This sounds like a push for purity - similar to what happened in Python 3 with the move from print “xyz” (special keyword) to print(“xyz”) (standard function).

The new function requires three additional keystrokes every time it is used.


It breaks my script that calls /usr/bin/fgrep


Unnecessary potentially breaking change. I'll rest my case.


The article and very few comments acknowledge that this is being done to facilitate removal of these binaries. Assuming the binaries are going away, is it better to add a warning, or to surprise people be just removing them?

In general, there’s a good, legitimate, hard question here about how to handle removing things from our software. It needs to be done, probably more often than we do it, and it’s hard enough for software writers and maintainers to bring themselves to remove things. Are there better strategies? What more can we do to make removing features less painful, beyond publishing the deprecation schedule, adding a warning in advance, and then removing them after the schedule and the warning have been out for a while?


I don't think most of the comments are worrying about the difference between warning or removal. Most of the comments are about what is wrong with egrep continuing to work perfectly well as it has for a decade without warning or being removed.


I do wish that there was a standard pipeline for warnings, so stuff like this didn't have to go into error (bad) or output (worse). Powershell has such a feature but of course since it's not at the OS/Posix level then support is spotty.


Why is it bad for it to go stderr? It's supposed to be used for all kinds of app meta-output, not just errors. If you want to actually check if the command failed, that's what exit codes are for.

I did run into some Node.js code that assumed that anything printed out to stderr is a fatal error - but that's just people making wrong assumptions, not using the interface as intended.


You can technically pipe to streams other than &1 and &2, so you could use

  printf "WARNING: %s" "$warn_msg" >&3
But of course, nonstandard.


Hmmm, perhaps "a program's output is part of its interface" wasn't such a good idea after all...

If programs communicated with each other using structured interfaces and not "plain text" (already a misnomer), this would be a non-issue.


I first thought about different warning that GNU grep >= 3.8 emits: about "stray \" (it's also in the changelog - https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001...). I have personally seen this one several times already and I only have grep 3.8 on my desktop, as I use Debian stable everywhere else. But this warning may prevent hidden bugs in the future.


This reminds me of the last time daylight savings time was changed (in the US). A bunch of hassle and software breakage.. for what? Just to get back to the same place we already were (working software)? What a waste.


It's probably time for the Linux world to distance itself from the trash fire that is GNU, as they keep breaking backwards compatibility. This isn't 1970, it's 2022, people! You do not just break the world for no good reason.

GNU has served its purpose, it's time to stop relying on them. It would be a lot easier to convince the BSD people to support the few GNU specific flags in user land utilities and to add GNU libc compatibility, than it would be to convince the GNU maintainers to write good software.


I'm so confused by this change. fgrep and egrep are already just shell scripts that are almost just like aliasing grep -E and grep -F. Why is the GNU team suddenly so worried about these shell scripts?

If the maintenance of these scripts is so hard. They could just deprecate them entirely. It's pretty unlikely that they would stop working or become insecure anytime soon. And sysadmins would just source in the aliases into their shell environments anyway....


Funny, I've never ever used egrep or fgrep in the 20 odd years I've been operating systems, always preferred explicit flags to aliases or argv[0] evil.


I started to get really passionately angry about this because I am an fgrep user. Then I realized I can avoid my blood pressure spiking by just running a one liner and making a few edits. The muscle memory might take a while to fix but I'll be better for it and avoid my day starting off with anger about trivial things.

sudo grep -rF 'fgrep' / 2>fgrep.rip.stderr.log | tee fgrep.rip.log

Goodbye old friend...


They (the maintainer) should make the behavior of `grep` depending on the name of the executable.

Then one could simply do a `ln -s grep egrep` and be done with it.


That's exactly what it does already. The maintainers are objecting to the existence of the links.


Yep. Almost (it uses a wrapper script instead of a symlink), but it ends up being such a silly situation all the same...

Thanks for pointing it out though, I was not aware!


I cannot edit anymore, so I add this: This is the case in a distribution (Arch) that has effectively taken the maintainer's advice into account, not the reverse.


Seriously, what's the deal? Just use aliases instead, if you're fond of using egrep or fgrep yourself. Refactoring and re-deploying scripts should not be a big issue in any environment either these days. Admin/Devs are lazy, yes. But THAT is the least of their problems maintaining shell scripts.


> (I'm definitely not in favor of fossilizing Unix, but there's a difference between avoiding fossilization and the kind of minimal, mathematical purity that we see GNU Grep trying to impose here.

I'm not sure this can be reconciled with "printing a deprecation warning is a breaking change".


>I'm definitely not in favor of fossilizing Unix, but there's a difference between avoiding fossilization and the kind of minimal, mathematical purity that we see GNU Grep trying to impose here.

I can't really see the difference. Adding a deprecation warning to stderr in a non-POSIX command 15 years after deprecation notice is among the smallest possible changes to [GNU's Not] Unix I can think of. Even then, the change is trivially silenced by deleting a single line in a shell script or two[1] and seems that some distros already do that for you[2].

Yes, this will break something for someone[3] and I might well be that someone. You truly have my sympathies. But if you want to run a system for a decade and a half without ever needing changes, stick to POSIX. You can't have your fossils and eat them too.

[1] https://git.savannah.gnu.org/cgit/grep.git/commit/src/egrep.... [2] https://github.com/void-linux/void-packages/pull/39340 [3] https://xkcd.com/1172/


> But if you want to run a system for a decade and a half without ever needing changes, stick to POSIX. You can't have your fossils and eat them too.

… and test before making major system upgrades. For most users this won't be arriving until the next major distribution release and it seems unlikely that this will be the only visible change in such a move. If it's that big a deal, you should have a test environment, change management, dedicated admins, etc.


Tangentially, I'm still searching for a grep for Windows you can run from file explorer.

I've resorted to right-clicking a command window and using findstr. =(

I'm sure there's also a powershell equiv but I haven't taken the time to walk the PS object tree to find it.


powershell has `sls` which expands to `Select-String`


> typing 'grep -F' is two more characters, one of them shifted

Adding in more special characters is also hostile to users who aren't based in a single English-speaking country and regularly have to deal with different keyboard layouts.


While we are discussing common command aliases, could we decide to make ll (= ls -l) a standard one? I have a hard time using a shell without it, and it's annoying to create the alias on all the machines I ssh to ^^


GNU Grep: "Am I so out of touch? No, it's the users who are wrong"


I use egrep and fgrep regularly. I guess I am just getting too old for GNU.


As a software developer, I have long-since ingrained the behavior of looking up the documentation before I assume something works in a particular way. Perhaps doubly so when shell scripting, where I have to verify if I can rely on a particular package being present, or the minimum version of something I can expect, or what the output is guaranteed to be.

When I was considering using fgrep, I looked it up; lo and behold, it's not part of the POSIX standard. So I used the option instead, and have since.

I'm a little puzzled why people have problems with these little things when it feels like it's pretty common for tools to change given enough time (for example, ifconfig and netstat to ip and ss). It's just essentially an API that gradually refines over time.


It's not that common for tools that have been around for decades, and don't have system dependencies (like your network configuration examples do) to change.

egrep and fgrep are used on ad hoc command lines daily, and option alternatives are less convenient to use. I personally use 'egrep -o' and 'fgrep -f <(...)' all the time. I already find it tedious to have to type 'sed -r' and am half-considering wrapping it in an 'esed' variant.

Eliminating egrep and fgrep is change for the sake of change. Removing a couple of hard links from a few bin directories would be the total positive achievement, at the cost of years of work removing the utilities from scripts - or more likely, adding the links back, or variants thereof, like shell scripts which add -E / -F as required.


Just, why? Change for the sake of being noticed? Not getting enough attention at home?

Countless tools have figured out how to argv[0] or whatever - the path of the binary changing the behavior


debian appears to be just removing the warnings:

    grep (3.8-2) unstable; urgency=low
      This Debian grep release removes the deprecation warning about egrep and
      fgrep. These alternative programs will be still shipped by Debian. Although,
      for portability reasons, users are encouraged to use grep with the concerned
      options instead of those alternative programs.


I see the problem for scripts, but for people it's just a matter of setting an alias command, isn't it?


I guess they backed out the “which” change before this so you wouldn’t see something silly like:

which egrep

And it would print

You should use command -v

egrep is being deprecated

/usr/..

Like jeez..


Just be glad they haven't handed over maintenance to the gnome project.


Soon grep will be a systemd module.


Surely you mean systemd-grepd -z nuts


ripgrep (rg) is an option. I switched to it for all my grepping needs.


If they change this, when might we see it bubble into RHEL? 10?


I like beating people in concept but not in practice.


Please put it back.


Yet another reason why shellscripts are bad huh.


How does the article lead to this conclusion?


People are upset that changing literally anything on the system like adding a deprecation warning to stderr will break their fragile shellscript setup.


Does the warning go to stdout or somewhere else?


stderr


> stderr

For now.


Clearly that will never be done. Outputting this to stdout would be ridiculous and break everyone and everything using it. Might as well just outright remove egrep and fgrep. Ridiculous comment.


alias fgrep='grep -F'


the issue is that programs rely on the existence of fgrep in its PATH. your shell alias doesn't fix that.


It is a bit hacky but you could put "grep -F" in a script file in your path.



    #!/bin/sh
    exec grep -E "$@"


Of course it can be solved; no one claimed you can't. The choice here is:

- Thousands of users have to update their scripts, habits, shell configs; or

- The GNU Grep maintainers spend essentially zero minutes "maintaining" a few lines of code to automatically use -E or -F based on argv[0].

It seems to me the second is obviously the better option.


So. Thought experiment.

Which has lower cognitive load if everyone starts doing it?

Explicitly specifying switches, or argv magic? I'd argue, the switches are. In the abscence of the symlinking, that is how the tool functionality would have to be driven anyway.

Argv magic now runs into a problem if another program sharing the name ever comes into existence on the path. It's also completely unergonomic in a sense, because short of looking at the source, you have no way of knowing what argv transforms implementations support are, and in order to use them, you must explicitly pollute the Symbol namespace with a denormalized util. Also, the argv magic does require one extra shell to do the transform from !grep to grep -!, Which is technically more overhead. On the other hand, fgrep and egrep are ironically easier to grep/sed for as opposed to grep/ -[F|E]

One tool, one manual, one name, one argv0.

I am not the Grand Poobah of the Internet, however, even if their hat is in my possession, so I understand that it is likely that the fgrep/egrep convention is probably deeply entrenched, and likely to spawn a new holy war on par with Emacs/Vim. Tabs/spaces, etc...


I don't think grep does any argv-stuff; I took a quick look and I don't see it.

As far as I can tell this is the entire maintenance burden:

  $ cat =egrep =fgrep
  #!/bin/sh
  exec grep -E "$@"
  #!/bin/sh
  exec grep -F "$@"


what shell does this?


Apparently zsh replaces =cmd with cmd's absolute path. TIL.


Yes, very useful to bypass aliases if need be, quickly edit a script ("vi =my-script"), etc. I don't think bash has it; you need to use where/whence/which/command/whatever (I can never remember, why are there so many?!)


Hack: \command instead of command is very unlikely to be aliased, so should work fine.


This is a very useful hack, thanks. Unfortunately, my defensive scripting instincts do not allow me to use it on the "serious" situations.


    # unalias grep && unalias egrep && cp "$(command -v grep)" "$(dirname "$(command -v grep)")/egrep"
1 line. Fixed. Done. Btw needs to be run as root ( as signified by # prompt above)

EDIT: see other comment in this thread from FreeBSD user that on FreeBSD grep, egrep, and fgrep are all separate but identical (copies) of the same file. So this isn’t quite such a silly solution as some might think.


The current contents (not joking): (this is why the situation is crazy)

  #!/bin/sh
  cmd=${0##\*/}
  echo "$cmd: warning: $cmd is obsolescent; using grep -E" >&2
  exec grep -E "$@"


I cannot edit anymore, so I add this:

This is the case in a distribution (Arch) that has effectively taken the maintainer's advice into account, not the reverse.


see also:

    which
being deprecated, replaced by:

   command -v


preach!


> What's special about GNU Grep 3.8 is that its version of egrep and fgrep now print an extra message when you run them. Specifically, these messages 'warn' (ie nag) you to stop using them and use 'grep -E' and 'grep -F' instead.

Can we please stop with the make-work nonsense??

Please stop making us change scripts just because you've decided that some command shouldn't have existed that does exist. Stop it. Just stop.

See `which` deperecation. Please, no more.


    grep -P is only true way


   grep --perl-regexp --only-matching pattern some/where/file/name
aka

   grep -Po pattern some/where/file/name


Happy to have left the mess that is grep, awk, bash, zsh, fish, etc. behind and just use Python for scripting.


Python has it's own problems. Not the least of which being it being hundreds of times slower than coreutils.

I'll take learning Shell and a handful of coreutils, over praying the sysadmin has successfully navigated/allowed me to navigate all the Python footguns.

Also, good luck doing anything embedded where space is at a premium. Not all base systems have/need Python. Everything needs a shell, however.


Hehe, wasn’t the Python 2 to 3 incompatibility debacle a thousand times worse than [fe]grep? Python2 is still lingering in places even after Python 2 was removed.


I like hand saws and I like circular saws/Skil-saws. I use them for different things.


The blog makes two main points: 1) adding new error messages causes compatibility problems; 2) some people are used to typing "egrep".

On the first point the author only gives hypothetical examples. I feel the argument might have been more compelling if we could see some concrete examples of things that break with GNU Grep 3.8.

As for the second point, I find it less convincing than the first one. If it's just the muscle memory then an alias would be an acceptable workaround. And I doubt that "everyone in the world" would want such an alias, as the author suggests.


It seems pretty simple, piping bash commands into other bash commands and other text stream juggling is a pretty typical use of these commands and so changing what stream is output can change the behavior of consumers of the output of these functions.

I haven’t done anything with fgrep and egrep before but piping grep into another grep for more complex classes of text search is something i use a lot.


It's more than likely the warning will be printed to stderr, not out, so there will be no impact on the actual work done.


Automatically monitoring the stderr from cron jobs for unusual outputs is a prudent measure, and its plausible that this change will increase the burden of false positives (it certainly will not reduce it.)


But if you’re monitoring the output it usually means you are in a position to fix problems which means you can likely update the script in question to use the new warning-less invocation.


If I had been woken up in the middle of the night or had a vacation interrupted on account of this, I would not be entertaining warm and grateful thoughts toward whoever thought it was a good idea.


If you're being woken in the middle of the night over this then your testing infrastructure is crap.

This is something that should be caught before it gets to that point SPECIFICALLY so you aren't getting woken up in the middle of the night.


All the replies so far are missing the point: it is prudent to monitor for unusual events, including previously-unseen messages, over and above the explicit handling of specific errors. It is also prudent to not wait until morning to investigate.

Your infrastructure probably is crap, as very few people get to build it themselves from scratch. That does not mean one should cheerfully accept additional unnecessary or pedantic complications.

It would also be prudent to investigate each and every change to any of the software you use, in order to anticipate problems, but unnecessary and pedantic changes increase the burden there, as well.


Wouldn't you test before upgrading packages in production? And usually you'd want to schedule any upgrades so that the next day or two has coverage from someone who can deal with any issues that arise.


I have yet to work at a place that didn’t have systems running mission-critical shell scripts with little to no SDLC on boxes that got periodic “yum update -y”s. There seems to be a difference in oversight of software we write & “the operating system”.

Should we do better? Absolutely! Will this burn people if vendors don’t take care? Also absolutely!


What if a cron.monthly or cron.weekly script calls egrep? Congrats now you get a lot of noise from cron stderr emails in the distant future.


I don't know if I have sympathy for this argument.

Your script ostensibly handles (at least logs) errors and warnings right? Do you exhaustively handle every single error and warning in a unique and different way or do you have a catchall "If non-0 return code then fail"? How does introducing new output to stderr affect that?


hence why I wrote "actual work done".

your scripts will continue to produce the expected output. the side effects, otoh, will change indeed.


The simple fact that you have to wonder about that question is the failure.

Everything about this, even this comment I'm writing right now, is a waste of time.


Can confirm:

    $ egrep '.' < <(grep --version) > /dev/null
    egrep: warning: egrep is obsolescent; using grep -E


It's not unusual in shell scripts to combine stderr with stdout by using "2>&1" or similar.


It is, however, very unusual to do so and then try to parse the output. Aside from compilers, what other CLI tools make any guarantees wrt what they print to stderr?


With regard to the first point, the examples may be hypothetical, but they are also very plausible.

When a change has little or no objective benefit, I feel the burden of demonstrating that it is harmless falls on those making the change.

As has been pointed out elsewhere, this is free software and the maintainers are free to do whatever they like. That does not stop others having an opinion about it, especially when it is in the form of constructive criticism.


Sure, but it would still be nice to have at least one such example. Looking at the rest of the discussion thread here on HN as of now it's still only hypotheticals.


Here's one example. This is in code my team inherited a long time ago, and there are many more like it.

        databases=`find /var/lib/mysql -type d | sed 's/\/var\/lib\/mysql\///g' | egrep -v 'mysql|test|performance|schema'`


That doesn't do anything with stderr, so it doesn't break.


It does output to STDERR an extra warning.


But that doesn't actually break your script, because backticks only capture stdout.


But it changes the behavior of the script in the UI. It can cause things like cron to send mail. It can cause other things wrapped around the script that are capturing both STDOUT and STDERR from the script to capture extra content. Any tool that's monitoring STDERR and expecting it to be empty may consider that an erroneous run, which may impact other scripted decisions. It's a breaking change in multiple circumstances, even if you don't consider extraneous warnings shown to a user manually running a script a breaking change.

Does that code look like something you'd log into a system and manually run on a regular basis? Does it maybe instead look like one layer of a legacy automation stack absorbed into other tools?


It would be even nicer to see convincing evidence that it is not going to be a problem.


You can’t prove a negative.


There is no largest prime.


Using that model, we can prove conclusively that, since a behavior has changed (a warning printed), it might cause problems. Therefore, we cannot prove that it cannot cause problems. What we would really like, though, is an actual problem shown to exist. Just like in mathematics; it’s one thing to prove that it’s impossible to prove something could not exist, but another thing entirely to show it existing.


You are overlooking something here: I never said anything about proof. I explicitly wrote 'convincing evidence' because proof is too demanding!

It's rather amusing how you have flipped from saying "you can't prove a negative" to an argument for the certainty of observable effects and the probability of consequences! (you wrote might cause problems, but everyone can see that's an unrealistic understatement of the implications of the argument you are using.)

The NASA managers prior to the Challenger crash thought that what they really wanted was something showing them an actual problem existed. Erring on the side of caution is generally prudent, even in relatively small matters.


> everyone can see that's an unrealistic understatement of the implications of the argument you are using

If nobody can show an actual existing problem, or even an example of reasonable code someone could have written which would be impacted by a the printed warning, then yes, I would think that I was charitable when I wrote “might cause problems”.


What does 'charitable' mean here? Generous towards what person or point of view? As you say you have conclusively proved that there is a non-zero probability of there being problems, the use of 'might' is already trying to persuade that this possibility is next to zero.


I thought I was being charitable when acknowledging that there might be a chance of a problem, when I in fact believe there not to be one.


If you were presented with an actual case, would you change your mind over whether introducing this warning is advisable?


Yes, of course. If the case is reasonably likely to occur in the real world and have real world impact, that is. And I would assume that the GNU grep developers would agree with me.


I think that is a very reasonable position to hold. It would only be the rejection of plausible cases, on the basis of no actual case having been uncovered, that I would take issue with. When assessing the downside of a proposal, there should not be much, if any, difference between how highly plausible and certain consequences are assessed.

If we could prove there would be no downside, then plausible problems could be ignored as merely hypothetical, and there would be no need to posit offsetting benefits. The question the GNU grep developers might want to consider is whether the supposed upside will have sufficient material consequences for their purposes.


> On the first point the author only gives hypothetical examples

I have scripts everywhere, some of them 20 years old or more, that use fgrep. For years and years it was a "best practice" thing if you were checking for a fixed string (so that you didn't accidentally match on a "." or whatever by forgetting it was a regex).


You have two options: The first option is to simply replace "fgrep" with "grep -F" everywhere in all your scripts, which is correct but is more work than your other option, which is to add your own "fgrep" script somewhere in your path.

Any of these options seem reasonable to me.


OK, are you going to call up all my former employers to tell them to audit the scripts I wrote for them in the late 90's?

I really don't think people understand the impact here. It's not it's just a bunch of angry geriatric graybeards yelling at the modern world. It's that there is decades of uncounted, unrecognized, untraceable software written using these old conventions that are suddenly changing.

It's just a terrible idea. Linux cares about conforming to syscall interfaces for binaries compiled 20 years ago, but somehow you think it's OK to break scripts that have worked fine for 50 (fifty!) years?


Either a system is frozen and static, in which case it will not receive this version of GNU grep, and there is no problem. On the other hand, if a system receives updates, the system needs both minor and major changes all the time, to keep up with its ever-changing environment. This is the jungle in which we live. Linux syscalls are important to keep, since it’s hard to change a compiled binary. But it’s easy to change a shell script.

And don’t exaggerate. This won’t, in all likelihood, “break” your scripts.


> This won’t, in all likelihood, “break” your scripts.

Previously:

> The first option is to simply replace "fgrep" with "grep -F" everywhere in all your scripts, which is correct but is more work than your other option, which is to add your own "fgrep" script somewhere in your path.

    script.sh: line 100: fgrep: command not found
seems like evidence of a broken script to me. The fact that it can be fixed doesn't make it not broken.


We were discussing the warning printed by fgrep and egrep, not their removal. I was suggesting that you put a version of fgrep/egrep in your path which does not show the warning.


That wasn't my take. I thought we were discussing the deprecation itself. It's true that nothing is broken yet[1]. But it's clear that "broken" is where we're going, and I don't think you or the GNU maintainers have thought through the implications correctly.

[1] At least, nothing that isn't sensitive to junk on stderr -- bourne scripts are not always as tolerant as you'd like.


We were explicitly discussing GNU grep 3.8, which does not remove anything, only add warnings. And the remote possibility of breakage due to warnings is why I qualified my assertion with “in all likelihood”.


The shell happens to be a programming tool and not just an interactive command interpreter. There are whole books on how to write shell scripts portably across various Unix-y platforms. Many of the examples in those books will now throw warnings on systems using the GNU tools.


If a book on writing portable shell scripts across *nix platforms depends on commands not specified by POSIX, I don't think those books are doing their job very well.


POSIX.2 wasn't a standard until 1992. Perhaps it's the standard at fault for specifying the -E and -F flags rather than specifying the tools that existed in v7 in 1979. Authors and publishers didn't just pause for 13 years to wait and see.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: