I was working in an organization that developed a big network switch, with a large C++ application running on it whose non-recursive Makefile took 30 seconds just to load and parse all of the include makefiles throughout the tree, before actually building anything.
Half a minute of waiting just waiting for that second that it then takes to recompile a single .cpp to a single .o and link everything.
I got tired and added a "make --dump" option which used the GNU Emacs undump code to dump an image of make with all the rules loaded from the Makefile. Then "make --restart" would instantly fire off the incremental rebuilds. (Of course, any changes to the makefiles or generated dependency makefiles required a new --dump to be taken to have an accurate rule tree.)
Another idea would be just to add a darn REPL to make, so you can keep it running and just re-evaluate the rule tree.
Isn't the ninja build system just a faster replacement for make. I use CMake primarily it lets me create makefiles, ninja files or vs projects. I noticed that in the general case ninja is 5% to 10% faster than make on builds that are more than few seconds long.
Then I found that if I did serious file manipulation at build time, like copy trees of files dependent on other thing in the build, I could have tens of thousands of targets, one per file usually. Ninja might hiccup for a fraction of a second on these shenanigans but make often sits and spins for 20 or minutes.
Unless you want to write the makefile yourself why not use ninja?
Because then I have to ask users to install ninja before they can build my program. In the projects I'm working on, I don't have any of the issues that ninja solves.
You can generate both. That gives you ninja for development (which had a problem you described) while the official builds still can use the makefiles that should result in the same output.
That means that a user who wants to patch the build rules has to have the generator, and learn the generating language instead of the Make language he or she already knows.
Autoconf has this disease. You can build from an official tarball, but touch anything (or use a git checkout) and you need auto-this to generate auto-that. Not just any auto-this, but a specific version, that is seven releases behind current, or else three releases ahead of what your distro provides.
> should result in the same output
It should; but someone has to ensure that it does. That's just another unnecessary concern that doesn't actually have to do with anything with the functionality of whatever is being built. We would like to spend our QA cycles validating the program, not three ways of building it.
Best to have just one way to build, and don't require users to install extraneous tools.
You've confused autoconf with automake. The output from autoconf is just a list of variables that are sourced by your handwritten Makefile, which you can supply yourself if you don't feel like executing autoconf.
It's automake that writes your Makefile for you, but you can just skip using that. E.g. the Git project uses autoconf optionally but not automake.
I know the difference between autoconf and its generated ./configure target.
Your comment overall indicated to me that you were talking about automake, not autoconf. But if not, fair enough.
E.g. you talk about "learn the generating language instead of the Make language". I know you were using that as an example, but there's no general non-horrible replacement for autoconf that you can write by hand, as opposed to automake where you can write a portable Makefile
You can of course write a bunch of ad-hoc shellscripts & C test programs to probe your system, but this is going to be a lot nastier and buggier than just using autoconf to achieve the same goal.
You also don't generally need autoconf to build projects you clone from source control in the same way that you need automake (because that actually makes the Makefile).
The output of autoconf is generally just a file full of variables the Makefile includes, if you don't have that file you can just manually specify anything that differs from the Makefile defaults, e.g. NO_IPV6=YesPlease or whatever.
The Git project, whose autoconf recipe I've contributed to, is a good example of this. You can "git clone" it and just issue "make" and it works, but if the default config doesn't work then "make configure && make" generally solves it, but you can also just e.g. do "make NO_IPV6=YesPlease" if it was lack of IPv6 that was causing the compilation failure. It'll then get your NO_IPV6 variable from the command-line instead of from the ./configure generated config.mak.autogen.
I interpret this as saying that autoconf takes configure.ac as input, and produces a runnable 'configure' script as output. But you are saying that "the output of autoconf is generally just file full of variables the Makefile includes". How can these both be true?
I was using "autoconf" to mean both the software itself and all its output, including the generated configure script.
Confusingly, sorry about that, but for the purposes of discussing what software you need to generate the configure valuables you ultimately need when cloning from source control, it makes no difference.
At first I thought I would have a hard time searching for it with such a simple name, but it clearly is a Haskell tool.
This looks like a reasonable alternative for people who control the whole lifetime of their code, life server software developers. But for someone shipping a library or anything they expect someone else to build, what advantages does this have over Make in the *nix world or CMake in general?
I wonder if most of those include files were the files auto-generated by the C pre-processor to track header file dependencies. In a toy benchmark[1] GNU Make spent 98% of its time processing those include files (for a no-op build). Ninja has a special optimisation for these files, where it reads them the first time they're created, inserts the dependency information into a binary database, then deletes them so it doesn't have to parse them in future invocations. AFAICT this accounts for most of Ninja's speed improvement over Make.
Well, it's GPL licensed which means they'd have to provide source code to anyone they give the software to. The only question is whether they're aloud to distribute it to anyone outside of their company.
Not contradicting you, but just posting to invite someone to correct me if I'm wrong. If I understand correctly, the GPL specifically grants a license to distribute to anyone you please. However, in the case where the software was originally distributed to a company, an individual working at the company does not have a license (because it wasn't distributed to them specifically) and can not redistribute the software. Only the company can redistribute the software (though, having done so, they can't stop a recipient from redistributing the software).
It's one of the really subtle points of the GPL and easy to get wrong (which I'm inviting people to correct my interpretation ;-) ). I often wonder whether the AGPL would work the same way because the "distribute" clause in it is awfully vague, from my interpretation. Just giving me access to the software over a network requires giving a license, it seems. So if they give me access as an employee, I should also get a license... maybe...
OK, if you promise to stay off my lawn, I'll explain the history behind undump. Back in the 70's, the big CS departments typically had DEC 36-bit mainframes (PDP-10, PDP-20) running the Tops10/Tops20/Tenex/Waits/Sail family of operating systems. These are what Knuth used to do all of TeX, McCarthy LISP, and Stallman and Steele EMACS. Not Unix; and Linus hadn't touched a computer yet.
Executable program files were not much more than memory images; to run a program, the OS pretty much just mapped the executable image into your address space and jumped to the start. But when the program stopped, your entire state was still there, sitting in your address space. If the program had stopped due to a crash of some sort, or if it had been in an infinite loop and you had hit control-C to interrupt it, the program was still sitting there, even though you were staring at the command prompt. And the OS had a basic debugging capability built-in, so you could simply start snooping around at the memory state of the halted program. You could continue a suspended program, or you could even restart it without the OS having to reload it from disk. It was kind of a work-space model.
Translating into Linux-ish, it's as if you always used control-Z instead of control-C, and the exit() system call also behaved like control-Z; and gdb was a builtin function of the shell that you could invoke no matter how your program happened to have been paused, and it worked on the current paused process rather than a core file (which didn't exist).
The OS also had a built-in command to allow you to SAVE the current memory image back into a new executable file. There wasn't much to this command, either, since executables weren't much more than a memory image to begin with. So, the equivalent of dump/undump was really just built into the OS, and wasn't considered any big deal or super-special feature. Of course, all language runtimes knew all about this, so they were always written to understand as a matter of course that they had to be able to deal with it properly. It pretty much came naturally if you were used to that environment, and wasn't a burden.
Thus, when TeX (and I presume the various Lisp and Emacs and etc. that were birthed on these machines) were designed, it was completely expected that they'd work this way. Cycles were expensive, as was IO; so in TeX's case, for example, it took many seconds to read in the basic macro package and standard set of font metric files and to preprocess the hyphenation patterns into their data structure. By doing a SAVE of the resulting preloaded executable once during installation, everyone then saved these many seconds each time they ran TeX. But when TeX was ported over to Unix (and then Linux), it came as a bit of a surprise that the model was different, and that there was no convenient, predefined way to get this functionality, and that the runtimes weren't typically set up to make it easy to do. The undump stuff was created to deal with it, but it was never pretty, since it was bolted on. And many of use from those days wonder why there's still no good solution in the *nix world when there are still plenty of programs that take too damn long to start up.
Emacs is my primary editor, however Emacs dumper has always been a dumpster fire.
Basically they coded up so much ill-concieved and inefficient Emacs Lisp that the editor would never start up in an acceptable time. Instead of engineering around this (lazy loading services, fixing things, not doing things nobody needs) they had the great idea they could start the editor once and then core dump the in-memory state of a running editor. Then on later editor starts they would map-in the core dump and instantly be in a (somewhat) good state. Fails all kinds of smell tests and really speaks to bad taste having unbounded consequences. It is an idea that should not work and it is only happenstance that it ever worked (and it gets harder and harder as we have things like address layout randomization, file handles, and so on).
[edited "file" -> "fire", sorry! And yes I know lispers always dumped, but they are dumping the C memory environment here- not just their precious Lisp state. They should have had some appreciation for how the C environment actually worked since they decided to use it.]
They didn't have this great idea; it came from the Lisp culture. Lisp implementations have the ability to save images. It's not described in ANSI CL, but all the "industrial strength" Lisp have this, and it is one of the means for application delivery.
If you're working on an interpreted Lisp which is getting slow to load due to its growing library, image dumping and restarting is the obvious "off the shelf" approach you already know about from being a Lisp programmer.
> Instead of engineering around this (lazy loading services
I'd be surprised if Emacs didn't have lazy loading too. Lazy loading only goes so far. Sometimes lazy loading brings in a lot of dependencies, so it is slow for those users who use whatever is being lazily loaded.
Lazy loading is a hack to work around the lack of a fast image restore.
The thing that is hacky about Emacs image saving is that it saves the C state. It's not exactly a Lisp image saving mechanism.
This is why I was abel to port this dump/undump very easily to GNU Make, which doesn't share any data structures with Emacs, let alone any Lisp stuff. Dumping the image worked just fine for the C data structures on GNU Make's heap.
It's because it's such a low-level mechanism that they are having difficulties with its portability.
I don't hear about, for instance, the CLISP project having issues with the EXT:SAVEINITMEM function due to some GLibc support changing; it doesn't rely on that.
> I'd be surprised if Emacs didn't have lazy loading too.
Not sure if this is exactly what you're referring to, but Emacs has the autoload [1] feature, which lets you declare some functions and in which elisp file to find them. Emacs won't load that file until the function is actually called. This is usually for user-called functions. Most packages use it.
It is an idea that should not work and it is only happenstance that it ever worked
In the 1980s it was a surprisingly common approach. For example most Unix variants had an undump command to turn a core dump into an executable. See https://www.ctan.org/tex-archive/obsolete/support/undump for example. It was also how Microsoft Office file formats were implemented until about a decade ago.
With the hindsight of experience, it is easy to say that it is a bad way to go and won't age well. But back when Emacs went this way, the problems were not as widely understood as they are now.
IIRC, "most UNIX variants" didn't have an undump command. It primarily came with a few huge applications that played these type of games. The two worst offenders that I remember were in fact GNU Emacs and TeX! A few other programs that I dealt with (porting to a new platform) which depended upon a similar hack were Sendmail and Franz Lisp.
The Sendmail "WIZ" SMTP bug was in fact caused by it's use of reloading the frozen configuration file (the pointer to the password was saved in the data section, not the BSS region that was reloaded from the frozen config, so that if you used a frozen config you effectively had an empty password).
My impression comes from the comments in the undump link that I provided.
To your list of other programs you can add Perl. It comes with a dump command to take a core dump that can be turned into an executable. I never used it as anything other than a joke, but I do know several people who independently used that feature in production back in the 1990s.
Now that we do have the hindsight of experience, and it is time to make that decision again, why are the maintainers going with the clearly smellier approach?
They have an immediate problem (things are slow/broken on newer glibc) that needs fixing.
Someone has posted sane code to fix the issue in a similar fashion to what they were doing before but without needing to reach into the memory manager.
The maintainer who is mad wants it fixed right (replace the whole thing and make the elisp load fast) but so far neither they nor anyone else seems to have posted code to do that.
Bird in the hand. I would hope they'd just remove the need for the mechanism totally in the future, but for now.... bird in the hand.
Make elisp faster. Easier said than done. Any easy wins already happened.
Introduce various lazy loading mechanisms and switch to that by default. That doesn't help any third party plugins that people might use. In fact any such work could create ways of breaking the expectations of said plugins!
Identify/eliminate work spent on dead code paths in initial load. Again, easier said than done. And there are bound to be a lot of, "I don't see what that is useful for, but what third party plugins might need it?"
In the meantime, end users faced with the choice of a slow new release versus a fast old one, will resist upgrading. And if there is a patch that simply makes things fast, that is generally acceptable.
The current moment provides pressure to do things right. If the cheap fix is taken, there are no guarantees on when there will be another opportunity like this one. But pressures exist to take the easy out right now. And if taken, it won't be easier next time either.
The first time you start Firefox, it reads in a whole mess of xul, html, css, and javascript. It parses it all and then dumps it to a binary file that can just be mmap'd back in next time. This is exactly like emacs, except that emacs also dumps out all of its own internal state including the state of the allocator; that makes it more fragile.
This may seem strange to you, but this is actually the way many lisp systems (and other image-based development systems, like Smalltalk) work.
I build software using the commercial develop environment Allegro Common Lisp and an image dump is precisely the way I deliver software. The ultimate deliverable is an executable that loads the image and launches the main execution thread.
This works really well for us. To load up our web server takes a while due to having to load up caches etc.. so we tend to do all that once, dump the image and deploy it - cache already primed.
It had an even bigger advantage for me today. Due to some git shenanigans I somehow managed to wipe out a few days of work. No idea how... Almost resigning myself to having to recode the whole lot I just remembered that I had built an image to test just prior to attempting to check in. Clozure CL saves the source code of each function in the image. I was able to probe the image and pull out all the source code that I had just lost, saving myself a few days of tedious recoding!
Not a recommended use of images - but it sure saved my butt today!
I actually have no idea. I had a lot of unchecked work - about a weeks worth. (I know, I should not have let it go that far.. it was just one of those tasks that just seemed to get bigger and bigger.) It was time to check in, so I went through file by file staging it all (using Magit). I then decided to do some more testing. Built a lisp image. Then I became distracted, started thinking about home time etc... so my recollection of what I did is a little hazy. I think I saw I had some changes that I wasn't wanting to commit and test. So I think I stashed them. Then probably surfed hacker news or something.. When I got back to things I suddenly noticed that none of my changes were there any more. Nothing was staged, nothing was stashed. It was all gone. I spent a few hours looking around for it but all to no avail.
It is the second time I have lost work by misusing git. I think the other time I was messing with rebasing and branching and all my work disappeared.
I do really need to learn how to use git properly. But in the meantime I am not taking any more chances. I have set up a zone on our internal SmartOS server which uses ZFS. An hourly cron job on my machine rsyncs my source directory to the zone and then creates a ZFS snapshot. So, hopefully no matter how badly I use git now I will never be able to lose more than about an hours worth of work.
When you stage a file in git, what it actually does is to create in its object database a blob with the contents of the staged file, and then it points the corresponding entry in the working tree index to the blob's hash. The blob object is only removed during the periodic garbage collection, and only after a configurable period of time since its creation has passed.
Therefore, once you stage a change, git will keep a copy of it for at least two weeks unless configured otherwise. Even if nothing points to it anymore (due to, for instance, an errant git reset --hard), there are ways to find its object hash (something like git fsck --unreachable, or even a ls -lR in the object database directory), and once you have the hash, you can use git cat-file to retrieve its contents.
Recovering from a lisp image is pretty darned cool. :D
> I had a lot of unchecked work - about a weeks worth
> ... It is the second time I have lost work by misusing git
I realize this is _off topic_, but one thing that helps me a lot is making regular small commits (that describe some small idea of the change), and then rebase them later once I get all the linting / tests fixed. (Obviously not on master ;)) It's easy to go too-granular here, but it's very handy to be able to reorder the commits, or squash "fixup" commits, to make things easier to review.
I spend more time rebasing than I probably need to, but on the other hand I've never lost work. ;)
For example (adapted from a PR I was working on this morning ;))
git commit -m "Add Clone Foo feature"
git commit -m "Reorganize tests"
git commit -m "Fix indentation"
git commit -m "Add generator for Baz items"
git commit -m "Add test of Clone Foo feature"
git commit -m "fixup linting in application"
git commit -m "fixup test linting"
git commit -m "fix Foo test to use correct selector"
git rebase --interactive master
# reorder fixup commits, squash them together with the things they fix
You can always rebase and squish all of these down into one commit later before you make your pull request (or, leave them as-is if your team is OK with un-squished PRs).
> # reorder fixup commits, squash them together with the things they fix
The --autosquash option of git rebase can save you some work:
--autosquash, --no-autosquash
When the commit log message begins with "squash! ..." (or "fixup! ..."), and there is a commit whose title begins with the same ..., automatically modify the todo list of rebase -i so that the
commit marked for squashing comes right after the commit to be modified, and change the action of the moved commit from pick to squash (or fixup). Ignores subsequent "fixup! " or "squash! " after
the first, in case you referred to an earlier fixup/squash with git commit --fixup/--squash.
This option is only valid when the --interactive option is used.
If the --autosquash option is enabled by default using the configuration variable rebase.autoSquash, this option can be used to override and disable this setting.
(Not off-topic for me. :-)) I will have a play with all this. Now that I have my source being backed up to zfs I can be a bit more comfortable with experimenting with new workflows.
To help learn to use Git properly, start by making many frequent commits from the command line throughout the course of the day. I would say that an appropriate pace is about once every 30 minutes.
When you're done with the feature, then depending on your preference, you can either merge your branch into the master branch, or you can rebase your changes onto the master branch (my recommendation). In both approaches you may wish to squash your commits, so that you ship perhaps just one commit or fewer than your full set of working commits. Then you prepare the changes for code review and for pushing upstream.
The best way to learn about and become comfortable with a tool is to use it regularly. As you approach mastery it will become a swiss army knife. Frequent commits give you a lot of utility for the same reason that editor undo/redo buffer does, except it's persistent.
If you are worried about screwing things up by running the wrong git commands, then check out git's reflog feature, and how you can use `git reset --hard` to roll back changes via the reflog. Virtually every change you make to your local repository is versioned, and the reflog shows you that history and allows you to roll back.
My first step with git is always creating a new branch, then I commit multiple times a day with "wip" or minor notes as the commit message. It is purely to capture the stream-of-consciousness of the code. When it comes time to create a sensible patch I immediately branch again. The old WIP branch is not touched until I've completely merged the feature. On the new branch I usually reset then commit hunks in some way that makes sense, or if the changes are small squash everything into a single commit.
You can also do the same thing with tags if you like.
There is no penalty to branches in git... use them. Frequently.
I got into trouble at one stage mixing branching with rebasing. This really put me off branching. Creating a new branch to perform the merge is a stroke of genius - thanks! I will be doing that right away.
Thanks for the explanation! I am not a heavy user of git stash, so not sure if git reflog could have helped you, but I am grateful that somebody steered me toward git reflog early in my learning git. With reflog and frequent commits, it's very hard to lose work, short of deleting the entire project or its .git/ directory. rsync+ZFS also sounds like nice combo for running around git in order to prevent future mistakes destroying work.
> An hourly cron job on my machine rsyncs my source directory to the zone and then creates a ZFS snapshot. So, hopefully no matter how badly I use git now I will never be able to lose more than about an hours worth of work.
This is a thread about Emacs - no need for ZFS, Emacs can already do versioned file backups for you!
That is very interesting. I do have Emacs making some sort of backup, but have never managed to get it to work reliably. I'll follow those instructions.
What was confusing for me is understanding that by default Emacs does not back up on every save. It makes a copy when you first open a file, so only the version of the file before you start editing it is what gets backed up.
I agree that dumping an image is not a great idea in the modern world, but the concept(1) dates right back to when emacs was still based on TECO on the PDP-10:
http://pdp-10.trailing-edge.com/mit_emacs_170_teco_1220/01/i...
(and I suspect the available CPU horsepower back then was such that you pretty much needed to do tricks like this to get quick startup even if you were careful about how much lisp code you ran at startup).
(1) as a part of emacs, that is -- I'd guess that "language runtime takes an image of its pre-initialized state and then just loads that" was used in non-emacs contexts before that.
Did it dump Lisp images, or the image of the entire process (including all the non-Lisp stuff)?
It appears that all the complexity and instability here comes from the latter part. Lisp itself was always designed around this image dumping concept. But C and its standard library were not.
This is Lisp on the IBM 704/7090 and PDP-1 in 1960-1964. Ritchie would not start work on C until 1969 and there was no standard C library (which, btw, is completely irrelevant to the runtime state of a process) for decades. There were no processes or virtual memory (the 7090 didn't even have overlays) or non-Lisp stuff other than the punch card/paper tape loader.
And what were the stability guarantees for such things? Could you expect to take such a dump, bring it to a different system (potentially with different hardware, different minor OS revision etc), and expect it to work?
In those days, you could pretty much never take any software to different hardware, and even different minor OS revision was asking for trouble. What you are asking for isn't something anyone expected back then. Different hardware was _really_ different.
Right, that's kinda what I was expecting. But that also explains why it was a mistake to adopt this model for Emacs, or at least to stick with it for so long (to clarify, I specifically mean full process dumps, not Lisp world dumps) - it's a different world, and it has been different for a very long time, with software and OS/hardware significantly more decoupled.
Why? A binary executable does not rely on the very fine details of the implementation of the OS heap manager, for example, while a dump necessarily would.
> A binary executable does not rely on the very fine details of the implementation of the OS heap manager, for example, while a dump necessarily would.
What are you talking about? What is it about brk(2)/sbrk(2)/mmap(2) that would be a problem?
Correct me if I'm wrong, but if you make a process dump of the entire process, that dump would include internal structures used to maintain its heap, no? If those structures change between OS versions, for example, then the process dump from an older version would represent corrupted heap when loaded on a newer version.
None of the kernel data structures are saved or even visible to the process - the kernel memory (on Linux x64 everything mapped in the upper 128TiB) is not readable by the process.
You can have a problem if the kernel system call interface changes, but that is a problem for every executable. You would need to re-compile everything on the system.
> Instead of engineering around this (lazy loading services, fixing things, not doing things nobody needs) they had the great idea they could start the editor once and then core dump the in-memory state of a running editor. Then on later editor starts they would map-in the core dump and instantly be in a (somewhat) good state. Fails all kinds of smell tests and really speaks to bad taste having unbounded consequences.
How else would you compile a dynamic programming language program into a binary executable?
BTW, something that just came to mind: this "bad taste having unbounded consequences" of copying the entire process memory is exactly how fork(2) works.
Doing something that relies on OS internals is bad taste when done in userspace.
Interface stability aside, what fork does is simpler: a copied process can rely on its environment matching its parent's enough for a large subset of the contents of the duplicated address space to retain their meaning. Serializing a process's address space's contents and then loading it into an "unrelated" process later makes it more complicated to know what memory contents can still be relied on (e.g. ASLR, file handles, etc.).
> Interface stability aside, what fork does is simpler: a copied process can rely on its environment matching its parent's enough for a large subset of the contents of the duplicated address space to retain their meaning. Serializing a process's address space's contents and then loading it into an "unrelated" process later makes it more complicated to know what memory contents can still be relied on (e.g. ASLR, file handles, etc.).
Everything you mentioned except for ASLR is a problem for forked processes, and it boils down to kernel state (file descriptors and signals being two that I can think of right now).
Yes, I remember building Emacs in the late 1980's and watching the final dump. I guess 30 years later hardware hasn't improved enough to avoid this? People are building editors in JavaScript today. Perhaps the Emacs Lisp engine needs to be tuned a little more? That Guile Emacs has been decades in the making...
Guile Emacs isn't a bottom-up rewrite of Emacs [1], but a project to replace the elisp runtime with Guile. One important thing it does not do is get rid of/convert any ELisp, because of how much is invested in it: 71% of the ~2M LOC codebase is ELisp: https://www.openhub.net/p/emacs
Guile would run the ELisp code. What's holding the project back, IIRC, is that it doesn't yet do so perfectly.
[1] which is something I (and likely most users) frequently think it sorely needs, until we realize A) the monumental size of the task and B) how much refinement the current version already has, as per http://www.joelonsoftware.com/articles/fog0000000069.html
Interesting. Any idea on why it's not got more traction? Or is it actually on track to become the canonical emacs?
Reading that LWN article, it looks like Guile wasn't a great choice from a community perspective, but more than that, by working internally to the emacs project it's subjected itself to internal standards of interoperability that it'll never really be able to hit (where a pseudo-hostile fork like xEmacs was might have sufficient momentum to force some accommodation).
> Reading that LWN article, it looks like Guile wasn't a great choice from a community perspective
I think going forward it will be. Between Guile-Emacs, Guix, and GNU Shepherd you have the best-supported Lisp Machine operating system analogue available right now: https://www.gnu.org/software/guix/
I am excited about GuixSD and I think a lot of other people will be.
No it wont, at least as far as Emacs is concerned.
Nobody is stepping up to do the work, plus Guile has serious bugs on Windows and OSX, single-digit number of developers and few, if any, users. So Guile Emacs is really a pipe dream that people like to bring up from time to time.
> Nobody is stepping up to do the work, plus Guile has serious bugs on Windows and OSX, single-digit number of developers and few, if any, users. So Guile Emacs is really a pipe dream that people like to bring up from time to time.
You forgot to mention that BSD is dying.
This state of affairs is different from any other Lisp implementation how? And why would it stop progress from being made?
I've been happily running SBCL and CCL on Linux and OSX for years without issues. CCL on Windows too.
Guile has 1-2 people working on it part-time, and Windows/OSX are not a priority because:
+ GNU project, duh.
+ Nobody in Guile-land cares about Windows/OSX enough to step up and fix issues.
+ Even if they did, Stallman would tell them not to.
So the state of affairs is indeed very different from the CL Lisp implementations. Not to mention, Guile has pretty much no userbase to speak of. There are commercial entities releasing products with SBCL and CCL in addition to the very healthy opensource community.
> best-supported Lisp Machine operating system analogue
Hardly. GNU is far from anything Lisp Machine. The Lisp Machine was never a system hacked up by scripts running on top of some Unix copy with a fancy name.
As I note elsethread, Guile Scheme is not a 'faster, more modern Lisp,' because Scheme is not Lisp. Scheme is a Lisp-like language, which does have a place in the world, but it's not Lisp, and it's not (IMHO) well-suited for large software projects like emacs.
I personally oppose a Scheme-based emacs because, having moved from C to a more Lisp-like language, I don't think there would be the will to actually move to real Lisp. You might accuse me of letting the best be the enemy of the good (which is a fair criticism, both of my position re. porting emacs & of Zaretskii's re. the dumper), but part of why I use emacs and Lisp is that they are not good: they are the best.
Emacs-Lisp is dynamically scoped. That's precisely the kind of mistake (yes, mistake), that hinders large scale anything.
And who cares Scheme is not Lisp? It's close, and it's reputedly cleaner. Besides, there are many Lisp dialects, some of them just as different from one another than Scheme is from them.
> Emacs-Lisp is dynamically scoped. That's precisely the kind of mistake (yes, mistake), that hinders large scale anything.
Yup, default-dynamic scope is a mistake, although in the particular case of emacs it has made certain code very clean (and led to plenty of bugs, as well). Emacs has added lexical scoping, which is a start.
Common Lisp's lexical default and optional dynamic scoping provides the best of both worlds.
> And who cares Scheme is not Lisp? It's close, and it's reputedly cleaner.
Among other things, its continuations are semantically broken, preventing correct implementation of UNWIND-PROTECT.
It's a good language for its intended purpose (teaching & research), but the language as standarised is not well-suited for building large projects (individual implementations, of course, can be quite good — but then one enters the world of implementation-dependence).
Common Lisp is superior for large projects, in part because it's much more fully-specified, in part because that specification includes more features, in part culturally (because Lisp systems have more often tended to be large and long-lasting).
In TXR Lisp, a Common-Lisp-like language which has (delimited) continuations, I came up with a pragmatic approach w.r.t. unwind-protect which works well.
I introduced "absconding" operators which allow a context to be abandoned (via a non-local exit) without invoking unwinding. Absconding just performs the jump to the exit point, without doing any clean up along the way. That's it!
The idea is basically: why, when temporarily leaving a continuation, should we clean up resources, when we intend to resume the continuation and want those resources intact? If it should happen that we don't resume, so what; garbage collection will just have to take care of it. We wouldn't do that with threads, or coroutines, right? When a thread suspends, we don't close its files for fear that it might never wake up to use them and close them.
With this absconding, I implemented a workable yield construct. I can have code which, say, recursively traverses some structure and yields items to a controlling routine. That recursion can have exception handling and unwinding which works normally, as if the yields were not there. When the recursion performs a normal return, or a throw, all the unwinding takes place normally. The yields use absconding, and so they don't disturb anything. When a continuation is resumed, all of the dynamic handlers are in place, including the unwind cleanups that were never touched. Nothing was torn down.
Also in place are dynamically scoped variables. TXR Lisp's continuations play nicely with those also.
If you compare this brilliantly simple solution to concoctions involving dynamic-wind, dynamic-wind looks hopelessly silly, like what were they thinking when they came up with it?
Just don't unwind when you temporarily leave a context. The sky does not fall. Your CS degree is still hanging the wall. Everything is cool.
As for the more higher level stuff, spacemacs does a great job of that. It really makes configuration files, separation of concerns, vim bindings, and many other things sensible and user friendly. It doesn't touch the lower level code though, as it is just an elisp layer on top of emacs itself.
Interestingly Emacs is not the only project affected by this glibc/dumper dispute.
I added the stone-old patches for perl to my cperl fork, to be able to dump/compile perl scripts to native binaries the fast way.
Improved dumpers are here:
https://github.com/perl11/cperl/commits/feature/gh176-unexec
Mostly unified error handling and a few darwin segment instabilities. It is very fragile to use with a static library, but ok as dynamic library. Emacs uses the dumper in the main exe, not in a library.
Solaris is the easiest to use.
So I know a little bit of the troubles they are talking about here. Dan's portable dumper would be nice to have, XEmacs had this decades ago, but it never made it over to Stallman emacs. Wonder why :)
So looking at the new pdump, it really is horrible. I don't think I want to do that. I'll rather add a proper static malloc to cperl, such ptmalloc3, which is better than glibc malloc, i.e. ptmalloc2, anyway. They never switched to the better version, because it had more memory overhead. And I really can make use of the arena support there. Emacs should try the same. Much easier and much faster.
Good bye glibc.
I am actually surprised the dumper paradigm doesn't get more love. Startup time is an issue for most large programs. The dumper route is a generally applicable way to drastically improve startup time. Think of it as splitting your code into two parts: setup phase which is run at compile time (i.e. pre-dump) and run phase which runs at run time. Undumping substitutes a simple load of a file for the setup phase. What is not to love?
At one point Borland C++ precompiled header files used this approach: dump the in-memory object graph for the header file whole. Makes more sense than for a whole executable.
The complexity of that solution. By nature it is very fragile and leads to nasty bugs. But I can see certainly see the benefit of this approach - would be interesting to see this applied to some big frameworks/VMs, such as Java or .NET.
I've read something about dumping JVM state to decrease Clojure startup time (don't recall the details and I'm on mobile right now to check), so if I'm not wrong it was tried before...
Zaretskii's stance is weird. If you are going to run out of people who can work on the core of the editor's source code, then the editor will die. So the lack of ability to work on the code is the real problem. This is probably because it has accreted way too much complexity at this point, and way too many hacks. Shedding some of those hacks is a very good idea.
If you wanted to keep it the old way, and depend on the nuances of how an allocator stores memory, then ship your own allocator. Video game people do this as a matter of course; it's not a big deal.
Emacs and Vim are great tools. Amazing pieces of software. But in some areas you really see the prospect of time and aging.
I used and learned both in some point in time, liked something from both, sticked to Vim, but I kinda felt the best text editor would
be hybrid of these two. (Now don't run and tell me to install EVIL in Emacs, I tried that, but modal editing is not the only thing that gives Vim edge over Emacs)
What got my attention recently is Xi (developed by Raph Levien). It is written in Rust, looks fairly interesting, can't wait for it to be in advanced state. I really wouldn't mind some nice, modern, terminal text editor. (I use NeoVim at the moment, and I think it is closest to that, but VimL :cringe:)
Personally, as much as various editors talk about "modernity," there isn't really one that's surpassed Emacs yet. My requirements are fairly simple:
-Text interface: I don't want to be unproductive when I'm without a GUI
-Consistant abstractions and scriptability: Your editor needs to be customizable for whatever task I may need: that means scripting anything that can be done with a UI, customizing pretty much any aspect of editor behavior, the ability to embed curses-style UIs into the editor (when in nongraphical mode), and interfacing with external programs. And yes, these are all features I use. And not just for playing tetris, either.
In order to do this, you need consistant abstractions. For instance, if every window, from your obtions window, to your editing window, to your tetris window, aren't treated the same, aren't fundamentally the same object type, then you've failed as an editor designer.
-Ecosystem: this one isn't so much of a requirement, but it's nice to have a good one. That means lots plugins and other tooling.
The only editors that have these are vim, emacs, and, to a certain degree, acme.
Interesting. For me the killer feature of Emacs is its customizability. That's not quite the right word, because Emacs's integration with ELisp lets you do things that almost nothing else does. The ability to modify existing functions live is central to its power, and the boundary between "The Editor" and "Extensions" is extremely fuzzy.
Xi seems to take a harder approach to customization, which means that customizers will always be subject to the limitations of the plug-in interface. There will always be a hard boundary between "The Editor" and "Extensions", and I believe that will ultimately limit its usefulness.
I agree with you. Having tasted the emacs way, I'm constantly frustrated (actually I just simply avoid) by interactions with software that isn't so inspectable, and malleable. Still, never blocking user input is a respectable goal for an editor, and a design philosophy I wish emacs would seriously pursue. I would not mind a "feature" freeze on emacs for the remainder of the decade if it meant absolute responsive editing with asynchronous operations.
Emacs customizability is something rare. I had the pleasure to run QBASIC and Turbo Pascal 7 not long ago; and was amazed at the capabilities and speed of these old IDEs. Yet, they were locked. TP7, which is an epic[1] thing, made me feel sad, because the editing features are so basic, almost crippling (no block selection); you physically feel how you miss emacs, where anything is a few LoC away.
[1] text based multifile edition with overlapping windows (including .. ascii window shadowing), invisible compilation times (on a Pentium2), exhaustive help system; all in 800KB.
> you physically feel how you miss emacs, where anything is a few LoC away
That is my problem with IDEs, in a nutshell. There is the running joke that Emacs is a decent operating system in want of a decent editor. The same can be said - more strongly - about Eclipse or Visual Studio.
Some things these IDEs do spectacularly well, for sure, but when it comes to basic text editing, I keep thinking how easy this or that would be in emacs. ;-|
Same, and I started in the Eclipse fad, with eclipse plugin being a thing, before I knew how to program emacs (beside default config). The day I realize how general lisp was and how dynamic emacs was I had to pause for a minute.
Last winter I had to use Eclipse (for scala), one day of mild use trigger nasty wrist pain (I play music, I'm used to pushing the mechanics, that was more). And people say emacs causes RSI ;)
Also the Eclipse crowd is completely off the user side. It's all about tech. Microsoft might be better, I didn't use VS since ages. IDEA is said to be really great at ergonomics. But rarely someone brings a lot to the table. (the only recent thing I noticed was parinfer, ambitious and useful). Also people underestimate what a elisp can do when used correctly. See yasnippet, of Fuco litable.el.
I don't even remap. I think my hands ended up morphing into an emacs stockholm syndrom. Or maybe music did it before that. Still I was surprised that Eclipse would revive such painful sensations.
Modifying existing functions sounds like a recipe for plugin incompatibilities. Vim doesn't let you modify any built-in functions but it seems to be just as powerful.
Indeed, architecturally it's just asking for trouble. However it also lets user extend the system in ways that aren't previously planned for. Pros and cons...
> The editor should never, ever block and prevent the user from getting their work done.
A thousand times this. Emacs and vim are both guilty of failing this most basic principle. It frustrates me to no end that a text editor will hang while font locking text, or printing repl results.
A lot of what used to cause trouble for Vim was fixed with Neovim and the new async stuff Vim 8 introduced. This mainly ensures plugins don't have to block (though a lot still do).
That said, there's still parts of Vim that don't benefit from this and especially on large files can get annoying. But since switching to Neovim I've had a significant drop in the amount of times I have to wait for vim to do something and sit there screaming "oh come on" at my monitor.
Maybe eval'ing to the repl is almost a tic of mine. But as a contrived example, (range 100000) is instantaneous in a clojure repl in the terminal, and interminably slow in cider (emacs package for clojure). There's also SO questions like this[1], and I've personally come across it a lot as well. Anytime lines get long emacs grinds to a halt.
Geiser might just be adding a lot more newlines by default. There is a setting in CIDER to pretty print everything, which alleviates the issue. The core problem is emacs is terrible at displaying/wrapping/navigating/editing long lines.
Well I used GUI Emacs for the 99% of the time. Maybe I should give terminal version a try... I had problem that i couldn't stop tweaking Emacs. I would do work, and come to the point of "aha, I need to this, no plugin on melpa, lets try hack it", and then I realize Emacs devoured my productivity! I like Emacs and it's Elisp eco system, but sometimes it goes too far. And I don't like that it has so many things included. I feel constantly like I use 0.1% of it...
What specifically do you find so great about Emacs? I've forced myself to try it a few times but I couldn't get past the slow performance of the editor itself and the unreliability of basic things like syntax highlighting.
For me I find it quite performant once it's actually running. I've never had a problem with highlights either. Usually there is a minor mode for even the most esoteric language
I was using it to write C++. The default cc-mode didn't work properly. I was then pointed to various random plugins that people had written to address various modern C++ syntax issues but each had their own bugs.
Text editing is basically 90% of command-line interfaces, and Emacs has its own Unix-like shell with Unix-like utilities (that work the same on every platform, even Windows), terminal emulator, file manager, a bunch of email clients (I stopped using any for a while, although I plan to try using mu4e over the holidays), add-ons for any kind of programming language and version control and build system, stuff like calendars and spreadsheets, a file system emulator that lets you transparently work with remote files over a bunch of different protocols (no need to try to set up NFS or whatever on Windows), and all this works the same on any OS and over a terminal connection. It is a great text UI for most computing tasks.
In a related discussion, it's interesting to read this overview of work required to get Emacs to support double-buffering on X11. Interestingly, it's by the same guy who proposed the new dumper patch, Daniel Colascione (who lurks around here)
Rather than try to capture the state of the C library's memory-allocation subsystem, it simply marshals and saves the set of Elisp objects known to the editor.
This is what Allegro CL (and I presume all the other Common Lisps) have done for 30+ years. I'm surprised they didn't move to the marshalling idea before, for fear of the Glibc hacks going away.
EDIT:
Actually, unexec() was used in the early days, so the 30+ years is wrong. It's been more like 20+ years.
To me, the dispute is coming from what appears to be Zaretskii wanting nothing less than step 3. I can sympathize; I don't think I would want to receive a big pile of C were I a maintainer. It seems to me that temporary hacks tend to become permanent.
However, I think he's mistaken. As you said, that list makes a decent roadmap. IMO, the best approach to hand is to take the currently offered patch, and continue to work towards specifying and implementing optimizations to the elisp loader.
This way you are independent on glibc changes and platform quirks. It is faster, has arena support, and should have been the default glibc malloc anyway.
Why is the core of Emacs still written in C? Is it just fear of trying to rewrite it from the ground up in Lisp, or is there something fundamental to the architecture that makes Lisp unsuitable?
History, it's been in continuous development for over three decades now. And, being open source, a port of the C portions to a new language would require developer effort that doesn't always appear on demand.
Also portability. GNU Emacs supports a fair number of platforms and, consequently, has a lot of things included in it for different platforms, different presentation layers (GUI, CLI). It's not a small task to support the variety of environments that it presently does with a novel implementation.
And then there's the bootstrap issue. Emacs lisp isn't compiled to a standard binary executable format, they use their own bytecode. So even if you did port more portions that are in C to elisp, you still need an environment to run that elisp in, and the problem that the dumper file solves is actually made worse, more elisp needs to be loaded and executed before you get into a functional editor.
There is work to port elisp to run on type of Guile, but it's not done yet. Lacking developer resources and other things. Like the issue of losing support for various environments, it also doesn't 100% support the existing base of elisp packages out there, which is a deal breaker for many people.
Because that's the part that needs to interact with the OS at a low-level. Think of it as the "OS bindings" if you will. C is the lowest-common-denominator to do that.
Edit: I'm wrong, see avar's response below. It looks like it's for performance reasons.
Relatively speaking this really couldn't be further from the truth.
There's plenty of codebases that implement some programming language and only use C bindings for truly low-level primitives, such as external library calls, memory allocation etc, leaving any substantial logic that's built on top of those OS primitives to a higher level language.
Emacs is not such a codebase, most of the C code by volume is things that could perfectly well be implemented in Emacs Lisp itself, but aren't because Emacs Lisp is relatively slow.
So while it's fine for "scripting" Emacs itself, things like regexes, anything that has to do with low-level character handling, most of the GUI layout of Emacs itself (i.e. the buffer logic etc, not actually calling ncurses or X) etc. is written in C.
> most of the C code by volume is things that could perfectly well be implemented in Emacs Lisp itself,
A substantial part of the GNU Emacs core is written in C: the byte code virtual machine, the core Emacs Lisp language implementation, loading/saving images, memory management, ... That's not implementable in the current Lisp architecture (which additionally is single threaded).
If we want to move more of the language implementation in Lisp, that's possible - see implementations like SBCL or Clozure CL. Porting/maintaining these seems more difficult. But then you can implement regexes and other stuff in Lisp.
Yes. Basically the Emacs "C bit", is not just a Lisp like Clisp or SBCL, it also has a load of text editor code. Have a "CLmacs", ie. an emacs written in Common Lisp, is something a lot of people would really like. The problem, of course, is the vast amount of elisp that we currently use in our editors.
> it's also the sort of thing that could give vi a definitive advantage in the interminable editor wars
Is editors war still a thing? I was under the impression that emacs, vim and more recent editors (sublime and atom, to name two) each found their core audience and were quite distinct.
Or maybe it's just that I've been using vim for long enough. I can change my main language every few years, but I can't see myself changing my editor. Maybe the "editor war" is more about hesitating between editors at the beginning.
I don't get it. In 2016, on i7 with 32GB of RAM, 2 stripped SSDs, emacs (spacemacs actually) still takes around 1-2s to start which forces me to use emacsclient, and that is already the second stage of it's initialization started from the snapshot? I'm impressed.
I've loved and used Emacs for ~20 years, but if Emacs were to become slow, then if I were to have a replacement editor that could do the following (w/no X or window manager) without additional config in Linux, I'd use it instead:
arrow keys to move
add and delete text anywhere
paste from terminal buffer
ctrl-s -> search (and continue to find next match)
ctrl-v -> down
ctrl-esc -> up
ctrl-k -> kill line
ctrl-x ctrl-s -> save
ctrl-x ctrl-c -> quit
ctrl-a -> goto beginning of line
ctrl-e -> goto end of line
I don't even use selection anymore, because I can just use the terminal window copy/paste.
CTRL-S: Save
CTRL-Q: Quit
CTRL-F: Find string in file (ESC to exit search, arrows to navigate)
It's available in a lot of well-used distros: https://pkgs.org/search/kilo but doesn't look like it's in Arch, etc.
Kilua looks cool also as it has more similar keybindings to Emacs[2]:
Ctrl-x Ctrl-o Open a new file in the current buffer.
Ctrl-x Ctrl-s Save the current file.
Ctrl-x Ctrl-c Quit.
Ctrl-x c Create a new buffer
Ctrl-x n Move to the next buffer.
Ctrl-x p Move to the previous buffer.
Ctrl-x b Select buffer from a list
M-x Evaluate lua at the prompt.
Ctrl-r: Regular expression search.
but the goal would be to have that available in a package manager in a default install, so that after logging into any server where I'm a sudoer, I could:
Never even heard about the dumper in Emacs. It sounds like a crazy idea that is prone to break at the slightest change. Any shelving-unshelving implementation for processes should come from the kernel which owns the virtual memory mappings: kernel can basically already shelve a running process by just swapping it out to disk completely. And even that scheme is prone to break as soon as anything will do I/O.
FYI, as for Emacs: I just fire up "emacs -nw" inside tmux and let it run for months. I call make-frame-on-display to add a window on my X session but I can close that or restart X without having to kill my Emacs inside tmux, along with a few other long-running processes such as mail reader and IRC clients.
The dumper approach in Emacs predates GNU Emacs; it was how the original TECO Emacs worked.
The real fix would be to make dynamically linkable compiled elisp files (i.e. .so files) and let the system linker make Emacs start quickly just like any other program.
> The real fix would be to make dynamically linkable compiled elisp files (i.e. .so files) and let the system linker make Emacs start quickly just like any other program.
I'm not sure I understand, but this seems like a lot of work for little reward. Newer versions of Emacs function with the C library that's missing these dumper hooks and if you have a newer C library, you can update to a new version of Emacs.
This seems like a lot of work for people who want to stick with an older release.
Perhaps a related question: how essential is this functionality? I use Emacs on Mac OS X, which does not even have glibc. Yet the article suggests that this is only present on glibc. So should I be appalled at how long it takes my Emacs to load? Seems to me it's just as fast as it was on GNU/Linux.
I'm a little confused by this. I can understand why someone might want to preserve a repl env across invocations however the main reason given is startup time.
Given emacs daemon, which allows connecting with a thin client, how much startup time are we talking here? Can they not start once and connect like the rest of us?
My understanding is that this dumping of state is done during the Emacs build process. You build Emacs, initialize it and then dump out it's state. Every time you launch Emacs, your instance is starting from the dumped state.
Emacs has two init stages, you only see the latter.
The first is what gets it to the dump file.
The second is where your personal .emacs file gets loaded and executed.
The first is the one that takes too long and motivated them to create the dumper system to begin with, and needs some resolution (elimination and making that first init process faster, switching to a serialization of the objects rather than the C program memory, or something else).
So am I using this dumper system every time I launch emacs? I start emacs "normally" in that I simply launch it from the cmd line with the daemon flag.
In a sense, yes. You're using the product of the dumper. It's used to create an image of the state of the running system during the emacs build, and then delivered to end users like us as part of the emacs installation. That executable image state is loaded, and then your .emacs is called.
For a comparable model, check out the way Smalltalk images (particularly with Squeak and now Pharo) are distributed.
For kicks, try running "emacs-undumped". It's the base version without everything loaded (the dump-file), and part of what's used for creating the dump file during the emacs build process. At least for me it's pretty much unusable thanks to the terminal colors that it seems to insist on using for plain text.
That was on OS X. No idea if it shows up on other systems. The man page says it's not meant for end users like us, and that it's to be used with dumpemacs.
The concern about startup time seems a bit odd to me too. I use Emacs in the way it was originally designed to be used: I start an Emacs process and keep it running for weeks. I don't even use the client; when I want to edit a new file, I select the Emacs window and pull up the file from within that. The idea that I would type a shell command every time I want to edit a new file is foreign to me -- I mean, I know that's how dinky little single-buffer editors like nano and pico work, but I would never use a sophisticated multi-buffer editor like Emacs (or IntelliJ IDEA or Eclipse or Visual Studio or ...) in that mode.
A text editor will always be with you, no matter what you're working on. IDEs tend to be for a specific host platform, language, and sometimes target platform. For example, Xcode only runs on a Mac, understands a limited set of languages, and is best used for building Mac and iOS apps. MS Visual Studio only runs on Windows, understands more but different languages, and is best used for building for Microsoft platforms. Android Studio is best used for Android things. IntelliJ and Eclipse are best used for Java.
If anyone has heard of your programming language, there is probably an Emacs mode for it.
More to the point, Xcode, Visual Studio, Android Studio, IntelliJ, Eclipse, and so forth all have generally different UIs, which means that unless you spend all your development time in a single environment, you have to do a lot of really expensive mental context switching.
An editor like Emacs or Vim provides a unified user interface for all the different languages and development environments with which you use it, which eliminates almost all of that mental overhead - and can also invoke your compiler, (usually) integrate with your debugger, et cetera. IDEs still win on tight integration, but for any language popular enough that someone's invested the time to build a well-integrated IDE for it, Emacs or Vim will generally cover at least the 90% cases quite well, too.
If I ever try Emacs again, I'll have a look. In the past I've found clang-based plugins seem to have a half-life of about 6 months.. while Emacs might live forever if your plugins don't, then you are relearning a bunch of things anyway.
PS at first i thought your URL was some kind of joke I didn't get, until I clicked it and found it linked to a genuine project.
Emacs users don't have to worry about minor errors because we don't make them. Vi users don't, either (they already made a huge error, so all their little ones don't matter).
On a more serious note, a lot of the features of an IDE (e.g. code completion, method lookup, compiling, and version control integration) are trivially turned on with emacs. I use etags (ctags, etc) and have macros bound to various frequently used git commands, auto-compile when I change a translation unit, etc. My Emacs is an IDE. And much more. Vi has similar functionality available to some degree or other and even if it didn't, with evil mode in emacs vi users can have it, too.
Most if not all IDEs separate compilation from the actual editing. So choice of editor is really orthogonal to whoever is doing the compiling. Ideally you want to be able to do this all from the command line anyways so you can run automated tests when code is checked in. Vim and Emacs can be set up to emulate a pretty convincing IDE experience, but the separation between editor and compiler is much more clear.
To reverse your question: why require an IDE if the same tools work on the command line too?
Vim is pretty much an IDE. I type :make (not :!make) to build. Vim gathers the errors into a quick-fix list, and navigates through them. I have Vim's :grep bound to doing lid lookups on a mkid-generated database: lightning fast to find all occurrences of an identifier. And of course tags for chasing identifier definitions, thanks to a ctags-generated database. Vim has very good syntax highlighting and indentation; many times I spot as syntax error because it is flagged by Vim. I also have a Vim add-on that performs code completion. I type foo. and the list of members of structure foo come up, etc.
I don't step-debug inside Vim; but it looks like there are ways:
At least for Emacs, one of the things I like about it is the ability to add functionality when I need it. For example, implementing a "go back to the last place I modified this file" (super useful to me, at least) and then binding it a key would likely be hard to "add" to an existing IDE. With Emacs, not so much.
It's also very easy to add ad-hoc functionality directly from the editor. You can start with a keyboard macro (F3, do crap, F3, do crap, F4, hit F4 again to do "do crap" as many times as you want) and go all the way up to writin elisp packages. M-x find-function makes it easy to look at the internals of whatever library you're using. defadvice makes it easy to do horrible, but expedient, things and get on with your life. IMHO, if you're not writing at least a little bit of elisp, you're not taking advantage of Emacs's real potential.
The thing is that an editor is one of the tools that a programmer uses. We also use build tools, debuggers, profilers, visualisation tools, etc, etc.
IDE stands for "integrated development environment", but to do the "integration" part, what it usually means is that it bundles tools for you. So you are stuck with the editor, debugger, build tool, etc of your IDE. Every time you choose not to use that "integrated" tool reduces the value of the IDE.
For example, if you decide to write your own build files and start the build process off differently than the IDE expects, it's no longer "integrated". If you decide to use a different debugger, it's no longer "integrated".
Primarily, choosing not to use an IDE is mostly about wanting to be flexible about the tools you are choosing. Sometimes the tools in the IDE are really good for your job (for example, some of the refactoring tools can actually make certain IDEs worth it alone in some environments). Other tools, like the simplified build systems, almost always cause more trouble than they are worth in the long run.
When you get down to think about it, how difficult is it to "integrate" a custom set of tools without an IDE? We are programmers. Especially programmers who come from a Unix background are very much used to building their own environments. It's not nearly so hard as it seems -- especially because most of the heavy lifting has already been done by people before you.
In the case of Emacs, it practically is an IDE, except that you can plug damn near anything into it with a little elbow grease. I know of programmers who never leave Emacs. Ever.
So to sum up, there is almost nothing you do in your IDE that I don't also do in my custom built DE. My biggest gripe about IDEs is that I wish they would unintegrate their tools so I could use them separately :-)
You should be able to edit text in a familiar manner regardless of what kind of text it is. It might be source code, or it might be an email, a webpage, a document, your shell, a debugger, or anything else you can think of. No matter what you're editing you'll have the same features, keyboard shortcuts, etc available.
A programmer who wants a good universal text editor will find one that also has good integration with their compiler and debugger; emacs and vim are both pretty good choices there.
> If Emacs adopted one of the proposals to use a standard Lisp dialect
That should be 'if Emacs adopted the standard Lisp dialect.' There's only one standard Lisp, and that's Common Lisp[1][2].
I do indeed think that it would be wonderful if the elisp engine were reimplemented in Lisp, but it's a tremendous amount of work, with a lot of potential incompatibilities in the short term, with very little to show until the long term.
[1] That doesn't mean that other Lisp-like languages aren't awesome. Racket, in particular, leaps to mind as something which is massively cool. Clojure isn't to my taste, but I can understand why some folks like it. Scheme has its virtues too. But none of them is Lisp, unlike elisp, which really is a Lisp.
[2] There are also EuLisp & ISLISP, but they're effectively dead.
>There's only one standard Lisp, and that's Common Lisp
That isn't true. Scheme is a Lisp. Why would think otherwise?
Is it because nil != false? Even McCarthy admitted that that was a random decision, and possibly a mistake.
Is it because it hasn't got "proper" macros? That's a common misconception, but it's not true: all but the most minimal implementations provide an imperative macro system.
Is it because it's lisp-1? Many lisps have been.
Is it because they have different histories? Because that's not true. Both came out of MIT. Many people were on both standardization communities then, and many people are active in both communities now. The first Scheme was written in MACLisp, for crying out loud.
> Is it because nil != false? Even McCarthy admitted that that was a random decision, and possibly a mistake.
I think that code written where NIL is the sole false value tends to read better than code with a distinguished false value, but I'm open to the idea that uglier code is more reliable.
> That's a common misconception, but it's not true: all but the most minimal implementations provide an imperative macro system.
But they're not standardised. The manner in which they affect the runtime environment is not standardised. It's just not a sound basis for a portable project: in practise any large software project will run not on Scheme but on Guile Scheme, or Chicken Scheme.
> Is it because it's lisp-1? Many lisps have been.
Only mistakenly. I used to think that Lisp-1 is better: after all, I'm used to other languages like C or JavaScript where functions and variables share the same namespace.
I've come to realise that 'because C & JavaScript do it' is hardly sufficient for something to be good grin
It turns out that the multiple namespaces of Lisp are extremely useful for industrial-strength software.
> Many people were on both standardization communities then
Sure, there's lots of cross-pollination between Lisp & Scheme, because they are both good languages. I don't dislike Scheme: in its place, it's a very interesting way to learn certain concepts. It's Lisp-like, which means it's certainly not all bad.
But it's not an actual descendant of a Lisp: it's its own independent language. It has flaws (continuations which make UNWIND-PROTECT impossible; nil ≠ #f; Lisp-1; worst of all: it's under-specified), but it's not bad. The world would probably be a worse place if there were no Scheme in it.
But I think that emacs would be worse if it were implemented in Scheme rather than Lisp.
>I think that code written where NIL is the sole false value tends to read better than code with a distinguished false value, but I'm open to the idea that uglier code is more reliable.
It's less reliable, and less clean. If you can't see the problem with overloading false, look at C, another language which overloads false and suffers for it.
>Only mistakenly. I used to think that Lisp-1 is better: after all, I'm used to other languages like C or JavaScript where functions and variables share the same namespace.
I have yet to see a disadvantage to being a lisp-1, and it comes with many advantages: we don't have to function and funcall everything, which makes our code a good deal cleaner.
>But it's not an actual descendant of a Lisp
That is definitely wrong. It is emphatically a descendant of Lisp. In fact, the very first paper on Scheme, AI Lab Memo 349, described it as "a full funarg lisp."
We can argue about whether Scheme is a lisp, but its descendance is well-documented: if you can't trust the language's creator on its origins, than who can you trust?
>it's under-specified
This is true. However, R7RS is working on fixing this. And as I understand it, it's even including a standardized procedural macro mechanism, so there you go.
> and many people are active in both communities now.
Hardly.
> Many people were on both standardization communities then
Guy L. Steele worked on the High Performance Fortran standard. Makes it a Lisp...
> The first Scheme was written in MACLisp, ...
One of the first Haskell implementations was written in Lisp. The first ML was written in Lisp.
Lots of different languages were implemented (but not initially developed) in Lisp at some point in time. Incl. Scheme, Logo, Prolog, Dylan, C, Ada, Fortran, Python, Pascal, Clojure, ...
You only addressed my points showing there was a link between the communities. Just because there are links between the lisp community and other communities doesn't mean Scheme isn't a lisp.
It just invalidates your arguments, nothing more. That parts of the first Scheme were once written in Maclisp 40 years ago does not make it a Lisp. An Emacs was once implemented in Maclisp, too. It doesn't make it a Lisp. It's an editor.
Scheme is a different programming language with its own set of standards like R*RS and IEEE Scheme, its own community, its own code, its own libraries, its one mailing lists, its own online forums, its own user groups, its own funding, ...
ISLisp, Elisp and Common Lisp are sharing the same core language. ISLisp is basically a simplified Common Lisp (a Lisp-2 with a CLOS-variant and Conditions in the standard). Kent Pitman wrote a compatibility package which allows running ISLisp in CL. Elisp and Common Lisp are both coming from Maclisp, so they share the same core language. Elisp has a lot of further Common Lisp compatibility ( https://www.gnu.org/software/emacs/manual/html_node/cl/ ), which is widely used in Emacs Lisp code. Standard Lisp is another Lisp, which shares the core language.
Eulisp is long gone. RIP.
> Lisp isn't a language: it's a family of languages.
True. A family of languages which share.
> And the idea that the languages don't share community isn't true.
Sorry. I am primarily a schemer, so that's what I'm referring to.
Readtables and generic setters (as specified by SRFI-17: a bit limited compared to CL, as I understand it) are very common extensions, and there are several loop macros floating around, all inspired by CL.
We also technically took dynamic scope from CL, but most schemes only have R7RS dynamic scope, which isn't so much dynamic scope as it is a hack that looks a lot like dynamic scope if you squint.
Yes. As did lexical scope exist before Scheme. However, the scheme implementation was directly inspired by CL. As are (and I forgot to add this) our object systems.
I reached the part about the then clause of if/then/else needing a (progn ...) if it was to be a multi-statement clause, but the else clause doesn't. But you can avoid the progn if you don't have an else clause by using when instead of if!
I facepalmed so hard I needed reconstructive surgery. I haven't fully recovered from that. (in reality it's because, though ELisp itself is bad enough, Emacs' API is daunting)
It's just a consequence of the Lisp syntax: (if COND THEN ELSE), so this list has four elements, and the second is the condition, the third is one leg, and the last is the other leg. How would you make the "THEN" part longer, except by wrapping parens around it?
To use a metaphor: In typical infix languages, I can only add two numbers, not three. Why do I need to do strange things like using two plus signs as in "3 + 4 + 5" to add three numbers? Lisp can do it with one: (+ 3 4 5)
The answer to this metaphor is that the very idea of infix results in this limit. In the same way, the very idea of expressing everything as s-expressions results in that limitation.
You might be interested in this ancient discussion on the Arc Forum about the benefits and drawbacks of Lisp conditionals: http://arclanguage.org/item?id=18218
cond actually is a predecessor of if/else you know from other languages. Yes, it was first. As for progn, it's pretty much the equivalent of curly braces from C-like languages (or of the comma operator, if you care about its value).
Why if seems to be commonly defined as (condition then-clause &rest else-clauses) instead of (condition then-clause else-clause), I don't know. History/common use patterns, probably?
It's not really separating related semantics; it's that an if-then-else option has to have both a then clause and and else clause. (IF thenelse) is a natural way to express that, but it does mean that if you want to do more than one thing in the then clause then you'll need to have a PROGN. You could have multiple statements in the else clause (which is what emacs does).
In Common Lisp, the syntax is:
if test-form then-form [else-form] => result*
In emacs, it's:
if test-form then-form [else-form]* => result*
I think that the emacs form is weird and annoying, but might make certain forms of code easier (e.g. check for something and short-circuit, else calculate something more deeply).
COND is a different beast entirely.
Note that WHEN & UNLESS both have an implicit PROGN.
OK, so the weirdness is really that Emacs' else-form accepts multiple statements, and/or that there is no implicit progn for both then-form and else-form.
I.e, I would've expected something like this:
(if (condition)
(
(then do something)
(and more things)
)
(
(else do something else)
(and more other things)
)
)
(and yeah, I'm definitely showing my C-semantic preferences here aren't I?)
Yup. And the reason not to default to a list of then-forms and a list of else-forms is partly æsthetic, but partly dealing with the common case, because:
(if (eq foo bar) (baz))
looks like a function call, not returning the value of baz (using your syntax).
And of course Lisp is often written in a semi-functional way, so PROGN is, while not rare or really even avoided, not the usual way of doing things, usually.
Not in Common Lisp, that: else is just a variable reference here. Unbound, if you're lucky; bound to a true value if you're somewhat less lucky, and bound to nil if you're haplessly unfortunate. :)
There cannot be an implicit progn for both! Because we don't know where one ends and the other begins.
That is, unless we introduce a signal, like an else keyword/label which separates them, like this:
(if cond form form form
else form form form)
Such macros have existed. A certain John Foderaro of Franz, Inc. was (is?) known for favoring and promoting his if{star} macro. (Of course I mean asterisk by {star} which HN won't let me type).
I've always resisted writing this macro because I felt it has a slight readability issue:
(my-if (some condition)
((consequent)
(consequent2)
(consequent3))
((alternative) ;; relatively quiet signal here
(alternative)))
Maybe it's not that bad, after all. Still, it's just saving a few keystrokes to insert progn and doesn't buy much over cond, which is more general, allowing multiple conditions:
NOTE: cond existed in Lisp first, as a special form. The if macro came later as a syntactic sugar for simple situations involving just one condition!
The thing is that if you keep most of your code functional, a lot of this is moot. We need multiple statements when we are doing something imperative. There is never any reason to have (progn S1 S2) unless S1 has a side effect. If S1 has no side effect, then this is equivalent to (progn S2), which is just S2. If you avoid side effects, the you don't need any progn most of the time (implicit or not).
This is why progn is called what it is; it's Lisp's feature for writing a "program" (in the sense of an imperative list of things to do). Well, "prog" is that feature; and "progn" is the variant which returns the value of the n-th form.
Another mitigation is that lot of the time code binds variables anyway with a let:
(if condition
(let (... vars ...)
this
that)
(let (.... other vars ...)
other))
That is true, but somewhat alleviated by Emacs' choice to indent the THEN form at a different level than [ELSE]*. A novice would notice that the code looks incorrect as they were typing it.
Firstly, because the way Lisp is structured makes it impractical to have implicit progn (or implicit begin in the schemes) in if.
Secondly, because lisp, while not a functional language, has functional leanings: you would be surprised how often it is that one form is all you need.
CL-USER 48 > (LOOP REPEAT 2
FOR number = (READ)
IF (> number 0) DO
(princ '(the number is larger than 0))
(print 'true)
(terpri)
ELSE DO
(princ '(the number is not larger than 0))
(print 'false)
(terpri))
2
(THE NUMBER IS LARGER THAN 0)
TRUE
-2
(THE NUMBER IS NOT LARGER THAN 0)
FALSE
NIL
It is also not that difficult to write a multi-expression IF. Here is a sketch:
I was working in an organization that developed a big network switch, with a large C++ application running on it whose non-recursive Makefile took 30 seconds just to load and parse all of the include makefiles throughout the tree, before actually building anything.
Half a minute of waiting just waiting for that second that it then takes to recompile a single .cpp to a single .o and link everything.
I got tired and added a "make --dump" option which used the GNU Emacs undump code to dump an image of make with all the rules loaded from the Makefile. Then "make --restart" would instantly fire off the incremental rebuilds. (Of course, any changes to the makefiles or generated dependency makefiles required a new --dump to be taken to have an accurate rule tree.)
Another idea would be just to add a darn REPL to make, so you can keep it running and just re-evaluate the rule tree.