The Emacs dumper dispute

kazinator · on Nov 30, 2016

I ported undump from Emacs to GNU Make once!

I was working in an organization that developed a big network switch, with a large C++ application running on it whose non-recursive Makefile took 30 seconds just to load and parse all of the include makefiles throughout the tree, before actually building anything.

Half a minute of waiting just waiting for that second that it then takes to recompile a single .cpp to a single .o and link everything.

I got tired and added a "make --dump" option which used the GNU Emacs undump code to dump an image of make with all the rules loaded from the Makefile. Then "make --restart" would instantly fire off the incremental rebuilds. (Of course, any changes to the makefiles or generated dependency makefiles required a new --dump to be taken to have an accurate rule tree.)

Another idea would be just to add a darn REPL to make, so you can keep it running and just re-evaluate the rule tree.

sqeaky · on Nov 30, 2016

Isn't the ninja build system just a faster replacement for make. I use CMake primarily it lets me create makefiles, ninja files or vs projects. I noticed that in the general case ninja is 5% to 10% faster than make on builds that are more than few seconds long.

Then I found that if I did serious file manipulation at build time, like copy trees of files dependent on other thing in the build, I could have tens of thousands of targets, one per file usually. Ninja might hiccup for a fraction of a second on these shenanigans but make often sits and spins for 20 or minutes.

Unless you want to write the makefile yourself why not use ninja?

kazinator · on Nov 30, 2016

Because then I have to ask users to install ninja before they can build my program. In the projects I'm working on, I don't have any of the issues that ninja solves.

viraptor · on Nov 30, 2016

You can generate both. That gives you ninja for development (which had a problem you described) while the official builds still can use the makefiles that should result in the same output.

kazinator · on Nov 30, 2016

> You can generate both.

That means that a user who wants to patch the build rules has to have the generator, and learn the generating language instead of the Make language he or she already knows.

Autoconf has this disease. You can build from an official tarball, but touch anything (or use a git checkout) and you need auto-this to generate auto-that. Not just any auto-this, but a specific version, that is seven releases behind current, or else three releases ahead of what your distro provides.

> should result in the same output

It should; but someone has to ensure that it does. That's just another unnecessary concern that doesn't actually have to do with anything with the functionality of whatever is being built. We would like to spend our QA cycles validating the program, not three ways of building it.

Best to have just one way to build, and don't require users to install extraneous tools.

avar · on Nov 30, 2016

You've confused autoconf with automake. The output from autoconf is just a list of variables that are sourced by your handwritten Makefile, which you can supply yourself if you don't feel like executing autoconf.

It's automake that writes your Makefile for you, but you can just skip using that. E.g. the Git project uses autoconf optionally but not automake.

kazinator · on Nov 30, 2016

Rather, you might be confusing the ./configure script with autoconf (the tool which generates the script from "configure.ac").

avar · on Nov 30, 2016

I know the difference between autoconf and its generated ./configure target.

Your comment overall indicated to me that you were talking about automake, not autoconf. But if not, fair enough.

E.g. you talk about "learn the generating language instead of the Make language". I know you were using that as an example, but there's no general non-horrible replacement for autoconf that you can write by hand, as opposed to automake where you can write a portable Makefile

You can of course write a bunch of ad-hoc shellscripts & C test programs to probe your system, but this is going to be a lot nastier and buggier than just using autoconf to achieve the same goal.

You also don't generally need autoconf to build projects you clone from source control in the same way that you need automake (because that actually makes the Makefile).

The output of autoconf is generally just a file full of variables the Makefile includes, if you don't have that file you can just manually specify anything that differs from the Makefile defaults, e.g. NO_IPV6=YesPlease or whatever.

The Git project, whose autoconf recipe I've contributed to, is a good example of this. You can "git clone" it and just issue "make" and it works, but if the default config doesn't work then "make configure && make" generally solves it, but you can also just e.g. do "make NO_IPV6=YesPlease" if it was lack of IPv6 that was causing the compilation failure. It'll then get your NO_IPV6 variable from the command-line instead of from the ./configure generated config.mak.autogen.

nkurz · on Dec 1, 2016

The output of autoconf is generally just a file full of variables the Makefile includes

You may just be using a terminology I don't recognize, but like 'kazinator', I think you are missing a step.

The Git project, whose autoconf recipe I've contributed to, is a good example of this.

Great, let's use that as a specific example: https://github.com/git/git/blob/master/configure.ac. Line 2 says "Process this file with autoconf to produce a configure script."

I interpret this as saying that autoconf takes configure.ac as input, and produces a runnable 'configure' script as output. But you are saying that "the output of autoconf is generally just file full of variables the Makefile includes". How can these both be true?

avar · on Dec 1, 2016

I was using "autoconf" to mean both the software itself and all its output, including the generated configure script.

Confusingly, sorry about that, but for the purposes of discussing what software you need to generate the configure valuables you ultimately need when cloning from source control, it makes no difference.

sqeaky · on Dec 1, 2016

Chains of discussion like this are why I like CMake. Much less confusion.

Few confuse the output of CMake with things that ought to be committed.

kazinator · on Dec 1, 2016

I like something called GNU Make for the same reason.

Few confuse the output of Make (namely your built program) with something that ought to be committed (like its source code or the Makefile).

viraptor · on Nov 30, 2016

> That means that a user who wants to patch the build rules

I think we have very different definitions of a "user".

I kind of see your point about parallel builds, but think a lot of it could be automatically verified with right CI system. Once it works, it works.

kazinator · on Nov 30, 2016

> Once it works it works ...

So long as nobody touches it, or any aspect of the environment it implicitly depends on.

eru · on Nov 30, 2016

Shake is usually even faster.

sqeaky · on Dec 1, 2016

At first I thought I would have a hard time searching for it with such a simple name, but it clearly is a Haskell tool.

This looks like a reasonable alternative for people who control the whole lifetime of their code, life server software developers. But for someone shipping a library or anything they expect someone else to build, what advantages does this have over Make in the *nix world or CMake in general?

drothlis · on Dec 1, 2016

I wonder if most of those include files were the files auto-generated by the C pre-processor to track header file dependencies. In a toy benchmark[1] GNU Make spent 98% of its time processing those include files (for a no-op build). Ninja has a special optimisation for these files, where it reads them the first time they're created, inserts the dependency information into a binary database, then deletes them so it doesn't have to parse them in future invocations. AFAICT this accounts for most of Ninja's speed improvement over Make.

[1]: http://david.rothlis.net/ninja-benchmark/

rurban · on Nov 30, 2016

Is this open source somewhere or company internal only? I think I could make good use of that also.

I updated the unexec code a bit in my project, but adding it to more projects sound like fun.

cyphar · on Dec 1, 2016

Well, it's GPL licensed which means they'd have to provide source code to anyone they give the software to. The only question is whether they're aloud to distribute it to anyone outside of their company.

mikekchar · on Dec 1, 2016

Not contradicting you, but just posting to invite someone to correct me if I'm wrong. If I understand correctly, the GPL specifically grants a license to distribute to anyone you please. However, in the case where the software was originally distributed to a company, an individual working at the company does not have a license (because it wasn't distributed to them specifically) and can not redistribute the software. Only the company can redistribute the software (though, having done so, they can't stop a recipient from redistributing the software).

It's one of the really subtle points of the GPL and easy to get wrong (which I'm inviting people to correct my interpretation ;-) ). I often wonder whether the AGPL would work the same way because the "distribute" clause in it is awfully vague, from my interpretation. Just giving me access to the software over a network requires giving a license, it seems. So if they give me access as an employee, I should also get a license... maybe...

BuuQu9hu · on Dec 1, 2016

Please contribute that patch to GNU Make!

drfuchs · on Dec 1, 2016

OK, if you promise to stay off my lawn, I'll explain the history behind undump. Back in the 70's, the big CS departments typically had DEC 36-bit mainframes (PDP-10, PDP-20) running the Tops10/Tops20/Tenex/Waits/Sail family of operating systems. These are what Knuth used to do all of TeX, McCarthy LISP, and Stallman and Steele EMACS. Not Unix; and Linus hadn't touched a computer yet.

Executable program files were not much more than memory images; to run a program, the OS pretty much just mapped the executable image into your address space and jumped to the start. But when the program stopped, your entire state was still there, sitting in your address space. If the program had stopped due to a crash of some sort, or if it had been in an infinite loop and you had hit control-C to interrupt it, the program was still sitting there, even though you were staring at the command prompt. And the OS had a basic debugging capability built-in, so you could simply start snooping around at the memory state of the halted program. You could continue a suspended program, or you could even restart it without the OS having to reload it from disk. It was kind of a work-space model.

Translating into Linux-ish, it's as if you always used control-Z instead of control-C, and the exit() system call also behaved like control-Z; and gdb was a builtin function of the shell that you could invoke no matter how your program happened to have been paused, and it worked on the current paused process rather than a core file (which didn't exist).

The OS also had a built-in command to allow you to SAVE the current memory image back into a new executable file. There wasn't much to this command, either, since executables weren't much more than a memory image to begin with. So, the equivalent of dump/undump was really just built into the OS, and wasn't considered any big deal or super-special feature. Of course, all language runtimes knew all about this, so they were always written to understand as a matter of course that they had to be able to deal with it properly. It pretty much came naturally if you were used to that environment, and wasn't a burden.

Thus, when TeX (and I presume the various Lisp and Emacs and etc. that were birthed on these machines) were designed, it was completely expected that they'd work this way. Cycles were expensive, as was IO; so in TeX's case, for example, it took many seconds to read in the basic macro package and standard set of font metric files and to preprocess the hyphenation patterns into their data structure. By doing a SAVE of the resulting preloaded executable once during installation, everyone then saved these many seconds each time they ran TeX. But when TeX was ported over to Unix (and then Linux), it came as a bit of a surprise that the model was different, and that there was no convenient, predefined way to get this functionality, and that the runtimes weren't typically set up to make it easy to do. The undump stuff was created to deal with it, but it was never pretty, since it was bolted on. And many of use from those days wonder why there's still no good solution in the *nix world when there are still plenty of programs that take too damn long to start up.

agumonkey · on Dec 1, 2016

Seems like everything was better in the old days.

jmount · on Nov 30, 2016

Emacs is my primary editor, however Emacs dumper has always been a dumpster fire.

Basically they coded up so much ill-concieved and inefficient Emacs Lisp that the editor would never start up in an acceptable time. Instead of engineering around this (lazy loading services, fixing things, not doing things nobody needs) they had the great idea they could start the editor once and then core dump the in-memory state of a running editor. Then on later editor starts they would map-in the core dump and instantly be in a (somewhat) good state. Fails all kinds of smell tests and really speaks to bad taste having unbounded consequences. It is an idea that should not work and it is only happenstance that it ever worked (and it gets harder and harder as we have things like address layout randomization, file handles, and so on).

[edited "file" -> "fire", sorry! And yes I know lispers always dumped, but they are dumping the C memory environment here- not just their precious Lisp state. They should have had some appreciation for how the C environment actually worked since they decided to use it.]

kazinator · on Nov 30, 2016

They didn't have this great idea; it came from the Lisp culture. Lisp implementations have the ability to save images. It's not described in ANSI CL, but all the "industrial strength" Lisp have this, and it is one of the means for application delivery.

If you're working on an interpreted Lisp which is getting slow to load due to its growing library, image dumping and restarting is the obvious "off the shelf" approach you already know about from being a Lisp programmer.

> Instead of engineering around this (lazy loading services

I'd be surprised if Emacs didn't have lazy loading too. Lazy loading only goes so far. Sometimes lazy loading brings in a lot of dependencies, so it is slow for those users who use whatever is being lazily loaded.

Lazy loading is a hack to work around the lack of a fast image restore.

The thing that is hacky about Emacs image saving is that it saves the C state. It's not exactly a Lisp image saving mechanism.

This is why I was abel to port this dump/undump very easily to GNU Make, which doesn't share any data structures with Emacs, let alone any Lisp stuff. Dumping the image worked just fine for the C data structures on GNU Make's heap.

It's because it's such a low-level mechanism that they are having difficulties with its portability.

I don't hear about, for instance, the CLISP project having issues with the EXT:SAVEINITMEM function due to some GLibc support changing; it doesn't rely on that.

AceJohnny2 · on Nov 30, 2016

> I'd be surprised if Emacs didn't have lazy loading too.

Not sure if this is exactly what you're referring to, but Emacs has the autoload [1] feature, which lets you declare some functions and in which elisp file to find them. Emacs won't load that file until the function is actually called. This is usually for user-called functions. Most packages use it.

[1] https://www.gnu.org/software/emacs/manual/html_node/elisp/Au...

btilly · on Nov 30, 2016

It is an idea that should not work and it is only happenstance that it ever worked

In the 1980s it was a surprisingly common approach. For example most Unix variants had an undump command to turn a core dump into an executable. See https://www.ctan.org/tex-archive/obsolete/support/undump for example. It was also how Microsoft Office file formats were implemented until about a decade ago.

With the hindsight of experience, it is easy to say that it is a bad way to go and won't age well. But back when Emacs went this way, the problems were not as widely understood as they are now.

jlv2 · on Nov 30, 2016

IIRC, "most UNIX variants" didn't have an undump command. It primarily came with a few huge applications that played these type of games. The two worst offenders that I remember were in fact GNU Emacs and TeX! A few other programs that I dealt with (porting to a new platform) which depended upon a similar hack were Sendmail and Franz Lisp.

The Sendmail "WIZ" SMTP bug was in fact caused by it's use of reloading the frozen configuration file (the pointer to the password was saved in the data section, not the BSS region that was reloaded from the frozen config, so that if you used a frozen config you effectively had an empty password).

btilly · on Nov 30, 2016

My impression comes from the comments in the undump link that I provided.

To your list of other programs you can add Perl. It comes with a dump command to take a core dump that can be turned into an executable. I never used it as anything other than a joke, but I do know several people who independently used that feature in production back in the 1990s.

rurban · on Nov 30, 2016

Looks like a very simple objcopy from binutils.

Maybe we just need to add an objcopy format for reading the bfdname "core" to --input-target. Sounds trivial.

In fact happened in 2001 already: https://sourceware.org/ml/binutils/2001-12/msg00360.html

Not a bad way at all, very convenient. Just ASLR might be tricky.

sooheon · on Nov 30, 2016

Now that we do have the hindsight of experience, and it is time to make that decision again, why are the maintainers going with the clearly smellier approach?

MBCook · on Nov 30, 2016

They have an immediate problem (things are slow/broken on newer glibc) that needs fixing.

Someone has posted sane code to fix the issue in a similar fashion to what they were doing before but without needing to reach into the memory manager.

The maintainer who is mad wants it fixed right (replace the whole thing and make the elisp load fast) but so far neither they nor anyone else seems to have posted code to do that.

Bird in the hand. I would hope they'd just remove the need for the mechanism totally in the future, but for now.... bird in the hand.

btilly · on Nov 30, 2016

What would fixing it right look like?

Make elisp faster. Easier said than done. Any easy wins already happened.

Introduce various lazy loading mechanisms and switch to that by default. That doesn't help any third party plugins that people might use. In fact any such work could create ways of breaking the expectations of said plugins!

Identify/eliminate work spent on dead code paths in initial load. Again, easier said than done. And there are bound to be a lot of, "I don't see what that is useful for, but what third party plugins might need it?"

In the meantime, end users faced with the choice of a slow new release versus a fast old one, will resist upgrading. And if there is a patch that simply makes things fast, that is generally acceptable.

The current moment provides pressure to do things right. If the cheap fix is taken, there are no guarantees on when there will be another opportunity like this one. But pressures exist to take the easy out right now. And if taken, it won't be easier next time either.

chris_wot · on Dec 1, 2016

I'm amazed that Chrome and Firefox load instantly and they regularly interpret scads of DOM input and complex JavaScript code.

Exactly why can't emacs do this again?

db48x · on Dec 1, 2016

The first time you start Firefox, it reads in a whole mess of xul, html, css, and javascript. It parses it all and then dumps it to a binary file that can just be mmap'd back in next time. This is exactly like emacs, except that emacs also dumps out all of its own internal state including the state of the allocator; that makes it more fragile.

jeremiep · on Nov 30, 2016

Because even with experience you don't refactor a 40+ years old piece of software easily without breaking half the ecosystem.

I'm sure there's a whole world of tradeoffs we're not seeing that the maintainers have to deal with.

armitron · on Dec 1, 2016

It's not smellier, it's actually a great feature.

Dumping your own images will be extremely useful to many Emacs users.

tumba · on Nov 30, 2016

This may seem strange to you, but this is actually the way many lisp systems (and other image-based development systems, like Smalltalk) work.

I build software using the commercial develop environment Allegro Common Lisp and an image dump is precisely the way I deliver software. The ultimate deliverable is an executable that loads the image and launches the main execution thread.

weavie · on Nov 30, 2016

This works really well for us. To load up our web server takes a while due to having to load up caches etc.. so we tend to do all that once, dump the image and deploy it - cache already primed.

It had an even bigger advantage for me today. Due to some git shenanigans I somehow managed to wipe out a few days of work. No idea how... Almost resigning myself to having to recode the whole lot I just remembered that I had built an image to test just prior to attempting to check in. Clozure CL saves the source code of each function in the image. I was able to probe the image and pull out all the source code that I had just lost, saving myself a few days of tedious recoding!

Not a recommended use of images - but it sure saved my butt today!

CoffeeOnWrite · on Nov 30, 2016

> Due to some git shenanigans I somehow managed to wipe out a few days of work.

What happened??

weavie · on Nov 30, 2016

I actually have no idea. I had a lot of unchecked work - about a weeks worth. (I know, I should not have let it go that far.. it was just one of those tasks that just seemed to get bigger and bigger.) It was time to check in, so I went through file by file staging it all (using Magit). I then decided to do some more testing. Built a lisp image. Then I became distracted, started thinking about home time etc... so my recollection of what I did is a little hazy. I think I saw I had some changes that I wasn't wanting to commit and test. So I think I stashed them. Then probably surfed hacker news or something.. When I got back to things I suddenly noticed that none of my changes were there any more. Nothing was staged, nothing was stashed. It was all gone. I spent a few hours looking around for it but all to no avail.

It is the second time I have lost work by misusing git. I think the other time I was messing with rebasing and branching and all my work disappeared.

I do really need to learn how to use git properly. But in the meantime I am not taking any more chances. I have set up a zone on our internal SmartOS server which uses ZFS. An hourly cron job on my machine rsyncs my source directory to the zone and then creates a ZFS snapshot. So, hopefully no matter how badly I use git now I will never be able to lose more than about an hours worth of work.

cesarb · on Nov 30, 2016

> so I went through file by file staging it all

When you stage a file in git, what it actually does is to create in its object database a blob with the contents of the staged file, and then it points the corresponding entry in the working tree index to the blob's hash. The blob object is only removed during the periodic garbage collection, and only after a configurable period of time since its creation has passed.

Therefore, once you stage a change, git will keep a copy of it for at least two weeks unless configured otherwise. Even if nothing points to it anymore (due to, for instance, an errant git reset --hard), there are ways to find its object hash (something like git fsck --unreachable, or even a ls -lR in the object database directory), and once you have the hash, you can use git cat-file to retrieve its contents.

weavie · on Dec 1, 2016

Result! git fsck --unreachable has pulled up the stray commit. There it all is, right in front of me. Incredible! Thanks.

weavie · on Nov 30, 2016

Seriously? Wow! I will try that first thing tomorrow. Thanks.

eru · on Nov 30, 2016

Look up git reflog or so.

gknoy · on Nov 30, 2016

Recovering from a lisp image is pretty darned cool. :D

> I had a lot of unchecked work - about a weeks worth > ... It is the second time I have lost work by misusing git

I realize this is _off topic_, but one thing that helps me a lot is making regular small commits (that describe some small idea of the change), and then rebase them later once I get all the linting / tests fixed. (Obviously not on master ;)) It's easy to go too-granular here, but it's very handy to be able to reorder the commits, or squash "fixup" commits, to make things easier to review.

I spend more time rebasing than I probably need to, but on the other hand I've never lost work. ;)

For example (adapted from a PR I was working on this morning ;))

    git commit -m "Add Clone Foo feature"
    git commit -m "Reorganize tests"
    git commit -m "Fix indentation"
    git commit -m "Add generator for Baz items"
    git commit -m "Add test of Clone Foo feature"
    git commit -m "fixup linting in application"
    git commit -m "fixup test linting"
    git commit -m "fix Foo test to use correct selector"

    git rebase --interactive master
    # reorder fixup commits, squash them together with the things they fix

You can always rebase and squish all of these down into one commit later before you make your pull request (or, leave them as-is if your team is OK with un-squished PRs).

TimWolla · on Dec 1, 2016

> # reorder fixup commits, squash them together with the things they fix

The --autosquash option of git rebase can save you some work:

       --autosquash, --no-autosquash
           When the commit log message begins with "squash! ..." (or "fixup! ..."), and there is a commit whose title begins with the same ..., automatically modify the todo list of rebase -i so that the
           commit marked for squashing comes right after the commit to be modified, and change the action of the moved commit from pick to squash (or fixup). Ignores subsequent "fixup! " or "squash! " after
           the first, in case you referred to an earlier fixup/squash with git commit --fixup/--squash.

           This option is only valid when the --interactive option is used.

           If the --autosquash option is enabled by default using the configuration variable rebase.autoSquash, this option can be used to override and disable this setting.

weavie · on Dec 1, 2016

(Not off-topic for me. :-)) I will have a play with all this. Now that I have my source being backed up to zfs I can be a bit more comfortable with experimenting with new workflows.

Thanks

Pyxl101 · on Dec 1, 2016

To help learn to use Git properly, start by making many frequent commits from the command line throughout the course of the day. I would say that an appropriate pace is about once every 30 minutes.

When you're done with the feature, then depending on your preference, you can either merge your branch into the master branch, or you can rebase your changes onto the master branch (my recommendation). In both approaches you may wish to squash your commits, so that you ship perhaps just one commit or fewer than your full set of working commits. Then you prepare the changes for code review and for pushing upstream.

The best way to learn about and become comfortable with a tool is to use it regularly. As you approach mastery it will become a swiss army knife. Frequent commits give you a lot of utility for the same reason that editor undo/redo buffer does, except it's persistent.

If you are worried about screwing things up by running the wrong git commands, then check out git's reflog feature, and how you can use `git reset --hard` to roll back changes via the reflog. Virtually every change you make to your local repository is versioned, and the reflog shows you that history and allows you to roll back.

xenadu02 · on Dec 1, 2016

My first step with git is always creating a new branch, then I commit multiple times a day with "wip" or minor notes as the commit message. It is purely to capture the stream-of-consciousness of the code. When it comes time to create a sensible patch I immediately branch again. The old WIP branch is not touched until I've completely merged the feature. On the new branch I usually reset then commit hunks in some way that makes sense, or if the changes are small squash everything into a single commit.

You can also do the same thing with tags if you like.

There is no penalty to branches in git... use them. Frequently.

weavie · on Dec 1, 2016

I got into trouble at one stage mixing branching with rebasing. This really put me off branching. Creating a new branch to perform the merge is a stroke of genius - thanks! I will be doing that right away.

CoffeeOnWrite · on Nov 30, 2016

Thanks for the explanation! I am not a heavy user of git stash, so not sure if git reflog could have helped you, but I am grateful that somebody steered me toward git reflog early in my learning git. With reflog and frequent commits, it's very hard to lose work, short of deleting the entire project or its .git/ directory. rsync+ZFS also sounds like nice combo for running around git in order to prevent future mistakes destroying work.

sedachv · on Dec 1, 2016

> An hourly cron job on my machine rsyncs my source directory to the zone and then creates a ZFS snapshot. So, hopefully no matter how badly I use git now I will never be able to lose more than about an hours worth of work.

This is a thread about Emacs - no need for ZFS, Emacs can already do versioned file backups for you!

http://stackoverflow.com/questions/151945/how-do-i-control-h...

weavie · on Dec 1, 2016

That is very interesting. I do have Emacs making some sort of backup, but have never managed to get it to work reliably. I'll follow those instructions.

sedachv · on Dec 2, 2016

What was confusing for me is understanding that by default Emacs does not back up on every save. It makes a copy when you first open a file, so only the version of the file before you start editing it is what gets backed up.

_ofdw · on Nov 30, 2016

What sort of software do you develop? By that I mean what field are you in?

Not a lot of commercial lisp projects make the news that I read.

bandrami · on Dec 1, 2016

Well, for instance, this project on which you are reading news right now is one...

_ofdw · on Dec 1, 2016

That'd be why I said "not a lot of" and didn't say "none"

pm215 · on Nov 30, 2016

I agree that dumping an image is not a great idea in the modern world, but the concept(1) dates right back to when emacs was still based on TECO on the PDP-10: http://pdp-10.trailing-edge.com/mit_emacs_170_teco_1220/01/i... (and I suspect the available CPU horsepower back then was such that you pretty much needed to do tricks like this to get quick startup even if you were careful about how much lisp code you ran at startup).

(1) as a part of emacs, that is -- I'd guess that "language runtime takes an image of its pre-initialized state and then just loads that" was used in non-emacs contexts before that.

_19qg · on Nov 30, 2016

The first Lisp from the early 60s did already dump images.

int_19h · on Nov 30, 2016

Did it dump Lisp images, or the image of the entire process (including all the non-Lisp stuff)?

It appears that all the complexity and instability here comes from the latter part. Lisp itself was always designed around this image dumping concept. But C and its standard library were not.

sedachv · on Dec 1, 2016

This is Lisp on the IBM 704/7090 and PDP-1 in 1960-1964. Ritchie would not start work on C until 1969 and there was no standard C library (which, btw, is completely irrelevant to the runtime state of a process) for decades. There were no processes or virtual memory (the 7090 didn't even have overlays) or non-Lisp stuff other than the punch card/paper tape loader.

qwertyuiop924 · on Nov 30, 2016

Actually, many early unixes included undump, which let you build an executable from a coredump.

int_19h · on Nov 30, 2016

And what were the stability guarantees for such things? Could you expect to take such a dump, bring it to a different system (potentially with different hardware, different minor OS revision etc), and expect it to work?

jrochkind1 · on Nov 30, 2016

In those days, you could pretty much never take any software to different hardware, and even different minor OS revision was asking for trouble. What you are asking for isn't something anyone expected back then. Different hardware was _really_ different.

int_19h · on Dec 1, 2016

Right, that's kinda what I was expecting. But that also explains why it was a mistake to adopt this model for Emacs, or at least to stick with it for so long (to clarify, I specifically mean full process dumps, not Lisp world dumps) - it's a different world, and it has been different for a very long time, with software and OS/hardware significantly more decoupled.

sedachv · on Dec 1, 2016

> it's a different world, and it has been different for a very long time, with software and OS/hardware significantly more decoupled.

This argument makes about as much sense for core files as it does for binary executables, which is to say, not very much.

int_19h · on Dec 1, 2016

Why? A binary executable does not rely on the very fine details of the implementation of the OS heap manager, for example, while a dump necessarily would.

sedachv · on Dec 1, 2016

> A binary executable does not rely on the very fine details of the implementation of the OS heap manager, for example, while a dump necessarily would.

What are you talking about? What is it about brk(2)/sbrk(2)/mmap(2) that would be a problem?

int_19h · on Dec 1, 2016

Correct me if I'm wrong, but if you make a process dump of the entire process, that dump would include internal structures used to maintain its heap, no? If those structures change between OS versions, for example, then the process dump from an older version would represent corrupted heap when loaded on a newer version.

sedachv · on Dec 1, 2016

None of the kernel data structures are saved or even visible to the process - the kernel memory (on Linux x64 everything mapped in the upper 128TiB) is not readable by the process.

You can have a problem if the kernel system call interface changes, but that is a problem for every executable. You would need to re-compile everything on the system.

int_19h · on Dec 1, 2016

Are all heap-related structures be stored in kernel memory, though? There's nothing in userspace for bookkeeping?

sedachv · on Dec 2, 2016

The kernel just gives you pointers and sizes via mmap/sbrk, it is up to you to keep track of them.

sedachv · on Nov 30, 2016

> Instead of engineering around this (lazy loading services, fixing things, not doing things nobody needs) they had the great idea they could start the editor once and then core dump the in-memory state of a running editor. Then on later editor starts they would map-in the core dump and instantly be in a (somewhat) good state. Fails all kinds of smell tests and really speaks to bad taste having unbounded consequences.

How else would you compile a dynamic programming language program into a binary executable?

sedachv · on Dec 1, 2016

BTW, something that just came to mind: this "bad taste having unbounded consequences" of copying the entire process memory is exactly how fork(2) works.

firethief · on Dec 1, 2016

Doing something that relies on OS internals is bad taste when done in userspace.

Interface stability aside, what fork does is simpler: a copied process can rely on its environment matching its parent's enough for a large subset of the contents of the duplicated address space to retain their meaning. Serializing a process's address space's contents and then loading it into an "unrelated" process later makes it more complicated to know what memory contents can still be relied on (e.g. ASLR, file handles, etc.).

armitron · on Dec 1, 2016

Nothing of what Emacs or Common Lisp systems or Smalltalk systems do when they dump images relies on OS internals.

Emacs unexec relied on glibc internals but the new portable dumper does away with that and only relies on gasp Emacs internals.

Where do you people come up with that stuff?

sedachv · on Dec 1, 2016

> Interface stability aside, what fork does is simpler: a copied process can rely on its environment matching its parent's enough for a large subset of the contents of the duplicated address space to retain their meaning. Serializing a process's address space's contents and then loading it into an "unrelated" process later makes it more complicated to know what memory contents can still be relied on (e.g. ASLR, file handles, etc.).

Everything you mentioned except for ASLR is a problem for forked processes, and it boils down to kernel state (file descriptors and signals being two that I can think of right now).

melling · on Nov 30, 2016

Yes, I remember building Emacs in the late 1980's and watching the final dump. I guess 30 years later hardware hasn't improved enough to avoid this? People are building editors in JavaScript today. Perhaps the Emacs Lisp engine needs to be tuned a little more? That Guile Emacs has been decades in the making...

fatbird · on Nov 30, 2016

Sounds like it needs the neovim treatment: neomacs.

klocksib · on Nov 30, 2016

Once upon a time this basically happened: xemacs

sooheon · on Nov 30, 2016

God a neomacs written in a faster, more modern lisp would be a dream.

bryanlarsen · on Nov 30, 2016

It's called guile emacs.

https://lwn.net/Articles/615220/

https://www.emacswiki.org/emacs/GuileEmacs

AceJohnny2 · on Nov 30, 2016

Guile Emacs isn't a bottom-up rewrite of Emacs [1], but a project to replace the elisp runtime with Guile. One important thing it does not do is get rid of/convert any ELisp, because of how much is invested in it: 71% of the ~2M LOC codebase is ELisp: https://www.openhub.net/p/emacs

Guile would run the ELisp code. What's holding the project back, IIRC, is that it doesn't yet do so perfectly.

[1] which is something I (and likely most users) frequently think it sorely needs, until we realize A) the monumental size of the task and B) how much refinement the current version already has, as per http://www.joelonsoftware.com/articles/fog0000000069.html

fatbird · on Nov 30, 2016

Interesting. Any idea on why it's not got more traction? Or is it actually on track to become the canonical emacs?

Reading that LWN article, it looks like Guile wasn't a great choice from a community perspective, but more than that, by working internally to the emacs project it's subjected itself to internal standards of interoperability that it'll never really be able to hit (where a pseudo-hostile fork like xEmacs was might have sufficient momentum to force some accommodation).

sedachv · on Dec 1, 2016

> Reading that LWN article, it looks like Guile wasn't a great choice from a community perspective

I think going forward it will be. Between Guile-Emacs, Guix, and GNU Shepherd you have the best-supported Lisp Machine operating system analogue available right now: https://www.gnu.org/software/guix/

I am excited about GuixSD and I think a lot of other people will be.

armitron · on Dec 1, 2016

No it wont, at least as far as Emacs is concerned.

Nobody is stepping up to do the work, plus Guile has serious bugs on Windows and OSX, single-digit number of developers and few, if any, users. So Guile Emacs is really a pipe dream that people like to bring up from time to time.

sedachv · on Dec 1, 2016

> Nobody is stepping up to do the work, plus Guile has serious bugs on Windows and OSX, single-digit number of developers and few, if any, users. So Guile Emacs is really a pipe dream that people like to bring up from time to time.

You forgot to mention that BSD is dying.

This state of affairs is different from any other Lisp implementation how? And why would it stop progress from being made?

armitron · on Dec 3, 2016

I've been happily running SBCL and CCL on Linux and OSX for years without issues. CCL on Windows too.

Guile has 1-2 people working on it part-time, and Windows/OSX are not a priority because:

+ GNU project, duh.

+ Nobody in Guile-land cares about Windows/OSX enough to step up and fix issues.

+ Even if they did, Stallman would tell them not to.

So the state of affairs is indeed very different from the CL Lisp implementations. Not to mention, Guile has pretty much no userbase to speak of. There are commercial entities releasing products with SBCL and CCL in addition to the very healthy opensource community.

_19qg · on Dec 2, 2016

> best-supported Lisp Machine operating system analogue

Hardly. GNU is far from anything Lisp Machine. The Lisp Machine was never a system hacked up by scripts running on top of some Unix copy with a fancy name.

qwertyuiop924 · on Nov 30, 2016

It sorta works, but it's hard to keep it working, so people don't really do much with it.

Shockingly, the compatability was pretty good, last I checked.

If you really want an emacs written in a sane Lisp, and don't mind losing gnuemacs compatability, check out Hemlock. or Edwin.

draven · on Nov 30, 2016

> check out Hemlock. or Edwin.

Or Climacs https://www.common-lisp.net/project/climacs/

It seems there's a rewrite as well: https://github.com/robert-strandh/Second-Climacs

qwertyuiop924 · on Nov 30, 2016

Oh. I forgot Climacs.

Well, there you go. Another CL editor.

_19qg · on Dec 1, 2016

Here is another one:

https://github.com/cxxxr/lem

zeveb · on Nov 30, 2016

As I note elsethread, Guile Scheme is not a 'faster, more modern Lisp,' because Scheme is not Lisp. Scheme is a Lisp-like language, which does have a place in the world, but it's not Lisp, and it's not (IMHO) well-suited for large software projects like emacs.

I personally oppose a Scheme-based emacs because, having moved from C to a more Lisp-like language, I don't think there would be the will to actually move to real Lisp. You might accuse me of letting the best be the enemy of the good (which is a fair criticism, both of my position re. porting emacs & of Zaretskii's re. the dumper), but part of why I use emacs and Lisp is that they are not good: they are the best.

loup-vaillant · on Dec 1, 2016

Emacs-Lisp is dynamically scoped. That's precisely the kind of mistake (yes, mistake), that hinders large scale anything.

And who cares Scheme is not Lisp? It's close, and it's reputedly cleaner. Besides, there are many Lisp dialects, some of them just as different from one another than Scheme is from them.

zeveb · on Dec 1, 2016

> Emacs-Lisp is dynamically scoped. That's precisely the kind of mistake (yes, mistake), that hinders large scale anything.

Yup, default-dynamic scope is a mistake, although in the particular case of emacs it has made certain code very clean (and led to plenty of bugs, as well). Emacs has added lexical scoping, which is a start.

Common Lisp's lexical default and optional dynamic scoping provides the best of both worlds.

> And who cares Scheme is not Lisp? It's close, and it's reputedly cleaner.

Among other things, its continuations are semantically broken, preventing correct implementation of UNWIND-PROTECT.

It's a good language for its intended purpose (teaching & research), but the language as standarised is not well-suited for building large projects (individual implementations, of course, can be quite good — but then one enters the world of implementation-dependence).

Common Lisp is superior for large projects, in part because it's much more fully-specified, in part because that specification includes more features, in part culturally (because Lisp systems have more often tended to be large and long-lasting).

kazinator · on Dec 1, 2016

In TXR Lisp, a Common-Lisp-like language which has (delimited) continuations, I came up with a pragmatic approach w.r.t. unwind-protect which works well.

I introduced "absconding" operators which allow a context to be abandoned (via a non-local exit) without invoking unwinding. Absconding just performs the jump to the exit point, without doing any clean up along the way. That's it!

The idea is basically: why, when temporarily leaving a continuation, should we clean up resources, when we intend to resume the continuation and want those resources intact? If it should happen that we don't resume, so what; garbage collection will just have to take care of it. We wouldn't do that with threads, or coroutines, right? When a thread suspends, we don't close its files for fear that it might never wake up to use them and close them.

With this absconding, I implemented a workable yield construct. I can have code which, say, recursively traverses some structure and yields items to a controlling routine. That recursion can have exception handling and unwinding which works normally, as if the yields were not there. When the recursion performs a normal return, or a throw, all the unwinding takes place normally. The yields use absconding, and so they don't disturb anything. When a continuation is resumed, all of the dynamic handlers are in place, including the unwind cleanups that were never touched. Nothing was torn down.

Also in place are dynamically scoped variables. TXR Lisp's continuations play nicely with those also.

If you compare this brilliantly simple solution to concoctions involving dynamic-wind, dynamic-wind looks hopelessly silly, like what were they thinking when they came up with it?

Just don't unwind when you temporarily leave a context. The sky does not fall. Your CS degree is still hanging the wall. Everything is cool.

pjmlp · on Nov 30, 2016

It was called XEmacs.

hyperhopper · on Dec 1, 2016

As for the more higher level stuff, spacemacs does a great job of that. It really makes configuration files, separation of concerns, vim bindings, and many other things sensible and user friendly. It doesn't touch the lower level code though, as it is just an elisp layer on top of emacs itself.

gkafkg8y8 · on Dec 1, 2016

> Emacs dumper has always been a dumpster fire.

This made me laugh. Thanks!

rurban · on Nov 30, 2016

Interestingly Emacs is not the only project affected by this glibc/dumper dispute.

I added the stone-old patches for perl to my cperl fork, to be able to dump/compile perl scripts to native binaries the fast way.

Improved dumpers are here: https://github.com/perl11/cperl/commits/feature/gh176-unexec Mostly unified error handling and a few darwin segment instabilities. It is very fragile to use with a static library, but ok as dynamic library. Emacs uses the dumper in the main exe, not in a library. Solaris is the easiest to use.

So I know a little bit of the troubles they are talking about here. Dan's portable dumper would be nice to have, XEmacs had this decades ago, but it never made it over to Stallman emacs. Wonder why :)

So looking at the new pdump, it really is horrible. I don't think I want to do that. I'll rather add a proper static malloc to cperl, such ptmalloc3, which is better than glibc malloc, i.e. ptmalloc2, anyway. They never switched to the better version, because it had more memory overhead. And I really can make use of the arena support there. Emacs should try the same. Much easier and much faster. Good bye glibc.

dzdt · on Nov 30, 2016

I am actually surprised the dumper paradigm doesn't get more love. Startup time is an issue for most large programs. The dumper route is a generally applicable way to drastically improve startup time. Think of it as splitting your code into two parts: setup phase which is run at compile time (i.e. pre-dump) and run phase which runs at run time. Undumping substitutes a simple load of a file for the setup phase. What is not to love?

barrkel · on Nov 30, 2016

At one point Borland C++ precompiled header files used this approach: dump the in-memory object graph for the header file whole. Makes more sense than for a whole executable.

tener · on Nov 30, 2016

> What is not to love?

The complexity of that solution. By nature it is very fragile and leads to nasty bugs. But I can see certainly see the benefit of this approach - would be interesting to see this applied to some big frameworks/VMs, such as Java or .NET.

mobilerisotto · on Nov 30, 2016

I've read something about dumping JVM state to decrease Clojure startup time (don't recall the details and I'm on mobile right now to check), so if I'm not wrong it was tried before...

jblow · on Nov 30, 2016

Zaretskii's stance is weird. If you are going to run out of people who can work on the core of the editor's source code, then the editor will die. So the lack of ability to work on the code is the real problem. This is probably because it has accreted way too much complexity at this point, and way too many hacks. Shedding some of those hacks is a very good idea.

If you wanted to keep it the old way, and depend on the nuances of how an allocator stores memory, then ship your own allocator. Video game people do this as a matter of course; it's not a big deal.

Philipp__ · on Nov 30, 2016

Emacs and Vim are great tools. Amazing pieces of software. But in some areas you really see the prospect of time and aging. I used and learned both in some point in time, liked something from both, sticked to Vim, but I kinda felt the best text editor would be hybrid of these two. (Now don't run and tell me to install EVIL in Emacs, I tried that, but modal editing is not the only thing that gives Vim edge over Emacs)

What got my attention recently is Xi (developed by Raph Levien). It is written in Rust, looks fairly interesting, can't wait for it to be in advanced state. I really wouldn't mind some nice, modern, terminal text editor. (I use NeoVim at the moment, and I think it is closest to that, but VimL :cringe:)

qwertyuiop924 · on Nov 30, 2016

Personally, as much as various editors talk about "modernity," there isn't really one that's surpassed Emacs yet. My requirements are fairly simple:

-Text interface: I don't want to be unproductive when I'm without a GUI

-Consistant abstractions and scriptability: Your editor needs to be customizable for whatever task I may need: that means scripting anything that can be done with a UI, customizing pretty much any aspect of editor behavior, the ability to embed curses-style UIs into the editor (when in nongraphical mode), and interfacing with external programs. And yes, these are all features I use. And not just for playing tetris, either.

In order to do this, you need consistant abstractions. For instance, if every window, from your obtions window, to your editing window, to your tetris window, aren't treated the same, aren't fundamentally the same object type, then you've failed as an editor designer.

-Ecosystem: this one isn't so much of a requirement, but it's nice to have a good one. That means lots plugins and other tooling.

The only editors that have these are vim, emacs, and, to a certain degree, acme.

lilyball · on Nov 30, 2016

Huh, apparently Xi is actually a Google project? I didn't realize anyone at Google was even touching Rust.

https://github.com/google/xi-editor

Edit: Well, not actually an official Google project, but it's still on their GitHub account. The last line of the README says

> This is not an official Google product (experimental or otherwise), it is just code that happens to be owned by Google.

AceJohnny2 · on Nov 30, 2016

Interesting. For me the killer feature of Emacs is its customizability. That's not quite the right word, because Emacs's integration with ELisp lets you do things that almost nothing else does. The ability to modify existing functions live is central to its power, and the boundary between "The Editor" and "Extensions" is extremely fuzzy.

Xi seems to take a harder approach to customization, which means that customizers will always be subject to the limitations of the plug-in interface. There will always be a hard boundary between "The Editor" and "Extensions", and I believe that will ultimately limit its usefulness.

sooheon · on Nov 30, 2016

I agree with you. Having tasted the emacs way, I'm constantly frustrated (actually I just simply avoid) by interactions with software that isn't so inspectable, and malleable. Still, never blocking user input is a respectable goal for an editor, and a design philosophy I wish emacs would seriously pursue. I would not mind a "feature" freeze on emacs for the remainder of the decade if it meant absolute responsive editing with asynchronous operations.

agumonkey · on Dec 1, 2016

Emacs customizability is something rare. I had the pleasure to run QBASIC and Turbo Pascal 7 not long ago; and was amazed at the capabilities and speed of these old IDEs. Yet, they were locked. TP7, which is an epic[1] thing, made me feel sad, because the editing features are so basic, almost crippling (no block selection); you physically feel how you miss emacs, where anything is a few LoC away.

[1] text based multifile edition with overlapping windows (including .. ascii window shadowing), invisible compilation times (on a Pentium2), exhaustive help system; all in 800KB.

krylon · on Dec 1, 2016

> you physically feel how you miss emacs, where anything is a few LoC away

That is my problem with IDEs, in a nutshell. There is the running joke that Emacs is a decent operating system in want of a decent editor. The same can be said - more strongly - about Eclipse or Visual Studio.

Some things these IDEs do spectacularly well, for sure, but when it comes to basic text editing, I keep thinking how easy this or that would be in emacs. ;-|

agumonkey · on Dec 1, 2016

Same, and I started in the Eclipse fad, with eclipse plugin being a thing, before I knew how to program emacs (beside default config). The day I realize how general lisp was and how dynamic emacs was I had to pause for a minute.

Last winter I had to use Eclipse (for scala), one day of mild use trigger nasty wrist pain (I play music, I'm used to pushing the mechanics, that was more). And people say emacs causes RSI ;)

Also the Eclipse crowd is completely off the user side. It's all about tech. Microsoft might be better, I didn't use VS since ages. IDEA is said to be really great at ergonomics. But rarely someone brings a lot to the table. (the only recent thing I noticed was parinfer, ambitious and useful). Also people underestimate what a elisp can do when used correctly. See yasnippet, of Fuco litable.el.

krylon · on Dec 1, 2016

> And people say emacs causes RSI ;)

A couple of years back, I actually developed a mild case of emacs pinky. I used to think that was just a joke.

Then I found out how to remap Caps Lock to be a Ctrl key. Never looked back. ;-)

agumonkey · on Dec 1, 2016

I don't even remap. I think my hands ended up morphing into an emacs stockholm syndrom. Or maybe music did it before that. Still I was surprised that Eclipse would revive such painful sensations.

lilyball · on Nov 30, 2016

Modifying existing functions sounds like a recipe for plugin incompatibilities. Vim doesn't let you modify any built-in functions but it seems to be just as powerful.

AceJohnny2 · on Nov 30, 2016

Indeed, architecturally it's just asking for trouble. However it also lets user extend the system in ways that aren't previously planned for. Pros and cons...

agumonkey · on Dec 1, 2016

Am sure all emacs lisp students have had the pleasure to kill it because they rewrote eval without realizing the implications.

sooheon · on Nov 30, 2016

> The editor should never, ever block and prevent the user from getting their work done.

A thousand times this. Emacs and vim are both guilty of failing this most basic principle. It frustrates me to no end that a text editor will hang while font locking text, or printing repl results.

daenney · on Nov 30, 2016

A lot of what used to cause trouble for Vim was fixed with Neovim and the new async stuff Vim 8 introduced. This mainly ensures plugins don't have to block (though a lot still do).

That said, there's still parts of Vim that don't benefit from this and especially on large files can get annoying. But since switching to Neovim I've had a significant drop in the amount of times I have to wait for vim to do something and sit there screaming "oh come on" at my monitor.

qwertyuiop924 · on Nov 30, 2016

...I have never had that problem.

Well, okay. I did once. But still, it's uncommon.

sooheon · on Nov 30, 2016

Maybe eval'ing to the repl is almost a tic of mine. But as a contrived example, (range 100000) is instantaneous in a clojure repl in the terminal, and interminably slow in cider (emacs package for clojure). There's also SO questions like this[1], and I've personally come across it a lot as well. Anytime lines get long emacs grinds to a halt.

[1]: http://emacs.stackexchange.com/questions/598/how-do-i-preven...

qwertyuiop924 · on Nov 30, 2016

Huh. That might be a problem with CIDER, because it never happens to me with Geiser.

sooheon · on Nov 30, 2016

Geiser might just be adding a lot more newlines by default. There is a setting in CIDER to pretty print everything, which alleviates the issue. The core problem is emacs is terrible at displaying/wrapping/navigating/editing long lines.

qwertyuiop924 · on Nov 30, 2016

No, it isn't.

So maybe my configuration is different?

Or maybe the lines aren't long enough. Because it's usually no problem.

The one time it was a problem was when emacs decided to randomly coredump.

quotemstr · on Nov 30, 2016

> The one time it was a problem was when emacs decided to randomly coredump.

Please report these issues.

qwertyuiop924 · on Dec 1, 2016

I will do that.

However, I'll have to do it next time (if there is one). The core file is missing, the circumstances were vague, and I can't reproduce.

busterarm · on Nov 30, 2016

...And I just found the Rust project I want to get my hands dirty on...

topspin · on Dec 1, 2016

> I didn't realize anyone at Google was even touching Rust.

There is also TensorFlow Rust written primarily by Adam Crume at Google. https://github.com/tensorflow/rust

Actually, the following /google repos are all Rust: xi-editor, pulldown-cmark, hat-backup, rspirv, tarpc, rustcxx, font-rs and rust-multihash.

steveklabnik · on Nov 30, 2016

I really enjoyed this talk about Xi https://www.youtube.com/watch?v=SKtQgFBRUvQ

Philipp__ · on Nov 30, 2016

Wow, hi Steve! Yeah, amazing talk, watched it a few times!

PS: big fan of your work! Keep up! :)

steveklabnik · on Dec 1, 2016

Thanks! :D

TeMPOraL · on Nov 30, 2016

> but I kinda felt the best text editor would be hybrid of these two

http://spacemacs.org/

> I tried that [evil], but modal editing is not the only thing that gives Vim edge over Emacs

Could you share some specific things you mean?

Philipp__ · on Nov 30, 2016

Well I used GUI Emacs for the 99% of the time. Maybe I should give terminal version a try... I had problem that i couldn't stop tweaking Emacs. I would do work, and come to the point of "aha, I need to this, no plugin on melpa, lets try hack it", and then I realize Emacs devoured my productivity! I like Emacs and it's Elisp eco system, but sometimes it goes too far. And I don't like that it has so many things included. I feel constantly like I use 0.1% of it...

agumonkey · on Dec 1, 2016

But that's a bit unrelated to modal vs non modal edition. That said, I've always wanted a trimmed down emacs; no org; no mail etc ...

ksk · on Nov 30, 2016

What specifically do you find so great about Emacs? I've forced myself to try it a few times but I couldn't get past the slow performance of the editor itself and the unreliability of basic things like syntax highlighting.

_ofdw · on Nov 30, 2016

For me I find it quite performant once it's actually running. I've never had a problem with highlights either. Usually there is a minor mode for even the most esoteric language

ksk · on Nov 30, 2016

I was using it to write C++. The default cc-mode didn't work properly. I was then pointed to various random plugins that people had written to address various modern C++ syntax issues but each had their own bugs.

sedachv · on Dec 1, 2016

Text editing is basically 90% of command-line interfaces, and Emacs has its own Unix-like shell with Unix-like utilities (that work the same on every platform, even Windows), terminal emulator, file manager, a bunch of email clients (I stopped using any for a while, although I plan to try using mu4e over the holidays), add-ons for any kind of programming language and version control and build system, stuff like calendars and spreadsheets, a file system emulator that lets you transparently work with remote files over a bunch of different protocols (no need to try to set up NFS or whatever on Windows), and all this works the same on any OS and over a terminal connection. It is a great text UI for most computing tasks.

AceJohnny2 · on Nov 30, 2016

In a related discussion, it's interesting to read this overview of work required to get Emacs to support double-buffering on X11. Interestingly, it's by the same guy who proposed the new dumper patch, Daniel Colascione (who lurks around here)

https://www.facebook.com/notes/daniel-colascione/buttery-smo...

(Notes are Facebook's newish blog format)

HN discussion: https://news.ycombinator.com/item?id=12830206

e40 · on Nov 30, 2016

Rather than try to capture the state of the C library's memory-allocation subsystem, it simply marshals and saves the set of Elisp objects known to the editor.

This is what Allegro CL (and I presume all the other Common Lisps) have done for 30+ years. I'm surprised they didn't move to the marshalling idea before, for fear of the Glibc hacks going away.

EDIT:

Actually, unexec() was used in the early days, so the 30+ years is wrong. It's been more like 20+ years.

chriswarbo · on Nov 30, 2016

So according to the article there are three approaches:

- The existing implementation, which misbehaves with newer glibc

- The partially-implemented patches, which replace the existing core dump hacks with a big pile of C

- Some unspecified, potential future optimisations to the elisp loader

To me, that looks more like a roadmap than a dispute.

korethr · on Nov 30, 2016

To me, the dispute is coming from what appears to be Zaretskii wanting nothing less than step 3. I can sympathize; I don't think I would want to receive a big pile of C were I a maintainer. It seems to me that temporary hacks tend to become permanent.

However, I think he's mistaken. As you said, that list makes a decent roadmap. IMO, the best approach to hand is to take the currently offered patch, and continue to work towards specifying and implementing optimizations to the elisp loader.

rurban · on Nov 30, 2016

you forgot the forth:

- use your own malloc, favorably ptmalloc3.

This way you are independent on glibc changes and platform quirks. It is faster, has arena support, and should have been the default glibc malloc anyway.

AdmiralAsshat · on Nov 30, 2016

Why is the core of Emacs still written in C? Is it just fear of trying to rewrite it from the ground up in Lisp, or is there something fundamental to the architecture that makes Lisp unsuitable?

Jtsummers · on Nov 30, 2016

History, it's been in continuous development for over three decades now. And, being open source, a port of the C portions to a new language would require developer effort that doesn't always appear on demand.

Also portability. GNU Emacs supports a fair number of platforms and, consequently, has a lot of things included in it for different platforms, different presentation layers (GUI, CLI). It's not a small task to support the variety of environments that it presently does with a novel implementation.

And then there's the bootstrap issue. Emacs lisp isn't compiled to a standard binary executable format, they use their own bytecode. So even if you did port more portions that are in C to elisp, you still need an environment to run that elisp in, and the problem that the dumper file solves is actually made worse, more elisp needs to be loaded and executed before you get into a functional editor.

There is work to port elisp to run on type of Guile, but it's not done yet. Lacking developer resources and other things. Like the issue of losing support for various environments, it also doesn't 100% support the existing base of elisp packages out there, which is a deal breaker for many people.

AceJohnny2 · on Nov 30, 2016

Because that's the part that needs to interact with the OS at a low-level. Think of it as the "OS bindings" if you will. C is the lowest-common-denominator to do that.

Edit: I'm wrong, see avar's response below. It looks like it's for performance reasons.

avar · on Nov 30, 2016

Relatively speaking this really couldn't be further from the truth.

There's plenty of codebases that implement some programming language and only use C bindings for truly low-level primitives, such as external library calls, memory allocation etc, leaving any substantial logic that's built on top of those OS primitives to a higher level language.

Emacs is not such a codebase, most of the C code by volume is things that could perfectly well be implemented in Emacs Lisp itself, but aren't because Emacs Lisp is relatively slow.

So while it's fine for "scripting" Emacs itself, things like regexes, anything that has to do with low-level character handling, most of the GUI layout of Emacs itself (i.e. the buffer logic etc, not actually calling ncurses or X) etc. is written in C.

Just browse through the C files in src/, most of this clearly has nothing to do with interacting with low-level OS primitives: http://git.savannah.gnu.org/cgit/emacs.git/tree/src

_19qg · on Nov 30, 2016

> most of the C code by volume is things that could perfectly well be implemented in Emacs Lisp itself,

A substantial part of the GNU Emacs core is written in C: the byte code virtual machine, the core Emacs Lisp language implementation, loading/saving images, memory management, ... That's not implementable in the current Lisp architecture (which additionally is single threaded).

If we want to move more of the language implementation in Lisp, that's possible - see implementations like SBCL or Clozure CL. Porting/maintaining these seems more difficult. But then you can implement regexes and other stuff in Lisp.

torrent-of-ions · on Dec 1, 2016

Yes. Basically the Emacs "C bit", is not just a Lisp like Clisp or SBCL, it also has a load of text editor code. Have a "CLmacs", ie. an emacs written in Common Lisp, is something a lot of people would really like. The problem, of course, is the vast amount of elisp that we currently use in our editors.

kazinator · on Dec 2, 2016

You'd think that an Elisp implementation plus the necessary API's could be provided to make a good bulk of that code work.

dima55 · on Nov 30, 2016

Because C compiles to machine code and computers only understand machine code.

gonzo · on Dec 1, 2016

I ported emacs undump to the Convex machines 29 years ago.

http://ftp-archive.freebsd.org/pub/FreeBSD-Archive/old-relea...

Emacs is old and moldy. Let it die.

_pctq · on Nov 30, 2016

> it's also the sort of thing that could give vi a definitive advantage in the interminable editor wars

Is editors war still a thing? I was under the impression that emacs, vim and more recent editors (sublime and atom, to name two) each found their core audience and were quite distinct.

Or maybe it's just that I've been using vim for long enough. I can change my main language every few years, but I can't see myself changing my editor. Maybe the "editor war" is more about hesitating between editors at the beginning.

flukus · on Dec 1, 2016

It's still a thing, we're just temporarily united against some common enemies...

caf · on Nov 30, 2016

I think you'll find that is just Corbet's very dry sense of humour.

dpc_pw · on Nov 30, 2016

I don't get it. In 2016, on i7 with 32GB of RAM, 2 stripped SSDs, emacs (spacemacs actually) still takes around 1-2s to start which forces me to use emacsclient, and that is already the second stage of it's initialization started from the snapshot? I'm impressed.

gkafkg8y8 · on Dec 1, 2016

I've loved and used Emacs for ~20 years, but if Emacs were to become slow, then if I were to have a replacement editor that could do the following (w/no X or window manager) without additional config in Linux, I'd use it instead:

  arrow keys to move
  add and delete text anywhere
  paste from terminal buffer
  ctrl-s -> search (and continue to find next match)
  ctrl-v -> down
  ctrl-esc -> up
  ctrl-k -> kill line
  ctrl-x ctrl-s -> save
  ctrl-x ctrl-c -> quit
  ctrl-a -> goto beginning of line
  ctrl-e -> goto end of line

I don't even use selection anymore, because I can just use the terminal window copy/paste.

stevekemp · on Dec 1, 2016

Antirez, of redis fame, wrote a minimal editor in C which got a lot of love here on hacker news a while back:

https://news.ycombinator.com/item?id=12065217

I took that release and added Lua scripting support to it, which also seemed reasonably popular:

https://news.ycombinator.com/item?id=12137698

It pretty much meets your requirements, and was a fun project. But it did make me realize that writing editors is sometimes harder than you'd think.

gkafkg8y8 · on Dec 1, 2016

Kilo's supported commands are[1]:

  CTRL-S: Save
  CTRL-Q: Quit
  CTRL-F: Find string in file (ESC to exit search, arrows to navigate)

It's available in a lot of well-used distros: https://pkgs.org/search/kilo but doesn't look like it's in Arch, etc.

Kilua looks cool also as it has more similar keybindings to Emacs[2]:

  Ctrl-x Ctrl-o Open a new file in the current buffer.
  Ctrl-x Ctrl-s Save the current file.
  Ctrl-x Ctrl-c Quit.
  
  Ctrl-x c      Create a new buffer
  Ctrl-x n      Move to the next buffer.
  Ctrl-x p      Move to the previous buffer.
  Ctrl-x b      Select buffer from a list
  
  M-x           Evaluate lua at the prompt.
  Ctrl-r:       Regular expression search.

but the goal would be to have that available in a package manager in a default install, so that after logging into any server where I'm a sudoer, I could:

  sudo apt-get install some_package
  sudo yum -y install some_package
  pacman -S some_package
  ...

[1]: https://github.com/antirez/kilo [2]: https://github.com/skx/kilua

da4c30ff · on Dec 1, 2016

If that's all you need, maybe μEmacs[1] could do it?

[1]: https://en.wikipedia.org/wiki/MicroEMACS

pawadu · on Dec 1, 2016

Don't forget mg. It seems to be more used than uemacs these days, possibly thanks to the openbsd project:

[1]: https://en.wikipedia.org/wiki/Mg_(editor)

gkafkg8y8 · on Dec 1, 2016

It's also available in deb/rpm/pkg:

https://pkgs.org/download/mg

Note: pkgs.org list is not all-inclusive; there are more Linux distros in which it's included.

gkafkg8y8 · on Dec 1, 2016

Looks like it's in a few Slackware repos but I'm having trouble finding it in deb/yum/Arch pkg:

https://pkgs.org/search/uemacs

JASSPA version (the package is jasspa, but searching on microemacs just to show results more thoroughly):

https://pkgs.org/search/microemacs

yason · on Nov 30, 2016

Never even heard about the dumper in Emacs. It sounds like a crazy idea that is prone to break at the slightest change. Any shelving-unshelving implementation for processes should come from the kernel which owns the virtual memory mappings: kernel can basically already shelve a running process by just swapping it out to disk completely. And even that scheme is prone to break as soon as anything will do I/O.

FYI, as for Emacs: I just fire up "emacs -nw" inside tmux and let it run for months. I call make-frame-on-display to add a window on my X session but I can close that or restart X without having to kill my Emacs inside tmux, along with a few other long-running processes such as mail reader and IRC clients.

gumby · on Dec 1, 2016

The dumper approach in Emacs predates GNU Emacs; it was how the original TECO Emacs worked.

The real fix would be to make dynamically linkable compiled elisp files (i.e. .so files) and let the system linker make Emacs start quickly just like any other program.

sedachv · on Dec 1, 2016

> The real fix would be to make dynamically linkable compiled elisp files (i.e. .so files) and let the system linker make Emacs start quickly just like any other program.

Dynamic linking is not quick! https://en.wikipedia.org/wiki/Prelink

gumby · on Dec 1, 2016

That's just the cacheing that makes it fast. Obviously it's slower at some level than a static binary like a dump, but no big deal.

cmiles74 · on Nov 30, 2016

I'm not sure I understand, but this seems like a lot of work for little reward. Newer versions of Emacs function with the C library that's missing these dumper hooks and if you have a newer C library, you can update to a new version of Emacs.

This seems like a lot of work for people who want to stick with an older release.

massysett · on Nov 30, 2016

Perhaps a related question: how essential is this functionality? I use Emacs on Mac OS X, which does not even have glibc. Yet the article suggests that this is only present on glibc. So should I be appalled at how long it takes my Emacs to load? Seems to me it's just as fast as it was on GNU/Linux.

to3m · on Dec 1, 2016

It is supported on OS X as well: https://github.com/emacs-mirror/emacs/blob/master/src/unexma...

(And Windows: https://github.com/emacs-mirror/emacs/blob/master/src/unexw3..., https://github.com/emacs-mirror/emacs/blob/master/src/w32hea...)

gomijacogeo · on Dec 1, 2016

Why not use LLVM and create a proper .so or binary?

branchless · on Nov 30, 2016

I'm a little confused by this. I can understand why someone might want to preserve a repl env across invocations however the main reason given is startup time.

Given emacs daemon, which allows connecting with a thin client, how much startup time are we talking here? Can they not start once and connect like the rest of us?

cmiles74 · on Nov 30, 2016

My understanding is that this dumping of state is done during the Emacs build process. You build Emacs, initialize it and then dump out it's state. Every time you launch Emacs, your instance is starting from the dumped state.

branchless · on Nov 30, 2016

But why? We have an emacs init process that is fast enough, why not put your init script in version control and bootstrap emacs this way?

Jtsummers · on Nov 30, 2016

Emacs has two init stages, you only see the latter.

The first is what gets it to the dump file.

The second is where your personal .emacs file gets loaded and executed.

The first is the one that takes too long and motivated them to create the dumper system to begin with, and needs some resolution (elimination and making that first init process faster, switching to a serialization of the objects rather than the C program memory, or something else).

branchless · on Nov 30, 2016

So am I using this dumper system every time I launch emacs? I start emacs "normally" in that I simply launch it from the cmd line with the daemon flag.

Jtsummers · on Nov 30, 2016

In a sense, yes. You're using the product of the dumper. It's used to create an image of the state of the running system during the emacs build, and then delivered to end users like us as part of the emacs installation. That executable image state is loaded, and then your .emacs is called.

For a comparable model, check out the way Smalltalk images (particularly with Squeak and now Pharo) are distributed.

For kicks, try running "emacs-undumped". It's the base version without everything loaded (the dump-file), and part of what's used for creating the dump file during the emacs build process. At least for me it's pretty much unusable thanks to the terminal colors that it seems to insist on using for plain text.

branchless · on Dec 1, 2016

Fascinating, thanks. I wonder what the startup ratio is for emacs without the dump vs emacs -q.

I build emacs from src from gnu savannah and I have no emacs-undumped btw.

Jtsummers · on Dec 1, 2016

That was on OS X. No idea if it shows up on other systems. The man page says it's not meant for end users like us, and that it's to be used with dumpemacs.

vilhelm_s · on Nov 30, 2016

The binary you launch from the command line was produced by the dumper. The dumping only happens once, when you build emacs from source.

ScottBurson · on Dec 1, 2016

The concern about startup time seems a bit odd to me too. I use Emacs in the way it was originally designed to be used: I start an Emacs process and keep it running for weeks. I don't even use the client; when I want to edit a new file, I select the Emacs window and pull up the file from within that. The idea that I would type a shell command every time I want to edit a new file is foreign to me -- I mean, I know that's how dinky little single-buffer editors like nano and pico work, but I would never use a sophisticated multi-buffer editor like Emacs (or IntelliJ IDEA or Eclipse or Visual Studio or ...) in that mode.

BipolarElsa · on Nov 30, 2016

I have a question: what's the benefit of using a text editor when an IDE can perform compiling/interpreting for you at the ready?

Is a Emacs/Vim for coders who don't have to worry about minor errors?

I'm relatively new to programming and I'm curious.

dang · on Dec 1, 2016

We detached this subthread from https://news.ycombinator.com/item?id=13073839 and marked it off-topic.

swolchok · on Nov 30, 2016

A text editor will always be with you, no matter what you're working on. IDEs tend to be for a specific host platform, language, and sometimes target platform. For example, Xcode only runs on a Mac, understands a limited set of languages, and is best used for building Mac and iOS apps. MS Visual Studio only runs on Windows, understands more but different languages, and is best used for building for Microsoft platforms. Android Studio is best used for Android things. IntelliJ and Eclipse are best used for Java.

If anyone has heard of your programming language, there is probably an Emacs mode for it.

throwanem · on Nov 30, 2016

More to the point, Xcode, Visual Studio, Android Studio, IntelliJ, Eclipse, and so forth all have generally different UIs, which means that unless you spend all your development time in a single environment, you have to do a lot of really expensive mental context switching.

An editor like Emacs or Vim provides a unified user interface for all the different languages and development environments with which you use it, which eliminates almost all of that mental overhead - and can also invoke your compiler, (usually) integrate with your debugger, et cetera. IDEs still win on tight integration, but for any language popular enough that someone's invested the time to build a well-integrated IDE for it, Emacs or Vim will generally cover at least the 90% cases quite well, too.

CJefferson · on Nov 30, 2016

Although often, is not very good. The Emacs support for C++11 was awful for years. May be better now, I left Emacs because it.

kevbin · on Dec 1, 2016

https://github.com/Sarcasm/irony-mode

CJefferson · on Dec 1, 2016

If I ever try Emacs again, I'll have a look. In the past I've found clang-based plugins seem to have a half-life of about 6 months.. while Emacs might live forever if your plugins don't, then you are relearning a bunch of things anyway.

PS at first i thought your URL was some kind of joke I didn't get, until I clicked it and found it linked to a genuine project.

sidlls · on Nov 30, 2016

Emacs users don't have to worry about minor errors because we don't make them. Vi users don't, either (they already made a huge error, so all their little ones don't matter).

On a more serious note, a lot of the features of an IDE (e.g. code completion, method lookup, compiling, and version control integration) are trivially turned on with emacs. I use etags (ctags, etc) and have macros bound to various frequently used git commands, auto-compile when I change a translation unit, etc. My Emacs is an IDE. And much more. Vi has similar functionality available to some degree or other and even if it didn't, with evil mode in emacs vi users can have it, too.

lfowles · on Nov 30, 2016

Most if not all IDEs separate compilation from the actual editing. So choice of editor is really orthogonal to whoever is doing the compiling. Ideally you want to be able to do this all from the command line anyways so you can run automated tests when code is checked in. Vim and Emacs can be set up to emulate a pretty convincing IDE experience, but the separation between editor and compiler is much more clear.

To reverse your question: why require an IDE if the same tools work on the command line too?

kazinator · on Dec 1, 2016

Vim is pretty much an IDE. I type :make (not :!make) to build. Vim gathers the errors into a quick-fix list, and navigates through them. I have Vim's :grep bound to doing lid lookups on a mkid-generated database: lightning fast to find all occurrences of an identifier. And of course tags for chasing identifier definitions, thanks to a ctags-generated database. Vim has very good syntax highlighting and indentation; many times I spot as syntax error because it is flagged by Vim. I also have a Vim add-on that performs code completion. I type foo. and the list of members of structure foo come up, etc.

I don't step-debug inside Vim; but it looks like there are ways:

http://vi.stackexchange.com/questions/2046/how-can-i-integra...

armitron · on Dec 1, 2016

Emacs is not a text editor, this is a common fallacy.

It's an entire environment, the closest thing that we have today to the Lisp Machines of myth and legend.

An editor is simply one of the many applications it runs on top of the Emacs Lisp core.

RHSeeger · on Nov 30, 2016

At least for Emacs, one of the things I like about it is the ability to add functionality when I need it. For example, implementing a "go back to the last place I modified this file" (super useful to me, at least) and then binding it a key would likely be hard to "add" to an existing IDE. With Emacs, not so much.

quotemstr · on Nov 30, 2016

It's also very easy to add ad-hoc functionality directly from the editor. You can start with a keyboard macro (F3, do crap, F3, do crap, F4, hit F4 again to do "do crap" as many times as you want) and go all the way up to writin elisp packages. M-x find-function makes it easy to look at the internals of whatever library you're using. defadvice makes it easy to do horrible, but expedient, things and get on with your life. IMHO, if you're not writing at least a little bit of elisp, you're not taking advantage of Emacs's real potential.

mikekchar · on Dec 1, 2016

The thing is that an editor is one of the tools that a programmer uses. We also use build tools, debuggers, profilers, visualisation tools, etc, etc.

IDE stands for "integrated development environment", but to do the "integration" part, what it usually means is that it bundles tools for you. So you are stuck with the editor, debugger, build tool, etc of your IDE. Every time you choose not to use that "integrated" tool reduces the value of the IDE.

For example, if you decide to write your own build files and start the build process off differently than the IDE expects, it's no longer "integrated". If you decide to use a different debugger, it's no longer "integrated".

Primarily, choosing not to use an IDE is mostly about wanting to be flexible about the tools you are choosing. Sometimes the tools in the IDE are really good for your job (for example, some of the refactoring tools can actually make certain IDEs worth it alone in some environments). Other tools, like the simplified build systems, almost always cause more trouble than they are worth in the long run.

When you get down to think about it, how difficult is it to "integrate" a custom set of tools without an IDE? We are programmers. Especially programmers who come from a Unix background are very much used to building their own environments. It's not nearly so hard as it seems -- especially because most of the heavy lifting has already been done by people before you.

In the case of Emacs, it practically is an IDE, except that you can plug damn near anything into it with a little elbow grease. I know of programmers who never leave Emacs. Ever.

So to sum up, there is almost nothing you do in your IDE that I don't also do in my custom built DE. My biggest gripe about IDEs is that I wish they would unintegrate their tools so I could use them separately :-)

yxhuvud · on Dec 1, 2016

What makes you think Emacs cannot do proper syntax checking for me? Just install the right package with the package manager.

db48x · on Dec 1, 2016

You should be able to edit text in a familiar manner regardless of what kind of text it is. It might be source code, or it might be an email, a webpage, a document, your shell, a debugger, or anything else you can think of. No matter what you're editing you'll have the same features, keyboard shortcuts, etc available.

A programmer who wants a good universal text editor will find one that also has good integration with their compiler and debugger; emacs and vim are both pretty good choices there.