Nearly passed based on title alone, was very glad I clicked in. Just the first paragraph alone sets up this excellent & useful neologism, Alien Artefacts:
> The purpose of this blog post is to introduce the concept of alien artefacts1, a subcategory of legacy code. I use the term to describe particularly complicated and important pieces of software written by very smart engineers that are no longer working for the company—and thus not available to support it. The software works really well for what it was designed to do, but it is highly resistant to change.
We have some very wild search code that I think of that qualifies. I also recently added tests & typescript then slowly expanded some really neat code that takes Immutable.js, maps the entities, then caches the result in WeakMap to be re-resolvable latter- sounds simple but there was enough extra stuff also going on & happening that it took some putting heads together to decypher the true inner-workings. Both very distinct features which hadn't had the typical accrual of cruft, mostly untouched since inception, off in their own area. Alien Artefacts feels like a great term.
(BTW what spelling is this? Same word as artifact or no?)
Probably the most alien artefact I ever created was the core abstraction of a C++ game engine that used the curiously recurring template pattern in combination with multiple inheritance.
The idea was to be able to switch between DirectX and OpenGL renderers, or restart the renderer from scratch on failure (i.e.: recoverable GPU driver crash, GPU unplug, restore from sleep, etc...). The problem with using a naive OO v-table based abstraction for this is that it incurs indirect call overheads on every call, including calls that are internal to each implementation. So with some clever template programming and multiple inheritance, I got to have my cake and eat it too: calls within a hierarchy of classes specialised for one back-end were always direct inline-able calls, but you could have shared code that was abstracted across different back-ends.
I opened that source file about a decade later and thought: "What crazy person came up with this madness!?" and then saw my name in the commit history. I was the mad one. The call was coming from inside the house.
I remember writing this kind of stuff. Couple years after I left they'd re-written it, just so they would have somebody who understood it. My friend reported their engineer visiting his office often to exclaim over something he now understood!
It helps to rewrite something with example code in hand. You end up understanding everything.
(I did not mean to be opaque; we were a startup and it was hurry-up-and-be-done-yesterday code)
The main problem is that no one gets rewarded for maintenance. You might as well be invisible like facilities staff. The exception may be if it’s something that’s on the verge of breaking the company. Of course, in that case people will note that it was due to the lack of maintenance, but everyone will eventually ignore it again because most companies do not reward maintenance vs creating something new.
The months spend on building these, are often spent on becoming ___domain expert of one or several sub domains, build into the artifact. Which is not visible to the outside world. Its similar to hiring one or several consultants, but the consultants were that one person, digging into fluid dynamics and what not to write that one system. The problem is often not in the software, but in the "___domain" self-schooling one has to go through, to understand the process happening. Like sending a person to university again, but on the clock with a deadline.
Best example of an alien artifact i could come up with on the fly: The using parties do not understand it, but can reuse it. They will have a hard time modifying it. To understand the properties, a deep understanding of math is needed, and how the inverse square root can be mapped to the binary properties of a float.
I agree with many comments here but especially with this - the domains these alien artefacts touch are the things that makes them so tricky. Nonlinear optimization is one such area I work with regularly.
It seems there is an analogy to python and web programming here - a lot of people can work with things in python/js that are like alien artefacts to them.
Are some areas more prone to alien artefactness than other areas? What about deep learning - how many folks doing it actually understand things vs just use neatly packaged alien artefacts?
One of the systems I've had the honor of inheriting and building upon, I've described as originally gifted to humanity by Ancient Astronauts.
If you told a new-grad about how it was built, they might have a kneejerk reaction, since it went against some of the conventional wisdom they'd heard, in school or in blog posts, such as avoiding Not-Invented-Here Syndrome.
You'd have to find a way to convey to the new-grad that, in some situations, a few programmers can consistently accomplish big things that 10 or 100 programmers operating with the usual parameters could not.
But maybe, if we're a new grad, it's a disservice to us to hear that, just yet. Once we're out of school and in industry, there's a phase of really learning and appreciating the conventional wisdom in practice. Much of the wisdom is also a great default for most situations.
But later on, with experience, there's some point at which we can become a curmudgeonly enforcer who insists that a particular subset of wisdom always applies. Or we might be open to recognizing a situation in which the wisdom actually doesn't apply, or doesn't have to apply. Occasionally, there are huge wins from that.
SolveSpace is an alien artifact. I can safely say that because I've spent months working on it, porting it to Haiku OS. It is validating to know that this phenomenon is real enough to justify a name for it! https://github.com/realtaraharris/solvespace/tree/realtaraha...
We used to call that kind of code hemoglobin, or hemo for short. Hemoglobin is a molecule that's barely changed over hundreds of millions of years, is extremely complicated and does one thing really well -- bind to oxygen and carry it around. Change one or two proteins and it stops working.
Another colloquialism for this kind of code is "happy fun ball". do not feed or annoy happy fun ball...
> Thanks to ChatGPT for valuable feedback and editorial input on early drafts of this post. It was a joy collaborating with it. It provided consistent and helpful feedback, and was always available to assist with any changes or questions I had during the collaboration process2.
That kind of things will for sure go into my ___domain's blacklist
It'll be interesting to see how this attitude holds up over time. I think using (and acknowledging using) AI tools will become normalized, much like mobile phone use has been. I remember a time (less than twenty years ago) when I was going to dinner with friends and one of them made (at the time) the huge social faux-pas of accepting a call at the table. People seriously complained about it. Now that's normal behavior.
It's still rude. I'll accept the quick spouse/so update, the actual emergency or the "hey I might have to talk to x" but that's about it. If you're there to hang out and talk then do that. If you're just going to fuck with your phone the whole time you might as well not be there.
I'll also accept phone use as a furtherance of the discussion, eg pulling up details, addresses, info, memes to share, etc, but that still all focuses back on us being there together, not you being there in name only.
If it might be important I don't mind if people answer and make it brief, like figure out quickly if it's urgent and do the "hey i'm at a restaurant, I'll call you back later" thing. I'm fine with waiting a minute if that saves someone a ton of hassle afterwards.
I would have preferred disclosure at the top, however, for me, it was a good read and the whiff of AI's hand in this was barely noticeable, I enjoyed the content.
Both are tools, they are substantially different.. they serve different purpose..
"I did some research (..), according to the source (..), we know (..), X (..) found out that (..)"
Is different than laying out the knowledge and pretend it was a collaboration between the author and the tool
"the tool", notice the tone, and the personification of the tool by the author, devaluates the people who contributed to the source of the knowledge and everything in-between
I'm not a fan of these "AI generative" tools, not everything is dark tho, i can appreciate how Bing chat always annotate the source of each data it spits; it's a step in the right direction, but I'm afraid this won't be enough to avoid our descent into biased assisted collective intelligence.. perhaps it's over-reaction, perhaps it's the way to go, perhaps it's similar to pre/post electricity, we'll see, but we should be careful, notice I only talk for myself, I have no problem with people using these tools, as long as they disclose it, wich the author did, so I should at least respect that!
Not very different. This actually gives me similar vibes to if author was thanking Grammarly at the end of the post. Immediately think it's paid for by Grammarly or at least question author's intent. It's an ad for a closed commercial product.
I remember such a piece of code, perhaps the shortest ever created! It was 4 instructions in the machine language of ancient British ICL1900 computers. I - together with my colleague - spent some hours to understand it - firstly, what it does and then how and why it works!
The problem was that ICL1900 had a rather obscure method of storing 6-bit characters in its 24-bit words (yes, bytes were unknown in ICL computers). To copy character strings indexing was used but character position was stored on the first two bites of the index word while the word address offset on the right part of the index. Since memory was extremely valuable, the rest part of the index word was used for the counter. Thus fetch and store instructions were working such that after a character was accessed counter was decremented and 1 was added to the 2-bit character position subregister with overflow used to increment address offset every 4th character. Very clever solution because the string copying loop was very tight!
The 4 instruction sequence we tried to interpret turned out to be a macro modifying character index register backward allowing to use of an almost identical loop for copying a string starting from its end... It was using arithmetic overflow in a way we had trouble understanding and we hadn't access to hardware :-)
No, it was at my university computer center. I wanted to improve the Executive program, the simplest operating system for the computer they had. So we took a binary dump of it... :)
I remember such a piece of code, perhaps the shortest ever created! It was 4 instructions in the machine language of ancient British ICL1900 computers. I - together with my colleague - spent some hours to understand it - firstly, what it does and then how and why it works!
The problem was that ICL1900 had a rather obscure method of storing 6-bit characters in its 24-bit words (yes, bytes were unknown in ICL computers). To copy character strings indexing was used but character position was stored on the first two bites of the index word while the word address offset on the right part of the index. Since memory was extremely valuable, the rest part of the index was used for the counter. Thus fetch and store instructions were working such that after a character was accessed counter was decremented and 1 was added to the 2-bit character position subregister with overflow used to increment address offset every 4th character. Very clever solution because the string copying loop was very tight!
The 4 instruction sequence we tried to interpret was a macro modifying character index register backward allowing to use of an almost identical loop for copying a string starting from its end... It was using arithmetic overflow in a way we weren't able to understand, and we hadn't access to hardware :-)
This is a great metaphor. I'm familiar with it, both stuff I've inherited and (unfortunately) stuff I've written myself.
I do try to write a wall of comments for each hard-to-understand abstraction I'm introducing, including the rationale for the design, and why I haven't chosen the alternatives. This has been useful mostly for myself, as I'm still maintaining the same code after 12 years (or more, in some cases).
It's definitely a plus if you still have the alien who created the alien artefact around to answer questions - sort of like Master Yoda in Star Wars...
so to be clear, having such an artifact is better than having classical legacy code, because it shares all the same downsides but has a unique upside, being “well tested, well documented, and elegant”
As someone who has worked with an artifact or a few over the years, I definitely agree. The most colorful example was when I tried to rewrite one from scratch just to thoroughly understand all the business logic in it. By the time I ironed out all the bugs my code was practically identical to the original.
> The purpose of this blog post is to introduce the concept of alien artefacts1, a subcategory of legacy code. I use the term to describe particularly complicated and important pieces of software written by very smart engineers that are no longer working for the company—and thus not available to support it. The software works really well for what it was designed to do, but it is highly resistant to change.
We have some very wild search code that I think of that qualifies. I also recently added tests & typescript then slowly expanded some really neat code that takes Immutable.js, maps the entities, then caches the result in WeakMap to be re-resolvable latter- sounds simple but there was enough extra stuff also going on & happening that it took some putting heads together to decypher the true inner-workings. Both very distinct features which hadn't had the typical accrual of cruft, mostly untouched since inception, off in their own area. Alien Artefacts feels like a great term.
(BTW what spelling is this? Same word as artifact or no?)