SQLite: QEMU All over Again?

diffxx · on Oct 4, 2022

> The few core developers they have do not work with modern tools like git and collaboration tools like Github, and don’t accept contributions, although they may or may not accept your suggestion for a new feature request.

The funny thing about this comment is that SQLite is as close to the gold standard of software quality that we have in the open source world. SQLite is the only program that I've ever used that reliably gets better with every release and never regresses. Could it be that this is precisely because they don't use "modern tools" and accept outside contributions?

clpwn · on Oct 4, 2022

I feel like a lot of fantastic software is made by a small number of people whose explicit culture is a mix of abnormally strong opinionatedness plus the dedication to execute on that by developing the tools and flow that feel just right.

Much like a lot of other "eccentric" artists in other realms, that eccentricity is, at least in part, a bravery of knowing what one wants and making that a reality, usually with compromises that others might not be comfortable making (efficiency, time, social interaction from a larger group, etc).

awesomegoat_com · on Oct 4, 2022

Totally agree.

It is just allowing human element that creates quality craft.

When you are following the best practices, you remove that human element (hyperbole, I know).

When you force certain rules, jiras, stand-up, you increase predictability, but the cost is the lower quality, lower happiness and higher attrition.

chasil · on Oct 4, 2022

SQLite's quality is due to the DO-178B compliance that has been achieved with "test harness 3" (TH3).

Dr. Hipp's efforts to perfect TH3 likely did lower his happiness, but all the Android users stopped reporting bugs.

"The 100% MCD tests, that’s called TH3. That’s proprietary. I had the idea that we would sell those tests to avionics manufacturers and make money that way. We’ve sold exactly zero copies of that so that didn’t really work out... We crashed Oracle, including commercial versions of Oracle. We crashed DB2. Anything we could get our hands on, we tried it and we managed to crash it... I was just getting so tired of this because with this sort of thing, it’s the old joke of, you get 95% of the functionality with the first 95% of your budget, and the last 5% on the second 95% of your budget. It’s kind of the same thing. It’s pretty easy to get up to 90 or 95% test coverage. Getting that last 5% is really, really hard and it took about a year for me to get there, but once we got to that point, we stopped getting bug reports from Android."

https://corecursive.com/066-sqlite-with-richard-hipp/

bsder · on Oct 5, 2022

Even more interesting is right above that:

> he managed to segfault every single database engine he tried, including SQLite, except for Postgres. Postgres always ran and gave the correct answer. We were never able to find a fault in that. The Postgres people tell me that we just weren’t trying hard enough.

chasil · on Oct 5, 2022

I will confess that it is easier to quote the rule than the exception.

That is a profound compliment for Postgres.

Gordonjcp · on Oct 5, 2022

I've always felt like Postgres is like one of those big old Detroit Diesel V12s that power generators and mining trucks and things. It's slow and loud and hopelessly thirsty compared to the modern stuff you get nowadays, and it'll continue to be just as slow and loud and hopelessly thirsty for another 40 or 50 years without stopping even once if you don't fiddle with it.

p_l · on Oct 7, 2022

And then you find out that the slowness was because it was placed in first gear and someone left a limiter on the throttle...

chasil · on Oct 5, 2022

(I should say that it is not at all difficult to crash an Oracle dedicated server process. I've seen quite a few. This doesn't crash the database (usually).

I've never run an instance in MTS mode, so I've never seen a shared server crash, although I think it would be far from difficult.

I might be curious about the type of Db2 that crashed, UDB, mainframe, or OS/400, as they are very different.)

chousuke · on Oct 4, 2022

it's not that "best practices" or any of those things are what causes trouble; it's failing to recognize that they're just tools, and people will still be the ones doing the work. And people should never be treated as merely tools.

You can use all of those things as to enable people to do things better and with less friction, but you also need to keep in mind that if a tool becomes more of a hindrance than a help, you should go looking for a new one.

diffxx · on Oct 4, 2022

> it's not that "best practices" or any of those things are what causes trouble; it's failing to recognize that they're just tools, and people will still be the ones doing the work. And people should never be treated as merely tools.

For me, the concept of best practices is pernicious because it is a delegation of authority to external consensus which inevitably will lead to people being treated as tools as they are forced to contort to said best practices. The moment something becomes best practice, it becomes dogma.

potatoz2 · on Oct 4, 2022

Imagine your doctor or pilot eschewing “best practices” and what your reaction would be. There’s a reason knowledge communities build consensus.

Best practice doesn’t mean you’re at the mercy of the consensus, it just means you have to justify why you should stray from it.

danielheath · on Oct 5, 2022

Doctors “best practices” are handed down by the AMA (or local equivalent). Pilots “best practices” are handed down by the FAA (or local equivalent).

Programmers best practices are handed down by the twitter accounts of consultants. It’s not quite the same thing.

diffxx · on Oct 4, 2022

This comment perfectly encapsulates the point that I am making about best practices: the concept is used as a cudgel to silence debate and to confer a sense of superiority on the practitioner of "best practice." It is almost always an appeal to authority.

No one wants cowboy pilots ignoring ground control. Doctors though do not exactly have the best historical track record.

Knowledge communities should indeed work towards consensus and constantly be trying to improve themselves. Consensus though is not always desirable. Often consensus goes in very, very dark directions. Even if there is some universal best practice for some particular problem, my belief is that codifying certain things as "best practice" and policing the use of alternative strategies is more likely to get in the way of actually getting closer to that platonic ideal.

chasil · on Oct 5, 2022

Perhaps a better example might be "covering indexes," or what Oracle would call an "index full scan."

Is is an idea so efficient that to disregard it is inefficiency.

"I had never heard of, for example, a covering index. I was invited to fly to a conference, it was a PHP conference in Germany somewhere, because PHP had integrated SQLite into the project. They wanted me to talk there, so I went over and I was at the conference, but David Axmark was at that conference, as well. He’s one of the original MySQL people.

"David was giving a talk and he explained about how MySQL did covering indexes. I thought, “Wow, that’s a really clever idea.” A covering index is when you have an index and it has multiple columns in the index and you’re doing a query on just the first couple of columns in the index and the answer you want is in the remaining columns in the index. When that happens, the database engine can use just the index. It never has to refer to the original table, and that makes things go faster if it only has to look up one thing.

"Adam: It becomes like a key value store, but just on the index.

"Richard: Right, right, so, on the fly home, on a Delta Airlines flight, it was not a crowded flight. I had the whole row. I spread out. I opened my laptop and I implemented covering indexes for SQLite mid-Atlantic."

This is also related to Oracle's "skip scan" of indexes.

https://corecursive.com/066-sqlite-with-richard-hipp/

Spooky23 · on Oct 5, 2022

Most software “best practices” are a poorly structured replacement for a manual.

Aviation best practices were written from the outcome of minor and major disasters.

fsckboy · on Oct 5, 2022

> And people should never be treated as merely tools.

maybe on a tight knit team people don't mind being treated like tools because they understand what needs to get done next, and see that it makes the most sense for them to do it, it's nothing personal.

At my freshman year "1st day" our university president gave us an inspirational speech in which he said "people say our program just trains machines... I want you do know we don't train machines. We educate them."

chousuke · on Oct 6, 2022

I'd say that if you have a tight-knit team, you are already doing the very opposite of treating people as tools. There's nothing wrong with having a shared understanding of a goal and then assuming a specific role in the effort to accomplish that goal; people are very good at that.

The problem is when you think of people the same way you think of a hammer when you use it to hit nails: The hammer doesn't matter, only that the nail goes in.

gonz0p0litik · on Oct 4, 2022

Best practices are subjective. What is best practice is for C is not the same as Python.

SQL DBs provide consistency guarantees around mutating linked lists. It’s not hard to do that in code and use any data storage format.

Imo software engineers have made software “too literal” and generated a bunch of “products” to pitch in meetings. This is all constraints on electron state given an application. A whole lot of books are sold about unit tests but I know from experience a whole lot of critical software systems have zero modern test coverage. A lot of myths about the necessity of this and that to software have been hyped to sell stock in software companies the last couple decades.

chousuke · on Oct 6, 2022

"Best practices" are just a summary of what someone (or a group of someones) thinks is something that is broadly applicable, allowing you to skip much of the research required to figure out what options there are even available.

Of course, dogmatic adherence to any principle is a problem (including this one). Tools can be misused, but that doesn't really affect how useful they can be; though I think better tools are generally the kind that people will naturally use correctly, that's not a requirement.

Beltalowda · on Oct 4, 2022

I don't think you need "abnormally strong opinionatedness" or anything else special: all you need is a certain (long-term) dedication to the project and willingness to just put in the work.

Almost every project is an exercise in trade-offs; including every possible feature is almost always impossible, and never mind that it's the (usually small) group of core devs who need to actually maintain all those features.

tobyjsullivan · on Oct 4, 2022

I interpreted "opinionatedness" as meaning they have a clear definition of what sqlite is and isn't, including the vision of where it's headed. That would result in a team with very strong opinions about which changes and implementations are a good or bad fit for sqlite.

Can a project consistently make the right trade-offs without having strong opinions like that?

_the_inflator · on Oct 5, 2022

I see these informations especially in light of the theory of constraints when working on a platform: https://en.wikipedia.org/wiki/Theory_of_constraints

These devs provide a platform and any change to a platform has a huge impact for the users. They have a plan they follow, and in every project are layers. Constraints can be good, when and if applied correctly like in this case.

tyingq · on Oct 4, 2022

Well, and while they don't use git, they do use Fossil. Their explanation for why doesn't make Fossil seem less modern.

https://sqlite.org/whynotgit.html

cryptonector · on Oct 4, 2022

Fossil is not less modern than Git, just less popular.

Under the hood it seems a lot like Git. The UI is more Hg-like. I disagree with D. Richard Hipp's dislike of rebasing, but he's entitled to that opinion, and a lot of people agree with him.

Calling Fossil "not modern" is a real turn-off. TFA seems to be arguing for buzzwords over substance.

saagarjha · on Oct 5, 2022

Perhaps it should pick a better name :P

cryptonector · on Oct 5, 2022

Why? Fossil is a great name, since the past of a software project is... fossilized in the VCS, and looking through it is like doing archeology. No, Fossil's a great name. I just wish it adopted rebase as an optional workflow.

rodgerd · on Oct 5, 2022

When the retelling of the history of virtualisation ignored everything before Xen, I questioned the value of the essay.

When it got to asserting that Fossil isn't modern, I discarded it. Fossil's a DVCS, but unlike git it chooses to integrated project tooling for things like bug management with the code repo. You can argue about whether you like the approach. But 'not modern' is an absurd statement.

fluidcruft · on Oct 4, 2022

Actually leaves me seriously considering fossil rather than git for my next project.

Matumio · on Oct 4, 2022

Agree, after reading up about fossil. Except for one thing: I don't want the "closed team" culture that was intentionally baked into the tool.

When git replaced SVN, it was so empowering that I, as an individual, was able to use the full maintainer workflow without the blessing of a maintainer, privately by default.

Before git, we saved the output of "svn diff" into a .patch file for tracking personal changes. When submitting a patch, the maintainer had to write a commit message for you. With some luck, you even got credited. For sharing a sensible feature-branch, you had to become a maintainer with full access. This higher bar has advantages (it tends to create more full-access maintainers, for one). However, it sends this message: "Yes open source, but you are not one of us."

Yes, fossil has this great feature of showing "what was everyone up to after release X". I miss that in git. (Closest workaround: git fetch --all && gitk --all.) But if "everyone" means just the core devs, then I'm out.

badsectoracula · on Oct 5, 2022

> I don't want the "closed team" culture that was intentionally baked into the tool.

I've been using Fossil for years and TBH i don't see that "closed team" culture you speak of (though i also have almost all of my projects as "open source, not open contribution").

Fossil is a distributed version control and in fact it is "more" distributed than git if you consider that people tend to tie it with centralized services like GitHub to get more than just the VCS part. A Fossil repository contains not only the versioned files, but also a wiki, tickets/bugtracker, forum, chat room, blog/technotes - even the theme is part of it. And since it is a decentralized system, all of it are cloned when you clone the repository.

AFAIK the only limitation (hasn't really tried it myself since i only use it solo) is that for security cloned users aren't "fully" cloned so you'd need to make new users in the cloned repository - you can use the same username though (but in commits, history, etc it'll appear as different users). It'd be useful if "user" and "identity" could be distinguished so that you can have the tool know that two usernames are really about the same person.

Also Fossil works pretty much everywhere - a local server, as a FastCGI server, as a plain old CGI "script", you can even have it as a "CGI interpreter" for ".fossil" files (the repository files - the entire repository is stored in a single SQLite database file) which makes it usable with many shared web hosting services without needing root or even shell access. In a way that is the most decentralized you can go :-P.

arinlen · on Oct 5, 2022

> Fossil is a distributed version control and in fact it is "more" distributed than git if you consider that people tend to tie it with centralized services like GitHub to get more than just the VCS part.

The GitHub blurb makes no sense. Even if N developers standardize their workflow on using a couple of remote repos to exchange work, that does not make the underlying system less distributed/more centralized.

> A Fossil repository contains not only the versioned files, but also a wiki, tickets/bugtracker, forum, chat room, blog/technotes - even the theme is part of it.

That sounds like a major design faux pas. It makes zero sense to tie a chat room/blog/e-mail client/alarm clock to a source code repository.

badsectoracula · on Oct 10, 2022

> The GitHub blurb makes no sense. [..] It makes zero sense to tie a chat room/blog/e-mail client/alarm clock to a source code repository

I think you need to reconsider your senses if you can't see the tie between the two :-P. The "more distributed" part was exactly because of Fossil providing that additional functionality GitHub provides in a decentralized form.

gnubison · on Oct 5, 2022

Fossil is decentralized, just like git?

chillfox · on Oct 5, 2022

It's really nice for small personal projects where you are not expecting outside contributions.

I use fossil for my personal projects and git for work.

kbenson · on Oct 4, 2022

Extremely well written and maintained and high quality as of now and having a plan to make sure that can continue in the future are sometimes entirely different things with needs that oppose each other.

A single person can develop and release extremely high quality software, and as long as it meets the needs of the users (it's not missing a lot of features that a taking a long time to deliver), a single person in absolute control and writing all the code is probably a benefit in keeping it high quality and with less bugs.

It may not follow that the same can be said a few years from now, or even a few months from now, since the bus factor of that project is one, and if "bus events" includes "I don't want to work on that anymore but nobody else knows it well at all" then for some users that's a problem (and for others not so much).

On situation isn't necessarily better or worse than the other (and it's probably something in-between anyway), it really just depends on the project and the audience it's intended for. That audience might be somewhat self-selected by the style of development though.

RcouF1uZ4gsC · on Oct 4, 2022

> Extremely well written and maintained and high quality as of now and having a plan to make sure that can continue in the future are sometimes entirely different things with needs that oppose each other.

I think in this area SQLite has most other open source software beat. SQLite is used in the Airbus A350 and has support contracts for the life of the airframe.

https://mobile.twitter.com/copiousfreetime/status/6758345433...

Other places I have see that they have support contracts through 2050.

This is far more future support than any open source software or likely most commercial software.

diffxx · on Oct 4, 2022

Fair points, although the bus factor is more like 3 or 4 for SQLite as far as I know. The question though is that if the entire team vanished from the face of the earth, what would the impact be? My guess is that either SQLite would be good enough as is for 99% of use cases and it wouldn't need much development apart from maybe some minor platform specific use cases or, if new functionality truly is needed, then it would be better for a new team to rewrite a similar program from scratch using SQLite as more of a POC than as a blueprint.

chasil · on Oct 4, 2022

SQLite is supported until 2050, and will likely outlast many other platforms if this goal is attained.

I hope the bus factor is high enough to reach the goal.

"Every machine-code branch instruction is tested in both directions. Multiple times. On multiple platforms and with multiple compilers. This helps make the code robust for future migrations. The intense testing also means that new developers can make experimental enhancements to SQLite and, assuming legacy tests all pass, be reasonably sure that the enhancement does not break legacy."

https://www.sqlite.org/lts.html

kbenson · on Oct 4, 2022

I was speaking less to the SQLite situation specifically and more to the general idea of "Could it be that this is precisely because they don't use "modern tools" and accept outside contributions?" and how I think teams that are very small and not very accepting of outside help/influence might affect that.

To that end I purposefully compares extremes, and tried to allude to the fact that most situations fall between those extremes in some way. SQLite is more towards one end than the other, but it's obviously not a single developer releasing binaries to the world, which is about as far to that extreme as you can go. The other end would probably be something like Debian.

That's not to say either of those situations have to be horrible at what the other excels at. That singular person could have set things in place such that all their code and the history of it gets released on their death, and Debian obviously has a working process for releasing a high quality distribution.

Quekid5 · on Oct 4, 2022

AFAIUI the company behind SQLite (Hipp & co.) have basically endless funding. Not unlimited, just a good enough budget and not likely to end soon. That's also a big factor.

cryptonector · on Oct 4, 2022

> It may not follow that the same can be said a few years from now, or even a few months from now, since the bus factor of that project is one, and if "bus events" includes "I don't want to work on that anymore but nobody else knows it well at all" then for some users that's a problem (and for others not so much).

You may have been speaking generally, and you'd be right, but specifically the bus factor of the SQLite team and the SQLite Consortium is larger than 1, and they could hire more team members if need be.

If and when the SQLite team is no longer able to keep the project moving forwards, then I think we'd see one or more forks or rewrites or competitors take over SQLite's market share.

kbenson · on Oct 5, 2022

Yes, I was speaking generally. Specific development models have advantages and disadvantages, but those can often be countered by non-development model actions taken to limit those disadvantages. For example, and extremely open development model is likely prone to more bugs and quality problems, as well as a harder to read and work in code base. There are steps to combat that, such as style guides and automatic style converters, numerous reviewers that can go through code to fund bugs and make suggestions for better quality, etc.

It's not so much that one model over the other will have those problems I've mentioned for each, as much as I think those are common things those projects should be cognizant of and take steps to combat.

As you noted elsewhere, it sounds like SQLite has done a lot that mitigates what I see as the inherent disadvantages of their development model, which is laudable. At the same time I doubt the average SQLite developer is as easily and quickly replaced as the average Linux kernel contributor, even if there are specific kernel developers which would be hard to deal with their loss. Sometimes all you can do is mitigate the harm of a problem, not remove it entirely.

cryptonector · on Oct 5, 2022

I dunno, I've seen new team members jump onto projects and thrive where most other members have a decade or two of experience. Once you've been around the block dealing with large and complex codebases, picking up a new one gets easier, so I'm not at all worried about the SQLite bus factor. I agree that much larger projects like Linux can have much larger communities to draw leadership from, but I think SQLite is fine.

Once upon a time I wanted to make a contribution to SQLite, and I tried to negotiate making it, but it was quite an uphill battle. On the other hand, I found the codebase itself quite approachable.

cryptonector · on Oct 4, 2022

> Extremely well written and maintained and high quality as of now and having a plan to make sure that can continue in the future are sometimes entirely different things with needs that oppose each other.

SQLite has both.

kbenson · on Oct 4, 2022

And so does the Linux kernel. There are numerous cases of successes and failures at both ends of that spectrum.

My point wasn't to imply that you can only pick one, but that in some cases choices to maximize one aspect can negatively affect the other if care is not taken, and depending on audience high quality released software is not the only thing under consideration in open source projects. Keeping the developer group small and being extremely selective of what outside code or ideas are allowed might bring benefits in quality, but if not carefully considered could yield long term problems that ultimately harm a project.

cryptonector · on Oct 4, 2022

> > > Extremely well written and maintained and high quality as of now and having a plan to make sure that can continue in the future are sometimes entirely different things with needs that oppose each other.

> [...], but if not carefully considered could yield long term problems that ultimately harm a project.

SQLite has a successful consortium and a successful business model, namely: leveraging a proprietary test suite to keep a monopoly on SQLite development that then drives consortium membership and, therefore, consortium success, which then funds the SQLite team.

This has worked for a long time now. It will not work forever. It should work for at least the foreseeable future. If it fails it will be either because a fork somehow overcomes the lack of access to the SQLite proprietary test suite, or because a competitor written in a better programming language arises and quickly gains momentum and usage, and/or because the SQLite Consortium members abandon the project.

diffxx · on Oct 4, 2022

Very good points. The proprietary test suite is clearly the (open) secret to SQLite's success. It seems to me that it isn't even entirely accurate to describe SQLite as written in C when the vast majority of its code is probably written in TCL that none of us have seen. It's more like C is just how they represent the virtual machine which is described and specified by its tests. The virtual machine exists outside of any particular programming language but C is the most convenient implementation language to meet their cross platform distribution goals.

If someone did want to carve into SQLite's embedded db monopoly, it would take years to develop a comparable test suite. This seems possible, particularly if they develop a more expressive language for expressing the solutions to the types of problems that we use SQLite for. Who would fund this work though when SQLite works as well as it does?

menaerus · on Oct 4, 2022

MongoDB folks started working on the SQLite alternative - Realm.

kbenson · on Oct 5, 2022

Ultimately, the long term harm I was thinking of (for the most part) was lots of proprietary knowledge being lost as a developer is lost for one reason or another, and a resulting loss in quality and/or momentum in the project as that developer may represent a large percentage of project development capacity.

That a large chunk of this knowledge appears to have been offloaded into a test suite is good, and does a lot to combat this, but obviously nothing is quite as good as experience and skill and knowledge about the specifics of the code in question.

As a theoretical situation, how much more likely is a fork to eventually succeed if one or more of the code SQLite developers is no longer available to contribute to SQLite? There are a lot of variables that go into that, but I would feel comfortable saying "more likely than if those developers were still present". That idea encapsulates some of the harm I was thinking of.

cryptonector · on Oct 5, 2022

Institutional knowledge, and leadership, is indeed critical. The knowledge of a codebase can be re-bootstrapped, and its future can be re-conceived, but actually providing leadership is another story. I think there's one person on the SQLite team besides D. R. Hipp who can provide that leadership, but I'm not sure about business leadership, though who am I to speculate, when I don't really know any of them. All I can say is that from outside looking in, SQLite looks pretty solid, and libSQL seems unfunded.

arka2147483647 · on Oct 4, 2022

> Extremely well written and maintained and high quality as of now ...

As I see it, the quality comes from the fixed need, tight scope and small professional team

Fixed need -> Somebody always needs a database

Smaller scope -> Less features, less code, less bugs

Professional team -> No newbies around to break things

Now, to be sure, this kind of development model can only work for some projects.

dathinab · on Oct 5, 2022

Also tooling focused on their dev flow.

In difference to the article implying that they use out-date project tooling they don't(). They VCS isn't out-date but in some way more modern then git, it's just focused on dev flows similar to theirs to a point where using typical git dev flows will not work well. Similar not using GitHub is a more then right decisions for how the project is manage, github is too much focused on open contribution projects.

(): You could argue they use "not modern" tools like C and similar to do modern things like fuzz testing. But the articles author clearly puts a focus on the project tooling highlighting git/GitHub so reading it as implying that their VCS is "not modern" i.e. outdated i.e. bad seem very reasonable IMHO.

tablespoon · on Oct 4, 2022

> Extremely well written and maintained and high quality as of now and having a plan to make sure that can continue in the future are sometimes entirely different things with needs that oppose each other.

Software that's truly "extremely well written and maintained and high quality as of now" has the option of a plan like:

> "At the time of my death, it is my intention that the then-current versions of TEX and METAFONT be forever left unchanged, except that the final version numbers to be reported in the “banner” lines of the programs should become [pi and e] respectively. From that moment on, all “bugs” will be permanent “features.” (http://www.ntg.nl/maps/05/34.pdf)

If your software needs perpetual maintenance, that's a good sign that it's probably not that high quality.

mschuster91 · on Oct 4, 2022

> If your software needs perpetual maintenance, that's a good sign that it's probably not that high quality

The problem in a lot of cases is not the software per se, but changing environments. Windows upholds backwards compatibility to a ridiculous degree (you can still run a lot of Win95 era games or business software on Win10), macOS tends to do major overhauls of subsystems every five-ish years that require sometimes substantial changes (e.g. they completely killed off 32-bit app support), but the Linux space is hell.

Anything that needs special kernel modules has no guarantees at all unless the driver is mainlined into the official kernel tree (which can be ridiculously hard to achieve). Userspace is bleh (if you're willing to stick to CLI and statically linking everything sans libc) to horrible (for anything involving GUI or heaven forbid games, or when linking to other libraries dynamically).

The worst of all offenders however is the entire NodeJS environment. The words "backwards compatibility" simply do not exist in that world, so if you want even a chance at keeping up with security updates you have an awful lot of churn work simply because stuff breaks left and right at each "npm update".

parminya · on Oct 4, 2022

You say nothing seriously false, but perhaps depending on things like NodeJS just inherently means your software is going to be poor quality. If that is true, then both you and the PP are probably right. I tend to think software quality will be higher if you depend on a third-party collection (such as so-called Linux distributions) than a second-party aggregation (such as NPM).

mschuster91 · on Oct 4, 2022

Even the distributions have a hard time with the NodeJS environment and its relentless pace - and the more software gets written in JS the worse. When e.g. software A depends on library [email protected] and software B on library [email protected], and X has done breaking changes, what should a distribution do?

Hard forks in the package name (e.g. libnodejs-x-1.0 and libnodejs-x-1.1) are one option, but blow up the repository package count and introduce maintenance liability for the 1.0 version. Manually patching A to adapt to the changes in X works, but is a hell of a lot of work and not always possible (e.g. with too radical changes), not to mention if the work should be upstreamed, then licensing issues or code quality crop up easily which means yet more work. Dropping either A or B also works, but users will complain. And finally, vendoring in dependencies works also, but wastes a lot of disk space and risks security issues going unpatched.

And that's just for final software packages. Dependency trees of six or ten levels deep and final counts in the five digits are no rarity in an average NodeJS application.

Importing even the bare minimum introduces an awful lot of work and responsibility to distributions.

1vuio0pswjnm7 · on Oct 4, 2022

When NetBSD imported sqlite and sqlite3 into their base system that was a signal to me that SQLite is no-nonsense and reliable. That was many years ago, around 2011 I think. Not sure why SQLite is getting all the attention on HN lately. Usually more attention means more pressure to adopt so-called "modern" practices and other BS.

SQLite is interesting to me because like djb's software its author is not interested in copyrights.^1

1. https://cr.yp.to/publicdomain.html ^2

Here is how Hipp abandons his copyrights:

The author disclaims copyright to this source code. In place of a legal notice, here is a blessing:

   May you do good and not evil.
   May you find forgiveness for yourself and forgive others.
   May you share freely, never taking more than you give.

Apparently this is not be enough to convince some folks they can use the code (maybe they really are doing evil), and so there is also a strange set of "reassurances" on the website:

https://sqlite.org/copyright.html

It seems that is still not enough and so there is actually an option to pay for a "license" to software that is in the public ___domain. Don't laugh.

https://sqlite.org/purchase/license

2. I seem to recall an open source OS project or two making a fuss about djb's software being public ___domain but perhaps I am remembering incorrectly.

bombcar · on Oct 4, 2022

Public Domain doesn't exist in some countries, so people/companies from those countries want an assurance that their country's laws understands.

ad404b8a372f2b9 · on Oct 4, 2022

This part also rung some alarm bells for me. It makes me think the author is unable to see outside his bubble, and that feeling is only reinforced by the comments about Rust and the CoC in the Readme.

ikiris · on Oct 4, 2022

Yeah the guy clearly has some serious NIH syndrome.

tingletech · on Oct 4, 2022

I think you are right.

Their rational for not using git seems reasonable to me https://sqlite.org/whynotgit.html

marcodiego · on Oct 4, 2022

I'm all for minimizing friction for contributors, but when I read things like: "The few core developers they have do not work with modern tools like git and collaboration tools like Github", I wonder if the collaboration from someone who refuses to send a patch to a mailing list (because it is not what they are used to and don't care to learn how to do) is really worth considering. I mean: if someone is not wanting to move a few millimeters out of their comfort zone to make a contribution is, very likely, someone who has very little commitment or will try to force their opinions and methods onto others.

somat · on Oct 4, 2022

The irony is that sqlite uses fossil which is more modern than git.

But really, I agree, the elephant in the room is that any time someone use the term "old" or "not modern enough" or "legacy" it means they have a system they don't under stand that they want to get rid of. software does not "wear out".

Kwpolska · on Oct 4, 2022

SQLite does not accept any patches from anyone.

dathinab · on Oct 5, 2022

It does but you have to go through the maintainers and they have to be in line with the core principles of SQLite and have the necessary code quality etc.

I.e. it's hard to a point you can just say it's impossible for most people.

But what the author of the article fails to mention is that many of the things libsql wants to add to sqlite are in direct conflict with the core principles of sqlite.

E.g. SQLite: Max portability by depending on a _extreme_ small set of C-Standard C-Functions. libSQL: lets add io-uring a Linux specific functionality more complex then all the C-Standard C-Functions Sqlite depends on together.

E.g. SQLite: Strongly focused on simplicity and avoidance of race conditions by having a serialized & snapshot isolation level without fork-the-world semantics (i.e. globally exclusive write lock). libSql: Lets make it distributed (which is fundamental in conflict with the transaction model, if you want to make it work well).

E.g. SQLite: Small compact code base. libSql: Lets include a WASM runtime (which also is in conflict with max portability, and simplicity/project focus).

hanche · on Oct 5, 2022

> It does but you have to go through the maintainers and they have to be in line with the core principles of SQLite and have the necessary code quality etc.

Even so, I think they’ll prefer to rewrite the contribution. They need to be absolutely sure not to incorporate any copyright encumbered code by mistake.

dathinab · on Oct 5, 2022

given how SQLite is used everywhere that's a pretty good idea

lliamander · on Oct 4, 2022

Personally I find Fossil a really compelling VCS. I plan to use it for my next personal project.

binkHN · on Oct 5, 2022

> The funny thing about this comment is that SQLite is as close to the gold standard of software quality that we have in the open source world. SQLite is the only program that I've ever used that reliably gets better with every release and never regresses. Could it be that this is precisely because they don't use "modern tools" and accept outside contributions?

Reminds me of OpenBSD, who still primarily uses CVS for source control.

deterministic · on Oct 5, 2022

In other words, SQLite is clear evidence that using “modern” tools, “best” practices etc. is not necessary to creating world class software.

And other projects clearly demonstrate that using “modern” tools and “best” practices often still result in really bad low quality software.

So in conclusion: “modern” tools and “best” practices are empirically proven to not be important when it comes to creating world class software.

dathinab · on Oct 5, 2022

but SQLite IS using modern tools, you could say their VCS is more modern then git. It is just not compatible with a lot of git work flows due to being focused on workflow not working well with git.

It is also following modern best practice like:

- use the best tool for the job (i.e. not git or GitHub)

- consider upfront how the project can long term be maintained (i.e. realize that you don't have resources to manage/moderate an public issue tracker/PRs and that you don't want to delegate this work to 3rd parties you barely know)

- keep things simple, i.e. no global exclusive write lock and serialized isolation level (instead of subtle race condition and/or fork the work handling etc.)

- test a lot, use fuzzing etc.

- limit features to you targeted use-cases to keep complexity in check (maintainability, bug avoidance)

- opinionated code style, formatting

- clear cut well defined dev/contribution flow (for the few which can contribute directly)

I.e. if we ignore superficial modern best practices like "use exactly this tool" I don't know which modern best practice it does not fulfill(). Through some are not fulfilled in the way people are used to.

(): Ok, maybe they don't keep to: Prefer languages with more guard rails as far as possible. Through due to their targeted compatibility/portability C is kinda the only option.

bawolff · on Oct 4, 2022

This make it sound like sqlite isn't using source control at all. What they are actually doing is using a more obscure source control program than git. Honestly who cares? Source control is source control.

bena · on Oct 4, 2022

While Fossil may be obscure, it's also dog fooding for them. Fossil uses SQLite. It is built on top of SQLite.

choletentent · on Oct 4, 2022

Very similar approach is taken by the Lua developers.

nxpnsv · on Oct 5, 2022

Yup, the choice of tools is a terrible metric for software quality.

Ar-Curunir · on Oct 4, 2022

You're conflating two different arguments: not using modern tools, and not accepting outside contributions. It's certainly possible that limiting contributions to a set of trusted contributors helps things move smoothly.

However it's not clear at all that using old tools has the same effect.

notRobot · on Oct 4, 2022

They don't use old tools, they use different tools.

Git development started in 2005, Fossil development started in 2006, they'ré equally modern, and largely do the same things, just in different ways.

diffxx · on Oct 4, 2022

Also, Fossil is essentially implemented _in_ SQLite. Fossil is used to develop SQLite which is used to implement Fossil. It's a virtuous cycle. For the SQLite project, using Fossil is obviously superior to git. This doesn't mean that arbitrary projects should use Fossil over git.

ThunderSizzle · on Oct 4, 2022

The opposite can be said though, but git arbitrarily won over a bunch of other DVCS systems (another e.g. hg), mainly because of bandwagon and marketing.

ok_dad · on Oct 4, 2022

> old tools

I have some of my great-grandpa's carpentry tools, and I use them often. I guess I should go out and replace perfectly good tools with new stuff from a big store like Home Depot or Lowe's?

JAlexoid · on Oct 4, 2022

People forget that Stanley #4 plane hasn't really changed in over 100 years. It's still one of the best tools out there.

We dumped CVS because it was a poor tool, for the time. Subversion was better. Then completely distributed systems became better, because connectivity and computational power came about.

tolciho · on Oct 4, 2022

CVS never managed to wedge itself and eat my files like Subversion did. I went back to CVS.

outworlder · on Oct 4, 2022

Ew.

If you want to say something, say it. "I don't like that my contributions aren't accepted, so I'm forking the codebase". That's going to generate some discussion, but it's fine. Public ___domain and all.

However,

"look what happened to qemu, same thing will happen to sqlite (and I'm contributing to the problem and forking it)". One can't say "no contributions led to fragmentation" while _at the same time_ contributing to fragmentation by making a hard fork!

Also,

> "However, edge computing also means that your code will be running in many geographical locations, as close as possible to the user for the better possible latency. Which means that the data that hits a single SQLite instance now needs to be replicated to all the others."

Then replicate away. Leave that stuff out of SQLite. If that's a really important use-case, go use couchdb or something similar.

chasil · on Oct 4, 2022

It's wonderful that there is a fork of SQLite. It's good to see new ideas.

Is Airbus going to use it in the A350? No.

Why not? It's not compliant with DO-178B, because it has not been confirmed correct with "Test Harness 3" (TH3).

dathinab · on Oct 5, 2022

people are not annoyed by anyone forking SQLite

they are annoyed by people misrepresenting facts in a subtle manipulative way to make their fork and the reasons around it look like something it isn't

what the article intentionally or unintentionally conveys in its tone and formulation is: "sqlite is badly outdate beyond saving and needs to be replaced"

what actually is the case: "we want something like SQLite but with some core changes incompatible with SQLites core principles which should be API compatible enough to allow reusing a lot of DB tooling".

elteto · on Oct 4, 2022

There is something distasteful about this announcement but I can’t quite pinpoint it. Maybe it feels like a bait and switch announcing their own fork after the whole qemu commentary, or the wording about the code of conduct. I don’t know.

One of the greatest things about SQLite is how easy it is to embed in random targets/build systems/languages: a .c and .h and you are all set. Moving away from that model will turn off many people away so I hope they retain that model.

hitekker · on Oct 4, 2022

Trust your gut. It's a public shaming campaign disguised as a history lesson. Glauber Costa is trying to seem welcoming in one breath and, in another breath, is sniping at the same people he claims to want to join.

> We [Glauber Costa] take our code of conduct seriously, and unlike SQLite, we do not substitute it with an unclear alternative. We strive to foster a community that values diversity, equity, and inclusion. We encourage others to speak up if they feel uncomfortable.

With zero new code to justify this fork, this article is little more than a silly power trip by a flailing startup.

dathinab · on Oct 5, 2022

They do have clear goals of what to add, their goals make somewhat sense for some use-cases. But pretty much all of them direct conflict with at least one and sometimes multiple SQLlite core principles...

What the author wants as far as I can tell is a embedded distributed (edge) database with some C-API and SQL compatibility to SQLite so that you can use it with existing tooling (like e.g. by linking against it instead of sqlite) but not necessary as a drop in replacement (different transaction semantics), which doesn't need the same degree of portability as sqlite.

Through that is not quite what the author ends up communicating in the article. I wonder what degree of the formulations are intentionally manipulative compared to accidentally badly formulated.

diffxx · on Oct 5, 2022

Regardless of intention, the inability of the author to adequately explain why a fork was necessary, apart from some hand wavy arguments that some other solutions were inadequate, does not inspire confidence. I find it very hard to be generous with the author given the way that he paints himself as a visionary while casually dismissing the work and perspective of the people who actually did the real visionary work of seeding the technologies that he has hitched his wagon to.

ColonelPhantom · on Oct 4, 2022

The README has "Use Rust for new features" in it, so I doubt it will retain the same simplicity.

As much as I like Rust, and despite mixing Rust and C++ being the clear path forward for Mozilla, I'm not so sure it's the winning approach here. Part of the beauty of SQLite is the single .c/.h thing.

That said, I can see that maybe they're trying to expand the use cases of libsql compared to SQLite. That seems to be the whole idea with adding support for e.g. distributed databases, which is something SQLite just doesn't bother with at all, and would introduce a ton of external interfacing regarding networking. SQLite uses only the standard C library, and even then barely scratches its surface.

Also, SQLite's VFS API can do a lot there already. For example, I remember seeing SQLite compiled to WASM, using a VFS that downloads a remote database on S3 using HTTP Range requests. (I don't think it supported writing to the database, but it was still a really cool way of allowing complex client-side querying of a dataset that is static but too big to transfer).

sgbeal · on Oct 4, 2022

> Part of the beauty of SQLite is the single .c/.h thing.

Noting that it's not _developed_ that way. Its many files are compounded together by the build process to produce the amalgamation build. In a mixed-language project (Rust for some, C for others) a single-file distribution literally won't be possible and it will require as many toolchains to build as there are languages involved.

ReactiveJelly · on Oct 5, 2022

> a single-file distribution literally won't be possible

Actually.

You could compile the Rust into Wasm, then the Wasm into C.

Firefox did this last year [1], so the tools exist and it's neither totally impossible nor totally stupid.

The penalty is less than JITting or interpreting Wasm.

wasm2c is the tool they apparently used in 2021. [2]

[1] https://hacks.mozilla.org/2021/12/webassembly-and-back-again...

[2] https://github.com/WebAssembly/wabt/tree/main/wasm2c

> wasm2c takes a WebAssembly module and produces an equivalent C source and header.

bpye · on Oct 4, 2022

I wonder if mrustc [0] would be sufficient to retain the amalgamated build even if Rust were adopted. The regular Rust tool chain would be needed for development still, but if simply depending on the library the Rust components could be transpiled to C…

[0] - https://github.com/thepowersgang/mrustc

chasil · on Oct 4, 2022

TH3 requires the components to be separate for testing.

But that is not how it is meant to be used.

sgbeal · on Oct 4, 2022

> Also, SQLite's VFS API can do a lot there already. For example, I remember seeing SQLite compiled to WASM, using a VFS that downloads a remote database on S3 using HTTP Range requests. (I don't think it supported writing to the database, but it was still a really cool way of allowing complex client-side querying of a dataset that is static but too big to transfer).

<https://github.com/phiresky/sql.js-httpvfs>

NoGravitas · on Oct 4, 2022

It also feels gross to me, but the only thing I can put my finger on is citing webshit as the reason to completely change the direction of a well-loved project.

ReactiveJelly · on Oct 5, 2022

But they aren't compelling the SQLite devs to do anything.

If a fork catches on, then it's what people wanted. If it doesn't catch on, then let it fail.

dathinab · on Oct 5, 2022

yup but more likely their changes will make it unusable for a bunch of SQLite use-cases, so if it catches on it will probably exist in parallel to SQLite and once it's realized they won't replace SQLite should increasingly diverge by focusing on it's core target audience.

throw_a_grenade · on Oct 5, 2022

> There is something distasteful about this announcement but I can’t quite pinpoint it.

I think it's a general lack of gratefulness. The author considers sqlite as something to be taken without even saying thank you to drh. All he has to say is, that's a nice project you have there, it would be shame if something happened to it. Like to qemu.

matheusmoreira · on Oct 5, 2022

> The author considers sqlite as something to be taken without even saying thank you to drh.

Well, it is in the public ___domain. Respecting and thanking people is of course a very nice thing to do but the fact is there are zero restrictions on what can be done with this software.

> that's a nice project you have there, it would be shame if something happened to it

That's always possible in all free and open source software development. Anyone can show up and just start working harder than whoever's currently in charge. Corporations can show up with paid full time developers and completely displace a project's leadership. Eventually the fork will accumulate so many improvements it will become the de facto upstream.

It remains to be seen if this is what will happen to SQLite and this new libSQL. Who knows, right? I don't expect SQLite to go anywhere though.

tptacek · on Oct 4, 2022

Just a note that LiteFS isn't a distributed filesystem; despite the name, it's not a filesystem at all, but rather just a filesystem proxy. It does essentially the same thing that the VFS layer in SQLite itself does, but it does it with a FUSE filesystem, so you don't have to change SQLite's configuration in your application. As for the "distributed" part of it: LiteFS has a single-writer multi-reader architecture; it's the same strategy you'd use to scale out Postgres.

It's a little ironic to see LiteFS brought up in relation to a SQLite fork, since the premise of LiteFS is not changing the SQLite code you link into your application. Much of the work in LiteFS is about cooperating with standard SQLite.

At any rate: it seems somewhat unlikely that a hard fork of SQLite is going to succeed, in that part of the reason so many teams are building SQLite tooling is the trust they have in the current SQLite team.

xmonkee · on Oct 4, 2022

>LiteFS has a single-writer multi-reader architecture; it's the same strategy you'd use to scale out Postgres

I'm curious about how such a strategy deals with applications that update a value and then read the updated value back to the user, since there might be a replication delay between the write (that goes to the master) and the read (that comes from the closest replica). Do you make optimistic updates on the client or do you club (write-then-read) operations into a transaction?

benbjohnson · on Oct 4, 2022

LiteFS author here. It depends on the application. You can check the replication position on the replicas and simply wait until the replica catches up. That requires maintaining that position on the client though.

An easier approach that works for a lot of apps is to simply have the client read from the primary for a period of time (e.g. 5 seconds) after they issue a write. That's easy to implement with a cookie and it's typically "good enough" consistency for many read-heavy applications.

kasey_junk · on Oct 4, 2022

This is an application (and frequently data type) dependent decision. Some data is safe to return on acceptance others need to wait for acknowledged writes.

The most naive solution is to just make all writes slow but acknowledged.

mmastrac · on Oct 4, 2022

Has sqlite been forked before? This is the first true fork and re-license attempt that's caught my eye. The others I've seen are "ports" and "modifications", but always pointing people back upstream.

It's possible that the appetite for a SQLite fork is there, but nobody has provided it.

gaius_baltar · on Oct 4, 2022

> Has sqlite been forked before?

There is a fork called SQLCipher with native encryption. Probably lots of others too, but this one is the one that I remember.

vhanda · on Oct 4, 2022

I do remember the author of LMDB [0] porting some parts of sqlite to use lmdb instead, and then talking about the results. A quick googling doesn't seem to give me a result though.

[0] - http://www.lmdb.tech/doc/

acrispino · on Oct 5, 2022

https://github.com/LMDB/sqlightning

https://github.com/LumoSQL/LumoSQL

cryptonector · on Oct 4, 2022

TFA lists forks. rqlite comes to mind.

otoolep · on Oct 4, 2022

rqlite[1] author here. I wouldn't consider rqlite a fork in any sense, just so we're clear. That rqlite uses plain vanilla SQLite is one of its key advantages IMHO. Users have no concerns they're not running real SQLite source.

That said, there are some things that would be much easier to do with some changes to the SQLite source. But I think the message that rqlite sits on top of pure SQLite makes is still the right choice.

[1]: https://github.com/rqlite/rqlite

cryptonector · on Oct 4, 2022

My mistake. Thanks for the correction!

yellowapple · on Oct 4, 2022

Is rqlite a fork? It strikes me more as an application that uses SQLite as a dependency, rather than a fork of SQLite itself.

otoolep · on Oct 4, 2022

Nope, as explained in my other comment. It is, as you say, built on top of SQLite.

dathinab · on Oct 5, 2022

> LiteFS has a single-writer multi-reader architecture;

Note that SQLite transactions are also fundamentally single writer, multiple reader (isolation level serializized with snapshot isolation and no work-forking).

So if you want to make SQLite distributed you will have to end up with not just a single writable replicate but a global write log on transaction level. Which is very very bad for performance for many use cases. (E.g. in PostgreSQL if you use serialized & snapshot transaction isolation it's "forking the world" for parallel write transactions and if multiple transactions have no overlap they can complete in parallel without a problem (in parallel but on the same single write enabled replica). This is good enough for quite a bunch of applications and can still have a decent throughput).

As far as I can tell many use-cases which could profit from a distributed SQLite would not really like a per-transaction global lock but would be okay with something like what Postgres does. Through there are always some exceptions.

0xbadcafebee · on Oct 4, 2022

SQLite only works as a concept because it is not networked. Nobody truly understands the vast and unsolveable problem that is random shit going wrong within the communication of an application over vast distances. SQLite works great because it rejects the dogma that having one piece of software deal with all of that shit is in any way a good idea.

Back your dinky microservice with SQLite, run multiple copies, have them talk to each other and fumble about trying to get consensus over the data they contain in a very loose way. That will be much, much less difficult than managing a distributed decentralized database (I speak from experience). It's good enough for 90% of cases.

Remember P2P applications? That was basically the same thing. A single process running on thousands of computers with their own independent storage, shuffling around information about other nodes and advertising searches until two nodes "found each other" and shared their data (aw, love at first byte!). It's not great, but it works, and is a lot less trouble than a real distributed database.

kortex · on Oct 4, 2022

Amen. The first rule of distributed objects is "don't distribute your objects". It's much easier to reason about a bunch of different actors, each with their own copies of objects, trying to converge on consensus, than it is to actually have a distributed database that obeys ACID and does all the databasey things.

A subtle distinction, but an important one.

mritun · on Oct 4, 2022

A lot of words, manifesto decrying sqlite somehow striffling innovation and a battle-cry for all to join … but backed with very little code.

SQLite is public ___domain. Anyone is free to claim it as their work and do anything they wish.

No need to disparage. You never were prevented from doing what you wanted to do. Take the code, improve it and if it’s any good it may get recognition.

The fact is sqlite is good enough for 99.9% case. Everything else is fighting for the niche and the poster seems to be simply pissed that sqlite has the mindshare even without having their favorite features.

chrismorgan · on Oct 4, 2022

> Rejoin core SQLite if its policy changes

> We are strong believers in open source that is also open to community contributions. If and when SQLite changes its policy to accept contributions, we will gladly merge our work back into the core product and continue in that space.

If libSQL is going Apache-2.0¹ rather than public ___domain, that seems extremely unlikely. The public ___domain nature of SQLite is rather important for its deployments. And in fact licensing is a rather important part of why SQLite is closed to contributions (though with some administrative overhead it could be opened to contributions from people from some countries). The fact that this announcement and project documentation seems to make absolutely no mention of the licensing situation perplexes me.

—⁂—

¹ As they state in the text; but https://github.com/libsql/libsql/commit/f7c54b8f792aa502f025... is entirely insufficient, not constituting application of the license. I find this a bad sign, in stark contrast to the meticulous care SQLite has taken to copyright matters.

cryptonector · on Oct 4, 2022

Choosing a license that is not as good as public ___domain is a way to preclude "rejoining core SQLite".

dathinab · on Oct 5, 2022

> do not work with modern tools like git

That is VERY misleading and intentionally so, doesn't put a good light on the author.

Sure SQLite isn't open for contribution but their tolling (VCS, issue tracking) is in no way "less modern" then git+github, it is just focused on a different approach.

Putting a widely used open source but not open contribution project on GitHub is a nightmare. Similar if your are having no open contribution and only a small trusted team then you can use change flows not viable otherwise and their VCS is designed for such design flows for which git has quite a bunch of unnecessary overhead and food guns.

Lastly for many project choosing no open contribution is a very sane approach properly maintaining contributions for a widely used project is a lot of work basically forcing you to delegate a lot of work to people you hardly know. If you don't have time for this or don't feel good with trusting people you hardly know then open contribution can easily become a night mare. The fact that you AFIK have to be very disciplined to write C doesn't make this better.

Putting this aside another thing I found of putting in the article is that the author fails to mention that basically all of the things he wants to have in SQLit are in direct conflict with core design principles of SQLite. So they need a fork anyway independent of weather or not SQLite accepts contributions, their changes wouldn't be accepted anyway...

Through without question for their use cases a new database is needed which is similar to SQLite but also in subtle but very fundamental ways very different. Starting it with a fork doesn't seem a bad idea. But they really should not say/imply it's SQLite it's not sqlite, it's just very similar and once was forked of SQLite.

nothanks789 · on Oct 4, 2022

By all means, fork the code base for your specific use case, but I'll trust drh and his team over some rando any day.

sqlite isn't the most widely used database by accident - it's installed on practically every mobile phone on earth. It's the result of careful and deliberate design with millions of tests and fuzzing. The sqlite team is very responsive to its users' needs, but some features just don't make the cut. That's a good thing - to keep the library small, fast and reliable.

rwmj · on Oct 4, 2022

Glauber is definitely not "some rando". He's a very skilled programmer who has worked on many things (including QEMU).

chasil · on Oct 4, 2022

He will have to be extremely skilled to develop and pass a DO-178B test suite.

SQLite does have a "moat" of sorts - it must always be avionics-reliable code, until 2050 or beyond.

Any fork of popularity will be constantly rebasing to the head.

Kwpolska · on Oct 4, 2022

Has he produced any code for the libsql project yet, or only FUD blog posts?

glommer · on Oct 4, 2022

rwmj! Now here's a name I don't hear in forever! Hope you're doing great, man! And I won't take offense, I'm pretty rando, though =p

dleslie · on Oct 4, 2022

The headline is needlessly deceiving. The author is making the case for, and subsequently announcing, a hard fork of SQLite for the purposes of meeting the needs of edge computing.

tgv · on Oct 4, 2022

It seems OP wants to "shame" (*) the sqlite devs into supporting this fork. "You're going to miss the boat! History repeats itself!"

(*) the modern word for guilt-tripping.

MichaelCollins · on Oct 4, 2022

That's the way it looks to me. All of the SQLite code is already given away for free. What more is there to give? The trademark (owned by Hwaci according to the SQLite Consortium Agreement.) He wants to use politically charged shame tactics to get his hands on not just the SQLite code but the SQLite brand as well.

Why does he want the trademark? Because he knows that his complaints are fringe and few people will give a damn about his fork, unless he has the trademark and can call his fork the official SQLite.

glommer · on Oct 4, 2022

I don't want the SQLite trademark.

MichaelCollins · on Oct 4, 2022

Your manifesto and accompanying blog post are lay out your desire for your fork to be the official SQLite. That is contrary to the inclinations of the SQLite developers, who don't accept outside contributions, so you are trying to coerce a change to their policy using public shaming and call-out tactics.

So you claim you don't want the trademark signed over to you, but you do want access to the trademark opened up to your fork.

cryptonector · on Oct 4, 2022

Eh, u/glommer is calling it libSQL, not anything with "SQLite" in the name, so I wouldn't assume that u/glommer wants the trademark. But u/glommer does want some of SQLite's reputation to attach to libSQL -- that's going to be an uphill battle.

MichaelCollins · on Oct 4, 2022

> u/glommer is calling it libSQL, not anything with "SQLite" in the name

The blog post makes clear that being a fork is "a losing proposition in the long term." Because SQLite developers own the trademark to SQLite, the only way for this fork to not be a fork is if SQLite gives in to the demands of the manifesto.

cryptonector · on Oct 4, 2022

I don't think it's because of the trademark that forking it is a losing proposition. It's because of:

  - the proprietary test suite
  - the SQLite team's reputation
  - SQLite's reputation
    (which relates to the test suite)
  - the SQLite Consortium, which funds the
    SQLite team
  - funding

The trademark is the least of these. I don't find TFA at all compelling, and I find it distasteful, but I wouldn't impute on the author(s) that they want the SQLite trademark, especially given that they didn't even bring it up.

MichaelCollins · on Oct 4, 2022

> SQLite's reputation

This is the one.

If it were really about the proprietary test suite and fuzzer, then the manifesto should simply demand that SQLite developers release those proprietary tools. Instead they're demanding that SQLite accept a change of contributor policy and correspondingly change their code of conduct. They're wrestling for control of the brand, not merely some tools.

And yeah maybe they're after the money the SQLite Consortium gets too. But I think that's downstream from SQLite's brand recognition.

cryptonector · on Oct 4, 2022

That is a fair take, and now I agree. I suspect that u/glommer et. al. hadn't understood the importance of that test suite, else they would have mentioned it. To me libSQL seems like pie in the sky. They're making demands and threatening a fork, but w/o any evidence of sufficient resources the threats are empty and the demands will go unmet.

Beltalowda · on Oct 4, 2022

Things like rqlite are great and all, but I'm not sure if that's something that should be included in SQLite no matter which development model it uses.

Looking at the libSQL page, it seems they want SQLite to go in quite a different direction in general. That's all fine, but at that point you're working on other problems than what SQLite is intended to solve. People want software to solve every possible problem under the sun, but SQLite is a project with a fairly specific and narrow scope, which is intentional.

The "trick" is to enable things to be extended and patched easily, if need be, so people can build extensions and derivate projects if they want. I don't know to what degree SQLite allows this – I'm not familiar with the SQLite source code.

At any rate, for this to succeed you need at least one person actually writing code for it. Thus far, all I see in the repo are some non-code changes (add license, coc, Makefile twaeks, etc.) I don't know what the exact plans are, but you need more than a GitHub repo with "we accept contributions" and wait for people to submit them, because most people/companies won't. Almost all projects live or die by their core contributors, not the community.

justinclift · on Oct 5, 2022

SQLite allows people to write extensions (C based), and load them into the running database.

There are some popular ones around like Libspatialite, which provides GIS functions roughly equivalent to PostGIS:

https://www.gaia-gis.it/fossil/libspatialite/index

There are a bunch of others too. And yeah, I agree with you that if things could be done via extensions rather than forking SQLite... that would be better. :)

_tom_ · on Oct 4, 2022

SQLite's maintainers have a choice. They can spend their time coding, or they can spend their time explaining, over and over again, why they don't want to do a distributed database.

I can see why they don't take contributions.

cryptonector · on Oct 4, 2022

They have changed their minds on features before. E.g., FULL OUTER JOIN. They should and almost certainly do feel free to change their minds on features relating to distributed DBs.

gw99 · on Oct 4, 2022

This smells like "I don't like how SQLite because I can't get in to the party so I'm making my own party" which is fine but don't denigrate the product or the contributors who have been doing an amazing job for a very long time.

nowtern · on Oct 4, 2022

Just looking through the libsql repository on GitHub, turns out they've changed literally zero actual code. All they've done is added a README, some long-winded 'code of conduct' document, and a configuration file for CI builds. That's it.

What is even the point of announcing this? There's nothing new, nothing different. Just empty promises and hot air.

glommer · on Oct 4, 2022

yes, we haven't changed any code yet. There are essentially two approaches here: one of them is to present a finished thing, and the other is to announce your intentions as soon as possible and build every single thing in public.

We chose the latter in this case, as we think community is the most important aspect of this.

but you are right to be suspicious! recommend checking back later

Kwpolska · on Oct 4, 2022

No code changes and inflammatory/defamatory blog posts aren’t a good way to build a community. If you had shown some code that solves a real problem SQLite has, then people would take you seriously. But if all your “contributions” consist of ideas that 99% of SQLite users won’t need, and a few text files, your project doesn’t look too serious to me.

chasil · on Oct 4, 2022

I hope both projects are successful.

I'm going to reserve judgment; perhaps something profound will emerge.

DoneWithAllThat · on Oct 5, 2022

What does “build everything in public” mean? You’re literally describing open source software which SQLite already is.

throwaway0x7E6 · on Oct 5, 2022

in other words, the only thing you had to contribute was the coc.

karmicthreat · on Oct 4, 2022

SQLite is probably not compatible with community contributions. They seem to be very optimized to the stability and reliability of SQLite rather than features. Which is why people use it over other solutions.

xet7 · on Oct 4, 2022

> The few core developers they have do not work with modern tools like git and collaboration tools like Github, and don’t accept contributions, although they may or may not accept your suggestion for a new feature request.

Git is not modern. It is many executeables, and Git repos are a many files.

GitHub is not modern. It is closed source huge RoR application.

Fossil SCM is features of Git and GitHub much improved, and in small portable fast executeable, storing repos in much smaller space in one SQLite file.

Apache is not https://copyfree.org license. I prefer copyfree licenses.

Liquid_Fire · on Oct 5, 2022

> Git is not modern. It is many executeables, and Git repos are a many files.

What makes many executables and many repo files "not modern"?

kmavm · on Oct 5, 2022

The idea that "virtualization" began with Zen in 2004 is rather difficult to read as an early VMware employee. Before QEMU independently discovered it, VMware was JIT'ing unrestricted x86 to a safe x86 subset from 1999 on[1]. Hardware support for trap-and-emulate virtualization came to the market in the early 'aughts after VMware had proven the market demand for it.

[1] https://www.vmware.com/pdf/asplos235_adams.pdf

eatonphil · on Oct 4, 2022

Whether or not this succeeds, I think it's a great effort. SQLite not taking any outside contributions is of course their prerogative. But it would also be cool to see what could happen with a more open development model. And their (libsql) plans around io_uring and Rust for future code both sound like a good start.

The way they're going about this fork (described in the repo [0] readme) seems healthy enough for both projects as well.

Maybe the biggest challenge though is recreating SQLite's private/proprietary test suite [1].

[0] https://github.com/libsql/libsql

[1] https://www.sqlite.org/th3.html

cryptonector · on Oct 4, 2022

SQLite's proprietary test suite is its secret sauce that has kept it closed to contributions and kept forks from happening. It is the thing that made the SQLite Consortium a going business proposition. It is the thing that makes it possible to fund an open source infrastructure project with a small, cohesive team.

The public believes that that test suite exists and has 100% branch coverage, and that it is much larger and more complete than the public test suite. Of course, there's no at the private test suite exists, but we the public believe it does and we have plenty of reason to believe that it does.

Any fork will instantly lose the benefits of that private test suite. This is what keeps the SQLite team able to reject external contributions, and what keeps forks from taking hold.

Any hard fork will have a struggle with this.

I believe a Rust re-write would have much less trouble w.r.t. the private test suite, owing to Rust's memory safety. But any fork that remains C-coded will have trouble getting public acceptance, and will have to have amazing features -or a new public test suite- to get acceptance.

Meanwhile the SQLite team could respond by making SQLite3 a bit more modular and able to be used in distributed database constructions w/o alteration to any of the SQLite3 code. That would take the wind out of the sails of any fork. This possibility means that any forks need to be properly funded to be able to compete, but the SQLite Consortium is almost certainly well-funded, so it will take a serious commitment -maybe even by some of the consortium's members- to see a fork succeed.

chasil · on Oct 4, 2022

SQLite is 100% committed to avionics with DO-178B compliance.

Increasing modularity means the interfaces must comply.

Rewriting in a memory-safe language means that the garbage collector must comply.

(How) Can a networking interface comply?

That is really asking quite a lot.

remram · on Oct 5, 2022

Rust doesn't have a garbage collector.

Agingcoder · on Oct 6, 2022

The private suite is about (among others) MCDC ie extremely stringent test coverage constraints at the binary level - what does this have to do with rust vs c?

fluidcruft · on Oct 4, 2022

Yeah, the test suite seems pretty key. It also probably contains information about the proprietary extensions and contract work from clients and can likely never be released.

Proprietary test suite is a pretty interesting strategy actually. Very difficult for others to just steal your code and support effort and run if they have to start from zero to build a massive test suite.

I wonder how difficult it would be for some sort of tool to assume a particular original sqlite.c has full coverage and then suggest where additional tests are needed for patches?

chasil · on Oct 4, 2022

Dr. Hipp has expressed interest in licensing TH3.

Perhaps a startup could/should pursue this to truly maintain a reliable fork.

justin66 · on Oct 4, 2022

> But it would also be cool to see what could happen with a more open development model.

It seems a little disingenuous to act like we don't know what would happen, at least in broad strokes. Just compare and contrast the nature of SQLite with "more open" projects and you can get the gist.

More features, more bugs, abandonment of excellent testing standards, poor handling of what are dismissed by the developers (but not the current, extremely wide base of users) as niche concerns.

eatonphil · on Oct 4, 2022

It would be quite an uphill battle for this project to succeed in the longterm. But I don't understand what's disingenuous about hoping for innovation by a change in process.

_lqaf · on Oct 4, 2022

> We encourage others to speak up if they feel uncomfortable.

OK - your weird swipe at Sqlite tentatively tells me you're founding with a grudge, which (a) doesn't tend to last very long as a motivation and (b) has nothing to do with me, so I'm not interested at all.

So I don't have a dog in this fight, I'll check back in a few years and see how it went. Happy forking!

wyldfire · on Oct 4, 2022

> Fabrice was, without any exaggeration, a true genius.

He's still living. So is he no longer a genius?

speedgoose · on Oct 4, 2022

He is doing well: https://bellard.org/

csdvrx · on Oct 4, 2022

Step 1 for a commercial fork: pretend the author is dead :)

blastonico · on Oct 4, 2022

You have to renew your genius certificate every 2 years before expiration.

gw99 · on Oct 4, 2022

My last one expired 20 years ago.

Karupan · on Oct 4, 2022

You can issue yourself a self signed “Genius” certificate on LinkedIn. Just make sure your tagline has lots of buzzwords.

edfletcher_t137 · on Oct 4, 2022

This whole thing reads like a slighted, uppity, smarter-than-thou* child taking their ball and going home. It's off-putting.

*what's the point of pointing out other "geniuses": to encourage the reader to think "oh well then our author must be in the same group too!"? Transparently manipulative.

justin66 · on Oct 4, 2022

I wonder if they'll attempt to replicate the portion of SQLite's test harnesses [0] that are proprietary but nevertheless beneficial to all SQLite users, tests comprising over a million lines of C code.

Those tests are possibly among what the blog author does not consider to be "modern tools," which I don't even...

[0] https://www.sqlite.org/testing.html

cryptonector · on Oct 4, 2022

That would be very expensive, and would require extensive funding, and would take years to complete.

jasonhansel · on Oct 4, 2022

SQLite and QEMU have been successful in large part because of their limited scope. Further expanding the scope of a project puts it at risk of becoming unfocused and unmaintainable. It makes a lot more sense to start a new project to support this use case than to try to reshape SQLite.

logicchains · on Oct 4, 2022

I suspect there's a strong correlation in this case between "Closed to outside contribution" and SQLite being one of if not the most robust pieces of software out there.

jacobn · on Oct 4, 2022

The approach to edge-ifying SQLite taken in [1] looks quite promising - using FoundationDb as the storage handles a lot of hairiness.

1: https://github.com/losfair/mvsqlite

tenebrisalietum · on Oct 4, 2022

> Which means that the data that hits a single SQLite instance now needs to be replicated to all the others.

My initial reaction:

Yeah because ... SQLite is not a database server - it's an awesome API on top of a flat file? Do you want a replicated database? Then use a replicated database.

Then looking at the list of solutions (rqlite, BedrockdB, dqlite, ChiselStore, LiteFS) ... I guess people really do want to make it replicated.

Why not use a SQL server?

Also: I was using QEMU on an old 32-bit Dell server that was before Intel's VM efforts. It worked decently. :)

otoolep · on Oct 4, 2022

As for rqlite.... see this FAQ entry: https://github.com/rqlite/rqlite/blob/master/DOC/FAQ.md#why-...

jpgvm · on Oct 4, 2022

One one hand, SQLite would probably be a more expansive project if it allowed contributions in the same way PostgreSQL does. On the other hand it would likely make it both a larger and more complex project.

I think a better tradeoff would be making the case for why SQLite should have the hooks needed to implement the edge computing bits desired without forking the project itself or attempting to replace what has made it so great for so many years.

This approach would allow the extra behaviors to be managed out of tree and allow more free experimentation as a result without prematurely landing on "blessed" approaches to these very complex and frankly not yet well solved problems.

chasil · on Oct 4, 2022

Microsoft paid SQLite for Windows extensions to the database.

Symbian paid SQLite for the original mobile functionality.

AOL was the original customer for extending SQLite.

This does not appear to be an unresponsive project. It just has to be their way, if it's their code.

Joel_Mckay · on Oct 4, 2022

If someone cares about their project, and chooses to share it with the community... no one should feel entitled to pressure them into another development model...

I find some of SQLite's library could be cleaner, but appreciate the philosophy of keeping it consistent, understandable, and simple... It is nice to have something reliable completed in an afternoon. =)

johnboiles · on Oct 4, 2022

SITUATION: there are too many competing SQLite forks...

Vt71fcAqt7 · on Oct 4, 2022

They aren't competing. One has trillions of installs, the other has a new code of conduct.

discodave · on Oct 4, 2022

My thoughts exactly!

Why does it even need to be forked? The whole point of using SQLite as a foundation for distributed, edge, or other stuff is that it's a solid foundation that can be trusted to not change too much.

Why the need to open SQLite to contributions? Build your fancy distributed, or edge database on top of SQLite and you won't need to modify SQLite at all.

sgbeal · on Oct 4, 2022

> The whole point of using SQLite as a foundation for distributed, edge, or other stuff is that it's a solid foundation that can be trusted to not change too much.

Just to clarify: sqlite3 changes almost literally every day[^1]. However, the project has always placed a premium on backwards compatibility and the developers go way out of their way not to break in-the-wild applications. Given how many databases there are ("billions and billions"), even the slightest backwards incompatibility is likely to affect _someone_, and even 1% of "billions and billions" is a significant number of databases.

[^1]: https://sqlite.org/src/timeline

cestith · on Oct 4, 2022

Good ol' #927.

js8 · on Oct 4, 2022

Just a nitpick: Virtualization was pioneered by IBM S/370 mainframes, CP/CMS and the like. Author is talking about virtualization on desktop/x86 specifically.

Unfortunately, for historical reasons (microcomputer revolution), these are separate worlds.

afavour · on Oct 4, 2022

From a spectator's perspective it's going to be interesting to see how this all works out. SQLite's dedication to doing one thing, doing it well and staying incredibly small means it's basically my go-to example for phenomenal software in 2022. But at the same time maybe a more open model means we'll end up with an embeddable DB that's capable of even more wonderful things. I look forward to finding out.

resoluteteeth · on Oct 4, 2022

All of the projects listed are clearly outside the scope of what would potentially be merged into sqlite even if it accepted contributions.

That said there is probably room for someone to maintain a sqlite extension distribution with standard build settings for use across different languages or something.

Dowwie · on Oct 5, 2022

Can any of the contributors to ChiselStore [1] , mentioned in the article, elaborate on the comment made about it? What's the big deal about not fitting well with SQLx? This isn't a requirement for such a library to be successful. What other problems is it allegedly suffering from? It could be more of a problem of perspective.

[1] https://github.com/chiselstrike/chiselstore

bob1029 · on Oct 4, 2022

I don't want to distract too much from the intended point of this, but something to keep in mind per QEMU and virtual machines - SQLite has its own VM too.

In fact, by binding application-defined functions to your SQLite connection, you can realize your own virtualized application execution environment.

https://www.sqlite.org/opcode.html

https://www.sqlite.org/appfunc.html

huino · on Oct 5, 2022

There's an interesting set of comments by keithmadams on Twitter https://twitter.com/keithmadams/status/1577483553281310721, regarding the accuracy of this:

> This article's history of "virtualization" begins with paravirt in Zen in 2004. Then, after hardware support emerged, v12n started working with with unmodified OSes. No part of this is true. VMware was running binary OSes without hardware support in 1999.

> We did this by JIT'ing unrestricted x86 to a safe x86 subset. User-level code, with few exceptions, was directly executed, but kernel-level code ran through a little JIT. We described it in https://www.vmware.com/pdf/asplos235_adams.pdf

> It performs pretty well, as described above. The only problem is that tiny JITs that run in kernel mode and use x86 segmentation to hide themselves in the top 4MB of linear memory are ... hard to write. And VMware was making a lot of money.

> Zen invented paravirt; AMD invented SVM; and Intel invented VT in parallel all around 2003-2004, not to make virtualization on x86 practical, but to make it easy enough to disrupt VMware's emerging monopoly.

> As a VC focused on companies exploiting a tech advantage, VMware remains my favorite case study. For a time, they could do something that was undeniably valuable, that nobody else could do. And, moats like that are always time-limited; their value invites their destruction.

wg0 · on Oct 4, 2022

What they are missing here is that with QEMU only emulating and not virtualising in a way, was a far bigger limitation counterpart of which does not exist in SQLite.

SQLite delivers for the process embedded database promise and delivers it well. Sure, can have column oriented flavour like DuckDB etc but they don't call themselves SQLite or its fork.

Lastly, like many pointed out, SQLite is not even licensed (Apache, MIT etc) it is just in public ___domain so anything is possible.

losfair · on Oct 4, 2022

This project looks really exciting!

I'm working on mvsqlite [1], a distributed SQLite based on FoundationDB. When doing the VFS integration I have always wanted to patch SQLite itself, but didn't because of uncertainty around correctness of the patched version...

A few features on my wishlist:

1. Asynchronous I/O. mvsqlite is currently doing its own prefetch prediction that is not very accurate. I assume higher layers in SQLite have more information that can help with better prediction.

2. Custom page allocator. SQLite internally uses a linked list to manage database pages - this causes contention on any two transactions that both allocate or free pages.

3. Random ROWID, without the `max(int64)` row trick. Sequentially increasing ROWIDs is a primary source of contention, and causes significant INSERT slowdown in my benchmark [2].

[1] https://github.com/losfair/mvsqlite

[2] https://univalence.me/posts/mvsqlite-bench-20220930

chasil · on Oct 4, 2022

My guess is that, with the oncoming changes for concurrent writers, these will be implemented in time.

Still, Rome was not built in a day, and functionality must take a backseat to reliability and legacy compatibility.

https://www.sqlite.org/cgi/src/doc/begin-concurrent/doc/begi...

larsnystrom · on Oct 5, 2022

I believe the reason this post rubs so many the wrong way is because the author keeps praising his “opponents”, trying to sound positive while being inherently negative. It’s an ugly way of argumenting as it forces the other side to be the negative one, when in fact the author is the one who’s trying to bring something down. In the end the author just sounds dishonest, like a politician with a smile plastered over their face all the time.

I believe there are two ways this post could have been written which would have given a better impression of the author and his intentions. Either openly criticize SQLite, with well-formulated arguments, or don’t criticize SQLite and just announce that you’re doing something new (building on top of SQLite is not at all uncommon after all).

Am4TIfIsER0ppos · on Oct 4, 2022

> do not work with [...] tools like Github

feature

> with a clear code of conduct

bug

> Contributor Covenant Code of Conduct

terminal infection

ENOTTY · on Oct 4, 2022

Frankly, as someone more interested in emulation than virtualization, I have been occasionally unhappy that most of the energy behind QEMU goes to virtualization. Some of the things QEMU has done to better adapt to virtualization do not translate well to the emulation use case.

bonzini · on Oct 4, 2022

This is not true, for example a lot of the work to enable concurrent emulation of multiple CPUs started on the virtualization side. It is also mentioned in the article that both emulation and virtualization benefited from the innovation.

kazinator · on Oct 4, 2022

Re:

> * We would like to use io_uring and asynchronous interfaces

> * We would like to provide replication hooks for distributed systems

> * We would like to allow for WASM-based user-defined functions

This shit isn't needed in SqLite any more than QEMU needs to have anything whatsoever to do with virtualization.

gjvc · on Oct 4, 2022

This will be a flop.

cryptonector · on Oct 4, 2022

Any fork of SQLite will need serious funding to replace SQLite itself.

gray_-_wolf · on Oct 4, 2022

> they have do not work with modern tools like git

Isn't fossil younger than git?