> The few core developers they have do not work with modern tools like git and collaboration tools like Github, and don’t accept contributions, although they may or may not accept your suggestion for a new feature request.
The funny thing about this comment is that SQLite is as close to the gold standard of software quality that we have in the open source world. SQLite is the only program that I've ever used that reliably gets better with every release and never regresses. Could it be that this is precisely because they don't use "modern tools" and accept outside contributions?
I feel like a lot of fantastic software is made by a small number of people whose explicit culture is a mix of abnormally strong opinionatedness plus the dedication to execute on that by developing the tools and flow that feel just right.
Much like a lot of other "eccentric" artists in other realms, that eccentricity is, at least in part, a bravery of knowing what one wants and making that a reality, usually with compromises that others might not be comfortable making (efficiency, time, social interaction from a larger group, etc).
SQLite's quality is due to the DO-178B compliance that has been achieved with "test harness 3" (TH3).
Dr. Hipp's efforts to perfect TH3 likely did lower his happiness, but all the Android users stopped reporting bugs.
"The 100% MCD tests, that’s called TH3. That’s proprietary. I had the idea that we would sell those tests to avionics manufacturers and make money that way. We’ve sold exactly zero copies of that so that didn’t really work out... We crashed Oracle, including commercial versions of Oracle. We crashed DB2. Anything we could get our hands on, we tried it and we managed to crash it... I was just getting so tired of this because with this sort of thing, it’s the old joke of, you get 95% of the functionality with the first 95% of your budget, and the last 5% on the second 95% of your budget. It’s kind of the same thing. It’s pretty easy to get up to 90 or 95% test coverage. Getting that last 5% is really, really hard and it took about a year for me to get there, but once we got to that point, we stopped getting bug reports from Android."
> he managed to segfault every single database engine he tried, including SQLite, except for Postgres. Postgres always ran and gave the correct answer. We were never able to find a fault in that. The Postgres people tell me that we just weren’t trying hard enough.
I've always felt like Postgres is like one of those big old Detroit Diesel V12s that power generators and mining trucks and things. It's slow and loud and hopelessly thirsty compared to the modern stuff you get nowadays, and it'll continue to be just as slow and loud and hopelessly thirsty for another 40 or 50 years without stopping even once if you don't fiddle with it.
(I should say that it is not at all difficult to crash an Oracle dedicated server process. I've seen quite a few. This doesn't crash the database (usually).
I've never run an instance in MTS mode, so I've never seen a shared server crash, although I think it would be far from difficult.
I might be curious about the type of Db2 that crashed, UDB, mainframe, or OS/400, as they are very different.)
it's not that "best practices" or any of those things are what causes trouble; it's failing to recognize that they're just tools, and people will still be the ones doing the work. And people should never be treated as merely tools.
You can use all of those things as to enable people to do things better and with less friction, but you also need to keep in mind that if a tool becomes more of a hindrance than a help, you should go looking for a new one.
> it's not that "best practices" or any of those things are what causes trouble; it's failing to recognize that they're just tools, and people will still be the ones doing the work. And people should never be treated as merely tools.
For me, the concept of best practices is pernicious because it is a delegation of authority to external consensus which inevitably will lead to people being treated as tools as they are forced to contort to said best practices. The moment something becomes best practice, it becomes dogma.
This comment perfectly encapsulates the point that I am making about best practices: the concept is used as a cudgel to silence debate and to confer a sense of superiority on the practitioner of "best practice." It is almost always an appeal to authority.
No one wants cowboy pilots ignoring ground control. Doctors though do not exactly have the best historical track record.
Knowledge communities should indeed work towards consensus and constantly be trying to improve themselves. Consensus though is not always desirable. Often consensus goes in very, very dark directions.
Even if there is some universal best practice for some particular problem, my belief is that codifying certain things as "best practice" and policing the use of alternative strategies is more likely to get in the way of actually getting closer to that platonic ideal.
Perhaps a better example might be "covering indexes," or what Oracle would call an "index full scan."
Is is an idea so efficient that to disregard it is inefficiency.
"I had never heard of, for example, a covering index. I was invited to fly to a conference, it was a PHP conference in Germany somewhere, because PHP had integrated SQLite into the project. They wanted me to talk there, so I went over and I was at the conference, but David Axmark was at that conference, as well. He’s one of the original MySQL people.
"David was giving a talk and he explained about how MySQL did covering indexes. I thought, “Wow, that’s a really clever idea.” A covering index is when you have an index and it has multiple columns in the index and you’re doing a query on just the first couple of columns in the index and the answer you want is in the remaining columns in the index. When that happens, the database engine can use just the index. It never has to refer to the original table, and that makes things go faster if it only has to look up one thing.
"Adam: It becomes like a key value store, but just on the index.
"Richard: Right, right, so, on the fly home, on a Delta Airlines flight, it was not a crowded flight. I had the whole row. I spread out. I opened my laptop and I implemented covering indexes for SQLite mid-Atlantic."
This is also related to Oracle's "skip scan" of indexes.
> And people should never be treated as merely tools.
maybe on a tight knit team people don't mind being treated like tools because they understand what needs to get done next, and see that it makes the most sense for them to do it, it's nothing personal.
At my freshman year "1st day" our university president gave us an inspirational speech in which he said "people say our program just trains machines... I want you do know we don't train machines. We educate them."
I'd say that if you have a tight-knit team, you are already doing the very opposite of treating people as tools. There's nothing wrong with having a shared understanding of a goal and then assuming a specific role in the effort to accomplish that goal; people are very good at that.
The problem is when you think of people the same way you think of a hammer when you use it to hit nails: The hammer doesn't matter, only that the nail goes in.
Best practices are subjective. What is best practice is for C is not the same as Python.
SQL DBs provide consistency guarantees around mutating linked lists. It’s not hard to do that in code and use any data storage format.
Imo software engineers have made software “too literal” and generated a bunch of “products” to pitch in meetings. This is all constraints on electron state given an application. A whole lot of books are sold about unit tests but I know from experience a whole lot of critical software systems have zero modern test coverage. A lot of myths about the necessity of this and that to software have been hyped to sell stock in software companies the last couple decades.
"Best practices" are just a summary of what someone (or a group of someones) thinks is something that is broadly applicable, allowing you to skip much of the research required to figure out what options there are even available.
Of course, dogmatic adherence to any principle is a problem (including this one).
Tools can be misused, but that doesn't really affect how useful they can be; though I think better tools are generally the kind that people will naturally use correctly, that's not a requirement.
I don't think you need "abnormally strong opinionatedness" or anything else special: all you need is a certain (long-term) dedication to the project and willingness to just put in the work.
Almost every project is an exercise in trade-offs; including every possible feature is almost always impossible, and never mind that it's the (usually small) group of core devs who need to actually maintain all those features.
I interpreted "opinionatedness" as meaning they have a clear definition of what sqlite is and isn't, including the vision of where it's headed. That would result in a team with very strong opinions about which changes and implementations are a good or bad fit for sqlite.
Can a project consistently make the right trade-offs without having strong opinions like that?
These devs provide a platform and any change to a platform has a huge impact for the users. They have a plan they follow, and in every project are layers. Constraints can be good, when and if applied correctly like in this case.
Fossil is not less modern than Git, just less popular.
Under the hood it seems a lot like Git. The UI is more Hg-like. I disagree with D. Richard Hipp's dislike of rebasing, but he's entitled to that opinion, and a lot of people agree with him.
Calling Fossil "not modern" is a real turn-off. TFA seems to be arguing for buzzwords over substance.
Why? Fossil is a great name, since the past of a software project is... fossilized in the VCS, and looking through it is like doing archeology. No, Fossil's a great name. I just wish it adopted rebase as an optional workflow.
When the retelling of the history of virtualisation ignored everything before Xen, I questioned the value of the essay.
When it got to asserting that Fossil isn't modern, I discarded it. Fossil's a DVCS, but unlike git it chooses to integrated project tooling for things like bug management with the code repo. You can argue about whether you like the approach. But 'not modern' is an absurd statement.
Agree, after reading up about fossil. Except for one thing: I don't want the "closed team" culture that was intentionally baked into the tool.
When git replaced SVN, it was so empowering that I, as an individual, was able to use the full maintainer workflow without the blessing of a maintainer, privately by default.
Before git, we saved the output of "svn diff" into a .patch file for tracking personal changes. When submitting a patch, the maintainer had to write a commit message for you. With some luck, you even got credited. For sharing a sensible feature-branch, you had to become a maintainer with full access. This higher bar has advantages (it tends to create more full-access maintainers, for one). However, it sends this message: "Yes open source, but you are not one of us."
Yes, fossil has this great feature of showing "what was everyone up to after release X". I miss that in git. (Closest workaround: git fetch --all && gitk --all.) But if "everyone" means just the core devs, then I'm out.
> I don't want the "closed team" culture that was intentionally baked into the tool.
I've been using Fossil for years and TBH i don't see that "closed team" culture you speak of (though i also have almost all of my projects as "open source, not open contribution").
Fossil is a distributed version control and in fact it is "more" distributed than git if you consider that people tend to tie it with centralized services like GitHub to get more than just the VCS part. A Fossil repository contains not only the versioned files, but also a wiki, tickets/bugtracker, forum, chat room, blog/technotes - even the theme is part of it. And since it is a decentralized system, all of it are cloned when you clone the repository.
AFAIK the only limitation (hasn't really tried it myself since i only use it solo) is that for security cloned users aren't "fully" cloned so you'd need to make new users in the cloned repository - you can use the same username though (but in commits, history, etc it'll appear as different users). It'd be useful if "user" and "identity" could be distinguished so that you can have the tool know that two usernames are really about the same person.
Also Fossil works pretty much everywhere - a local server, as a FastCGI server, as a plain old CGI "script", you can even have it as a "CGI interpreter" for ".fossil" files (the repository files - the entire repository is stored in a single SQLite database file) which makes it usable with many shared web hosting services without needing root or even shell access. In a way that is the most decentralized you can go :-P.
> Fossil is a distributed version control and in fact it is "more" distributed than git if you consider that people tend to tie it with centralized services like GitHub to get more than just the VCS part.
The GitHub blurb makes no sense. Even if N developers standardize their workflow on using a couple of remote repos to exchange work, that does not make the underlying system less distributed/more centralized.
> A Fossil repository contains not only the versioned files, but also a wiki, tickets/bugtracker, forum, chat room, blog/technotes - even the theme is part of it.
That sounds like a major design faux pas. It makes zero sense to tie a chat room/blog/e-mail client/alarm clock to a source code repository.
> The GitHub blurb makes no sense. [..] It makes zero sense to tie a chat room/blog/e-mail client/alarm clock to a source code repository
I think you need to reconsider your senses if you can't see the tie between the two :-P. The "more distributed" part was exactly because of Fossil providing that additional functionality GitHub provides in a decentralized form.
Extremely well written and maintained and high quality as of now and having a plan to make sure that can continue in the future are sometimes entirely different things with needs that oppose each other.
A single person can develop and release extremely high quality software, and as long as it meets the needs of the users (it's not missing a lot of features that a taking a long time to deliver), a single person in absolute control and writing all the code is probably a benefit in keeping it high quality and with less bugs.
It may not follow that the same can be said a few years from now, or even a few months from now, since the bus factor of that project is one, and if "bus events" includes "I don't want to work on that anymore but nobody else knows it well at all" then for some users that's a problem (and for others not so much).
On situation isn't necessarily better or worse than the other (and it's probably something in-between anyway), it really just depends on the project and the audience it's intended for. That audience might be somewhat self-selected by the style of development though.
> Extremely well written and maintained and high quality as of now and having a plan to make sure that can continue in the future are sometimes entirely different things with needs that oppose each other.
I think in this area SQLite has most other open source software beat. SQLite is used in the Airbus A350 and has support contracts for the life of the airframe.
Fair points, although the bus factor is more like 3 or 4 for SQLite as far as I know. The question though is that if the entire team vanished from the face of the earth, what would the impact be? My guess is that either SQLite would be good enough as is for 99% of use cases and it wouldn't need much development apart from maybe some minor platform specific use cases or, if new functionality truly is needed, then it would be better for a new team to rewrite a similar program from scratch using SQLite as more of a POC than as a blueprint.
SQLite is supported until 2050, and will likely outlast many other platforms if this goal is attained.
I hope the bus factor is high enough to reach the goal.
"Every machine-code branch instruction is tested in both directions. Multiple times. On multiple platforms and with multiple compilers. This helps make the code robust for future migrations. The intense testing also means that new developers can make experimental enhancements to SQLite and, assuming legacy tests all pass, be reasonably sure that the enhancement does not break legacy."
I was speaking less to the SQLite situation specifically and more to the general idea of "Could it be that this is precisely because they don't use "modern tools" and accept outside contributions?" and how I think teams that are very small and not very accepting of outside help/influence might affect that.
To that end I purposefully compares extremes, and tried to allude to the fact that most situations fall between those extremes in some way. SQLite is more towards one end than the other, but it's obviously not a single developer releasing binaries to the world, which is about as far to that extreme as you can go. The other end would probably be something like Debian.
That's not to say either of those situations have to be horrible at what the other excels at. That singular person could have set things in place such that all their code and the history of it gets released on their death, and Debian obviously has a working process for releasing a high quality distribution.
AFAIUI the company behind SQLite (Hipp & co.) have basically endless funding. Not unlimited, just a good enough budget and not likely to end soon. That's also a big factor.
> It may not follow that the same can be said a few years from now, or even a few months from now, since the bus factor of that project is one, and if "bus events" includes "I don't want to work on that anymore but nobody else knows it well at all" then for some users that's a problem (and for others not so much).
You may have been speaking generally, and you'd be right, but specifically the bus factor of the SQLite team and the SQLite Consortium is larger than 1, and they could hire more team members if need be.
If and when the SQLite team is no longer able to keep the project moving forwards, then I think we'd see one or more forks or rewrites or competitors take over SQLite's market share.
Yes, I was speaking generally. Specific development models have advantages and disadvantages, but those can often be countered by non-development model actions taken to limit those disadvantages. For example, and extremely open development model is likely prone to more bugs and quality problems, as well as a harder to read and work in code base. There are steps to combat that, such as style guides and automatic style converters, numerous reviewers that can go through code to fund bugs and make suggestions for better quality, etc.
It's not so much that one model over the other will have those problems I've mentioned for each, as much as I think those are common things those projects should be cognizant of and take steps to combat.
As you noted elsewhere, it sounds like SQLite has done a lot that mitigates what I see as the inherent disadvantages of their development model, which is laudable. At the same time I doubt the average SQLite developer is as easily and quickly replaced as the average Linux kernel contributor, even if there are specific kernel developers which would be hard to deal with their loss. Sometimes all you can do is mitigate the harm of a problem, not remove it entirely.
I dunno, I've seen new team members jump onto projects and thrive where most other members have a decade or two of experience. Once you've been around the block dealing with large and complex codebases, picking up a new one gets easier, so I'm not at all worried about the SQLite bus factor. I agree that much larger projects like Linux can have much larger communities to draw leadership from, but I think SQLite is fine.
Once upon a time I wanted to make a contribution to SQLite, and I tried to negotiate making it, but it was quite an uphill battle. On the other hand, I found the codebase itself quite approachable.
> Extremely well written and maintained and high quality as of now and having a plan to make sure that can continue in the future are sometimes entirely different things with needs that oppose each other.
And so does the Linux kernel. There are numerous cases of successes and failures at both ends of that spectrum.
My point wasn't to imply that you can only pick one, but that in some cases choices to maximize one aspect can negatively affect the other if care is not taken, and depending on audience high quality released software is not the only thing under consideration in open source projects. Keeping the developer group small and being extremely selective of what outside code or ideas are allowed might bring benefits in quality, but if not carefully considered could yield long term problems that ultimately harm a project.
> > > Extremely well written and maintained and high quality as of now and having a plan to make sure that can continue in the future are sometimes entirely different things with needs that oppose each other.
> [...], but if not carefully considered could yield long term problems that ultimately harm a project.
SQLite has a successful consortium and a successful business model, namely: leveraging a proprietary test suite to keep a monopoly on SQLite development that then drives consortium membership and, therefore, consortium success, which then funds the SQLite team.
This has worked for a long time now. It will not work forever. It should work for at least the foreseeable future. If it fails it will be either because a fork somehow overcomes the lack of access to the SQLite proprietary test suite, or because a competitor written in a better programming language arises and quickly gains momentum and usage, and/or because the SQLite Consortium members abandon the project.
Very good points. The proprietary test suite is clearly the (open) secret to SQLite's success. It seems to me that it isn't even entirely accurate to describe SQLite as written in C when the vast majority of its code is probably written in TCL that none of us have seen. It's more like C is just how they represent the virtual machine which is described and specified by its tests. The virtual machine exists outside of any particular programming language but C is the most convenient implementation language to meet their cross platform distribution goals.
If someone did want to carve into SQLite's embedded db monopoly, it would take years to develop a comparable test suite. This seems possible, particularly if they develop a more expressive language for expressing the solutions to the types of problems that we use SQLite for. Who would fund this work though when SQLite works as well as it does?
Ultimately, the long term harm I was thinking of (for the most part) was lots of proprietary knowledge being lost as a developer is lost for one reason or another, and a resulting loss in quality and/or momentum in the project as that developer may represent a large percentage of project development capacity.
That a large chunk of this knowledge appears to have been offloaded into a test suite is good, and does a lot to combat this, but obviously nothing is quite as good as experience and skill and knowledge about the specifics of the code in question.
As a theoretical situation, how much more likely is a fork to eventually succeed if one or more of the code SQLite developers is no longer available to contribute to SQLite? There are a lot of variables that go into that, but I would feel comfortable saying "more likely than if those developers were still present". That idea encapsulates some of the harm I was thinking of.
Institutional knowledge, and leadership, is indeed critical. The knowledge of a codebase can be re-bootstrapped, and its future can be re-conceived, but actually providing leadership is another story. I think there's one person on the SQLite team besides D. R. Hipp who can provide that leadership, but I'm not sure about business leadership, though who am I to speculate, when I don't really know any of them. All I can say is that from outside looking in, SQLite looks pretty solid, and libSQL seems unfunded.
In difference to the article implying that they use out-date project tooling they don't(). They VCS isn't out-date but in some way more modern then git, it's just focused on dev flows similar to theirs to a point where using typical git dev flows will not work well. Similar not using GitHub is a more then right decisions for how the project is manage, github is too much focused on open contribution projects.
(): You could argue they use "not modern" tools like C and similar to do modern things like fuzz testing. But the articles author clearly puts a focus on the project tooling highlighting git/GitHub so reading it as implying that their VCS is "not modern" i.e. outdated i.e. bad seem very reasonable IMHO.
> Extremely well written and maintained and high quality as of now and having a plan to make sure that can continue in the future are sometimes entirely different things with needs that oppose each other.
Software that's truly "extremely well written and maintained and high quality as of now" has the option of a plan like:
> "At the time of my death, it is my intention that the then-current versions of TEX and METAFONT be forever left unchanged, except that the final version numbers to be reported in the “banner” lines of the programs should become [pi and e] respectively. From that moment on, all “bugs” will be permanent “features.” (http://www.ntg.nl/maps/05/34.pdf)
If your software needs perpetual maintenance, that's a good sign that it's probably not that high quality.
> If your software needs perpetual maintenance, that's a good sign that it's probably not that high quality
The problem in a lot of cases is not the software per se, but changing environments. Windows upholds backwards compatibility to a ridiculous degree (you can still run a lot of Win95 era games or business software on Win10), macOS tends to do major overhauls of subsystems every five-ish years that require sometimes substantial changes (e.g. they completely killed off 32-bit app support), but the Linux space is hell.
Anything that needs special kernel modules has no guarantees at all unless the driver is mainlined into the official kernel tree (which can be ridiculously hard to achieve). Userspace is bleh (if you're willing to stick to CLI and statically linking everything sans libc) to horrible (for anything involving GUI or heaven forbid games, or when linking to other libraries dynamically).
The worst of all offenders however is the entire NodeJS environment. The words "backwards compatibility" simply do not exist in that world, so if you want even a chance at keeping up with security updates you have an awful lot of churn work simply because stuff breaks left and right at each "npm update".
You say nothing seriously false, but perhaps depending on things like NodeJS just inherently means your software is going to be poor quality. If that is true, then both you and the PP are probably right. I tend to think software quality will be higher if you depend on a third-party collection (such as so-called Linux distributions) than a second-party aggregation (such as NPM).
Even the distributions have a hard time with the NodeJS environment and its relentless pace - and the more software gets written in JS the worse. When e.g. software A depends on library [email protected] and software B on library [email protected], and X has done breaking changes, what should a distribution do?
Hard forks in the package name (e.g. libnodejs-x-1.0 and libnodejs-x-1.1) are one option, but blow up the repository package count and introduce maintenance liability for the 1.0 version. Manually patching A to adapt to the changes in X works, but is a hell of a lot of work and not always possible (e.g. with too radical changes), not to mention if the work should be upstreamed, then licensing issues or code quality crop up easily which means yet more work. Dropping either A or B also works, but users will complain. And finally, vendoring in dependencies works also, but wastes a lot of disk space and risks security issues going unpatched.
And that's just for final software packages. Dependency trees of six or ten levels deep and final counts in the five digits are no rarity in an average NodeJS application.
Importing even the bare minimum introduces an awful lot of work and responsibility to distributions.
When NetBSD imported sqlite and sqlite3 into their base system that was a signal to me that SQLite is no-nonsense and reliable. That was many years ago, around 2011 I think. Not sure why SQLite is getting all the attention on HN lately. Usually more attention means more pressure to adopt so-called "modern" practices and other BS.
SQLite is interesting to me because like djb's software its author is not interested in copyrights.^1
The author disclaims copyright to this source code. In place of a legal notice, here is a blessing:
May you do good and not evil.
May you find forgiveness for yourself and forgive others.
May you share freely, never taking more than you give.
Apparently this is not be enough to convince some folks they can use the code (maybe they really are doing evil), and so there is also a strange set of "reassurances" on the website:
2. I seem to recall an open source OS project or two making a fuss about djb's software being public ___domain but perhaps I am remembering incorrectly.
This part also rung some alarm bells for me. It makes me think the author is unable to see outside his bubble, and that feeling is only reinforced by the comments about Rust and the CoC in the Readme.
I'm all for minimizing friction for contributors, but when I read things like: "The few core developers they have do not work with modern tools like git and collaboration tools like Github", I wonder if the collaboration from someone who refuses to send a patch to a mailing list (because it is not what they are used to and don't care to learn how to do) is really worth considering. I mean: if someone is not wanting to move a few millimeters out of their comfort zone to make a contribution is, very likely, someone who has very little commitment or will try to force their opinions and methods onto others.
The irony is that sqlite uses fossil which is more modern than git.
But really, I agree, the elephant in the room is that any time someone use the term "old" or "not modern enough" or "legacy" it means they have a system they don't under stand that they want to get rid of. software does not "wear out".
It does but you have to go through the maintainers and they have to be in line with the core principles of SQLite and have the necessary code quality etc.
I.e. it's hard to a point you can just say it's impossible for most people.
But what the author of the article fails to mention is that many of the things libsql wants to add to sqlite are in direct conflict with the core principles of sqlite.
E.g. SQLite: Max portability by depending on a _extreme_ small set of C-Standard C-Functions. libSQL: lets add io-uring a Linux specific functionality more complex then all the C-Standard C-Functions Sqlite depends on together.
E.g. SQLite: Strongly focused on simplicity and avoidance of race conditions by having a serialized & snapshot isolation level without fork-the-world semantics (i.e. globally exclusive write lock). libSql: Lets make it distributed (which is fundamental in conflict with the transaction model, if you want to make it work well).
E.g. SQLite: Small compact code base. libSql: Lets include a WASM runtime (which also is in conflict with max portability, and simplicity/project focus).
> It does but you have to go through the maintainers and they have to be in line with the core principles of SQLite and have the necessary code quality etc.
Even so, I think they’ll prefer to rewrite the contribution. They need to be absolutely sure not to incorporate any copyright encumbered code by mistake.
> The funny thing about this comment is that SQLite is as close to the gold standard of software quality that we have in the open source world. SQLite is the only program that I've ever used that reliably gets better with every release and never regresses. Could it be that this is precisely because they don't use "modern tools" and accept outside contributions?
Reminds me of OpenBSD, who still primarily uses CVS for source control.
but SQLite IS using modern tools, you could say their VCS is more modern then git. It is just not compatible with a lot of git work flows due to being focused on workflow not working well with git.
It is also following modern best practice like:
- use the best tool for the job (i.e. not git or GitHub)
- consider upfront how the project can long term be maintained (i.e. realize that you don't have resources to manage/moderate an public issue tracker/PRs and that you don't want to delegate this work to 3rd parties you barely know)
- keep things simple, i.e. no global exclusive write lock and serialized isolation level (instead of subtle race condition and/or fork the work handling etc.)
- test a lot, use fuzzing etc.
- limit features to you targeted use-cases to keep complexity in check (maintainability, bug avoidance)
- opinionated code style, formatting
- clear cut well defined dev/contribution flow (for the few which can contribute directly)
I.e. if we ignore superficial modern best practices like "use exactly this tool" I don't know which modern best practice it does not fulfill(). Through some are not fulfilled in the way people are used to.
(): Ok, maybe they don't keep to: Prefer languages with more guard rails as far as possible. Through due to their targeted compatibility/portability C is kinda the only option.
This make it sound like sqlite isn't using source control at all. What they are actually doing is using a more obscure source control program than git. Honestly who cares? Source control is source control.
You're conflating two different arguments: not using modern tools, and not accepting outside contributions. It's certainly possible that limiting contributions to a set of trusted contributors helps things move smoothly.
However it's not clear at all that using old tools has the same effect.
Also, Fossil is essentially implemented _in_ SQLite. Fossil is used to develop SQLite which is used to implement Fossil. It's a virtuous cycle. For the SQLite project, using Fossil is obviously superior to git. This doesn't mean that arbitrary projects should use Fossil over git.
The opposite can be said though, but git arbitrarily won over a bunch of other DVCS systems (another e.g. hg), mainly because of bandwagon and marketing.
I have some of my great-grandpa's carpentry tools, and I use them often. I guess I should go out and replace perfectly good tools with new stuff from a big store like Home Depot or Lowe's?
People forget that Stanley #4 plane hasn't really changed in over 100 years. It's still one of the best tools out there.
We dumped CVS because it was a poor tool, for the time. Subversion was better. Then completely distributed systems became better, because connectivity and computational power came about.
If you want to say something, say it. "I don't like that my contributions aren't accepted, so I'm forking the codebase". That's going to generate some discussion, but it's fine. Public ___domain and all.
However,
"look what happened to qemu, same thing will happen to sqlite (and I'm contributing to the problem and forking it)". One can't say "no contributions led to fragmentation" while _at the same time_ contributing to fragmentation by making a hard fork!
Also,
> "However, edge computing also means that your code will be running in many geographical locations, as close as possible to the user for the better possible latency. Which means that the data that hits a single SQLite instance now needs to be replicated to all the others."
Then replicate away. Leave that stuff out of SQLite. If that's a really important use-case, go use couchdb or something similar.
they are annoyed by people misrepresenting facts in a subtle manipulative way to make their fork and the reasons around it look like something it isn't
what the article intentionally or unintentionally conveys in its tone and formulation is: "sqlite is badly outdate beyond saving and needs to be replaced"
what actually is the case: "we want something like SQLite but with some core changes incompatible with SQLites core principles which should be API compatible enough to allow reusing a lot of DB tooling".
There is something distasteful about this announcement but I can’t quite pinpoint it. Maybe it feels like a bait and switch announcing their own fork after the whole qemu commentary, or the wording about the code of conduct. I don’t know.
One of the greatest things about SQLite is how easy it is to embed in random targets/build systems/languages: a .c and .h and you are all set. Moving away from that model will turn off many people away so I hope they retain that model.
Trust your gut. It's a public shaming campaign disguised as a history lesson. Glauber Costa is trying to seem welcoming in one breath and, in another breath, is sniping at the same people he claims to want to join.
> We [Glauber Costa] take our code of conduct seriously, and unlike SQLite, we do not substitute it with an unclear alternative. We strive to foster a community that values diversity, equity, and inclusion. We encourage others to speak up if they feel uncomfortable.
With zero new code to justify this fork, this article is little more than a silly power trip by a flailing startup.
They do have clear goals of what to add, their goals make somewhat sense for some use-cases. But pretty much all of them direct conflict with at least one and sometimes multiple SQLlite core principles...
What the author wants as far as I can tell is a embedded distributed (edge) database with some C-API and SQL compatibility to SQLite so that you can use it with existing tooling (like e.g. by linking against it instead of sqlite) but not necessary as a drop in replacement (different transaction semantics), which doesn't need the same degree of portability as sqlite.
Through that is not quite what the author ends up communicating in the article. I wonder what degree of the formulations are intentionally manipulative compared to accidentally badly formulated.
Regardless of intention, the inability of the author to adequately explain why a fork was necessary, apart from some hand wavy arguments that some other solutions were inadequate, does not inspire confidence. I find it very hard to be generous with the author given the way that he paints himself as a visionary while casually dismissing the work and perspective of the people who actually did the real visionary work of seeding the technologies that he has hitched his wagon to.
The README has "Use Rust for new features" in it, so I doubt it will retain the same simplicity.
As much as I like Rust, and despite mixing Rust and C++ being the clear path forward for Mozilla, I'm not so sure it's the winning approach here. Part of the beauty of SQLite is the single .c/.h thing.
That said, I can see that maybe they're trying to expand the use cases of libsql compared to SQLite. That seems to be the whole idea with adding support for e.g. distributed databases, which is something SQLite just doesn't bother with at all, and would introduce a ton of external interfacing regarding networking. SQLite uses only the standard C library, and even then barely scratches its surface.
Also, SQLite's VFS API can do a lot there already. For example, I remember seeing SQLite compiled to WASM, using a VFS that downloads a remote database on S3 using HTTP Range requests. (I don't think it supported writing to the database, but it was still a really cool way of allowing complex client-side querying of a dataset that is static but too big to transfer).
> Part of the beauty of SQLite is the single .c/.h thing.
Noting that it's not _developed_ that way. Its many files are compounded together by the build process to produce the amalgamation build. In a mixed-language project (Rust for some, C for others) a single-file distribution literally won't be possible and it will require as many toolchains to build as there are languages involved.
I wonder if mrustc [0] would be sufficient to retain the amalgamated build even if Rust were adopted. The regular Rust tool chain would be needed for development still, but if simply depending on the library the Rust components could be transpiled to C…
> Also, SQLite's VFS API can do a lot there already. For example, I remember seeing SQLite compiled to WASM, using a VFS that downloads a remote database on S3 using HTTP Range requests. (I don't think it supported writing to the database, but it was still a really cool way of allowing complex client-side querying of a dataset that is static but too big to transfer).
It also feels gross to me, but the only thing I can put my finger on is citing webshit as the reason to completely change the direction of a well-loved project.
yup but more likely their changes will make it unusable for a bunch of SQLite use-cases, so if it catches on it will probably exist in parallel to SQLite and once it's realized they won't replace SQLite should increasingly diverge by focusing on it's core target audience.
> There is something distasteful about this announcement but I can’t quite pinpoint it.
I think it's a general lack of gratefulness. The author considers sqlite as something to be taken without even saying thank you to drh. All he has to say is, that's a nice project you have there, it would be shame if something happened to it. Like to qemu.
> The author considers sqlite as something to be taken without even saying thank you to drh.
Well, it is in the public ___domain. Respecting and thanking people is of course a very nice thing to do but the fact is there are zero restrictions on what can be done with this software.
> that's a nice project you have there, it would be shame if something happened to it
That's always possible in all free and open source software development. Anyone can show up and just start working harder than whoever's currently in charge. Corporations can show up with paid full time developers and completely displace a project's leadership. Eventually the fork will accumulate so many improvements it will become the de facto upstream.
It remains to be seen if this is what will happen to SQLite and this new libSQL. Who knows, right? I don't expect SQLite to go anywhere though.
Just a note that LiteFS isn't a distributed filesystem; despite the name, it's not a filesystem at all, but rather just a filesystem proxy. It does essentially the same thing that the VFS layer in SQLite itself does, but it does it with a FUSE filesystem, so you don't have to change SQLite's configuration in your application. As for the "distributed" part of it: LiteFS has a single-writer multi-reader architecture; it's the same strategy you'd use to scale out Postgres.
It's a little ironic to see LiteFS brought up in relation to a SQLite fork, since the premise of LiteFS is not changing the SQLite code you link into your application. Much of the work in LiteFS is about cooperating with standard SQLite.
At any rate: it seems somewhat unlikely that a hard fork of SQLite is going to succeed, in that part of the reason so many teams are building SQLite tooling is the trust they have in the current SQLite team.
>LiteFS has a single-writer multi-reader architecture; it's the same strategy you'd use to scale out Postgres
I'm curious about how such a strategy deals with applications that update a value and then read the updated value back to the user, since there might be a replication delay between the write (that goes to the master) and the read (that comes from the closest replica). Do you make optimistic updates on the client or do you club (write-then-read) operations into a transaction?
LiteFS author here. It depends on the application. You can check the replication position on the replicas and simply wait until the replica catches up. That requires maintaining that position on the client though.
An easier approach that works for a lot of apps is to simply have the client read from the primary for a period of time (e.g. 5 seconds) after they issue a write. That's easy to implement with a cookie and it's typically "good enough" consistency for many read-heavy applications.
This is an application (and frequently data type) dependent decision. Some data is safe to return on acceptance others need to wait for acknowledged writes.
The most naive solution is to just make all writes slow but acknowledged.
Has sqlite been forked before? This is the first true fork and re-license attempt that's caught my eye. The others I've seen are "ports" and "modifications", but always pointing people back upstream.
It's possible that the appetite for a SQLite fork is there, but nobody has provided it.
I do remember the author of LMDB [0] porting some parts of sqlite to use lmdb instead, and then talking about the results. A quick googling doesn't seem to give me a result though.
rqlite[1] author here. I wouldn't consider rqlite a fork in any sense, just so we're clear. That rqlite uses plain vanilla SQLite is one of its key advantages IMHO. Users have no concerns they're not running real SQLite source.
That said, there are some things that would be much easier to do with some changes to the SQLite source. But I think the message that rqlite sits on top of pure SQLite makes is still the right choice.
> LiteFS has a single-writer multi-reader architecture;
Note that SQLite transactions are also fundamentally single writer, multiple reader (isolation level serializized with snapshot isolation and no work-forking).
So if you want to make SQLite distributed you will have to end up with not just a single writable replicate but a global write log on transaction level. Which is very very bad for performance for many use cases. (E.g. in PostgreSQL if you use serialized & snapshot transaction isolation it's "forking the world" for parallel write transactions and if multiple transactions have no overlap they can complete in parallel without a problem (in parallel but on the same single write enabled replica). This is good enough for quite a bunch of applications and can still have a decent throughput).
As far as I can tell many use-cases which could profit from a distributed SQLite would not really like a per-transaction global lock but would be okay with something like what Postgres does. Through there are always some exceptions.
SQLite only works as a concept because it is not networked. Nobody truly understands the vast and unsolveable problem that is random shit going wrong within the communication of an application over vast distances. SQLite works great because it rejects the dogma that having one piece of software deal with all of that shit is in any way a good idea.
Back your dinky microservice with SQLite, run multiple copies, have them talk to each other and fumble about trying to get consensus over the data they contain in a very loose way. That will be much, much less difficult than managing a distributed decentralized database (I speak from experience). It's good enough for 90% of cases.
Remember P2P applications? That was basically the same thing. A single process running on thousands of computers with their own independent storage, shuffling around information about other nodes and advertising searches until two nodes "found each other" and shared their data (aw, love at first byte!). It's not great, but it works, and is a lot less trouble than a real distributed database.
Amen. The first rule of distributed objects is "don't distribute your objects". It's much easier to reason about a bunch of different actors, each with their own copies of objects, trying to converge on consensus, than it is to actually have a distributed database that obeys ACID and does all the databasey things.
A lot of words, manifesto decrying sqlite somehow striffling innovation and a battle-cry for all to join … but backed with very little code.
SQLite is public ___domain. Anyone is free to claim it as their work and do anything they wish.
No need to disparage. You never were prevented from doing what you wanted to do. Take the code, improve it and if it’s any good it may get recognition.
The fact is sqlite is good enough for 99.9% case. Everything else is fighting for the niche and the poster seems to be simply pissed that sqlite has the mindshare even without having their favorite features.
> We are strong believers in open source that is also open to community contributions. If and when SQLite changes its policy to accept contributions, we will gladly merge our work back into the core product and continue in that space.
If libSQL is going Apache-2.0¹ rather than public ___domain, that seems extremely unlikely. The public ___domain nature of SQLite is rather important for its deployments. And in fact licensing is a rather important part of why SQLite is closed to contributions (though with some administrative overhead it could be opened to contributions from people from some countries). The fact that this announcement and project documentation seems to make absolutely no mention of the licensing situation perplexes me.
—⁂—
¹ As they state in the text; but https://github.com/libsql/libsql/commit/f7c54b8f792aa502f025... is entirely insufficient, not constituting application of the license. I find this a bad sign, in stark contrast to the meticulous care SQLite has taken to copyright matters.
That is VERY misleading and intentionally so, doesn't put a good light on the author.
Sure SQLite isn't open for contribution but their tolling (VCS, issue tracking) is in no way "less modern" then git+github, it is just focused on a different approach.
Putting a widely used open source but not open contribution project on GitHub is a nightmare. Similar if your are having no open contribution and only a small trusted team then you can use change flows not viable otherwise and their VCS is designed for such design flows for which git has quite a bunch of unnecessary overhead and food guns.
Lastly for many project choosing no open contribution is a very sane approach properly maintaining contributions for a widely used project is a lot of work basically forcing you to delegate a lot of work to people you hardly know. If you don't have time for this or don't feel good with trusting people you hardly know then open contribution can easily become a night mare. The fact that you AFIK have to be very disciplined to write C doesn't make this better.
Putting this aside another thing I found of putting in the article is that the author fails to mention that basically all of the things he wants to have in SQLit are in direct conflict with core design principles of SQLite. So they need a fork anyway independent of weather or not SQLite accepts contributions, their changes wouldn't be accepted anyway...
Through without question for their use cases a new database is needed which is similar to SQLite but also in subtle but very fundamental ways very different. Starting it with a fork doesn't seem a bad idea. But they really should not say/imply it's SQLite it's not sqlite, it's just very similar and once was forked of SQLite.
By all means, fork the code base for your specific use case, but I'll trust drh and his team over some rando any day.
sqlite isn't the most widely used database by accident - it's installed on practically every mobile phone on earth. It's the result of careful and deliberate design with millions of tests and fuzzing. The sqlite team is very responsive to its users' needs, but some features just don't make the cut. That's a good thing - to keep the library small, fast and reliable.
The headline is needlessly deceiving. The author is making the case for, and subsequently announcing, a hard fork of SQLite for the purposes of meeting the needs of edge computing.
That's the way it looks to me. All of the SQLite code is already given away for free. What more is there to give? The trademark (owned by Hwaci according to the SQLite Consortium Agreement.) He wants to use politically charged shame tactics to get his hands on not just the SQLite code but the SQLite brand as well.
Why does he want the trademark? Because he knows that his complaints are fringe and few people will give a damn about his fork, unless he has the trademark and can call his fork the official SQLite.
Your manifesto and accompanying blog post are lay out your desire for your fork to be the official SQLite. That is contrary to the inclinations of the SQLite developers, who don't accept outside contributions, so you are trying to coerce a change to their policy using public shaming and call-out tactics.
So you claim you don't want the trademark signed over to you, but you do want access to the trademark opened up to your fork.
Eh, u/glommer is calling it libSQL, not anything with "SQLite" in the name, so I wouldn't assume that u/glommer wants the trademark. But u/glommer does want some of SQLite's reputation to attach to libSQL -- that's going to be an uphill battle.
> u/glommer is calling it libSQL, not anything with "SQLite" in the name
The blog post makes clear that being a fork is "a losing proposition in the long term." Because SQLite developers own the trademark to SQLite, the only way for this fork to not be a fork is if SQLite gives in to the demands of the manifesto.
I don't think it's because of the trademark that forking it is a losing proposition. It's because of:
- the proprietary test suite
- the SQLite team's reputation
- SQLite's reputation
(which relates to the test suite)
- the SQLite Consortium, which funds the
SQLite team
- funding
The trademark is the least of these. I don't find TFA at all compelling, and I find it distasteful, but I wouldn't impute on the author(s) that they want the SQLite trademark, especially given that they didn't even bring it up.
If it were really about the proprietary test suite and fuzzer, then the manifesto should simply demand that SQLite developers release those proprietary tools. Instead they're demanding that SQLite accept a change of contributor policy and correspondingly change their code of conduct. They're wrestling for control of the brand, not merely some tools.
And yeah maybe they're after the money the SQLite Consortium gets too. But I think that's downstream from SQLite's brand recognition.
That is a fair take, and now I agree. I suspect that u/glommer et. al. hadn't understood the importance of that test suite, else they would have mentioned it. To me libSQL seems like pie in the sky. They're making demands and threatening a fork, but w/o any evidence of sufficient resources the threats are empty and the demands will go unmet.
Things like rqlite are great and all, but I'm not sure if that's something that should be included in SQLite no matter which development model it uses.
Looking at the libSQL page, it seems they want SQLite to go in quite a different direction in general. That's all fine, but at that point you're working on other problems than what SQLite is intended to solve. People want software to solve every possible problem under the sun, but SQLite is a project with a fairly specific and narrow scope, which is intentional.
The "trick" is to enable things to be extended and patched easily, if need be, so people can build extensions and derivate projects if they want. I don't know to what degree SQLite allows this – I'm not familiar with the SQLite source code.
At any rate, for this to succeed you need at least one person actually writing code for it. Thus far, all I see in the repo are some non-code changes (add license, coc, Makefile twaeks, etc.) I don't know what the exact plans are, but you need more than a GitHub repo with "we accept contributions" and wait for people to submit them, because most people/companies won't. Almost all projects live or die by their core contributors, not the community.
There are a bunch of others too. And yeah, I agree with you that if things could be done via extensions rather than forking SQLite... that would be better. :)
SQLite's maintainers have a choice. They can spend their time coding, or they can spend their time explaining, over and over again, why they don't want to do a distributed database.
They have changed their minds on features before. E.g., FULL OUTER JOIN. They should and almost certainly do feel free to change their minds on features relating to distributed DBs.
This smells like "I don't like how SQLite because I can't get in to the party so I'm making my own party" which is fine but don't denigrate the product or the contributors who have been doing an amazing job for a very long time.
Just looking through the libsql repository on GitHub, turns out they've changed literally zero actual code. All they've done is added a README, some long-winded 'code of conduct' document, and a configuration file for CI builds. That's it.
What is even the point of announcing this? There's nothing new, nothing different. Just empty promises and hot air.
yes, we haven't changed any code yet. There are essentially two approaches here: one of them is to present a finished thing, and the other is to announce your intentions as soon as possible and build every single thing in public.
We chose the latter in this case, as we think community is the most important aspect of this.
but you are right to be suspicious! recommend checking back later
No code changes and inflammatory/defamatory blog posts aren’t a good way to build a community. If you had shown some code that solves a real problem SQLite has, then people would take you seriously. But if all your “contributions” consist of ideas that 99% of SQLite users won’t need, and a few text files, your project doesn’t look too serious to me.
SQLite is probably not compatible with community contributions. They seem to be very optimized to the stability and reliability of SQLite rather than features. Which is why people use it over other solutions.
> The few core developers they have do not work with modern tools like git and collaboration tools like Github, and don’t accept contributions, although they may or may not accept your suggestion for a new feature request.
Git is not modern. It is many executeables, and Git repos are a many files.
GitHub is not modern. It is closed source huge RoR application.
Fossil SCM is features of Git and GitHub much improved, and in small portable fast executeable, storing repos in much smaller space in one SQLite file.
The idea that "virtualization" began with Zen in 2004 is rather difficult to read as an early VMware employee. Before QEMU independently discovered it, VMware was JIT'ing unrestricted x86 to a safe x86 subset from 1999 on[1]. Hardware support for trap-and-emulate virtualization came to the market in the early 'aughts after VMware had proven the market demand for it.
Whether or not this succeeds, I think it's a great effort. SQLite not taking any outside contributions is of course their prerogative. But it would also be cool to see what could happen with a more open development model. And their (libsql) plans around io_uring and Rust for future code both sound like a good start.
The way they're going about this fork (described in the repo [0] readme) seems healthy enough for both projects as well.
Maybe the biggest challenge though is recreating SQLite's private/proprietary test suite [1].
SQLite's proprietary test suite is its secret sauce that has kept it closed to contributions and kept forks from happening. It is the thing that made the SQLite Consortium a going business proposition. It is the thing that makes it possible to fund an open source infrastructure project with a small, cohesive team.
The public believes that that test suite exists and has 100% branch coverage, and that it is much larger and more complete than the public test suite. Of course, there's no at the private test suite exists, but we the public believe it does and we have plenty of reason to believe that it does.
Any fork will instantly lose the benefits of that private test suite. This is what keeps the SQLite team able to reject external contributions, and what keeps forks from taking hold.
Any hard fork will have a struggle with this.
I believe a Rust re-write would have much less trouble w.r.t. the private test suite, owing to Rust's memory safety. But any fork that remains C-coded will have trouble getting public acceptance, and will have to have amazing features -or a new public test suite- to get acceptance.
Meanwhile the SQLite team could respond by making SQLite3 a bit more modular and able to be used in distributed database constructions w/o alteration to any of the SQLite3 code. That would take the wind out of the sails of any fork. This possibility means that any forks need to be properly funded to be able to compete, but the SQLite Consortium is almost certainly well-funded, so it will take a serious commitment -maybe even by some of the consortium's members- to see a fork succeed.
The private suite is about (among others) MCDC ie extremely stringent test coverage constraints at the binary level - what does this have to do with rust vs c?
Yeah, the test suite seems pretty key. It also probably contains information about the proprietary extensions and contract work from clients and can likely never be released.
Proprietary test suite is a pretty interesting strategy actually. Very difficult for others to just steal your code and support effort and run if they have to start from zero to build a massive test suite.
I wonder how difficult it would be for some sort of tool to assume a particular original sqlite.c has full coverage and then suggest where additional tests are needed for patches?
> But it would also be cool to see what could happen with a more open development model.
It seems a little disingenuous to act like we don't know what would happen, at least in broad strokes. Just compare and contrast the nature of SQLite with "more open" projects and you can get the gist.
More features, more bugs, abandonment of excellent testing standards, poor handling of what are dismissed by the developers (but not the current, extremely wide base of users) as niche concerns.
It would be quite an uphill battle for this project to succeed in the longterm. But I don't understand what's disingenuous about hoping for innovation by a change in process.
> We encourage others to speak up if they feel uncomfortable.
OK - your weird swipe at Sqlite tentatively tells me you're founding with a grudge, which (a) doesn't tend to last very long as a motivation and (b) has nothing to do with me, so I'm not interested at all.
So I don't have a dog in this fight, I'll check back in a few years and see how it went. Happy forking!
This whole thing reads like a slighted, uppity, smarter-than-thou* child taking their ball and going home. It's off-putting.
*what's the point of pointing out other "geniuses": to encourage the reader to think "oh well then our author must be in the same group too!"? Transparently manipulative.
I wonder if they'll attempt to replicate the portion of SQLite's test harnesses [0] that are proprietary but nevertheless beneficial to all SQLite users, tests comprising over a million lines of C code.
Those tests are possibly among what the blog author does not consider to be "modern tools," which I don't even...
SQLite and QEMU have been successful in large part because of their limited scope. Further expanding the scope of a project puts it at risk of becoming unfocused and unmaintainable. It makes a lot more sense to start a new project to support this use case than to try to reshape SQLite.
I suspect there's a strong correlation in this case between "Closed to outside contribution" and SQLite being one of if not the most robust pieces of software out there.
> Which means that the data that hits a single SQLite instance now needs to be replicated to all the others.
My initial reaction:
Yeah because ... SQLite is not a database server - it's an awesome API on top of a flat file? Do you want a replicated database? Then use a replicated database.
Then looking at the list of solutions (rqlite, BedrockdB, dqlite, ChiselStore, LiteFS) ... I guess people really do want to make it replicated.
Why not use a SQL server?
Also: I was using QEMU on an old 32-bit Dell server that was before Intel's VM efforts. It worked decently. :)
One one hand, SQLite would probably be a more expansive project if it allowed contributions in the same way PostgreSQL does. On the other hand it would likely make it both a larger and more complex project.
I think a better tradeoff would be making the case for why SQLite should have the hooks needed to implement the edge computing bits desired without forking the project itself or attempting to replace what has made it so great for so many years.
This approach would allow the extra behaviors to be managed out of tree and allow more free experimentation as a result without prematurely landing on "blessed" approaches to these very complex and frankly not yet well solved problems.
If someone cares about their project, and chooses to share it with the community... no one should feel entitled to pressure them into another development model...
I find some of SQLite's library could be cleaner, but appreciate the philosophy of keeping it consistent, understandable, and simple... It is nice to have something reliable completed in an afternoon. =)
Why does it even need to be forked? The whole point of using SQLite as a foundation for distributed, edge, or other stuff is that it's a solid foundation that can be trusted to not change too much.
Why the need to open SQLite to contributions? Build your fancy distributed, or edge database on top of SQLite and you won't need to modify SQLite at all.
> The whole point of using SQLite as a foundation for distributed, edge, or other stuff is that it's a solid foundation that can be trusted to not change too much.
Just to clarify: sqlite3 changes almost literally every day[^1]. However, the project has always placed a premium on backwards compatibility and the developers go way out of their way not to break in-the-wild applications. Given how many databases there are ("billions and billions"), even the slightest backwards incompatibility is likely to affect _someone_, and even 1% of "billions and billions" is a significant number of databases.
Just a nitpick: Virtualization was pioneered by IBM S/370 mainframes, CP/CMS and the like. Author is talking about virtualization on desktop/x86 specifically.
Unfortunately, for historical reasons (microcomputer revolution), these are separate worlds.
From a spectator's perspective it's going to be interesting to see how this all works out. SQLite's dedication to doing one thing, doing it well and staying incredibly small means it's basically my go-to example for phenomenal software in 2022. But at the same time maybe a more open model means we'll end up with an embeddable DB that's capable of even more wonderful things. I look forward to finding out.
All of the projects listed are clearly outside the scope of what would potentially be merged into sqlite even if it accepted contributions.
That said there is probably room for someone to maintain a sqlite extension distribution with standard build settings for use across different languages or something.
Can any of the contributors to ChiselStore [1] , mentioned in the article, elaborate on the comment made about
it? What's the big deal about not fitting well with SQLx? This isn't a requirement for such a library to be successful. What other problems is it allegedly suffering from? It could be more of a problem of perspective.
I don't want to distract too much from the intended point of this, but something to keep in mind per QEMU and virtual machines - SQLite has its own VM too.
In fact, by binding application-defined functions to your SQLite connection, you can realize your own virtualized application execution environment.
> This article's history of "virtualization" begins with paravirt in Zen in 2004. Then, after hardware support emerged, v12n started working with with unmodified OSes. No part of this is true. VMware was running binary OSes without hardware support in 1999.
> We did this by JIT'ing unrestricted x86 to a safe x86 subset. User-level code, with few exceptions, was directly executed, but kernel-level code ran through a little JIT. We described it in https://www.vmware.com/pdf/asplos235_adams.pdf
> It performs pretty well, as described above. The only problem is that tiny JITs that run in kernel mode and use x86 segmentation to hide themselves in the top 4MB of linear memory are ... hard to write. And VMware was making a lot of money.
> Zen invented paravirt; AMD invented SVM; and Intel invented VT in parallel all around 2003-2004, not to make virtualization on x86 practical, but to make it easy enough to disrupt VMware's emerging monopoly.
> As a VC focused on companies exploiting a tech advantage, VMware remains my favorite case study. For a time, they could do something that was undeniably valuable, that nobody else could do. And, moats like that are always time-limited; their value invites their destruction.
What they are missing here is that with QEMU only emulating and not virtualising in a way, was a far bigger limitation counterpart of which does not exist in SQLite.
SQLite delivers for the process embedded database promise and delivers it well. Sure, can have column oriented flavour like DuckDB etc but they don't call themselves SQLite or its fork.
Lastly, like many pointed out, SQLite is not even licensed (Apache, MIT etc) it is just in public ___domain so anything is possible.
I'm working on mvsqlite [1], a distributed SQLite based on FoundationDB. When doing the VFS integration I have always wanted to patch SQLite itself, but didn't because of uncertainty around correctness of the patched version...
A few features on my wishlist:
1. Asynchronous I/O. mvsqlite is currently doing its own prefetch prediction that is not very accurate. I assume higher layers in SQLite have more information that can help with better prediction.
2. Custom page allocator. SQLite internally uses a linked list to manage database pages - this causes contention on any two transactions that both allocate or free pages.
3. Random ROWID, without the `max(int64)` row trick. Sequentially increasing ROWIDs is a primary source of contention, and causes significant INSERT slowdown in my benchmark [2].
I believe the reason this post rubs so many the wrong way is because the author keeps praising his “opponents”, trying to sound positive while being inherently negative. It’s an ugly way of argumenting as it forces the other side to be the negative one, when in fact the author is the one who’s trying to bring something down. In the end the author just sounds dishonest, like a politician with a smile plastered over their face all the time.
I believe there are two ways this post could have been written which would have given a better impression of the author and his intentions. Either openly criticize SQLite, with well-formulated arguments, or don’t criticize SQLite and just announce that you’re doing something new (building on top of SQLite is not at all uncommon after all).
Frankly, as someone more interested in emulation than virtualization, I have been occasionally unhappy that most of the energy behind QEMU goes to virtualization. Some of the things QEMU has done to better adapt to virtualization do not translate well to the emulation use case.
This is not true, for example a lot of the work to enable concurrent emulation of multiple CPUs started on the virtualization side. It is also mentioned in the article that both emulation and virtualization benefited from the innovation.
They were initially developed in parallel. To the very best of my recollection, Richard had no knowledge of git when he started fossil. Fossil's initial development dates back to 2005, IIRC, (edit: maybe 2006) initially as a TCL prototype. It wasn't developed in earnest in C until 2007 and was self-hosting by the end of 2007 (i discovered it around Christmas that year).
Listening to SQLite creator podcast (https://corecursive.com/066-sqlite-with-richard-hipp/# ), it does feel that Glauber is right about weird collaboration standards. The guy is against gmail, git, etc. Fossil may be better for SQlite today, w/o many contributors, that's the problem Glauber is trying to solve
Choosing to license the fork under Apache-2.0 ensures that this will not happen (via this fork, at least), as the license is considered incompatible with the kernel's GPL-2.0-only license.
> LiteFS: [...] I personally think that if distributed filesystems were easy, we’d have a good one by now.
The part that makes that "easy" for LiteFS is that it's a single-writer system. That's a pretty rare trade-off in general purpose distributed filesystems.
I don't like how the 'manifesto' presents SQLite's code of ethics as nefarious:
We take our code of conduct seriously, and unlike SQLite, we do not substitute it with an unclear alternative. We strive to foster a community that values diversity, equity, and inclusion. We encourage others to speak up if they feel uncomfortable.
It seems in bad faith to imply that SQLite is some sort of toxic, anti-diversity space. It's not. it's just a closed-to-contributions open source project. Richard is pretty friendly guy and I'm not certain if efforts have been made to reach out to him here.
One of the things I think is really innovative [1] about SQLite's code of conduct is that it's a one-way covenant. The developers are agreeing to act and engage with the community in a certain way without expecting anything in return or enforcing any of their beliefs on others. The rules are also very clearly defined, even the ones I personally don't agree with.
I think the new "Contributor Covenant" is a step backwards. It's pages and pages of text to say "don't be a jerk", and the rules are so poorly defined that it isn't much more precise than a one liner saying "don't be a jerk". There's lots of catch-alls like "... And any other unprofessional behavior is against the rules". You could stretch this to cover almost anything. Hell, I've worked in professional settings where wearing shorts was verboten, but I don't think open source projects should be policing how participants dress except in the most extreme circumstances (like if someone shows up to a convention naked when it isn't pre-planned as being a naturist event)
I find it confounding that the the new covenant is described as being less vague when it relies on poorly defined terms like "community spaces" and "community leaders". What's a community space? Who are the community leaders? Idk, but I've seen a handful of cases these covenants have been weaponized against developers expressing their personal opinions in a context that has nothing to do with Github.
[1]: I know calling it innovative is a bit ironic when most of it was written 1500 years ago, but it's innovative compared to other software projects.
(Edited for grammar and to flesh out why I think the new covenant is vaguer despite the author's claim, unlike the old one with well-defined rules)
These days, I find myself feeling far more comfortable and trusting when a software community doesn't have a "code of conduct", and when it doesn't have a "moderation team" or "community leader enforcement".
Censorship, or even just the threat of censorship, causes far greater harm to a software community than a small number of thin-skinned people getting "offended" every now and then, usually over trivial and irrelevant matters.
> Censorship, or even just the threat of censorship, causes far greater harm to a software community than a small number of thin-skinned people getting "offended" every now and then, usually over trivial and irrelevant matters.
Projects that do not promise to at least enforce a basic level of civility however lose a potential coder base as well, and that is people who are not "mainstream" and have to fight for survival (all too often literally, even in Western countries) in the meatspace already all day: women, gender-queer, people of color, people of non-Christian religions... last thing they want is toxic gamer bro culture seeping into spaces where they are active.
Anecdata, but anyway: I've been on HN for a few years now and very often the lead developers for extremely complex and interesting stuff, particularly reverse engineering, fall under that umbrella somehow. The "free market" shows just exactly where the highest talent classes are, and it's not the whiners whining about CoCs preventing them from spamming "I identify as an Apache helicopter" memes.
> and unlike SQLite, we do not substitute it with an unclear alternative
This portion of the statement is harsh, and I think the project could do better by stating what they are doing and why, rather than focusing on what SQLite is doing and why it is bad.
I would say that it seems feasible (to me) that SQLite's "Code Of Ethics" works for a small team, open-source, closed-contribution model but wouldn't work for the open-contribution model that libSQL is trying to achieve.
This is probably refering the SQLite Code of Ethics debacle.
SQLite has a Code of Ethics, which is used for ticking a box when working with some of their clients (meaning: they probably don't take it very seriously). For this code of ethics, they chose to use the christian Rule of St. Benedict[0].
When it was implemented, it ruffled some feathers in a fairly major way. While it's true that SQLite is closed to code contributions, they do still accept discussions on their forums, among other things. Those interactions would, supposedly, be bound to the CoC. The sqlite devs later cleared up that it was there for client compliance reasons, but the damage was done.
I don't think this line is in bad faith. It's basically spelled out on their code of ethics page that they basically don't care too much about it and is mostly there for ticking boxes.
The article said libSQL would have a clear code of conduct. README.md stated their goals and said SQLite's code of ethics was unclear. Was there another announcement?
> This document continues to be used for its original purpose - providing a reference to fill in the "code of conduct" box on supplier registration forms.
It seems somewhat deceitful to use this list to check those boxes, which makes this entire page self-contradictory (specific points highlighted):
Working around bureaucracy is still deceit. The spirit of the ethics they state here would suggest a good faith effort to put together a true code of conduct, however basic.
The ethics that Christianity teach don't have a backdoor because it's big corporate.
Working around bureaucracy is not deceit. At least according to my understanding of the word, "deceit" requires that someone is lead to believe falsehoods. That's not going on here. SQLite has a code of ethics, and if that's all that's required by some bureaucratic process, then that's fair game. If the process requires enforcement of the code of ethics, SQLite should probably not tick the box, and the document puts that information front and centre.
I mean that's still deceitful. I get that they don't like codes of conduct, but I can still find it amusing that they violate their own code of ethics on the same page they wrote them on.
I don't think I agree with the other commenter that your argument is a straw man. I think we use different definitions of deceit though. I don't think anyone is being mislead, which in my mind is enough to make it not deceitful; if you operate with a different definition, that's okay.
As far as I can see, you're wrong: my position is that the CoE is not deceitful, /u/mmastrac's position is that the CoE is deceitful (and that it's ironic or whatever because the CoE says not to be deceitful). As far as I can see, this is just a good old disagreement where both parties understand and respond to each other's positions.
> I pointed out in detail how it matches the fallacy.
I'm sorry, you did nothing of the kind. The only thing that appears to have happened here, as far as I can tell, is your definition of a strawman, and a cherry-picking, uncharitable read of one of my comments.
> Next up: The first step to get out of a whole you have dug yourself into is to stop digging and all that.
This hostility doesn't feel like the start of a productive conversation, so I'll bid you a good day and stop here.
> Give to the Caesar what belongs to the Caesar and all that.
Everyone always leaves out the "render unto God that which is God's". That second part (not to mention the preceding context) changes the whole meaning of the verse.
In your view, christian ethics don't allow jokes, which is a bit extreme given that many very serious Christian theologians are also quite entertaining and have been so historically as well.
This code of conduct is there to prime the foundation for the takeover of by outsiders. With for profit companies, they can just buy the company. For non-profits, outsiders takeover by instituting DEI and then find something that they can make a big deal out of, or outright lie to get the founders kicked out and take over the organization.
it's not nefarious, it's just a different approach to things.
I think Hipp is justified in all he is doing, and I don't think he is necessarily wrong. We just see things differently
Intolerance towards religion is toxic and anti-diversity. In America, the Civil Rights Act of 1964 makes discriminating against people because of their religion illegal in many contexts. If you can't work alongside somebody who is religious, then you become a liability to any corporation that respects this law.
It's a free country, you're free to be a racist or an anti-theist or a sexist or just about any other -ist. But when your intolerance makes you a legal liability to employers, you'll have yourself to blame.
And insofar as you find SQLite's developers intolerable, this is definitely a "you" problem. The vast majority of us don't have a problem with SQLite's developers, they make software that is used by a huge and diverse group of people. This makes your intolerance all the more stark.
It is precisely American Christianity which created the system of religious tolerance and pluralism that we operate under today.
And Richard Hipp, while religious, is clearly the farthest thing from a nutcase as you can get. From everything I have seen his behavior indicates that he is calm, rational, thoughtful, and kind.
No, I wouldn't. I'd say 'eccentric' maybe, but not a nutcase. Lots of very smart, sane people believe things I think are false. I mean, two of the greatest thinkers in the realm of mathematical probability (Pascal and Bayes) were staunch theists.
"This document was originally called a "Code of Conduct" and was created for the purpose of filling in a box on "supplier registration" forms submitted to the SQLite developers by some clients."
A code of conduct that covers all the bases of the gold standard for such things, the Contributor Covenant -- including explicitly promoting DEI, protecting everyone but especially members of marginalized communities from harassment, and delineating explicit enforcement procedures and committees -- is table stakes for an open source project today. Not having one presents compliance issues for companies who otherwise might adopt your project.
It does not seem to be preventing adoption of SQLite anywhere.
People should have codes of conduct if they believe in them (I do) and find they're necessary in the circumstances they're working in (I have not yet).
People should stop going around proclaiming that pro-forma codes of conduct are a basic necessity for every project; all that does is exacerbate the drama around codes of conduct --- which are perfectly useful and often necessary for huge projects, and which don't at all need to be the topic of 500-post message board threads about freedom.
> A code of conduct that covers all the bases of the gold standard for such things, the Contributor Covenant
It's amazing to me the hubris of people who believe that a set of community guidelines that are less than a decade old are superior to a set of community guidelines that have successfully guided a set of communities for more than 1000 years.
Is this really true though? In the projects I've seen this Covenant pushed heavily, it just seems to be a way to distract from building the product, to instead spend valuable contributor time debating identity politics.
There's a time and place for this sort of thing (e.g. Twitter); open source projects are typically not an appropriate venue.
Specially when demanded by entitled companies that want the work of FLOSS developers for free. And, curiously, these are usually the same companies that cry about (L|A)?GPL.
You do realize that SQLite adopted it's current coc for exactly this reason, right? That's what's meant by compliance, there are some situations where adopting oss code requires specific licenses and compatible-ish coc's to avoid issues with required harassment training and dei. Some auditor will ask for "proof" and the coc is what you can send them. And it's done virally like this because otherwise it would be trivial to bypass by simply having a subcontractor.
You think workplaces make people watch those cringy videos for fun?
I'm glad you're in favor of abolishing affirmative action then, since it's literally systemic racism and even falls under the disingenuous "prejudice plus power" definition of racism!
No-one is insisting on anything, really. "If you'd like us to adopt your technology, do X" gives both parties a choice in the matter and total freedom to walk away.
1. It's creepy that corporations are having expectations about how volunteer effort to create code that they use for free runs itself (or claims to run itself). It's generally not even being made for the corporations that are using it - the software is made for everyone to use (and fork) as they see fit.
2. These corporations are not engaging in this behavior spontaneously. They are responding to a legal regime which is increasingly becoming totalitarian.
That is fair. It's still creepy - especially given that the project is not open to contributions, and has not (to my knowledge) had any issues with the broader "community" related to the things covered in a code of conduct.
The funny thing about this comment is that SQLite is as close to the gold standard of software quality that we have in the open source world. SQLite is the only program that I've ever used that reliably gets better with every release and never regresses. Could it be that this is precisely because they don't use "modern tools" and accept outside contributions?