-g kB Remove old log lines when the in-memory database crosses x kB
Seems like garbage collection is only implemented for in-memory database (by reading SQLITE_DBSTATUS_CACHE_USED). Maybe logrotate could be set up to do it instead, but nothing in documentation indicates so.
You can amortize the write speed significantly by not committing often, either in the sense of a SQL `COMMIT` or in the sense of doing a _synchronous_ `COMMIT`. You could commit every N seconds, say, for some sufficiently large N, or you could commit after N seconds of idle time and no more than M seconds since the last commit. You can also disable `fsync()`, commit often, and re-enable `fsync()` once every N seconds. There are many tactics you can use for data where some loss due to power failure is tolerable.
I.e., you can probably get pretty close to the storage device's max sustained write throughput, though with some losses for write magnification due, e.g., to B-tree write magnification and also indices you might want to maintain.
Write magnification due to B-tree write magnification can be amortized by committing infrequently (which is why I listed that _first_ above). Though there should be no need because SQLite3 already amortizes B-tree write magnification by using a write-ahead log (WAL), so be sure to enable the WAL for this sort of application.
Write magnification due to indices can be amortized by partitioning your tables by time ranges, and then use a VIEW to unify your tables, and then you can create an index on any partition only when it closes to new log entries. This approach causes reads to be slow when searching newer log entries, but those probably will all fit in memory, so it's not a problem if you have a large enough page cache.
Now I've not built anything like this so I can't say for sure, but I suspect that one could get very aggressive with these techniques and reach a sustained write rate of around 75% of the storage device(s)' sustained write rate.
Turning off fsync is pretty dangerous since a crash could corrupt the database. You might think you would just lose a couple seconds of data, but that's only true if writes are applied in order.
E.g. if some data is moved from page A to page B, you normally write B with the new data, fsync, and then write A without the data. Without fsync, you might only write page A and you would lose that data. This might happen on a internal data structure and corrupt the whole database.
This is a core design challenge for all logging systems. This is why there are mechanisms for intentionally dropping messages to relieve queue pressure, optimizations around the use of IO_URING. Inversely because logging systems can drop messages it is one of the primary reasons for "MARK" type mechanisms (https://lists.debian.org/debian-user/1998/09/msg00915.html).
That's not GP's question. GP wants to know how high the write rate can be regardless of the systemd log rate limiter, likely so as to be able to increase that rate limit!
I'm actually kinda surprised they went with SQLite here, log messages are the trivialest data format and there's no way you can't beat SQLite's speed by just not having database logic in the middle at all. Just being able
to BYOAllocator for the logs themselves with such predictable linear memory usage would make this thing scream.
You're sort of right in that a B-tree is not a good data structure for logs given that append-only files are perfect for logs. But the point of using an RDBMS for logs is to be able to a) index the logs, b) provide a great search facility for logs. Perhaps a better design would be a virtual table plugin for SQLite3 that allows one to use log files as tables, then index and search them with SQLite3, but if one lacks the time to investigate that approach then oe can't be faulted for using SQLite3 directly.
Agree, /var/log/messages is there for a long time, writing to log is never a problem. Digesting the log is the niche market and it is profitable enough that we have a lot of tools in this market (rotation, transmission, parsing, etc)
How does this work exactly? Is every log line a separate transaction in autocommit mode? Because I don't see any begin/commit statements in this codebase so far...
This looks right up my alley. I am experimenting to see how much I can strip systemd from my every day laptop as an exercise in futility and to understand how embedded a distribution like Debian has become.
It’s really a shame that OpenBSD doesn’t have a good file system. Otherwise I’d use for my production systems (I could put it inside of proxmox, with a ZFS container outside for stability inside, but I like to run bare metal, no VMs)
I tried to install OpenBSD on three VM hosts, one Macbook Intel, one FreeBSD and one KVM. OpenBSD failed to install in all three environments. The latter two crashed just during the file system creation stage.
I run Void Linux on my laptop, which seems to exhibit many BSD-esque approaches (and lacks systemd), while also being a rolling release distro with fresh stuff usually appearing in a few days after an upstream release.
That’s my concern, the Debian experience is usually pretty lovely and I’d hate to leave it behind, but maybe there’s no point in fiddling with something stuck in trends.
Yes, Alpine is a good choice for small servers. I run both FreeBSD and Alpine for my homelab. Alpine feels very close to FreeBSD style. I still prefer pf over ufw/awall.
Otherwise looks like a great project.