Hacker News new | past | comments | ask | show | jobs | submit login

Say the computer writing the log was destroyed, you were able to recover the storage, but the only computer available did not have journalctl.

Or if you're writing to network storage and would like to analyze the logs from your haiku box.

Text is not perfect, but it's the one thing that is always available.

Also I'm not sure about the journalctl format, but in general binary formats don't handle partial corruption well. Which is something kind of important for logs.




The journalctl binary format seems to handle corruption pretty well. That was a design criteria.

Everyone forgets or tries to ignore that text files ARE A BINARY FORMAT. It is encoded in 7-bit ASCII with records delimited by 0x0a bytes.

Corruption tends to be missing data, and so the reader has to jump ahead to find the next synchronization byte, aka 0x0a. This also leads to log parsers producing complete trash as they try to parse a line that has a new timestamp right in the middle of it.

Or there's a 4K block containing some text and then padded to the end with 0x00 bytes. And then the log continues adding more after reboot. Again, that's fixed by ignoring data until the next non-zero byte and/or 0x0a byte. This problem makes it really obvious that text logs are binary files.

See the format definition at https://www.freedesktop.org/wiki/Software/systemd/journal-fi...

And here, this isn't perfect but if you had to hack out the text with no journalctl available you could try this:

grep -a -z 'SYSLOG_TIMESTAMP=\|MESSAGE=' /var/log/journal/69d27b356a94476da859461d3a3bc6fd/system@4fd7dfdde574402786d1a1ab2575f8fb-0000000001fc01f1-0005c59a802abcff.journal | sed -e 's/SYSLOG_TIMESTAMP=\|MESSAGE=/\n&/g'


The journalctl binary format seems to handle corruption pretty well. That was a design criteria.

Thanks for pointing that out. I guess thats why they came up with their own format instead of just using sqlite, or something else that is already a standard.

Everyone forgets or tries to ignore that text files ARE A BINARY FORMAT

That's a bit pedantic, even for HN standards :-)

But yes, I know all about fragile log parsers and race conditions of multiple processes writing to the same file. I was just thinking about a scenario where you end up having to read raw logs when things go haywire.


> the only computer available did not have journalctl

This is exactly what I don't understand. This is a world where no other computer exists? I have bigger things to worry about (even if you scope the problem down to to "no other computer with journalctl installed exists on my network").

> Or if you're writing to network storage and would like to analyze the logs from your haiku box.

I don't have any Haiku boxes, but I do have Windows boxes. At my day job, where I'm a Linux sysadmin for a large finance company, my workstation is Windows, and I don't have admin on it. So even with conventional UNIX logs, a much more common case is - as I mentioned - that I'd want to read them from a computer that doesn't have gzip installed.

But we're fine with gzip logs, because the way I'd actually do this is to get a Linux computer running.

(Also, keep in mind that the filesystem itself is a binary format. If you're really worried about reading logs from Haiku, you wouldn't put /var/log on NFS because that sounds like a terrible idea, you'd log to a FAT filesystem. But nobody actually does that. Everyone's logs are on ext4 or XFS or btrfs or whatever, and nobody says those formats are a bad idea.)


Note that not every journald/journalctl is created equal. It's easy to end up with logs written on one computer that journalctl on another computer can't read, even if the latter is a newer version, depending on which settings each one was compiled with (which is mostly a distro question.)


All right, then that seems like a reasonable objection to the format, not simply "It's a binary format." gzip files created anywhere can be read by any version of gzip.

A standalone (and cross-platform) journalctl file reader seems like a useful thing to have around and not a terribly difficult thing for someone to build.


This is exactly what I don't understand. This is a world where no other computer exists?

What about a windows user that has a smart device of some sort that is acting up. They're able to dump the files, but now they have to read them. It's a more realistic scenario than a haiku box, but its the same idea.

And for the record, I am against gzip logs too. I think if you need to save that many logs you should be exporting them to another system instead of archiving them locally.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: