Below I'm discussing compressed size here rather than how "fast" it is to copy databases.
Yeah there are indexes. And even without indexes there is an entire b-tree sitting above the data. So we're weighing the benefits of having a ___domain dependent compression (binary format) vs dropping all of the derived data. I'm not sure how that will go, but lets try one.
Here is sqlite file containing metadata for apple's photo's application:
About 6% smaller for dump vs the original binary (but there are a bunch of indexes in this one). For me, I don't think it'd be worth the small space savings to spend the extra time doing the dump.
With indexes dropped and vacuumed, the compressed binary is 8% smaller than compressed text (despite btree overhead):
566177792 May 1 09:09 photos_noindex.sqlite
262067325 May 1 09:09 photos_noindex.sqlite.gz
About 13.5% smaller than compressed binary with indices. And one could re-add the indices on the other side.
Yup, these results are pretty consistent with what I'd expect (& why I noted the impact of indices) cause even string data has a lot of superfluous information when expressed in the DDL ("INSERT INTO foo ...") - I would expect all of that to exceed any bookkeeping within the btree. And non-string values like blobs or numbers are going to be stored more efficiently than in the dump which is a text encoding (or even hex for blobs) which is going to blow things up further.
Some more anecdata - from this it looks like you could `VACUUM INTO` + `zstd --long -12` using 19.1s and get 109% of the size you'd get from `dump` + `zstd --long -5` using 32.8s. Saves 13.7s at the cost of 76M. YMMV, obvs.
I learned about this when trying to decode data from Firefox IndexedDB. (I was extracting Tana data.) Their structured clone data format uses nan-boxing for serialization.
Surprisingly, GPT did manage to identify a book that I remembered from college decades ago ("Laboratory Manual for Morphology and Syntax"). It seems to be out of print, and I assumed it was obscure.
Can agree that it’s good at finding books. I was trying to find a book (Titanic 2020) I vaguely remembered from a couple plot points and the fact a ship called Titanic was invoked. ChatGPT figured it out pretty much instantly, after floundering through book sites and Google for a while.
Wonder if books are inherently easier because their content is purely written language? Whereas movies and art tend to have less point by point descriptions of what they are.
> Wonder if books are inherently easier because their content is purely written language? Whereas movies and art tend to have less point by point descriptions of what they are.
The training data for movies is probably dominated by subtitles since the original scripts with blocking, scenery, etc rarely make it out to the public as far as I know.
I must be tired. The thing you remembered was the name of a boat in the book and any web search engine and Wikipedia would probably give you the correct answer?
> > Standard mail forwarding lasts 12 months. You can pay to extend mail forwarding for 6, 12, or 18 more months (18 months is the maximum).
That's kind of awkward when you consider people will find that address for source code where that license file just wont be updated for decades to come, if at all.
With 20/20 hindsight, if the FSF had used a P.O. Box number in the license, the license addresses would always be correct even if the FSF office changed addressed or (as now) was no longer maintained.
Of course, the cost of a P.O. box over 40 years would have added up to thousands of dollars and that is less money for FSF advocacy. And time spent going to the post office to check the box would also have taken away from advocacy time.
Another physical mail DNS-like idea is mail forwarding -- but it typically has time limits at the post office although not for private mail forwarders:
https://en.wikipedia.org/wiki/Mail_forwarding
"Private mail forwarding services are also offered by private forwarding companies, who often offer features like the ability to see your mail online via a virtual mailbox. Virtual mailboxes usually have options to get your mail scanned, discard junk mail and forward mail to your current address."
Although strictly speaking, these forwarding services are not quite like DNS (even if they do get at the idea of indirection). A true mail DNS would be more like a service you mail a post card to with a person's or organization's name and which mails a post card back to you which tells you what address to currently write to in order to reach that person or organization. (At least, if you write to that received address during some time-to-live window of validity of the address.) And I guess Encrypted DNS would be like you and the service using more expensive security envelopes instead of post cards? :-)
> Of course, the cost of a P.O. box over 40 years would have added up to thousands of dollars and that is less money for FSF advocacy. And time spent going to the post office to check the box would also have taken away from advocacy time.
To be fair, renting office space in downtown Boston also adds up to tens (if not hundreds) of thousands of dollars, every year. By comparison, $500 dollars a year [0] for a medium PO Box (in the lobby of the building for their new office, no less!) is a steal.
CGP Grey, a youtube channel, has a video on some of the problems of the postal codes and addresses from earlier this year that I learned about alternates to my familiar US based system. https://www.youtube.com/watch?v=1K5oDtVAYzk
One thing I've been meaning to try, but never got round to, is to stick a URL on an envelope, pointing at a page with an address, and see if the mail (royal mail, in my case) actually deliver it. I suspect they would but that it would take a few extra days. It's no worse than some of the addresses that they do deliver.
Hope is not a strategy. As much as I hate crypto, something on the blockchain might be more durable. You want something that isn't reliant on any one person or company to continue to exist (though maybe the long now foundation will) and even if Bitcoin goes to zero, I think there will be some die hard true believers to keep running miners even past the built in 2140 expiration date.
since this is hacker news... i once had some trouble changing mail address from one supplier (they would send the materials to the new address, but insisted on sending billing/tax info to the old one) so i did the mail forward process some three times + their extensions (i recall it was 6 + 3mo or so)... it got me close to 3 yrs of reliable mail forward from the great folks at usps until i could get thru the supplier personnel thick skull.
the only issue "redoing" the request is that people at the old address can block it, so be sure to talk to them first.
> the only issue "redoing" the request is that people at the old address can block it, so be sure to talk to them first.
That's so strange, especially when you consider that for legal purposes, if you receive mail at someone's home, you are now a "resident" and it is harder for police to kick you out. Why would anyone willingly want your mail to come to your address.
Simply receiving mail does not make you a resident. You must establish residency and that is being allowed access to the home, the understanding that you are leaving belongings behind with the ability to access them later, how long you have stayed, and maintaining things like utility bills. A lease is a contract that clearly establishes the guidelines between two willing parties. Absent that, the definition of residency is typically delineated in your state landlord-tenant laws.
Interesting, I can't reproduce it. I've got Chrome 134.0.6998.166 on macos and with profiling turned on, it's about 55ms for me (3ms of that is spent in scripting).
I learned the top / middle / bottom from a book in elementary school in the early 80's. I did it for a talent show and kids accused me of watching the guy mix it up and memorizing the moves (that would have been more impressive than simply solving it).
Later in college, having forgotten everything. I worked out the solution myself after a hint from a prof (that it's essentially conjugations of group elements).
Years later, I again developed a solution, but this time I do edges first, with permutations that mess up the corners and then the corners. I mainly mixed it up to do something unique.
It depends on what subset of Notion you use. Nothing (including Notion) is perfect for me. I'd like to build my own eventually, but I'm currently using Obsidian which doesn't hit your "works in the browser" requirement.
One option, which is open source and self hosted, is Trilium[sic], found at https://github.com/zadam/trilium It's open source, so if it's close to what you want, you might be able to adjust it to meet your needs.
Other commercial options include Realm, Tana, and Craft. With varying degrees of "AI".
I really like the UX of Tana for building out graphs of pages with properties, but it's slow to start up, doesn't support math, etc. So it's mainly a UX example for me.
Yeah there are indexes. And even without indexes there is an entire b-tree sitting above the data. So we're weighing the benefits of having a ___domain dependent compression (binary format) vs dropping all of the derived data. I'm not sure how that will go, but lets try one.
Here is sqlite file containing metadata for apple's photo's application:
Doing a VACUUM INTO: gzip -k photos.sqlite (this took 20 seconds): sqlite3 -readonly photos.sqlite .dump > photos.dump (10 seconds): gzip -k photos.dump (21 seconds): About 6% smaller for dump vs the original binary (but there are a bunch of indexes in this one). For me, I don't think it'd be worth the small space savings to spend the extra time doing the dump.With indexes dropped and vacuumed, the compressed binary is 8% smaller than compressed text (despite btree overhead):
About 13.5% smaller than compressed binary with indices. And one could re-add the indices on the other side.reply