Not sure if this is what the above comment means by "atomic", but a shortcoming ...

bandrami · on March 29, 2024

It drives me batty to see people store 100MB JSON objects with a predictable internal structure as single records in an RDB rather than destructuring it and storing the fields as a single record. Like, yes, you can design it the worst possible way like that, but why? But I see it all the time.

Temporary_31337 · on March 29, 2024

Because schemas. The whole point of nosql is that you can alter your data model without having to reload the whole database

blksv · on March 29, 2024

Actually, that's the whole point of RDBs: that you can alter your data model (in most cases) just by a simple DDL+DML query. And it is with NoSQL that you have to manually download all the affected data from the DB, run the transformation with consistency checks, and upload it back. Or, alternatively, you have to write your business logic so that it can work with/transform on-demand all the different versions of data objects, which to my taste is even more of a nightmarish scenario.

bandrami · on March 29, 2024

Which is great in the early stages of development, but people actually deploy like this

aerhardt · on March 29, 2024

The benefits of going schemaless in the early stages of development are highly suspect in my experience. The time that one might save in data modeling and migrations comes out from the other end with shittier code that’s harder to reason about.

blksv · on March 29, 2024

My perspective is that using NoSQL does not save time in data modeling and migrations. Moreover, one has to pay in increased time for these activities, because (a) in most cases, data has to follow some model in order to be processable anyway, the question is whether we formally document and enforce it at a relational storage, or leave it to external means (which we have to implement) to benefit from some specifically-optimized non-relational storage, (b) NoSQL DBs return data (almost) as stored, one cannot rearrange results as freely as with SQL queries, not even close, thus much more careful design is required (effectively, one has to design not only schema but also the appropriate denormalization of it), (c) migrations are manual and painful, so one had better arrive at the right design at once rather than iterate on it.

That is, of course, if one doesn't want to deal with piles of shitty code and even more shitty data.

winrid · on March 29, 2024

It's not an issue with size. It's an issue with race conditions. With Mongo I can update a.b and a.c concurrently from different nodes and both writes will set the right values.

You can't do that with PG JSONB unless you lock the row for reading...

callalex · on March 29, 2024

Yes but that simplified write complexity means you are pushing a ton of read complexity out to your application.

winrid · on March 30, 2024

What?? That's an insane argument. That's like saying if one client sets column X to 1 and another client concurrently sets SET y = 2, one client's writes will be LOST. It shouldn't, and it doesn't. If it did, nobody would use Postgres. This issue only exists with PG's JSON impl.

callalex · on March 30, 2024

What?? That’s an insane way to describe what I’m talking about. Data/transaction isolation is very complex and extremely specific to every use case, which is why database engines worth anything let you describe to them what your needs are. Hence why when one client writes to Y they specify what they think X should be if relevant and get notified to try again if the assumptions are wrong. An advantage of specifying your data and transaction model up front is that it will surface these subtle issues to you before they destructively lose important information in an unrecoverable manner.

https://en.wikipedia.org/wiki/Isolation_(database_systems)

winrid · on March 30, 2024

So updating one column on a table is destructive and can lose important information now? :D

How do you increment a value in a column while other people write to the database? You don't grab the whole damn row and rewrite it...

winrid · on March 30, 2024

Also this statement:

> which is why database engines worth anything let you describe to them what your needs are

Contradicts your argument.

Mongo has atomic updates to update specific fields, or you can do replaceOne() etc to replace the whole document.

While PG only gives you "replace" with JSON.

So I guess postgres isn't worth anything. :)

jeltz · on March 29, 2024

A JSON object which is 100 MB after compression is a quite huge thing.