Hacker News new | past | comments | ask | show | jobs | submit login

Amen. Whether or not the article's example is a good one, in a world without consistency you need to worry about state between _any_ two database operations in the system, so there's nearly unlimited opportunity for this class of error in almost any application found in the real world.

The truly nefarious aspect of NoSQL stores is that the problems that arise from giving up ACID often aren't obvious until your new product is actually in production and failures that you didn't plan for start to appear.

Once you're running a NoSQL system of considerable size, you're going to have a sizable number of engineers who are spending significant amounts of their time thinking about and repairing data integrity problems that arise from even minor failures that are happening every single day. There is really no general fix for this; it's going to be a persistent operational tax that stays with your company as long as the NoSQL store does.

The same isn't true for an ACID database. You may eventually run into scaling bottle necks (although not nearly as soon as most people think), transactions are darn close to magic in how much default resilience they give to your system. If an unexpected failure occurs, you can roll back the transaction that you're running in, and in almost 100% of cases this turns out to be a "good enough" solution, leaving your application state sane and data integrity sound.

In the long run, ACID databases pay dividends in allowing an engineering team to stay focused on building new features instead of getting lost in the weeds of never ending daily operational work. NoSQL stores on the other hand are more akin to an unpaid credit card bill, with unpaid interest continuing to compound month after month.




A database is an abstract concept, and ACID transactions on databases have nothing to do with the interface semantics used to interact with the shared state being managed.

You can have non-ACIDic SQL databases and ACID NoSQL databases.

While it is true that many KV stores do not present a fully consistent view over distributed data, there are some that do.

It is very important that people shed this idea that only SQL databases can have strong consistency, that's for sure.


> It is very important that people shed this idea that only SQL databases can have strong consistency, that's for sure.

Why? In practice, people are going to be building things using the same common paths: Postgres, Mongo, MySQL, Redis, etc., and from a purely pragmatic standpoint, the SQL databases that you're likely to use come with strong ACID guarantees while the NoSQL databases don't.

If you can point me to a NoSQL data store that provides high availability, similar atomicity, consistency, and isolation properties provided by a strong isolation level in Postgres/MySQL/Oracle/SQL Server, and has a good track record of production use, I'm all ears.


Berkeley DB.

I use it in production for a game that supports over 100,000 concurrent users and it's been around for a very long time.

It's fully ACID by default but you get the ability to drop elements of ACID in exchange for greater performance in a per table or transaction level. It also supports two different isolation approaches on a per table level. Locking or MVCC (like postgres).

It's a really great NoSQL database from the days before the term NoSQL existed.


BerkeleyDB is great. I'm honestly surprised it doesn't see more use. It's right in the sweet spot for software that doesn't quite need a relational model, and that seems to be a lot of software.


Check out lmdb for a better alternative to it.


How did I not know about this awesome-looking tool? Thanks for the mention! Wikipedia link for others:

https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Databa...


Yup, and we used to have FoundationDB, sadly not any more, but don't get me started on that.

Still there are plenty of highly distributed stores to go around, and it's good to know that people are actively using them with great success.

It's very nice to have fine control over consistency vs performance even if you do have to think about it as a developer, but then that's our job :-)


Berkeley DB was great, but Oracle killed it by changing the license to AGPL. That change meant many projects that use it couldn't update to the latest version without violating the license.

LMDB has taken its place in a lot of projects that used to use Berkeley.


https://github.com/cockroachdb/cockroach

which was inspired by this paper:

http://research.google.com/archive/spanner.html

which google made available for all in form of it's cloud datastore:

https://cloud.google.com/datastore/

"We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions." [1]

[1] http://highscalability.com/blog/2012/9/24/google-spanners-mo...


A bit out of the scope, since it is an embedded database designed to run on mobile devices, but Realm[1] is a NoSQL database (in this case Object Database) with full ACID transactions, and is running on millions of devices.

[1] http://realm.io


https://cloud.google.com/datastore/

I haven't used it in a few years, but from what I remember you could have ACID like operations on grouped entries because grouping them makes their physical storage the same so ACID transaction becomes a localized operation.



> While it is true that many KV stores do not present a fully consistent view over distributed data, there are some that do.

Consistency is always relative to what the DBMS can enforce, which is in turn always relative to what the the DBMS's schema language can express. A KV store can be trivially consistent in that, between writes, it always presents the same view of the data. But it won't be consistent in the sense of presenting a view of the data that's guaranteed not to contradict your business rules. The latter is what users of relational databases expect.


Totally agree. I've wet my toes with NoSQL systems but things like MongoDB just look like trouble waiting to happen.

On the other hand, SQL makes it hard to maintain the state definition, I had to develop a mechanism to upgrade the DB from any possible state to the latest state.

Still, this allows me to define very accurately what is a valid row or entry in the database, which means my Applications need to make far fewer assumptions.

I can just take data from the DB and assume with certainty that this data has certain forms, formattings and values.

MongoDB makes none of these guarantees.


It's absolutely true that migrations in SQL databases can be a pain. I don't actually think that's a bad thing - what's really going on there is that the DBMS is throwing data integrity issues into your face, and encouraging you to think about them.

A lot of developers don't want to do that, because they want to think of data integrity as this tangential concern that's largely a distraction from their real job. A lot of developers also have a habit of cursing their forebears on a project for creating all sorts of technical debt by cutting corners on data integrity issues, while at the same time habitually cutting corners themselves in the name of ticking items off their to-do list at an incrementally faster pace.

This isn't to say that NoSQL systems make your system inherently less maintainable. But I do think NoSQL projects gained a lot of their initial market traction by appealing to developers' worst instincts with sales pitches that said, "Hey, you don't even have to worry about this!" and deceptively branding schema-on-read as "schemaless". So a reputation for sloppiness might not be deserved (or at least, that's not an issue I want to take a position on), but, in any case, it was very much earned.


I am very fond of pushing as much validation into the database as I can, so I use `NOT NULL`, foreign keys, `CHECK` constraints, and whatever I can to let the database do the job of ensuring valid data. All of us make mistakes, and making sure there are no exceptions is just what computers are good at. I think you're right about "a lot of developers", but I find data modeling, including finding the invariants, pretty satisfying actually.

Reading your comment reminds me of ESR's point in The Art of Unix Programming that getting the data structures right is more important than getting the code right, and if you have good data structures, writing the code is a lot easier. The database is just a big data structure, the foundation everything else is built on, and the better-designed it is, the easier your code will be.


That's not ESR's point. It's shoplifted from Brooks.


IMO validating in the DB also makes code faster.

When I get data from a well validated DB, I don't have to check that it is valid or has certain formats, that is ensured by the DB.

I can remove checks from the entire CRUD stack, if it's invalid data it does not get created or won't make it into the update.

The easiest way to check data is then to try the INSERT or UDPATE and if it works it's valid.

Otherwise I'd need to validate all fields, see if any contain errors, if any are missing, etc etc.


>DBMS is throwing data integrity issues into your face, and encouraging you to think about them.

And I love it to death for it.

It also allows me to do some nifty little tricks.

I've developed my own Migration Tool that works by exploiting this data integrity, it actually aids me to identify and upgrade the database dynamically without worrying about which updates are applied and which not.


"MongoDB: Because /dev/null doesn't support sharding"


Minor point, but MongoDB added document validation in 3.2 so you can add guarantees about what documents are being stored.

https://docs.mongodb.com/manual/core/document-validation/


pardon my ignorance but wouldn't you have the same kind of issues once you distribute any system, NoSQL or not. CAP applies to all distributed systems.

Surely I get the feeling, NoSql just exposed the issue and became synonymous with it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: