*> nothing Mongo does which postgresql doesn't do better* a) It has a built-in a...

ozkatz · 2024-05-27T12:29:52 1716812992

> Almost all big data storage solutions are NoSQL.

I think it's important to distinguish between OLAP AND OLTP.

For OLAP use cases (which is what this post is mostly about) it's almost 100% SQL. The biggest players being Databricks, Snowflake and BigQuery. Other tools may include AWS's tools (Glue, Athena), Trino, ClickHouse, etc.

I bet there's a <1% market for "NoSQL" tools such as MongoDB's "Atlas Data Lake" and probably a bunch of MapReduce jobs still being used in production, but these are the exception, not the rule.

For OLTP "big data", I'm assuming we're talking about "scale-out" distributed databases which are either SQL (e.g. cockroachdb, vitess, etc) SQL-like (Casandra's CQL, Elasticsearch's non-ANSI SQL, Influx' InfluxQL) or a purpose-built language/API (Redis, MongoDB).

I wouldn't say OLTP is "almost all" NoSQL, but definitely a larger proportion compared to OLAP.

blagie · 2024-05-27T13:40:13 1716817213

> Almost all big data storage solutions are NoSQL.

Most I've seen aren't. NoSQL means non-relational database. Most big data solutions I've seen will not use a database at all. An example is hadoop.

Once you have a database, SQL makes a lot of sense. There are big data SQL solutions, mostly in the form of columnar read-optimized databases.

On the above, a little bit of relational can make a huge performance difference, in the form of, for example, a big table with compact data with indexes into small data tables. That can be algorithmically a lot more performant than the same thing without relations.

zxxh · 2024-05-27T13:31:54 1716816714

Anyone reading the above comment ask yourself. Do you actually think when interest rates dropped to zero they suddenly inverted a system that was better than SQL. “Horizontal scaling” I’m sorry I don’t speak marketing language, what is that? I’ve only been doing this for decades.