> Almost all big data storage solutions are NoSQL.
I think it's important to distinguish between OLAP AND OLTP.
For OLAP use cases (which is what this post is mostly about) it's almost 100% SQL.
The biggest players being Databricks, Snowflake and BigQuery. Other tools may include AWS's tools (Glue, Athena), Trino, ClickHouse, etc.
I bet there's a <1% market for "NoSQL" tools such as MongoDB's "Atlas Data Lake" and probably a bunch of MapReduce jobs still being used in production, but these are the exception, not the rule.
For OLTP "big data", I'm assuming we're talking about "scale-out" distributed databases which are either SQL (e.g. cockroachdb, vitess, etc) SQL-like (Casandra's CQL, Elasticsearch's non-ANSI SQL, Influx' InfluxQL) or a purpose-built language/API (Redis, MongoDB).
I wouldn't say OLTP is "almost all" NoSQL, but definitely a larger proportion compared to OLAP.
I think it's important to distinguish between OLAP AND OLTP.
For OLAP use cases (which is what this post is mostly about) it's almost 100% SQL. The biggest players being Databricks, Snowflake and BigQuery. Other tools may include AWS's tools (Glue, Athena), Trino, ClickHouse, etc.
I bet there's a <1% market for "NoSQL" tools such as MongoDB's "Atlas Data Lake" and probably a bunch of MapReduce jobs still being used in production, but these are the exception, not the rule.
For OLTP "big data", I'm assuming we're talking about "scale-out" distributed databases which are either SQL (e.g. cockroachdb, vitess, etc) SQL-like (Casandra's CQL, Elasticsearch's non-ANSI SQL, Influx' InfluxQL) or a purpose-built language/API (Redis, MongoDB).
I wouldn't say OLTP is "almost all" NoSQL, but definitely a larger proportion compared to OLAP.