Opinion: moreSQL is real

tom_b · on Jan 24, 2012

I love SQL (stay with me).

I cut my SQL teeth on SQL scripts written by programmers who were mainly Java jockeys. Lots of imperative code built around getting a result set and iterating over it.

I was learning relational programming by playing in SQL cmd lines. And getting massive speed-ups by pushing code back to the SQL engines and out of imperative loops.

I love this style of coding - when you squint just right, it looks (a bunch) like a REPL and exploratory development.

And for that last fact, I will always have a bit of a soft spot for SQL. ANSI, please. Host language interfaces on the side.

jcromartie · on Jan 24, 2012

I came here to say pretty much just this. SQL and declarative relational programming are extremely powerful. The new breed of DBs are throwing away something with a solid mathematical and theoretical foundation.

In fact, I wonder if a lot of the reaction against relational databases stems from a lack of understanding of the declarative style and the underlying relational algebra.

Most of the programmers I have worked with can write Java or C# code just fine, but they do just like you described when writing SQL, which makes SQL seem clunky and perform horribly. On the other hand, when you start thinking in sets, and learn to embrace the query optimizer (it does work) you can stop writing silly SQL code and write beautiful and elegant code that does exactly what you need 99% of the time.

I like Ruby, Clojure, Haskell, JSON, shell scripts, and dang it I like SQL.

zaphar · on Jan 24, 2012

Relational databases are awesome. SQL (the ansi standard) is pretty good too. But Rows suck. My data almost never exists as a row in the app. For me the siren call of NoSQL was storage systems that stored and returned my data the way it actually existed in my app. The vast majority of my CRUD apps don't need joins. They need to update an Entity that has a tree like structure. The biggest pain was deserializing that structure into a set of rows and then back again. It's a pain that ORM's tried to fix but caused additional problems. What I really want is a relational database with a sql like query language that isn't stuck in a world of rows.

cameronh90 · on Jan 24, 2012

Relational databases have their place, and lots of problems fit onto a relational database very well. But that doesn't mean SQL is a good language. SQL is poor language by many measures, and to make matters worse, nearly all implementations of SQL have all sorts of procedural extensions that just doesn't fit well with declarative relational programming. Also mathematical purity doesn't imply maintainable code. I'm also not really sure that real life SQL databases stick too closely to their mathematical foundations.

And yeah, the optimiser does work, if you know how to play it's games. Yesterday, I was working on a really slow query (13 seconds), and I restructured it in such a way that produced exactly the same output and shouldn't really have had any impact on the performance. However, suddenly the optimiser kicked in and started doing what it should have done in the first place, and the runtime went down to about 0.1s. Note that this isn't just a problem with SQL, you get the same kind of issues in anything declarative or functional, where you're relying on the compiler to select the appropriate algorithm. Awesome when it works... nightmare when it doesn't.

Though a more fundamental problem with current SQL relational databases is locking. If you want an entity from a document/object database, you only need to lock that object, but a given entity stored normally in a SQL DB may be scattered all over the database in lots of tables, requiring loads of various kinds of lock. Plus, SQL acquires locks implicitly in whatever order it feels like (which may be optimiser dependent), which causes deadlocks. Avoiding this requires careful planning and denormalisation, which kills of a lot of the advantages of using SQL in the first place.

jeltz · on Jan 24, 2012

Your gripe about locks seems quite exaggerated. At least compared to my own experience. Locks are a problem, yes, but not more so than most systems.

Most modern SQL databases support MVCC (multi-version concurrency control) which means they do not have to take any read locks, so your arguments really only applies to UPDATE where I do not think any database supports explicit order for the locking.

Deadlocks in databases are generally detected which causes one transaction to abort. So they are not as harmful as in some other environments.

Databases always in MVCC: PostgreSQL, MySQL+InnoDB, Oracle (if I remember correctly)

Databases with optional MVCC: MS SQL, SQLite

EDIT: Fixed some typos.

cameronh90 · on Jan 24, 2012

It's true that MVCC does mitigate some of the problems, but in a write-heavy OLTP database (such as the one I'm working on now) it's still a significant issue. I also am not aware of any database that supports explicit ordering for the locks, but it's only ever really an issue when using a relational database since you have to lock multiple tables for a single entity.

I agree that deadlocks aren't as harmful, but they are a pain and do hurt performance, so it's another thing you have to worry about.

leif · on Jan 24, 2012

MySQL with TokuDB handles write-heavy OLTP workloads well, even with multiple complicated indexes. Since it uses Fractal Tree indexing, writes are so fast that locks aren't held long enough for this to become a problem.

jeltz · on Jan 24, 2012

I agree with you about that writing write-heavy applications in SQL is much more painful than writing read-heavy. I have been hit by various types of deadlocks including the annoying foreign key deadlocks. In PostgreSQL 9.2 they are working on removing false (as in they are not necessary to preserve consistency) foreign key deadlocks so they should be less common if that patch lands.

weavejester · on Jan 24, 2012

I really don't like SQL.

It doesn't follow the relational model and the syntax is extremely hard to parse.

It was written in an era that was more worried about conserving computing resources than thinking about abstractions.

I've nothing against relational databases, but I wish SQL would hurry up and die :(

ericmoritz · on Jan 24, 2012

What would be your ideal implementation of such an abstraction?

weavejester · on Jan 24, 2012

Something that's easy to parse, and ideally adheres to the relational model. I'm not certain if the latter can be done in a performant manner, so I'd probably settle for a few compromises if they were well argued.

For example, a natural join between two tables could be represented as an S-expression:

    (join a b)

The actual format is probably irrelevant. You could represent it as JSON or XML or some binary protocol. What would matter is that you were transferring a standard structure (i.e. a list consisting of 'join', 'a' and 'b').

Compare this to SQL where you'd actually send a raw string:

    SELECT * FROM a NATURAL JOIN b

Although because SQL doesn't follow the relational model particularly closely, usually you'd write something more like:

    SELECT * FROM a JOIN b ON a.id = b.a_id

Both of these statements are much harder to parse than a S-expression represented by some known format.

As well as being harder to parse, it's also harder to combine SQL. For example, perhaps I want to add a selection to limit the join:

    (select (join a b) (= x 1))

Combining expressions in a S-expression is trivial, but in SQL it's more complex. You can't just wrap one expression in another - you'd want to extend an existing statement with a "WHERE":

    SELECT * FROM a JOIN b ON a.id = b.a_id WHERE x = 1

More complex expressions become a mess of subqueries, and very hard to parse. Man ORMs actually represent queries as ASTs internally and only convert them into SQL at the last moment. If we could sent relational queries directly to the database we'd save a lot of work.

jcromartie · on Jan 24, 2012

Sounds like ClojureQL ;) one of my favorite projects. Looks like Korma is winning with a "worse is better" approach.

batista · on Jan 24, 2012

"In fact, I wonder if a lot of the reaction against relational databases stems from a lack of understanding of the declarative style and the underlying relational algebra."

That, a thousand times.

dhruvbird · on Jan 24, 2012

You can find out more about tokutek here: http://duckduckgo.com/?q=tokutek.com http://www.quora.com/Andrew-Byde/answers/Tokutek

Tokutek · on Jan 24, 2012

To expand the MVCC list noted above, MySQL + TokuDB is always in MVCC as well (since version 5.0 last year). Also note that the Quora link describes Fractional Cascading. TokuDB uses Fractal Tree indexes. To learn more about Fractal Tree indexes as opposed to basic LSM, see here - http://www.tokutek.com/2011/09/write-optimization-myths-comp...