I used ORMs for a long time and I have written my own many years ago, so I understand them well and I have learned to dislike them for most tasks. But I should have been more specific in what I mean by set oriented.
The point of working with sets of tuples is that you can apply any set operation to a set of tuples and what you get is a set of tuples. You can use selection, projection or union and you still get a set of tuples. The recursive nature of this principle makes it so powerful and simple. Kind of like Lisp.
ORMs introduce a conceptual barrier between objects defined by ___domain classes created at design time and generic containers used to hold the result of projections and aggregate functions. If you join over two classes and project over some of their attributes you lose all the properties of the original classes, including any attribute accessor methods. Same thing with aggregate functions and grouping.
Even if the result of a query involving projections or aggregate functions happens to have the exact same attributes as another existing class, you don't get objects of that class and hence none of its methods. The tuples don't compare equal to objects of that class either because they don't share their type.
Different ORMs have various features to partially paper over these issues. I don't want to get caught up in a debate about what ORMs can or cannot do because powerful ORMs like hibernate can do almost anything and everything is optional.
The point is that they discourage me from working in a set oriented fashion because they make it hugely more complex than it needs to be. They create a conceptual mess and then they want me to learn a lot of tuning techniques to fix what is broken as a result of that mess. It makes no sense to me.
The current Ruby ActiveRecord is based on Arel, which is a relational algebra. Everything in Arel is necessarily set-oriented and the results of expressions in the ActiveRecord ORM language are the exact objects that went into the set, as long as they are truly set operations.
Same thing with aggregate functions and grouping.
Why do you expect the resulting object of an aggregate over source objects to have the same kinds of methods? That makes no sense at all. What should the 'name' of a record return when you've aggregated over the salary of many employees? That makes no sense in the original relational algebra, whether expressed in SQL or ORM, either. Aggregating is not a set operation and there part of the false dichotomy comes in: SQL isn't quite as pure and easy to understand as ORM opponents would have you believe.
Exactly, encapsulation as a combination of data and methods makes no sense if data is transformed, joined in different ways, grouped, etc. Some methods have to become invalid. Others, like accessors, may still be logically valid but are lost because they are linked inextricably to a class type even though they may depend only on the attribute value itself.
The goal of a query is not to process sets of objects. The goal is to answer a question. The ORM gives you two choices. Either limit yourself to answers that can be expressed as objects of existing classes or get something that is fundamentally different from all other ___domain objects - a generic tuple of values.
You lose the homogenous, recursive transformation capability that the relational model provides.
In terms of simplicity, I suggest you take a few queries involving joins, group by, having and aggregates and translate them to procedural code. I've done that a lot. It's very eye opening. Functional languages using homogenous data structures like lists or generalized sequences can be similarily productive for data that fits into memory. That is a more and more viable alternative to SQL in my view, but OO systsms and querying/transformation just clash badly.
There's more to database programming than just concocting queries. Once you've built a query you want, you have to run it and handle the results somehow. Most object-oriented programs are going to use an object for this. So it makes sense to just write your whole program in terms of set operations on top of predefined objects which map pretty nicely to the database's idea of objects.
You are confirming what I said originally. What you call "handle the result" is quite frequently something that can be done in a set oriented fashion using orders of magnitude less code.
But what ORMs encourage you to do is to load a bunch of objects and then use procedural code to do the real thing.
I fully understand that not all types of algorithms and datastructures lend themselves to set oriented thinking. I work mostly with those nowadays. But for those cases it makes no sense to use RDBMS at all.
But what ORMs encourage you to do is to load a bunch of
objects and then use procedural code to do the real thing.
The exact same thing happens when you directly express the query in SQL. You load a bunch of objects and post-process them, to do what SQL can't do. Unless you count the various languages you can use in stored procedures in RDBMS's, but those aren't set-oriented SQL either.
If I want to do something that SQL can't do I don't use SQL. Using SQL is not an end in itself.
My rule of thumb is pretty simple. I use whatever takes fewer and/or simpler lines of code unless it's a lot slower.
The raging scalability debate is a different matter. I'm sure if you're Google there are good reasons to write more lines of code in order to scale better. I'm not Google so I can prioritize productivity over scalability.
The point of working with sets of tuples is that you can apply any set operation to a set of tuples and what you get is a set of tuples. You can use selection, projection or union and you still get a set of tuples. The recursive nature of this principle makes it so powerful and simple. Kind of like Lisp.
ORMs introduce a conceptual barrier between objects defined by ___domain classes created at design time and generic containers used to hold the result of projections and aggregate functions. If you join over two classes and project over some of their attributes you lose all the properties of the original classes, including any attribute accessor methods. Same thing with aggregate functions and grouping.
Even if the result of a query involving projections or aggregate functions happens to have the exact same attributes as another existing class, you don't get objects of that class and hence none of its methods. The tuples don't compare equal to objects of that class either because they don't share their type.
Different ORMs have various features to partially paper over these issues. I don't want to get caught up in a debate about what ORMs can or cannot do because powerful ORMs like hibernate can do almost anything and everything is optional.
The point is that they discourage me from working in a set oriented fashion because they make it hugely more complex than it needs to be. They create a conceptual mess and then they want me to learn a lot of tuning techniques to fix what is broken as a result of that mess. It makes no sense to me.