It's not undefined behavior, it's the fact that the uncorrelated subquery within...

swid · 2025-05-04T04:23:22 1746332602

I see what you saying, but it is very subtle, wouldn’t you agree?

Under a different plan, the inner query would only be evaluated once, it is hard to mentally parse how it will first find the rows with the group id and then pass into the sub query.

And still I am not sure how using a CTE or not in the manner in the post is supposed to avoid this issue, so now I’m a bit skeptical it does. I see how a sort would.

I hope if the sub query was its own CTE, the limit would be applied correctly, but am no longer sure… before this post I wouldn’t have questioned it.

Edit to say - now I see you need to explicitly use AS MATERIALIZED if you bump the subquery to a CTE. Someone should really write a better blog post on this… it raises an interesting case but fails to explain it… they probably have not even solved it for themselves.

shayonj · 2025-05-04T13:18:29 1746364709

That is correct and our curiosities are very much aligned here. I wrote this post in passing on a bus, so it def lacks some critical detail :D. I can very much solve this with a AS MATERIALIZED like you pointed, and as also mentioned in - https://news.ycombinator.com/item?id=43886522

I will look into a more thought out post perhaps. Thanks for sharing all the comments and feedback. Really enjoying the discussions.

shayonj · 2025-05-04T13:34:32 1746365672

Also, I believe using an `ORDER BY` is only as good as `=` where you are giving the planner a hint on the route to choose, but it doesn't necessarily solve the issue like using `AS MATERIALIZED` does. The issue being - it should only delete the number of rows and return them as authored by the value to `LIMIT`. So, using `ORDER BY` and `LIMIT`, will ensure the right set of IDs are returned, but it's not stopping the sub query from getting executed more than once.

It is still a good rule of those to pair your LIMITs with ORDER, however, yeah.

porridgeraisin · 2025-05-04T05:27:48 1746336468

It doesn't matter. The output of a query cannot depend on the plan. All plans should generate semantically equivalent output necessarily. Notice I say semantically equivalent because obviously Select random() can return different numbers each time. But it should still be semantically, one single random number.

In this case, the number of rows changing given the same input dataset, is a bug.

swid · 2025-05-04T05:57:28 1746338248

It is allowed to be a planner choice though! Very surprising, but if you understand the docs, it will follow that it is not actually a bug. It could change in the future and has changed in the past apparently - but that change was to give the planner more opportunity to optimize queries by not materializing parts of the query and inlining that part into the parent.

Regarding selects: [0]: “A key property of WITH queries is that they are normally evaluated only once per execution of the primary query… However, a WITH query can be marked NOT MATERIALIZED to remove this guarantee. … By default, a side-effect-free WITH query is folded into the primary query if it is used exactly once in the primary query’s FROM clause.

Regarding CTEs: [1]: “A useful property of WITH queries is that they are normally evaluated only once per execution of the parent query… However, the other side of this coin is that the optimizer is not able to push restrictions from the parent query down into a multiply-referenced WITH query"

Now, in either case - if you don't want the planner to inline the query - you might have to be explicit about it (I think since postgres 10?), or otherwise - yes, the output of the query will depend on the plan and this is allowed based on the docs.

[0]: https://www.postgresql.org/docs/current/sql-select.html

[1]: https://www.postgresql.org/docs/current/queries-with.html

Sesse__ · 2025-05-04T09:37:18 1746351438

The output of a query absolutely will depend on the plan. In ANSI SQL you're right it cannot (up to ordering), which is why ANSI SQL doesn't have anything like random() or nondeterministic user functions. But nearly all databases support something that is outside of that standard because it's genuinely useful, and then you get into poorly-defined situations that you need to deal with ad-hoc as an implementer.

E.g., for a very simple case, in SELECT * FROM a,b WHERE random() < 0.5, would you push random() down through the join or not? To one table? Both tables? Evaluate it once at the start of a query because it depends on neither a nor b? What if it's random() < a.x? Different databases have different and often poorly-defined semantics here.

swid · 2025-05-04T06:13:25 1746339205

I wrote this to a deleted comment, but even if the CTE was materialized, the subquery of the CTE would still not be...

For instance, with the stand alone query:

  DELETE … WHERE id IN (
    SELECT id … LIMIT 1 FOR UPDATE SKIP LOCKED
  )

the planner is free to turn that IN ( subquery ) into a nested‐loop semi‐join, re-executing the subquery as many times as it deems optimal. Therefore it can delete more than 1 row.

Sesse__ · 2025-05-04T09:31:30 1746351090

More to the point: The SQL standard doesn't talk about semijoins at all, it conceptually evaluates the WHERE clause for each and every row. So the question is whether the subquery is guaranteed to give the same result each and every time. That would presumably depend on the transaction isolation level in use, except that SKIP LOCKED (which is nonstandard, from what I know) presumably calls off all bets anyway.