> In Postgres, by contrast, any schema change, even adding an index with the CONCURRENTLY option, needs to at least momentarily hold a totally exclusive lock that blocks all writes and reads on the table.
It's true that every DDL statement requires a relation-level lock. Though the same is true for even a simple SELECT statement. The important detail is the lock strength for each variety of DDL, and how that affects and interacts with other queries (including other DDL):
CREATE INDEX (even without CONCURRENTLY) will not block reads, even for a moment (it just blocks write DML). CREATE INDEX (without CONCURRENTLY) won't even block other CREATE INDEX statements that run against the same table.
My guess is that your application appeared to exhibit this behavior, but the true problem was actually how several conflicting locks accumulated, which had the effect of blocking SELECTs for an unreasonably long time. A combination of CREATE INDEX and some other conflicting DDL that really does block reads (e.g., certain kinds of ALTER TABLE) can create the false impression that CREATE INDEX blocks reads in general. But that's not really the case at all. At worst, CREATE INDEX is only one part of the "traffic jam" that caused SELECTs to block in this scenario.
Huh. I was sure I was right about this but I tested and realised you are correct. CREATE INDEX (even with CONCURRENTLY) gets blocked by reads while waiting to acquire the lock it needs, but it doesn't block reads during that time.
I was getting confused because I'd seen my attempts to CONCURRENTLY add indexes time out waiting for locks, and had also experienced some of the other kinds of DDL that block reads while waiting for their lock (see https://dba.stackexchange.com/q/293992/18607), and I guess I just fallaciously assumed that index creation therefore also blocked reads without ever having tested it.
~Will edit my prior post to add a note about the error!~ Nope, doesn't seem I'm allowed to edit it any more.
CONCURRENTLY needs to wait for old transactions to go away, including those that haven't (and won't ever) touch the table that you're building an index on. So it's not waiting for a lock as such - it's waiting for older transactions to go away without conflicting in a way that can cause these "traffic jams". This can be a problem for the obvious reason, though generally only for the index build itself.
Tricky problems with relation-level locks tend to come from a combination of one lock request that is "generally non-disruptive but long-lived", and another lock request that is "generally disruptive/blocking but short-lived". I have heard of quite a few problem scenarios where these locks conflict with each other (usually by chance), leading to "generally disruptive/blocking and long-lived" -- which can be very dangerous. But that's fundamental to how lock managers work in general.
The Postgres implementation tries to make it as unlikely as reasonably possible. For example, autovacuum usually notices when something like this happens, and cancels itself.
It's true that every DDL statement requires a relation-level lock. Though the same is true for even a simple SELECT statement. The important detail is the lock strength for each variety of DDL, and how that affects and interacts with other queries (including other DDL):
https://www.postgresql.org/docs/devel/explicit-locking.html#...
CREATE INDEX (even without CONCURRENTLY) will not block reads, even for a moment (it just blocks write DML). CREATE INDEX (without CONCURRENTLY) won't even block other CREATE INDEX statements that run against the same table.
My guess is that your application appeared to exhibit this behavior, but the true problem was actually how several conflicting locks accumulated, which had the effect of blocking SELECTs for an unreasonably long time. A combination of CREATE INDEX and some other conflicting DDL that really does block reads (e.g., certain kinds of ALTER TABLE) can create the false impression that CREATE INDEX blocks reads in general. But that's not really the case at all. At worst, CREATE INDEX is only one part of the "traffic jam" that caused SELECTs to block in this scenario.