This isn't clear from the blog post, but we only use autocommit for read-only queries. Using autocommit is only really a major boost if you have a very high skew of reads to writes like we do (50:1).
Thank you for clarifying. Do you segregate autocommit based on whole chunks of application code (for example, front-end web servers are totally read-only at the connection level and therefore autocommit is fine) or is it more ad-hoc?
They specifically mentioned using it for read queries, where transactions are irrelevant and waste bandwidth and processing power on both the server and the client.
They're not irrelevant if you have parallel users modifying the database. The integrity checks in particular are only guarantee a coherent snapshot during a transaction, so if you do two queries and someone sneaks in a modification between them you'll get an incoherent snapshot. Whether or not this matters will depend on what sort of aggregate data you're creating.
The cost of a short-lived read transaction is incredibly low, especially with Postgres. All the data for the last so many transactions will be in the database anyway until a vacuum happens. There's no CPU cost to read transactions that I'm aware of.
I thought that autocommit simply wraps each SQL statement in its own transaction? Wouldn't a good programming practice, be to ensure that everything that is supposed to be atomic, occur in a single, possibly large, SQL statement anyways?
Autocommit omits the begin/commit pair. This is 'implicit' transaction mode. There is no way to turn off transactions in postgres.
The performance savings comes from the roundtrip latency of the BEGIN TRANSACTION / COMMIT packets.
"Wouldn't a good programming practice, be to ensure that everything that is supposed to be atomic, occur in a single, possibly large, SQL statement anyways?"
The simplistic answer to that question is a resounding yes.
Do you have a source for how it can "lead to broken data?" It's my understanding that autocommit simply makes postgresql commit a statement automatically after it's submitted, unless it's wrapped in a transaction.
It's ok to just use transactions all the time when you don't need performance, but those transactions come with a cost, and normaly most queries of a system won't need them.
1) Code that depends on autocommit is hard to unit test, since you have to mock out whatever internal method autocommit calls
2) Code that depends on autocommit is hard to reason about, since you don't have a consistent view of your data, especially in multi-step update methods.
3) Because of 2, your updates will (not "may", "will definitely") be corrupted at some point, leaving broken bad data in the database. Comprehensive constraints help, but if you're relying on autocommit chances are you're not using constraints very well either.
2. autocommit is explicitly mentioned in terms of single select reads. Besides - if you use transactions in your update, it doesn't affect anything - postgres will do the right thing. Similarly, postgres will wrap single statement updates in transactions for you. Using it in multi-step procedures is a no-no - fortunately autocommit is per-connection, so if you are being careful about what connection you use, you can have both.
3. Another strawman - using autocommit for single selects doesn't preclude not using it for places where data integrity is a concern.