I don't understand why anyone would advocate for autocommit. It's a horrible fea...

rbranson · on Jan 17, 2013

This isn't clear from the blog post, but we only use autocommit for read-only queries. Using autocommit is only really a major boost if you have a very high skew of reads to writes like we do (50:1).

zrail · on Jan 17, 2013

Thank you for clarifying. Do you segregate autocommit based on whole chunks of application code (for example, front-end web servers are totally read-only at the connection level and therefore autocommit is fine) or is it more ad-hoc?

amethyst · on Jan 17, 2013

They specifically mentioned using it for read queries, where transactions are irrelevant and waste bandwidth and processing power on both the server and the client.

fusiongyro · on Jan 17, 2013

They're not irrelevant if you have parallel users modifying the database. The integrity checks in particular are only guarantee a coherent snapshot during a transaction, so if you do two queries and someone sneaks in a modification between them you'll get an incoherent snapshot. Whether or not this matters will depend on what sort of aggregate data you're creating.

The cost of a short-lived read transaction is incredibly low, especially with Postgres. All the data for the last so many transactions will be in the database anyway until a vacuum happens. There's no CPU cost to read transactions that I'm aware of.

jcdavis · on Jan 17, 2013

Instagram is heavily sharded, so they've pretty much already given up a snapshot view of all their data anyways.

fusiongyro · on Jan 17, 2013

Oh, I trust they understand what they're up to. It's just not advice everybody should follow blindly.

jpitz · on Jan 17, 2013

Transactions are most certainly not irrelevant for reads. They are quite necessary.

Explicit transaction blocks for single-statement reads are pointless. Extra packets for the transaction demarcation even more so.

patrickgzill · on Jan 17, 2013

I thought that autocommit simply wraps each SQL statement in its own transaction? Wouldn't a good programming practice, be to ensure that everything that is supposed to be atomic, occur in a single, possibly large, SQL statement anyways?

jpitz · on Jan 17, 2013

Autocommit omits the begin/commit pair. This is 'implicit' transaction mode. There is no way to turn off transactions in postgres.

The performance savings comes from the roundtrip latency of the BEGIN TRANSACTION / COMMIT packets.

    "Wouldn't a good programming practice, be to ensure that everything that is supposed to be atomic, occur in a single, possibly large, SQL statement anyways?"

The simplistic answer to that question is a resounding yes.

nathanstitt · on Jan 18, 2013

Do you have a source for how it can "lead to broken data?" It's my understanding that autocommit simply makes postgresql commit a statement automatically after it's submitted, unless it's wrapped in a transaction.

marcosdumay · on Jan 17, 2013

It's ok to just use transactions all the time when you don't need performance, but those transactions come with a cost, and normaly most queries of a system won't need them.

jensenbox · on Jan 17, 2013

He does mention it is good for read only primarily...

gingerlime · on Jan 17, 2013

can you elaborate?

zrail · on Jan 17, 2013

Sure.

1) Code that depends on autocommit is hard to unit test, since you have to mock out whatever internal method autocommit calls

2) Code that depends on autocommit is hard to reason about, since you don't have a consistent view of your data, especially in multi-step update methods.

3) Because of 2, your updates will (not "may", "will definitely") be corrupted at some point, leaving broken bad data in the database. Comprehensive constraints help, but if you're relying on autocommit chances are you're not using constraints very well either.

sophacles · on Jan 17, 2013

1. is a strawman

2. autocommit is explicitly mentioned in terms of single select reads. Besides - if you use transactions in your update, it doesn't affect anything - postgres will do the right thing. Similarly, postgres will wrap single statement updates in transactions for you. Using it in multi-step procedures is a no-no - fortunately autocommit is per-connection, so if you are being careful about what connection you use, you can have both.

3. Another strawman - using autocommit for single selects doesn't preclude not using it for places where data integrity is a concern.