Looks like the AutoScaling groups are applications or stores where they do not n...

Looks like the AutoScaling groups are applications or stores where they do not need to coordinate their actions. The 3-az deployments seems to be APIs, which I am guessing scales with each regions (to reduce data cost?) and probably brought up/down automatically with Puppet to handle post-launch configurations. (you are reading it correctly).

I would guess testing is a skeleton version of the entire deployment so the cost is minimal and just need to test new deploys and verification for tests.

Staging probably wasn't a full mirror, at best I would venture to guess they had hot swaps coming up in staging and then being switched against production via ELBs.

They mention costs a few times in articles, so I would venture to guess they did optimize around many of those corners.