One of the first things I did at my current place was to add a Nagios check "Is the most recent backup size within x bytes of the previous one". It wasn't comprehensive, but was fast to implement and was later replaced with full restores and checks, but nevertheless a few weeks later it caught a perfectly formed tar.gz of a completely empty database due to a bug that would have continued creating empty archives until it was fixed.
This is a fantastic illustration of testing 101: often it's the dumbest possible checks catch huge errors. Get those in place before overthinking.
E.g. a team adjacent to mine years ago had a dev who made a one-character typo in a commit that went to production. Which caused many $MM to incorrectly flow out the door post-haste. The bad transactions were fortunately reversible with some work, but I was floored that there were no automated tests gating these changes. It wasn't a subtle problem. The most basic, boring integration test of "run a set of X transactions, check the expected sum" would have prevented that failure.
All these threads are complete gold mines of years of hard won experience and system administration tips! Fascinating, especially the war stories I'm reading - both hilarious and horrifying in equal measure.