This was also interesting: "we believe that the high-frequency checkpointing mechanism we have engineered in support of Remus will have many other interesting applications, ranging from forensics and error recovery tools based on replayable history to software engineering applications such as concurrency-aware time-travelling debuggers."
The problem with these systems is they don't know what is significant state so they have to copy everything to the slave.
The way Remus gets round this is it bulk copies (upto 40 times a second) rather than on every change. So the master runs slightly ahead.
Terracotta is something similar for the JVM. I think they get round it by exploiting the fact the JVM knows what's going on so for example you could say I want only this field on a class to be replicated. (But I've never used terracotta so someone might have to correct me on that.)
I imagine the performance hit is pretty substantial but for things like VOIP or messaging servers this will make real HA possible on commodity hardware. Pretty cool.
This was also interesting: "we believe that the high-frequency checkpointing mechanism we have engineered in support of Remus will have many other interesting applications, ranging from forensics and error recovery tools based on replayable history to software engineering applications such as concurrency-aware time-travelling debuggers."