In SRE, a core message is that failure is inevitable. No matter how much you prepare, there will always be incidents you can't foresee. This doesn't mean preparation is useless, though. This talk will focus on one extremely valuable type of preparedness: having backups and restoration processes for the worst disasters. When your system experiences a total outage, an effective option is often to switch to a backup system before trying to solve the issue itself. This will restore service as fast as possible. However, just making backup systems isn't enough.
This talk reveals complacency and blind spots when it comes to backup systems. Many organizations feel comforted by having created backups, but aren't actually prepared to use them. There will be practical advice given on how to improve backup systems for organizations of all sizes. The talk will cover looking at backup systems from the perspectives making them more reliable, more robust, and more resilient - based on the definitions given by Dr. David D. Woods. In order to make the advice inclusive, there won't be much technical detail. Instead, the focus will be on mindsets and strategies.
Black swan events are highly impactful incidents that are so unlikely or unimaginable that effort isn’t made to prepare for them. You'll learn how to conduct thought experiments of "meteor strikes" and other worst-case scenarios, such as ransomware, to feel ready for other problems you can't yet imagine. You'll also see how backup systems can still be useful for such disasters. This is how a resilient backup system is created - one that can still handle what falls outside your expectations.