Hardening Backups Against Ransomware
Human-operated ransomware represents a unique challenge to backup infrastructures. Unlike in other scenarios, ransomware attackers specifically target and attempt to destroy backup systems to increase the likelihood that a victimized organization will pay the ransom. This threat requires a different approach to securing backup infrastructure.
The Old Ways Are Not Enough
Traditionally, enterprise backup infrastructures were designed to address one or more of the following scenarios:
- Small-scale restorations of a single endpoint, server, database, or application in response to an isolated incident (e.g., database corruption, hardware failure, limited security compromise of endpoint, user error, etc.)
- Recovery from physical disaster affecting a single datacenter or cluster of proximate datacenters (e.g., due to fire, natural disaster, etc.)
- Long-term data retention based on legal or regulatory requirements
Protection approaches for these scenarios can be summarized by the ubiquitous "3-2-1 Rule": have at least three copies of data (production + two backups), stored on two different types of media, with at least one copy at a different location.
Traditional backup approaches fundamentally assume that any loss event will be geographically limited. With ransomware, however, this assumption no longer holds.
Why Ransomware Is Different
Ransomware is not limited by geography. Servers that physically reside in Seattle, Sao Paolo, and Singapore can all be corrupted together if they share a network.
Furthermore, ransomware attackers intentionally target backup systems in ways that tornados never will. Attackers attempt to find and destroy the backups before triggering their encryption malware, hoping to leave their victims helpless and force ransom payouts.
In many cases, the backups can be destroyed with the same ransomware encryption as everything else. In an Active Directory environment, for example, attackers will elevate privileges in the domain and then deploy the ransomware using common domain administration tools like WMI or Group Policy changes. They also perform discovery and lateral movement using protocols like RDP and SMB. So, if the backup systems are domain-joined Windows servers or if backup data is saved to network storage, it is likely that these backups would not survive the ransomware attack.
The attackers are also investing "hands on keyboard" time to gain access to the management consoles for backup systems, which may require little more than gaining Server Administrator privileges in the domain. Once inside the management console, attackers can often delete backup data through the built-in administration tools.
In an unprepared organization, it is not difficult for attackers to delete backups before triggering a ransom. Considering that the potential payoffs range from hundreds of thousands to tens of millions of dollars, attackers consider the additional effort to be worth it.
What Was That Password Again?
The other critical challenge we see is that organizations are not prepared to perform large-scale recovery even when backups do survive the initial attack. Again, this goes back to the flawed assumption that an organization would not lose everything at once. So, although critical line-of-business systems are successfully backed up, the systems they depend on may be overlooked.
Many of the organizations I've worked with realize that even if their backups did survive an attack, the passwords and encryption keys needed to access those backups would be destroyed by the ransomware. In other cases, backup system logins depended on having Active Directory available. And Active Directory could only be restored from (you guessed it!) the aforementioned backup systems. Fortunately, we discovered these problems during proactive hardening exercises with no damage done, but plenty of actual ransomware victims have suffered through these heartbreaking scenarios for real.
The Heart of the Issue
The ability to recover critical data and IT capabilities from backups is the last line of defense against catastrophic business losses due to ransomware and other enterprise-scale destructive cyberattacks.
Put more simply: The ability to recover from backups is the last line of defense against catastrophic losses from ransomware.
I hope this is something all readers will hear and echo beyond IT to your executive leadership. We are not just talking about whether the organization is forced to pay the ransom. The ransom is only a fraction of the total financial impact. Rather, this is also about the financial differences between rebuilding the organization in days versus in weeks or months. It is about reliable recovery versus unpredictable data losses and corrupted, untrusted system states. It is about salvaging the confidence of customers and the public, with clear ties to future revenues.
Critical Preparation Objectives
Organizations should assume that a successful ransomware attack will corrupt and take offline all core IT capabilities. Active Directory & other identity systems. File servers. Line of business applications. Databases. Internal networking. Password managers and Privileged Access Management (PAM) systems. Et cetera.
To successfully recover from this "everything is down" situation, an organization must achieve four key objectives before any attack:
- Perform Backups of Critical Systems: Regular backups of critical systems and data must be performed, including core infrastructure such as the organization's identity provider (e.g., Active Directory, etc.), DNS, DHCP, and related foundational capabilities.
- Harden Backups Against Destruction: Backup data must survive the ransomware or destructive cyberattack.
- Access Backups in an Emergency: The organization must be able to access the backups when production networks and critical IT infrastructure are down, encrypted, or otherwise inaccessible.
- Recover at Scale After a Disaster: The organization must be able to restore critical IT capabilities and datasets from backups at enterprise scale following a major incident.
Let's unpack these objectives a little bit.
1. Perform Backups of Critical Systems
The most fundamental objective is that the critical systems must be backed up in the first place. The systems need to be identified—including identity and administrative systems as well as line of business applications—and prioritized to guide recovery efforts. Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs) should be established for this kind of scenario. This is often IT's opportunity to warn the business that RTOs and RPOs in an enterprise disaster like ransomware are typically at least an order of magnitude longer than the RTOs/RPOs established for smaller restorations under normal conditions.
2. Harden Backups Against Destruction
If the backup datasets do not survive the attack, all other preparations are pointless. Therefore, these datasets should be protected against deletion or destruction within the retention lifecycle. Any solution needs to consider both the potential for abuse of administrator deletion capabilities in the backup management interfaces and deletion or corruption at the file system or data storage level.
The surest protection against data destruction is to use immutable backup storage, by which I mean storage that cannot be deleted or destroyed during established retention timelines without direct physical access or multi-party, out-of-band intervention. For example, some organizations are finding renewed value in physical media like tape or disks that are rotated offline and stored securely. Newer developments include dedicated storage appliances that feature logical immutability mechanisms to prevent the premature deletion of data by any user. The other common solution is to use cloud storage that has retention locks or WORM (write once, read many) functionality enabled.
Immutable storage is the best answer to backup data protection. It also tends to be the easiest to implement and maintain.
However, if that is not an option, the next-best solution is to implement independent network and identity management for backup infrastructure, isolating them to avoid contamination from a compromised production environment. Think along the lines of a red forest environment.
If neither immutability nor independence is available, then at a minimum, an organization should isolate backup infrastructure to protected network segments, apply stringent configuration hardening to the systems, and make sure that strong PAM controls are in place.
3. Access Backups in an Emergency
To access backups when production networks and critical IT infrastructure are down, system administrators will need some sort of offline or isolated "break glass" credentials to the backup storage systems. Remember, Active Directory is still down. Admins also need to consider any encryption keys that might be required for decrypting the storage or the backup files. Finally, the process for accessing backup systems under these conditions should be well-documented. How do administrators access the backup console directly? What are the IP addresses of the storage servers? Does a management server need to be rebuilt before data can be recovered from data stores? The middle of a crisis is not the time to be answering these questions.
All this information should be stored in a battle box. Not familiar with the term? A battle box is an offline collection of all the info required to perform emergency access and restoration of your IT systems. We generally suggest using a combination of portable digital storage (USB drives) plus hard copy documentation, all securely held by a small number of key administrators. Those encrypted USB flash drives with the physical PIN pad are a good option here. We've seen a few organizations use designated 'DR-only' laptops that are kept offline, while others have relied on isolated cloud-based document storage sites and cloud-hosted password managers as their not-connected-to-the-network battle box solution. Your mileage may vary, so choose an option in line with your organization's practices and risk tolerance.
4. Recover at Scale After a Disaster
The answer here is just like that old joke about how to get to Carnegie Hall: Practice, practice, practice.
Large-scale, from-the-ground-up infrastructure restorations are a completely different beast than the small-scale or single-system restorations that are more common. Even recovering an entire datacenter (assuming the organization has more than one) doesn't entirely compare to the complexities of recovering without any functional preexisting network to rely on.
Many organizations need to work up to a comfortable full-scale restoration. One suggestion is to run a recovery test every 3 to 4 months. Start basic with the first test, ensuring that Active Directory and basic networking and domain services can be restored. Use the test to update credentials and documentation in the battle box.
In the next test, recover domain and basic networking plus a couple of the next-most-critical systems, possibly a PAM system or a significant file server or database server. (Also, validate and update the battle box.) For the third test, restore all the above plus a few more systems, like some of the highest-priority line of business applications. (And... battle box.) Continue expanding the systems in scope for each test until all key systems can be recovered reliably. As dependency issues arise ("Apparently we need to restore Server Y before Server X..."), refactor the process and fix the documentation for next time.
By following this approach, most organizations should be able to develop a solid and validated enterprise-scale recovery capacity within 18 to 24 months or so. The processes and key information will be available and proven. The required primary and secondary administrators will be identified, trained, and well-practiced. Finally, IT can justify to the business leadership how long a recovery of this magnitude will truly take, which will impact the leadership's risk outlooks and funding priorities.
In Conclusion: Do This
Take a hard look at your organization's backups. Will they be there for you in a disaster when you need them?
To summarize, the key takeaways are:
- Perform backups of critical systems.
- Harden backups against destruction.
- Access backups in an emergency.
- Recover at scale after a disaster.
Organizations that achieve these objectives with their backups will almost certainly be able to recover from those backups in a crisis, saving the company from potentially catastrophic financial and reputational impacts.
On the other hand, organizations that don't harden along these lines will have a hard time recovering from a ransomware attack. Ransom payments may be unavoidable. Direct and indirect financial losses will be extensive. Recovery will take weeks or months instead of days. Some permanent data loss is likely. None of us wants to be in that situation.
Ransomware stinks, so harden your backups. This guide should help take the mystery out of what to do and where to focus. Get after it.