Ensuring Effective Data Recovery: Tips for MSPs
If you’re like most responsible MSPs, you have a data backup and disaster recovery plan in place. But having a plan doesn’t necessarily mean that you’ll be able to meet your data recovery objectives in the event that a disaster occurs. A number of variables that are easy to overlook can prevent your disaster recovery plan from meeting your goals.
To ensure that your data recovery proceeds as you intend, you need to identify and test for the unexpected factors that can hinder data recovery. This article discusses some of the common challenges that can prevent effective data recovery and explains how MSPs can address each one.
Recovery Time Objective (RTO) Delays
Recovery Time Objective, or RTO, refers to the amount of time that a business can afford to wait for data to be restored before business operations are seriously disrupted by the missing data.
Identifying an RTO and designing processes to meet it is an important step in building an effective data backup and recovery plan. However, you don’t truly know whether your data recovery process is capable of meeting your RTO objective until you actually perform data recovery.
You shouldn’t wait until you’re recovering from a real disaster to find out whether your RTO can be met. Instead, test your RTO capabilities by executing your data recovery plan periodically under test conditions in order to see how long it takes to recover data.
Testing RTO is especially important because unforeseen issues such as network bandwidth problems or slow disk throughput could delay recovery time and impact your ability to meet RTO.
File Permissions Inconsistencies
One common problem that can occur during data recovery is the loss of file permissions on data after it is restored. Such a loss can hinder your ability to restore normal business operations because file permissions are an important resource for enforcing access control and strengthening data security.
File permissions could be lost if, for example, you use different types of file systems or operating systems for your production data and your data backups. In other cases, backup and recovery tools may fail to copy all file permissions accurately when data is recovered from a backup location to production systems.
To an extent, you can control this risk by planning a data backup strategy that minimizes the risk of file permission inconsistencies. For example, backing up files using snapshots or disk images is a more reliable method for keeping file settings intact than is copying files individually.
However, even that strategy is no guarantee that file permissions will remain intact during data recovery. Unexpected issues can still occur; for instance, if you use different versions of the same type of file system, you may experience issues with file permissions because one version of the file system lacks features available in the other.
For these reasons, MSPs should determine whether their backup tools and process are preserving file permissions. If they don’t, then they must implement a manual process for addressing the issue.
Failure to include hidden files within restored data is another common problem that results from the nuances of file system and data recovery tools. What counts as a hidden file to one operating system may not be recognized as a hidden file on another. For example, on Windows, hidden files are typically configured at the file system level, whereas on Linux a hidden file or directory has a ‘.’ character prepended to its name.
Since hidden files may contain important data that you don’t want to lose following a disaster, including hidden files within your backup and recovery process is important.
As with file system permissions, you can mitigate the risk of overlooking hidden files by ensuring that your recovery tools are configured to recognize them and avoiding mixing different types of operating systems and file systems. However, it is also worth the time it takes to perform periodic checks in order to identify hidden files that may be missing from recovered data.
In some cases, temporary files are an important type of data that you want to recover. Although many temporary files can be safely lost without causing issue, in some instances temporary files contain information such as the state of a database or application log data.
Most operating systems store temporary files in a given directory, such as /tmp/ on Linux and %USERPROFILE%\AppData\Local\Temp on Windows. This makes it easy to include temporary files within backups in most cases. However, the default storage location for temporary files could be changed, and individual applications may store temporary data elsewhere. For these reasons, running periodic tests to ensure that important temporary files are being backed up is important if you need to preserve this data.
Just because data exists in a backup location does not mean all of it will be readable when you recover it. Disk problems, file system errors, network failures and other issues could cause some of your recovered data to become unreadable or inaccessible.
Backing up to a highly available storage location is one way to mitigate this problem, and you can run automated tests to check the integrity of backed up data. However, the only way to check for sure whether recovered data is free of errors is to perform an actual recovery, then test the data.
One type of data you don’t want to recover is malware.
Malware can be introduced to your data recovery and restore process in a number of ways. It might exist within the production data that you back up, or it may find its way onto your backup data.
You can run malware checks against data backups to help ensure that malware does not exist on backup storage. However, since some malware could be dormant within backups, then become active again when the data is restored to production systems, testing recovered data for malware is the best way to gain full visibility into the state of malware within restored data.
When recovering data, a lot can fail to go as expected. No matter how carefully you plan your data backup, investing a little extra time to run general recoveries and disaster recovery testing, and check the results for unexpected issues is well worth it. Data recovery testing will help to prevent unforeseen problems when they really count — in the event of a real post-disaster recovery.
Bonus – Grab This: Learn what features and capabilities your backup solution should have, how to choose the correct recovery options and more in this 7-page MSP’s Guide to Backup-as-a-Service.