Three Preventive Backup and Restore Measures for MSP Success

Author: David Gugick, VP of product management at CloudBerry Lab

Scheduling your image and file backups across your environment is only the first step to a successful backup and disaster recovery strategy. You'll not only need to make sure all your backup plans are completing properly, but you'll also need to make sure you can restore needed data and image new servers within the restore time objectives for your business. Remember, nobody cares about successful backups if you can't restore.

Let’s review three preventative measures you can implement today to help ensure an efficient backup and restore process that leaves you with happy customers.

1. Are your backup jobs running correctly?

Everyone knows that backup is a set-it-and-forget-it type of IT activity. But that doesn’t mean it’s a hands-off operation after initial setup. What it does mean is that the level of administration required to ensure a successful implementation may be much lower than other IT activities. With that in mind, it’s a good idea to check your backups to ensure they are running without incident and in the time required.

Depending on your backup software, this may be easily done with built-in notifications that let know when something has not completed according to specifications, or it could be that you visit a backup monitoring screen periodically to see if there were issues. More likely, a combination of the two. Once you know about an issue, you're in a position to address it. The last thing you want to happen is a request for a restore that cannot be fulfilled because backups were not running on the computer in question.

The important things to check are:

  • Are your backup jobs running? An offline computer may not generate any notifications, so you need to know when a computer is offline
  • Are your backup jobs completing without error? If not, you need to investigate why and remediate as quickly as possible
  • Are your backup jobs completing within the scheduled time? If not, you may be impacting network and broadband activity during times you’d prefer were free of that activity
  • Is your backup storage utilization within estimates? If you pay for cloud storage by the GB, you want to make sure you're using what you expected to avoid higher storage costs (for you and your customers)

2. Can you restore a server image within your SLAs?

If a server crashes, physical or virtual, you may not have time to manually set up a new server, install needed software, and restore needed files. If your SLAs (service level agreements) with customers require a faster recovery process, then you're likely running image-level backups to protect entire server volumes (operating system, installed software, and user files/databases). And if you do not have a duplicate physical server on standby, you probably need to restore to a virtual environment (VMware, Hyper-V, or the cloud). But understanding how long this process will take is imperative if you have SLAs in place. And the best way to do that is to perform test restores of entire images.

If you're adhering to the 3:2:1 backup principle, then you have options to restore from local disk or from the cloud. Consider testing all the restore scenarios applicable to your business, which may include:

  • Local restore to a local physical or virtual server
  • Cloud restore to a cloud VM
  • Cloud restore to a local physical or virtual server

The restore times will vary by restore scenario, but they provide invaluable information. Once you know the restore times, this information can be clearly communicated to customers. We recommend you test the applicable restore scenarios periodically. Even quarterly tests will provide enough information to set everyone’s expectations. There's also another benefit: If restores are taking longer than needed, you can examine alternate backup locations and restore types to see if they better meet your SLAs.

3. Can you meet your restore point objectives and retention policies?

This is something you likely discussed during initial backup setup, but it’s worth revisiting periodically. Your RPOs (restore point objectives) determine how much data a customer is willing to lose and your retention policies determine how far back that data can be restored. The lower the RPO, the more aggressive the backup schedule. The longer the retention, the more backup data is kept in storage. There are costs to both.

Most times this is a negotiation with the customer. But there might be drivers, like compliance and government regulations that change required retention policies, business changes that require lower RPOs or that deleted files remain in backup storage for a specified amount of time. To ensure your customers remain happy, you should review the backup jobs and retention policies a couple times a year to ensure everyone is in agreement and understands how much data could be lost (worst case scenario) and how far back data can be restored from backup storage.


Simple monitoring and testing your backup and restores is key to setting expectations for both the business and customers. If you’ve done your testing and you know what customers need, you'll be in a perfect position to provide the needed services that meet customer needs.

David Gugick is VP of product management at CloudBerry Lab. Read more CloudBerry Lab blogs here