Channel

Archiving to Amazon Glacier: How to Avoid Hidden Costs

Backup as a Ser

There are numerous data storage solutions available today, and data is now one of the cornerstones of every company. But where should you store all your data and archives? Amazon Glacier, for many, is thought of as an ideal choice. It is highly affordable, reliable and available. However, there are a few caveats you should keep in mind while leveraging this cold storage solution.

Early Deletion

Amazon Glacier has some of the lowest prices on the market for storage. Though this is often viewed as a major incentive, there are additional costs that many people don’t take into account when signing on. The popular misconception is that Glacier is a good choice for day-to-day storage.

It is not.

Amazon Glacier was designed to be a cold storage solution, a way to archive data for long periods of time — years, rather than months. Amazon has a set minimum time period for cold data storage — 90 days. If that minimum is not reached or data is changed before that date, Amazon charges what is called an “early deletion fee.” To give an example of the scope of this fee, the deletion of 1GB will incur a charge that can equal up to half of what you pay to store  that 1GB.

Retrieving the Data

Amazon’s cold storage was designed with archiving in mind. Retrieval of this class of data is generally  tied to compliance audits or other planned events. With Amazon Glacier, there are three tiers for retrieval:

  • Expedited. If you need your data quickly, it will naturally cost more. This tier is the most expensive, though acquiring your data only takes a couple of minutes if you need it quickly.
  • Standard. From the very first days of the service, Glacier retrieval has regularly taken 3-5 hours. This timing now exists as their Standard tier. Pricing is in the middle on the scale.
  • Bulk. If you expect the need for frequent data retrievals, then this option is for you. It’s dirt cheap, though it takes up to 12 hours to pull data from storage.

Inventory Operations

There are two ways to store objects in Glacier:

  • Direct upload to Glacier vault. For this you’ll create a vault within Amazon Glacier and upload objects directly.
  • S3-Glacier lifecycle policy. You upload data to Amazon S3 and convert it to Glacier-class. This way, your data remains in your original buckets with all rules of Glacier objects applied.

We recommend you use S3-Glacier lifecycle policies. It might seem a bit more complicated, yet it can help you avoid extra costs. But — when you change the object class from S3 to Glacier, all your files remain in place, and you can see them and perform actions. However, after you upload your files directly to your Glacier vault, you are forced to perform an inventory operation (which might cost you a lot if you have millions of files to inventory).

Create a Lifecycle for Your Data

Using Amazon Glacier effectively involves creating a lifecycle plan for the data you store. You need to determine which data you will need access to and when. Amazon S3 consists of four storage classes:

  • S3 is for files that you need right away. Though the storage is more expensive than other classes, it is less expensive to retrieve data.
  • S3 Infrequent Access is for data that you will need in 30 days. It is more affordable to store data with this option, but there is an early deletion fee if you want to change or delete data within 30 days of storing it. It is also more expensive to retrieve data.
  • S3 One-Zone Infrequent Access has the same 30-day minimum storage time as S3 Infrequent Access, but it is designed for less important data.
  • Glacier — for data designated for archive

With a bit of fine-tuning, your data lifespan will be seamless and won’t interfere with day-to-day operations. For example, your weekly backups may be going from S3 to S3 IA in a month — and after 30 days, they will automatically move to Glacier for 180 days.

Don’t Forget About Your Data

Clients can easily cross the 10TB mark in a month or two. If you have lots of clients with this amount of data, your storage bill will climb. Create a storage structure that can be easily followed, set up a retention policy to automatically delete unneeded versions of files, and finally, review your storage on a regular basis. If for some reason you lose a client, empty the vaults as soon as your contract will allow.

Over the years, Amazon Web Services has created one of the most impressive cloud compute and management solutions in the world. By effectively leveraging it, you can add additional value to your proposition. However, it’s important to understand the solution and how it relates to your needs in order to minimize the financial risks of misuse.


Alexander Negrash is director of marketing at CloudBerry Lab. Read more CloudBerryLab blogs here.