Dark Data: What It Is and How to Manage It
In this current world of business, organizations are amassing and storing more data than ever before. We truly are in the age of big data, which often presents as many challenges for growing companies as it does benefits.
Most of the data being gathered by your organization is going to be used to improve something about the way you do business. Whether it’s information about how your users are utilizing your product, results gathered from your marketing efforts, or internal statistics about your development processes, your company’s constantly growing data is a major asset that, with the correct analysis, can increase your bottom line.
But along with that valuable data, your company is almost certainly also storing an increasing amount of data that has no real tactical value at all. Gartner has deemed this unmanaged information as “Dark Data.” Sure, it sounds a bit dramatic, but realistically, the increasing amount of this unstructured information being stored by organizations is a costly and potentially risky endeavor that some believe could become a major speed bump along the big data highway.
Let’s take a look at what dark data actually is, how it could impact your organization, and what steps you can take to manage it at your own organization.
What is Dark Data?
As with many buzz terms that float around the Web, the exact definition of “Dark Data” can be hard to nail down. According to Gartner, which originally coined the term, dark data is defined as, “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes.”
By this definition, much – if not most – of the information your organization stores could be referred to as dark data. This is because, as useful as data can be, the majority of the information we tend to hold on to is simply collateral, in that we feel the need to hold onto it in case you need to prove that something occurred in the past, but is almost entirely obsolete for any other use.
Specific examples of what could make up all of your dark data will be wide-ranging on a company-to-company basis, but any of the following could absolutely fall under this fairly broad term if they are outdated or unstructured:
- Customer Information
- Log Files
- Account Information
- Previous Employee Data
- Financial Statements
- Raw Survey Data
- Email Correspondences
- Notes or Presentations
- Old Versions of Relevant Documents
What’s the Problem with Dark Data?
There are many issues associated with dark data that can become more prevalent as time goes by. If you think of dark data as the clutter that is amassed inside the house of a hoarder, the first problem becomes obvious: Space. As that unorganized data continues to grow, it takes up storage that could otherwise be used for your valuable assets. More storage means more overhead costs, which – particularly in the era of big data – is already a significant concern in most organizations.
Aside from increased storage costs, having large amounts of unstructured or unorganized data can potentially lead to serious security risks. Along with outdated and seemingly useless documents, dark data will likely also contain sensitive, proprietary information. If you haven’t seen the news, data breaches – like the one that just rocked Sony Pictures – are becoming more and more prevalent each week. Just because employees at your organization don’t want to take their time to go through piles of old information doesn’t mean that hackers aren’t willing to mine that data for years-old embarrassments that your company had hiding in the basement.
On the other end of the spectrum, your organization may also be missing out on some great opportunities by allowing dark data to steadily build up in your database. Along with extremely sensitive information that could be potentially harmful in the case of a breach, there’s likely going to be a lot of untapped potential inside that mass of information. As with the hoarder and their overabundance of useless stuff, it’s difficult for your company to find the information that could be truly valuable amid a giant mass of unstructured legacy data.
Next Page: How you can manage dark data, prune your database and back up the right information.