The Blog

Don't Be Afraid of the Dark... Data

It's the dawn of the living data and information growth is out of control; companies are ending up with Terabytes or Petabytes of information. Successful organisations run and are dependent on information.

It's the dawn of the living data and information growth is out of control; companies are ending up with Terabytes or Petabytes of information. Successful organisations run and are dependent on information. However, information is only valuable to an organisation if you know where it is, what's in it, and what is shareable. If you don't know what you have and where it is stored, there could be evil lurking within, waiting to rear its ugly head. If you watch scary movies, you will know that getting rid of the antagonist can be a notoriously difficult task, often ending badly for many. If companies don't want to meet a sticky end, then they must attempt to exorcise the demons in their data.

The Evil Within

With the growth of information, many companies do not know what is lurking within. Estimates suggest that 80% to 90% of all information is being considered as 'dark data' - data that is being collated, processed and stored but not being used because organisations have no idea what's in it. Organisations employee people to create, digest and manage information, heavily relying on them to make decisions on what should be kept, where and for how long. Yet, 75% of data today is generated and controlled by individuals with no responsibility for the overall data management strategy, according to IDC. Inevitably, no employee proactively manages information and they end up saving it where it is convenient. This means data is spread across a variety of disparate repositories and platforms. Because of this, many companies find it difficult to ascertain exactly what information they have and where it is stored.

The problem with not knowing what you have is that you could potentially have some really valuable information sitting idly on your systems. However, it is also a great place for unwanted data gremlins to hide, with unknown and unmanaged information carrying legal risk and costs associated with it. Imagine your organisation is faced with an eDiscovery request. What would happen if you didn't find all relevant data and it was later discovered you didn't turn over some information that could have helped the other side's case? The Judge can overturn an already decided case, issue an adverse inference and assign penalties, for example.

Another example of where dark data has caused havoc is in the recent case where NHS Surrey was fine £200,000 by the Information Commissioner's Office. It was fined because patient records were found on a second hand computer bought through an online auction site. An employee has clearly saved files on their hard drive when they were not supposed to. In most cases, this practice is ineffective and causes what many refer to as "underground archiving", the act of individuals keeping everything in their unmanaged local archives. These underground archives effectively hide the information from everyone else in the organisation, leaving them open to potentially monstrous fines.

Creating Your Own Demons

Exorcising 'evil' data isn't as easy as just deleting it. This in turn creates its own risks and companies can bring their own demons to life. In fact, companies fear deleting data as they could be inadvertently deleting evidence and the inability to furnish data requested as part of a legal or regulatory matter. Other reasons include not having defined policies for managing and disposing of electronic information and adversely, organisations having defined retention policies to actually keep all data indefinitely (usually because of the fear of spoliation).

However, on average only 1% of organisational data is subject to litigation hold, 5% is subject to regulatory retention and 25% has some business value, according to a survey conducted by the Compliance, Governance and Oversight Counsel in 2012. This means that approximately 69% of an organisations data store has no business value and could be disposed of without legal, regulatory or business consequences. So why fear deleting data?

Organisations not disposing of information in a systematic process boils down to poor information governance with no process in place to determine what information to keep and what to dispose of. This leaves them open to creating their own demons through retention policies that could come back to haunt them.

A well know example of a company that created its own demons through its retention policy is Arthur Andersen. It's sudden retention policy activity caused unintended consequences during the Enron case. The Arthur Andersen partner famously sent an email message to employees working on the Enron account, reminding them to "comply with the firm's documentation and retention policy". The Andersen partner never ordered the destruction or shredding of evidence but because it anticipated future litigation was highly likely, the implication in her email was to "get rid of suspect stuff". The timing of the email message was also suspect in that just 21 minutes separated Ms. Temple's e-mail message, to Andersen employees on the Enron account, about the importance of complying with the firm's document retention policy from an entry in a record of her current projects, in which she wrote that she was working on a case involving potential violations of federal securities laws.

This email was the beginning of the end for Arthur Andersen, which ceased to exist shortly after the case concluded. The firm was charged with and found guilty of obstruction of justice for shredding the thousands of documents and deleting emails and company files that tied the firm to its audit of Enron. Less than 1 year after that email was sent, Arthur Andersen surrendered its CPA license on August 31, 2002, and 85,000 employees lost their jobs.

The Shining Light

Businesses must shed light on their dark data; just because its contents aren't immediately apparent, doesn't mean it's not dangerous and - just like in the movies - the villain of the piece might not be the obvious candidate. Most companies try to manage data but are not doing a good enough job, leaving themselves open to all kinds of ghastly problems that they want to avoid. Companies do not want to be haunted by their data or their actions and must implement a true information governance process, including a truly defensible disposal capability.

In these instances, an information governance process would have been capturing, indexing, applying retention policies, protecting content on litigation hold and disposing of content beyond the retention schedule and not on legal hold. A documented and approved process which is consistently followed and has proper safeguards goes a long way in preventing the evil within to rise up and threaten the business.

Businesses must move away from manual, employee-based information governance to automated information retention and management, with truly accurate and consistent meaning-based predictive information governance. To successfully automate this process, auto-categorisation applications must have the ability to conceptually understand the meaning in unstructured content so that only content meeting your information governance policies, regardless of language, is kept and stored correctly.

Through the use of technology that understands information in context, business can harness the expertise of all members of staff to uncover relationships otherwise hidden in big data, such as those between people, locations and companies without an in-depth understanding of how the technology works. This avoids the need to build a team of specialists to go through the information, map it out and find the relationships. Instead experts within the company can teach the technology to focus on what matters to the business.

Don't be afraid of the dark but embrace the value dark data holds. Information governance can help companies keep your data demons away.