The year is 2004, and Google quietly releases a new offering called Gmail. It’s invitation only, and there is little fanfare. Techies pass invitations amongst themselves, and a cult following grows, but generally the world fails to take notice. The year is 2006, and Google launches Google Apps. Again, it’s seen as an interesting experiment but generally not taken seriously by the industry. Then small businesses and startups start noticing the service is free, and for small businesses, free is a critical differentiator. They can access their documents from the office, home, the road, an airplane or anywhere they find themselves. It’s collaborative; multiple users can edit the same document at the same time. This is how they work, and it gives them a competitive edge. The year is 2010, and Google announces that Google Apps is no longer in beta, and that almost 2 million companies are now using it. The year is 2015, and Gmail has just passed 900 million users and Google Apps has over 60 percent of the Fortune 500 as users.
The Cloud is not coming; it’s here. Despite the drawbacks and potential downsides, the value promised by cloud services is so compelling that companies big and small are all transitioning to it at an alarming rate. Possibly most alarming is not the number of companies using the cloud but the number that have fully adopted it, e.g., the entirety of their IT is cloud based. In 2015, this was 12 percent of all publically held companies. This number is expected to rise to over 25 percent by 2020 and approach 70 percent by 2025. But that tells only half the story. By 2020, nearly 50 percent of all small businesses will be completely cloud based. To add some perspective to that time frame, the iPhone is just about to celebrate its 10-year anniversary in January 2017. In less time than it has been around, we are seeing a massive shift in corporate IT from traditional back office operations to an entirely cloud-based ecosystem.
This explosion in adoption is driven by the compelling arguments of the cloud: enhanced functionality, ubiquitous access, rapid deployment, increased capacity, reduced cost and ease of adoption, to name a few. Along with these benefits, there are important questions to be considered when adopting cloud services and determining the impact when responding to regulatory, investigative or legal discovery requirements. Evaluating a cloud provider needs to include an understanding of each provider’s ability to apply a legal hold, to identify and collect data for legal discovery, and to investigate such issues as employee misconduct or security incidents. Unfortunately, many of these questions aren’t considered until after the decision has been made and the migration is under way, or in many cases, after migration is complete. When the questions are asked, it’s a fait accompli, and there is very little that can be done, especially in terms of contractual language and service level expectations.
The following 10 key points need to be considered when evaluating and selecting a new cloud solution:
1. Access to Your Information
Fast and complete access to data stored in the cloud is key in responding to discovery, regulatory or investigative requests. Many cloud services limit the speed of downloading data, alter the metadata of stored content and limit your access to data in many ways. This can result in significant delays, increased cost and legal risk. When selecting a provider, it’s critical to evaluate both contractually and practically your ability to access your data. In addition to a thorough evaluation of the contractual terms, pre-purchase testing should assess how the solution will meet your company’s needs. Testing the speed of exports and confirming that required exported metadata remains intact is critical to avoiding risk.
2. Lost in Translation
Data loss is of particular concern when there are regulatory or legal obligations. Migrating data into a cloud solution has the potential of altering that data or even losing it altogether. Metadata, such as the created and last modified dates and date or time stamps, may be altered; simple email address translations may change; or entire email messages may fail to migrate altogether. While some data loss may be acceptable, it’s important to know what will occur to ensure it will not negatively impact the company’s obligations. Additionally, the migration may cause new and misleading metadata to be introduced.
3. Ability to Apply a Legal Hold
Placing data on legal hold in a timely and complete fashion is an essential component of responding to regulatory, legal or investigative demands. Releasing unneeded data from an overbroad hold is much easier than trying to recover or replace data that was missed. Cloud services, such as Microsoft’s Office 365, offer solutions to place a custodian’s entire mailbox on legal hold or selectively place specific content on an “in-place” hold. This latter option is defined by a query using keywords, dates, senders and recipients, or other criteria. If the specific service doesn’t have the ability to place data on hold, it may be necessary to collect it to ensure preservation. This “collect to preserve” solution can be time-consuming and costly, and may introduce complexity in the process of managing and releasing data form hold.
4. Cloud Service Cancelled, No Longer Supported or Acquired
One of the biggest risks of using a third-party cloud provider is the potential for the provider to cancel the service or no longer support it or for it to be acquired. Cloud providers are always innovating, and as a result, many services are left to languish or to be cancelled altogether. This can introduce some significant challenges that may make your data difficult to migrate, export or even access altogether.
5. Know Where Your Data Is
One of the key components of responding to a discovery request or targeting an investigation is knowing where your data is located. How can you be certain all relevant data was placed on a legal hold or key systems in an investigation have been identified if you don’t know where your data is stored? Companies have traditionally used a process of data mapping to identify what types of data are stored on which systems. While this process still works for mapping data stored in the cloud, there is a layer of obfuscation introduced when working with cloud providers. You may not be fully aware of where your data is located, how it is stored and backed up, or even what country or jurisdiction your data resides in – and that introduces the possibility of cross-border regulations and laws.
6. Locating Custodial Data
Responding to discovery requests has traditionally been focused on identifying custodians and then collecting their data. This typically involved locating a custodian’s mailbox, computer, network share and other storage locations. The advent of cloud-based solutions has added a wrinkle to this process. These solutions introduce new ways to collaborate on documents and such concepts as a document owner, author, contributor and reader may mean different things. While many of these concepts were previously in use, cloud-based solutions have taken them to a new extreme. It’s important to understand how these new concepts may impact how data is identified and collected for individual custodians.
7. Does Delete Really Mean Delete?
A great benefit of using cloud services is the robust data protection features. Many cloud storage solutions retain multiple versions of a file and may even allow you to recover a file after it has been deleted. For example, Dropbox allows users to recover deleted files for up to 30 days. Paid Dropbox Pro accounts include extended version history (EVH), which allows recovery for up to one year. While this functionality is great for the user, it introduces some interesting challenges for data preservation and collection. Can these previous or deleted versions be put on a legal hold? Can they be collected? If all previous versions can be preserved and collected, does a corporation have an obligation to do so? These questions must be considered before selection, not after the decision has been made.
8. Knowing What Information Is Available
When responding to a discovery request, it’s important to preserve relevant data. When extracting from a cloud-based application for production, do you know what metadata is available and how it is being preserved during extraction? For example, Google Docs maintains a complete edit history for each document. This means you can replay all editing events, tagged with the author name, date and time for each individual character entered into the document, including when characters were deleted.
9. Producing Native Versions
Traditional productions use TIFF images of documents when responding to discovery requests. More recently, productions commonly include native files for documents. This raises the question, how does one produce a native file from Google Docs? Files can be exported out in Word’s .docx format, but is that a native Google Doc file? Does it retain all of the metadata available? Could it present inaccurate or misleading metadata? Do we need to add language to our productions stating that these documents are natively cloud based and have been converted to offline native documents for production? And if so, does that negate the purpose for producing natively in the first place?
10. What’s the Exit Strategy?
Thinking about how to dissolve the relationship up front is critical. Who owns the data, what responsibilities each party has to notify the other, and how the data will be returned to the owning party are all critical questions that need to be answered before a selection is made. Once you have a terabyte of data relying upon another company’s infrastructure, you don’t want to be put in a position to remove it quickly before they shut down the service.
This is not a theoretical warning. We are currently working with clients who find themselves in this exact position. One client has been notified by their cloud provider that they have until the end of 2016 before service is terminated; the provider offered no assistance to facilitate this process and actually pulled staff away from supporting the platform. Our client is now in the unenviable position of having their data, which is stored on another company’s infrastructure and is relevant to litigation, purged on January 1, 2017, and they have very little control and a large amount of responsibility.
And that is really where we end up. In return for the lowered costs and enhanced features of cloud services, we are giving up control but retaining responsibility. As such, it can be difficult to assess the true risk associated with this change. Some services, such as Microsoft’s Office 365, may appear to be more expensive up front, but when accounting for risk and downstream expense associated with legal and regulatory requirements, they may be the wiser choice. This is especially true when accounting for unknowns, such as how to natively produce a Google Doc or how to explain metadata anomalies or retention of data when a provider has decided to exit the business.
If your company or client uses Google Docs or a similar system, you should seek expert assistance on how to produce data from your cloud repository to meet discovery needs. Conversely, if you are receiving native documents in a discovery production, forensic experts can help you evaluate and interpret the metadata associated with documents that may have originated from the cloud.
Let’s close with a few old world adages updated for the modern cloud era – not all cloud providers are created equal, plan for the worst, hope for the best, and if it seems too good to be true, it probably is.
Published November 7, 2016.