The discovery process does not require a producing party to produce irrelevant documents in response to a request for production under Fed. R. Civ. P. 34. To the contrary, the scope of discovery is expressly limited to discovery that is relevant to any party’s claim or defense, or if good cause is shown, to the subject matter of the case. Fed. R. Civ. Pro. 26(b)(1). The purpose of discovery is to provide a mechanism for making relevant information available to the litigants. See 1983 Advisory Committee Notes to Fed. R. Civ. P. 26(b)(1). This same principle clearly does not apply to irrelevant information.
Pursuant to Rule 26(b)(1), corporations that are producing parties have routinely objected and refused to produce irrelevant documents in response to discovery requests. Moreover, they have spent large sums of money on document review, in part, because they do not want to share irrelevant documents with the opponent out of fear that irrelevant documents could spawn unrelated litigation or that highly confidential, sensitive irrelevant documents could get into the hands of competitors. This is the case even if a receiving party were to agree to the entry of a confidentiality agreement or protective order preventing the use of the information in other litigation. Some producing parties have even attempted with mixed results to redact irrelevant information, such as information about products not at issue in the case, from relevant documents prior to production. ArcelorMittal Cleveland Inc. v. Jewell Coke Co., Case No. 1:10–cv–00362, 2010 WL 5230862 (N.D. Ohio Dec. 16, 2010).
Yet, as described below, in each of the three cases involving predictive coding that have achieved notoriety over the last year, the protocols proposed or agreed to by the parties and/or mandated by the court required that producing parties share the documents and responsiveness coding utilized to train and stabilize the system, including documents coded irrelevant. The Sedona Conference Cooperation Proclamation and various courts believe this type of cooperation and transparency is the best solution for e-discovery. See Da Silva Moore v. Publicis Groupe & MSL Group, Case No. 11-cv-01279, 2012 WL 607412, at *11 (S.D.N.Y. Feb. 24, 2012). The theory is that plaintiff’s counsel should have access to defense counsel’s initial relevance coding or training of the computer so that plaintiff’s counsel can provide feedback and otherwise ensure that the coding that will be projected across the population of the documents is accurate and consistent with plaintiff’s discovery requests.[1]
The big question is whether the production of irrelevant documents and transparency involved during the predictive coding training process is a trend corporations will embrace to achieve the cost savings available with predictive coding. Corporations can save significant sums of money with predictive coding because the manual review of documents for privilege and responsiveness is limited to the documents that the computer predicts are likely to be relevant after an iterative training process involving both human and computer elements. The number of documents reviewed as part of this process is typically a fraction of the documents collected because the documents that the computer deems likely to be irrelevant are not manually reviewed except for sampling to confirm that the documents are in fact irrelevant.
The potential for cost savings can be seen in the Da Silva case where defendant had collected approximately 3,000,000 documents in response to discovery requests. 2012 WL 607412, at *3. Reviewing 10 percent to 50 percent of the collected documents, or 300,000 to 1.5 million documents, would cost defendant between $1.5 million and $7.5 million based on defendant’s affidavit attesting that it would cost $5 per document for document review. Id. But the protocol Magistrate Judge Andrew Peck approved required, after the predictive coding training process was over, that defendants only manually review for privilege and responsiveness the documents that the computer predicts are relevant. Assuming that the computer predicts 40,000 documents are likely to be relevant, then defendant would pay only $200,000 for manual document review.[2]
Defendant in Da Silva was gladly willing to embrace transparency and produce irrelevant documents during the predictive coding training process for cost-savings purposes. 2012 WL 607412, at *3. The protocol that defendant proposed, which Magistrate Judge Peck approved, requires that defendant initially review a sample set of 2,399 non-privileged documents for relevance and 8 issue codes and then share with plaintiffs the documents and coding for the entire random sample, including irrelevant documents, so that defendant could incorporate plaintiffs’ feedback into the coding. 2012 WL 607412, at *19. Thereafter, the protocol calls for defendant to review and code 7 iterations of groups of 500 computer-suggested documents and then provide the relevant and not-relevant documents and coding to plaintiff for each iteration to obtain plaintiffs’ feedback and incorporate it. 2012 WL 607412, at *20. Finally, at the end of the process but before the manual review process begins and as a quality control measure, the protocol requires defendant to review a random sample of 2,399 documents that the computer predicts are not likely to be relevant and then provide these documents to plaintiff for feedback. 2012 WL 607412, at *21.
Defendants’ transparency in the Da Silva protocol was necessary because Magistrate Judge Peck had made it clear to the parties in an early hearing that “if you do predictive coding, you are going to have to give your seed set, including the documents marked as nonresponsive, to the plaintiff’s counsel so they can say, well, of course you are not getting any [relevant] documents, you’re not appropriately training the computer.” See 2012 WL 607412, at *3. Magistrate Judge Peck had also expressly stated in his opinion that “MSL’s transparency in its proposed ESI search protocol made it easier for the Court to approve the use of predictive coding.” See 2012 WL 607412, at *11. Similarly, U.S. District Judge Andrew Carter’s affirmance of Magistrate Judge Peck’s decision was motivated, in part, by the fact that the protocol contained “built in participation by plaintiffs.” See 2012 WL 1446534, at *2 (S.D.N.Y. April 26, 2012).
Defendants in In Re Actos (Pioglitazone) Products Liability Litigation in the Western District of Louisiana appeared similarly willing to be transparent and share irrelevant documents during the predictive coding training process to achieve cost savings. In fact, the level of transparency in Actos as reflected in the Case Management Order Protocol Relating to Electronically Stored Information (CMO) entered by the court is unprecedented and goes beyond the sharing of irrelevant documents. See In Re Actos (Pioglitazone) Products Liability Litigation, MDL No. 6:11-md-2299 (W.D. La. July 27, 2012) (CMO), at *7-14. It requires each side to nominate three people as experts to work collaboratively to review the non-privileged documents together to train the system and make one relevance decision for the documents. Thus, defendants are not reviewing their own documents for relevance and seeking feedback from the other side. They are giving up their right to review their own documents during the predictive coding training process and agreeing to substitute a collaborative review that is done jointly with plaintiffs’ counsel of random sample sets of 500 documents during an assessment phase and then random samples of 40 documents during a training phase.[3] The transparency and collaboration continues even after the manual review process and as a quality control measure. The protocol allows for the collaborative review of a random sample of documents that defendants may withhold from production on relevance grounds during the manual review but which the computer predicts are relevant.
Defendants in Global Aerospace Inc. v. Landow Aviation, L.P. also appeared willing to be transparent and share some irrelevant documents during the predictive coding training process to achieve cost savings although defendants took a novel approach and added some precautions to avoid having to produce “sensitive” irrelevant documents. Case No. CL61040, 2012 WL 1419842, at *7-8 (Va. Cir. Ct. April 9, 2012) (Memorandum in Support of Motion for Protective Order Approving the Use of Predictive Coding).[4] Defendants’ proposed protocol requires that they code a sample set of documents for relevance and then provide the full set of coded training non-privileged documents, except for sensitive, irrelevant documents. Under the protocol, defendant is required to log any sensitive, irrelevant documents (in addition to privileged documents) that are withheld to enable opposing counsel to evaluate and possibly object to the coding decision. Thereafter, as part of a quality control process to test recall and prior to the manual review process, the proposed protocol also requires defendants to review and code statistically valid, random sample sets of both relevant and irrelevant documents and make all of these coded sample documents available to plaintiff except for sensitive, irrelevant documents.
It remains to be seen whether corporations will embrace predictive coding with the levels of transparency involved in the Da Silva, Actos and Landow matters. Some corporations will clearly be motivated by the potential cost savings. They may limit the matters they are willing to be transparent to those that they know are unlikely to involve the production of sensitive documents. Others may embrace the transparency because they figure that the volume of irrelevant documents to be produced during the predictive coding training process will be relatively small and thus the risk low or they figure that the problem of producing irrelevant documents can be controlled with a protective order or confidentiality agreement.
Some corporations will not embrace this openness or collaboration with plaintiffs’ counsel as quickly or easily. At best they may attempt to include provisions in protocols that chip away at the production of irrelevant documents during the predictive coding training process by including provisions, like the Landow defendant did, that allow for the withholding of irrelevant documents that defendant independently deems sensitive. At worst, they may refuse to use predictive coding in this manner altogether, or they may attempt to use it behind the scenes without coordination with plaintiffs’ counsel in much the same way that some corporations for the past several years have on their own, without plaintiffs’ involvement, selected and applied search terms to reduce the volume of documents to be reviewed. Under these circumstances, plaintiffs’ counsel do not demand to be involved in the search process at the Rule 26(f) conference or even inquire about the type of search protocol used. Corporations take the risk that, if questioned later, a court will find their search term selection process was reasonable even without plaintiff participation. Victor Stanley, Inc. v. Creative Pipe, Inc., Case No. MJG-06-2662, 2008 WL 2221841, *259-263 (D. Md. May 29, 2008). The same principle would arguably apply to the use of predictive coding without plaintiff participation.
[1] Plaintiffs have never been involved in defendants’ initial training of human reviewers that has always taken place as part of the traditional document review process. This training has always been solely in the province and discretion of defendant and defense counsel as part of their responsibility to comply with their discovery obligations.
[2] Under the protocol, the number of documents that the computer can predict as relevant is unlimited but defendant is entitled to seek cost shifting from the court if the predictive coding technology predicts that more than 40,000 documents are likely to be relevant. 2012 WL 607412, at *21.
[3] There were some precautions taken as part of this collaborative review. Plaintiffs’ experts in Actos agreed not to disclose to their co-counsel, client or anyone else without written consent any information they were exposed to that would be subject to withholding or redaction. CMO, In Re Actos,, at *7. Additionally, the collaboration process was to take place at defense counsel’s office and “plaintiffs’ experts and counsel shall not remove any of the control or training documents from defense counsel’s office, nor shall they be allowed to copy such documents.” CMO, In Re Actos, at *7-9.
[4] The Global Aerospace court approved the use of predictive coding in a short, one-paragraph order, but it is not clear from the opinion if defendant’s proposed protocol was wholly adopted. Case No. CL61040, 2012 WL 1431215 (Va. Cir. Ct. April 23, 2012) (Order Approving the Use of Predictive Coding).
Published October 22, 2012.