The value of filtering & culling in eDiscovery

Gathering electronically stored information (ESI) for the purposes of litigation is a common task for an IT department. Often this takes the form of a ‘collect everything’ approach and with it huge volumes are sent out for processing and review.  Without a pre-search step, which filters this information, companies are missing an excellent chance to streamline its practices and significantly reduce downstream discovery costs.

The IT staff who finds themselves on the business end of an eDiscovery data collection, often have little guidance.  Given the countless number of tasks on their plates, they are eager to get this ‘chore’ completed with minimal issues.  The collection phase often becomes a matter of bulk exporting, for example, off-loading entire mailboxes, downloading file shares wholesale, and copying entire hard drives to be sent out for further processing.  While this method is quite straightforward, it also turns out to cost a litigant significant amounts of time, money and resources as the legal matter lumbers forward.

Collection Phase

The collection phase of eDiscovery is actually quite economical when compared with other steps in the process.  A bit further up the road, the true costs reveal themselves.  According to one in-depth study, over 94 percent of eDiscovery costs come from processing and review. As noted by the ABA, even if the cost of processing data is manageable, the cost of reviewing it is not.  Despite the proliferation of advanced software tools, rooting out relevant and privileged information in large sets of ESI remains expensive. The cost of processing and review, it turns out, hinges on the volume of data involved.  All trends indicate that ESI volumes are only going to continue to increase as digital data creation expands exponentially across all spectrums.

Pre-searching, culling, filtering

Pre-searching, culling, filtering, and early case assessment (ECA) are different terms for the same idea – analyzing and reducing the volume of data collected for eDiscovery using cost-effective tools and methods before sending data for further processing and review. Legal oversight, communication and some familiarity with the matter can be helped by a litigation preparedness team.  There are also a variety of approaches and helpful software, such as Sherpa Software’s Altitude eDiscovery or Discovery Attender, that can be employed to efficiently shrink the data set, and therefore the cost burden.

The first of these methods are simple filters based on targeting custodians who are relevant to the matter. Limiting data to that within a pre-determined, applicable date range is always recommended. Redundant or trivial information (ROT data) such as daily news alerts, group emails celebrating cats or bacon, spam, newsletters, solicitations etc., tend to clutter up data stores and bloat collections.  However, they often have no bearing on a legal matter and rarely need to be collected.  Additional culling by targeting specific email domains, addresses, topics and keywords can often be performed for significant benefit.

In terms of files procured from hard drives, a comparison to the NIST list is one sanctioned method for removing unneeded program files that have no evidentiary importance.  Additionally, many collections can safely exclude other executables, media files, databases, log files and more as defined by the needs of the legal matter.  Deduplication is another effective technique for quickly reducing a volume of ESI.  Identical email sent to all the custodians or indistinguishable copies of the files stored in dispersed locations should be tracked, but time would be wasted if they had to be reviewed multiple times.

As mentioned, the major benefit of filtering is a noticeable reduction of costs.  Pre-searching is helpful in other ways, starting with statistics and analytics. A litigation team needs solid information to prepare to handle the demands of meet-and-confer sessions.   Early case assessment becomes easier as more details are revealed about the data set during the collection phase. ECA also helps build a useful scope of the case.  This ranges from outlining the estimated size and cost of the responsive set to identifying further custodians, keywords and locations relevant to the matter.

A further advantage of a pre-search culling is that it allows for a rolling production.  Instead of inundating a review team all at once with a large collection, filtering the data allows for a more measured transfer of information.  Furthermore, a rolling production allows for adjustments to criteria before resources are deployed.  Additional conditions can be applied if the initial result sets are either too broad or too constrained.

For defensibility purposes, it is important to track criteria used in the pre-search as well as the reasoning behind deploying each filter.  As noted in previous posts, “Good eDiscovery policy should include defensible deletion, litigation hold notices and good communication between stakeholders.”  It goes without saying that a good information governance framework backed up by sound policy makes sorting through the mountains of information much easier, effective and defensible in a court of law.

Even absent good policy, any practical organization should pre-search their ESI before sending a collection off to counsel for processing and review.  Remember, the more gigabytes collected, the higher the cost.  The solution?  Reduce the gigabytes.

For more information on eDiscovery solutions, click here or email us at information@sherpasoftware.com

Leave a Reply

Your email address will not be published. Required fields are marked *