Platforms that allow workers to collaborate on shared documents have become increasingly popular over the last decade. With the COVID pandemic forcing a sizable portion of the workforce to work remotely, the advantages of cloud collaboration have only been further boosted.

A number of companies, including IBM, Proctor & Gamble, Morgan Stanley and Allstate rely on one such cloud collaboration provider, Box. Box is a file hosting service that allows for collaboration and file sharing, and provides document security and access control, as well as other tools for managing files that are stored on its servers.

While workers may be in disparate locations, their data is stored centrally in the cloud, even while it is accessed remotely. Users determine how their content can be accessed by their colleagues or even users outside their organization. They may invite others to view and/or edit an account’s shared files, upload documents and photos to a shared files folder (and thus share those documents outside Box), giving other users permission to view or alter shared files.

Box accommodates a wide variety of platforms, such as Windows and Mac OS, as well as a variety of mobile devices. A comparison of Box with other file-sharing services can be found here.

Despite the advantages of Box’s business model, the tool lacks some key capabilities. Their eDiscovery feature set in particular feels somewhat lacking. While Box offers data retention and legal hold, their preferred approach is to export data into a partner application. Box offers two methods for searching content. The Metadata Query API enables you to programmatically find content within Box metadata. Using the Metadata API, you can:

  • Use a similar structure to SQL to pass a set of parameters and conditions
  • Retrieve matching files and folders along with the corresponding metadata

The built-in Search API allows you to sort through file and folder names as well as document text. However, according to the Box support site:

  • The indexing process is not instantaneous and therefore content is not immediately searchable. Indexing typically occurs within ten minutes but can take longer depending on system load.
  • The supported operations for metadata filters are limited. The Search API doesn’t perform comparisons on multiple metadata fields.[1]
You want to find all documents with the metadata template ‘Contract’ that have:

  • a value greater than $100,000
  • a renewal date in 2025
  • are associated with the North American sales region
  • contain the term ‘Acme’ in the file name or document text
 

This is not supported.

You cannot mix both fuzzy search (the document contains the term ‘Acme’) and the Boolean expression matching metadata fields.

When responding to requests for litigation and investigation data, Box’s restrictions can hinder or complicate the eDiscovery process.

To meet these challenges, Sherpa Software has extended the capabilities of our premier information governance tool — Altitude IG — to include the Box platform. Altitude is a comprehensive, yet simple to use, integrated data governance platform that allows you to search and access data spread throughout your organization, including network file shares and user desktops. Altitude allows you to comply with regulatory requirements by identifying and classifying documents that may contain sensitive General Data Protection Regulation (GDPR) or California Consumer Privacy Act (CCPA) related data.

Sherpa Software solutions can uniquely help you discover sensitive data whether it’s stored on a cloud-based platform such as Box, in email, on network shares or on employee laptops. Our software can help you inventory and classify data for appropriate remediation.

Contact us today so we can help you adjust to today’s increased number of remote workers and the resulting increased organizational data exposure.

[1] https://support.box.com/hc/en-us/articles/360050013874-Understanding-the-Metadata-Query-API-and-Search-API