I recently returned from the ARMA 2014 show in San Diego. Aside from a chance to visit this beautiful city, it was a great opportunity to connect with our customers and industry professionals within the records management community, and also to learn about the information governance challenges that they are facing on a daily basis. In many of those conversations, the need to effectively manage content residing on network file shares was a recurring theme. Often, organizations have well-formed policies in place to manage the lifecycle of electronic content contained in areas such as email and ECM repositories; however, network file storage has been largely untouched from a records management perspective, and this content is particularly problematic for the following reasons:

  1. First, most (if not all) employees are assigned a ‘home drive’ storage area on the network for business-related documents so a lot of content tends to accumulate in these locations.
  2. Second, this content has been created by a lot of different source applications, so it is often difficult to police it from a records management point of view.

Sherpa has been working with our clients to develop technology that will help them manage these network file storage areas. Today, our Altitude IG platform offers agents that can be deployed to monitor file content on either network file shares or user hard drives. These agents gather metadata-related information about the files, search for particular keyword content and statistically analyze content (for example by file age). While this level of oversight can be helpful, it does not fully address one of the fundamental questions records managers often ask: “Is the information important?” In order to fully answer that question, it is requisite to analyze the content of the documents. That is why we are excited to announce our newly-formed partnership with Content Analyst who has pioneered the use of latent semantic indexing (LSI) technology in their CAAT product. CAAT-Logo1 CAAT eliminates the time-consuming requirement for knowledge workers to focus on content classification or categorization by taking an entire collection of information and automatically sorting it into folders and subfolders by conceptual topics, even creating titles for each folder. This quickly organizes information in a logical fashion based on what it is about, not the words it contains. This analysis is accomplished by LSI, a machine-learning technique that enables CAAT technology to identify, represent and compare concepts that exist within a collection of documents or data. Sherpa is currently integrating CAAT into our Altitude IG platform to provide automatic classification and categorization capabilities. Once that integration is complete, customers will be able to point their Altitude IG agents at a network file share and automatically classify the documents based upon their content (i.e. contracts, engineering specifications, or lunch menus). This classification process also reveals the relationships between documents helping managers uncover hidden insights, and ultimately apply policies to govern documents based upon their content. If you think that automatic classification and categorization could benefit your organization and would like to participate in our technology preview program for these Altitude features, please contact me either by email (rwilson@sherpasoftware.com) or Twitter (@sherparick). In the meantime, stay tuned to this blog for more information about how classification and categorization will extend the capabilities of Altitude IG.