Anyone researching Information Governance (IG) will encounter the topic of File Analysis. Initially, these disciplines were related but not equivalent.  However as technology grows more all-encompassing, a new class of tools has been deployed which offer end to end file analysis, classification and remediation (FACR).  These processes form an essential building block for establishing an effective IG strategy. However, solid information lifecycle management is made up of more than file analysis.  Key components such as stakeholder involvement, policy creation and enforcement, eDiscovery plans, security and storage management must all be addressed to establish a complete framework.  But none of these steps can happen without the most essential element – knowing where data resides, who uses it, and what form does it take.  Without these key details obtained through comprehensive file analytics, it is impossible to control or manage corporate information assets.

File Analysis is a term used in many technical disciplines, including forensics, anti-virus and records management.    For the purposes of IG it can be defined as the process of locating, highlighting and classifying information assets of an organization.   In the past, these assets would have been paper based.  In today’s interconnected world, they are almost entirely electronic, and this so called ESI (electronically stored information) grows exponentially every year.

It is this unchecked growth of ESI that makes sound IG policy so important to an organization.  Surveys highlight the fact that “69% of all information … has no business, legal or regulatory value”.   Think of how much this redundant, outdated or trivial data costs a company in wasted time, money and operational resources.  To address this problem an organization must first gather information regarding the data assets (and detritus) of an organization including the size, age, location, type, and distribution of electronic and physical files.  By scrutinizing these basic statistics, organizations can determine severity and priority for compliance, storage constraints and bottlenecks.  Gartner further states ‘IT, data and storage managers use file analysis to deliver insight into information about the data, enabling better management and governance to improve business value, reduce risk and lower management cost. ‘

File Analysis can be used to shed the light on so called ‘dark data’ , information assets that are generated or stored by corporations on a day to day basis, but which are not used in other contexts.   This can include legacy data, records stored in BYOD, mobile and cloud based stores. Like other types of corporate data mentioned above, dark data is rarely  disposed of in a defensible manner.   A thorough analysis of all these areas can disclose how much expense, clutter and risk any and all of these areas pose to an organization.

As noted by AIIM, ‘…all of these good things flow from accurate metadata’.  An inventory of files and other electronically generated data is necessary to effectively organize and prioritize IG tasks.  More intensive analysis and remediation can be performed to identify duplicates, measure growth rates and classify files which act as agents to increase efficiency of searches, reduce redundancy, inform defensible deletion, streamline records management and generally provide context for the data.  This, in turn, feeds the overview and control of data and can be a boon to establishing effective Disaster Recovery efforts.  Furthermore, any poor handling of highly regulated or confidential data (e.g. PHI, PII, PCI) can also be uncovered with this effective scrutiny.

It is important to maintain and automate a file analysis process. This will assist companies to proactively audit policy, recognize business value and pinpoint trouble spots before they flare up, tasks which are nearly impossible to do manually.   File Analysis provides a starting point to understand the nature of information assets and forms the bedrock for all policy and procedure.  It is a key component in fulfilling and preserving the goals of any effective Information Governance plan.