Along with other information management challenges, classification of e-mail messages is becoming more important is some sectors.
E-mail classification is a way of flagging or tagging messages as being of a certain type. For example, a message might be classified as “privileged,” “confidential,” “secret,” “private,” or “business relevant.” In more complicated cases, message classifications may be hierarchical or relevant to only some people in the organization.

But, why classify e-mail messages at all? The reason for doing this is to make the handling of messages clearer or easier.

Messages of different types may be handled differently for retention purposes, routing, or e-discovery. A “business relevant” message may be retained for three years, whereas a “personal” one may be purged after thirty days. A message marked “secret” may be restricted from leaving the organization’s e-mail environment unless it is encrypted. An e-discovery search request may be concerned only with “privileged” messages.

There are basically two kinds of e-mail classification: Machine Assisted and User Applied. Here is a break-down of the differences.

Machine Assisted Classification

With this kind of classification, a computer process scans message headers and bodies to make a determination as to what classifications may apply to it and update the headers accordingly. It makes these decisions in various ways, but in essence it follows a set of rules.

Limited human involvement. Once configured, the machine does all the heavy lifting. Potentially all inbound, outbound, and internal messages can be processed automatically.

Consistent decisions. Assuming the rules don’t change, the computer will be faithful in applying the same classifications to like messages without deviation. However, some systems employ a kind of learning or adapting logic that, in some way or another, bases current decisions on past results. While some inconsistency may be evident early on, the intent is to make these computers more accurate at classifying messages over time.

Computers can’t think. They only do what they are told. This leads to various degrees of accuracy when classifying messages based purely on content or addresses, even when sophisticated and expensive learning systems are employed.

User Applied Classification

This involves the sender applying a suitable classification to a message during composition before it is sent. Sometimes, the e-mail client software may be able to assist the author in selecting a classification or it may prevent a message from being sent if the classification is missing. Received messages, from external sources, for example, may also be classified after the fact by the recipient.


People can reason. Presumably the message author can make the best determination as to the nature of the message and, therefore, what classifications it needs. Or, if the message arrived unclassified from an external sender, the recipient may be able to make that determination.


Increased workload. Besides additional and on-going training, the additional work on the author’s part to think about and apply classifications—admittedly slight for a single message—can add up when multiplied over hundreds and thousands of message over time.
Inconsistencies. Different people, and even the same person at different times, may make different decisions about two messages that should be classified the same. This can muddy the waters, as it were, when trying to manage the messages later as a group of a given type.
What is Right for Your Company?

Your company may not need to worry about e-mail classification at all. If it does, the kinds of classifications needed, available resources, and existing technologies will drive the approach you take. In some cases, a hybrid of the two mentioned methods will work best. Sherpa software can help you get to a solution that works for your individual case.