A Hierarchical Spam Filter

When multiple filters are used to filter an email, each can assign a probability that the email is spam, and classification of the email as spam can be based upon a weighted average of the filter probabilities.

Alternatively the order in which the filters are used can be specified, as well as a set of rules for deciding whether to classify the email after a given filter test based upon the result, or whether to forward it to the next filter for further testing. We refer to this procedure as a hierarchical filter to distinguish from the "weighted approach".

Emmesmail uses three filters in the following sequence: 1) a sender filter (using a locally-generated user-specific whitelist and blacklist); 2) a Bayesian filter; and 3) an appropriateness filter. Only those emails that successfully pass through each of filters sequentially avoids being classified as spam. Only those emails classified as non-spam based upon whitelist filtering or spam based upon blacklist filtering avoid further filtering.

The complete set of rules are given at the beginning of the "How Emmesmail Fights Spam" webpage.



Emmes Technologies
Updated Nov 10, 2006

valid html 4.01!