SpamAssassin
A practical guide to integration and configuration

Packt Publishing


 

HOME > CHAPTER 9

Chapter 9;
Bayesian Filtering

The Bayesian filter in SpamAssassin is one of the most effective techniques for filtering spam. Although Bayesian statistical analysis is a branch of mathematics, one doesn't necessarily need to understand the mathematics to use SpamAssassin's Bayesian filter.

Bayesian analysis involves teaching a system that a particular input gives a particular result. For Spam filtering, this teaching is repeated, many times over, with many spam and ham emails. Once this is finished, a Bayesian system can be presented with a new email and will give a probability of the result being spam. For best results, teaching should be a constant process.

To filter spam emails, the system is taught both ham and spam emails, until the filter has learned to differentiate between the two. Then, emails passed through the filter will be assigned a probability of being spam. When Bayesian filtering is used in conjunction with SpamAssassin's other spam detection rules, SpamAssassin approaches 100% detection of spam, with false positives (legitimate emails misclassified as spam) close to 0%.

Internally, the Bayesian engine provides a single probability figure for each email processed. This probability ranges from 0 (0% likelihood that an email is spam) up to 99 (99% likelihood).

In this chapter, the focus is on users who have an account on the local machine. A Bayesian database can be implemented using an SQL database. The principles of the Bayesian database are also valid for an SQL Bayesian database. Creating an SQL Bayesian database is covered in Chapter 14.

  • Chapter 9: Table of Contents: Preview Chapter 9 HTML | PDF [150KB]

    • Scoring

    • Training

    • Confirming Operation

    • Filter Training

      • User Involvement

      • Local Users

      • Unlearning

      • Auto-learn Thresholds

      • Bayesian Database Files

      • Removing a Bayesian Database

      • Sharing a Bayesian Database

    • Disabling Bayesian Filtering

    • Summary

BOOK DETAILS
  Paperback, 220 pages
Released: Sept 2004
ISBN: 1904811124
Author: Alistair McDonald
 
 

TABLE OF CONTENTS

Intro
1. Introducing Spam
2. Spam and Anti-Spam Techniques
3. Open Relays
4. Protecting Email Addresses
5. Detecting Spam
6. Installing SpamAssassin
7. Configuration Files
8. Using SpamAssassin
9. Bayesian Filtering
10. Look and Feel
11. Network Tests
12. Rules
13. Improving Filtering
14. Performance
15. Housekeeping and Reporting
16. Building an Anti-Spam Gateway
17. Email Clients
18. Choosing other Spam Tools
Appendix A
Index

 




View the book details
on PacktPub.com

 

 

  This website is owned and maintained by Packt Publishing Ltd, 2004. All rights reserved. Terms and Conditions