How do the automatic and manual learning principles operate?
In the heurastic analysis of Pinjo revealer are 2 learning mechanisms implemented. The automatic learning and the manual learning.
If a message is learned it implies that the bayesian statistical analysis table is being updated with the information found in the message.
The only thing we need to know is whether the message is spam or not. In this way its possible to finetune the bayesian tables to get ven better results.
Two methods of learning are available in Pinjo revealer:
Take in mind that every message can only be learbed ONCE to the database. If you feed the message again it will be skipped automatically.
Once an email is heurastically scanned, it will check if the result score is between certain values. If that is the case then the message will be automatically learned.
These learning scores can be adjusted in the ./config/local.cf file. The message specific information will be stored in the databases and will improve the mechanism for bayesian statiscal analysis.
The parameters to adjust the automatic learning process are:
bayes_auto_learn_threshold_nonspam
This is the value when mail will be detected as non spam. Default this value is set to 0.1 which is reasonably safe. The lower the value, the more safe you are about the learning of the messages.bayes_auto_learn_threshold_spam
This is the value when mail will be detected as spam. Default this value is set to 12.0 which is almost sure spam. The higher the value, the more safe you are about the learning of the messages.
By manually feeding messages to the system you can improve the bayesian score given to the messages. Ham or spam messages should be placed in the corresponding folder ./ham and ./spam. Then press the ‘Learn now’ button on the Filtering tab and the messages are learned to the system.
For more information about this check the FAQ item:
How can I learn messages to the system?