Train spamassassin with manual moved messages
From Veximwiki
On my server the users can create a folder called salearn ( $SAFOLDER ).
I copy the false negatives to this folder an a cron-job starts the script daily to let spamassasin learn...
If this folder $SAFOLDER exists in a users mailbox, the following script will run sa-learn to learn from this folder. after learning the e-mails are deleted from the folder.
#!/bin/bash
########## sat.sh ###############
MAILBASE="/var/opt/vmail"
SAFOLDER=".salearn"
DOMAINS=`ls $MAILBASE`
for domain in $DOMAINS
do
USERS=`ls $MAILBASE/$domain`
for user in $USERS
do
MAILCOUNTER=0
MAILTO=$user"@"$domain
MDIR=$MAILBASE"/"$domain"/"$user"/Maildir"
if ( test -e $MDIR"/"$SAFOLDER )
then
echo "learning from $MAILTO"
echo `sa-learn --spam $MDIR"/"$SAFOLDER"/new"`
echo `sa-learn --spam $MDIR"/"$SAFOLDER"/cur"`
DFILESN=`ls $MDIR/$SAFOLDER/new/`
DFILESC=`ls $MDIR/$SAFOLDER/cur/`
for file in $DFILESN
do
rm $MDIR/$SAFOLDER/new/$file -v
done
for file in $DFILESC
do
rm $MDIR/$SAFOLDER/cur/$file -v
done
echo ""
fi
done
done
Question from thocar@free.fr : as SA web site says that making SA learn ham is even more important than making SA learn spam, should'nt this script be improved to run sa-learn --ham on all folders that are not SAFOLDER nor INBOX ?
Answer from: henry.miller@itds.ch: i did this this way because i can not be sure, that all messages in other folders are ham. i have a lot filters wich deliver mails to different folders. you could add a config file where you can define the ham folders or simply create an explizit ham folder....
Question from martin@waschbuesch.de: Does this not mean that all users train the same database? What if what looks like spam to one user is valuable for the other? (e.g. the classification of, say, the microsoft newsletter has been the object of many heated discussions between my brother and I)...
