Train spamassassin with manual moved messages

From Veximwiki

Jump to: navigation, search

On my server the users can create a folder called salearn ( $SAFOLDER ).

I copy the false negatives to this folder an a cron-job starts the script daily to let spamassasin learn...

If this folder $SAFOLDER exists in a users mailbox, the following script will run sa-learn to learn from this folder. after learning the e-mails are deleted from the folder.

 #!/bin/bash
 ########## sat.sh ###############
 MAILBASE="/var/opt/vmail"
 SAFOLDER=".salearn"
 DOMAINS=`ls $MAILBASE`
 for domain in $DOMAINS
 do
       USERS=`ls $MAILBASE/$domain`
       for user in $USERS
       do
               MAILCOUNTER=0
               MAILTO=$user"@"$domain
               MDIR=$MAILBASE"/"$domain"/"$user"/Maildir"
               if ( test -e $MDIR"/"$SAFOLDER )
               then
                       echo "learning from $MAILTO"
                       echo `sa-learn --spam $MDIR"/"$SAFOLDER"/new"`
                       echo `sa-learn --spam $MDIR"/"$SAFOLDER"/cur"`
                       DFILESN=`ls $MDIR/$SAFOLDER/new/`
                       DFILESC=`ls $MDIR/$SAFOLDER/cur/`
                       for file in $DFILESN
                       do
                               rm $MDIR/$SAFOLDER/new/$file -v
                       done
                       for file in $DFILESC
                       do
                               rm $MDIR/$SAFOLDER/cur/$file -v
                       done
                       echo ""
               fi
       done
 done

Question from thocar@free.fr : as SA web site says that making SA learn ham is even more important than making SA learn spam, should'nt this script be improved to run sa-learn --ham on all folders that are not SAFOLDER nor INBOX ?

Answer from: henry.miller@itds.ch: i did this this way because i can not be sure, that all messages in other folders are ham. i have a lot filters wich deliver mails to different folders. you could add a config file where you can define the ham folders or simply create an explizit ham folder....

Question from martin@waschbuesch.de: Does this not mean that all users train the same database? What if what looks like spam to one user is valuable for the other? (e.g. the classification of, say, the microsoft newsletter has been the object of many heated discussions between my brother and I)...

Personal tools