Using procmail to filter out Russian emails

May 28th, 2008

Lots of the spam I get uses the Cyrillic alphabet. I believe it’s Russian. I don’t correspond with anyone using the Cyrillic alphabet, so I’ve come up with a procmail recipe to filter this email out. But first some background.

It seems that the subject lines of many (all?) Cyrillic emails look something like this:

Subject: =?koi8-r?B?7e/06ffh4+nxIPTy9eTh?=

Which appears like this in your email reader:

МОТИВАЦИЯ ТРУДА

The “KOI8-R” you see in the above Subject line refers to a popular Cyrillic encoding and indicates to the mail client that the rest of the text is thusly encoded. For more info, Wikipedia has a nice article on KOI8-R. There is another encoding, called windows-1251 that is also used to encode Cyrillic, albeit less commonly than KOI8-R.

To filter out these messages, I added two super simple procmail recipes to my .procmailrc file:

:0:
* Subject:.*koi8-r
$HOME/Maildir/.crap/

:0:
* Subject:.*windows-1251
$HOME/Maildir/.crap/

Keep in mind that for these recipes to work, the Cyrillic stuff has to appear in the email subject, which most of my spam seems to do. I haven’t done extensive testing, but will let this run for the coming weeks and report how it worked.

9 Responses to “Using procmail to filter out Russian emails”

  1. Dave Says:

    Well, so far so good. I’ve had no false positives or false negatives so far. Procmail has moved every piece of Cyrillic email I’ve received into my “crap” folder, which is exactly where I want it!

    Thanks procmail!

  2. Robotech_Master Says:

    Thanks! These recipes look like just what I need. I’ve had the same email address for over 15 years, and I stubbornly refuse to change it, so I’m on the list of every spammer everywhere. So anything that lets me chop out some of the trash before it hits Spamassassin saves load on my sysadmin’s CPU, and everybody’s happy.

  3. Dave Says:

    I’ve been running this procmail recipe for nearly 2 months with no false positives. I’ve collected lots of Russian email spam now. It’s so great to have that junk out of my inbox.

  4. Village T-Shirts Says:

    Your blog beats most everybody’s today! Thanks for the sanity saving recipe. You are my dreamboat of the week :-)

  5. dee jones Says:

    so, your Cyrillic comment was motivational effort. (as best as Dan and I could figure out)
    You would think living in Russia and having the i.p. address we do that we would encounter all kinds of Russian spam but we don’t! (and if we do, we will use your filter to get it out so don’t erase this b/c Clinton would have to help us non tech oldie fogies out!)
    Dan and Dee

  6. Dave Says:

    Update: I’ve caught 337 pieces of spam with this rule now. :)

  7. Allan Says:

    Dave,

    Thanks for the koi8-r tip. I tried to use this as a filter on my incoming mail client, but the client doesn’t seem to be able to find the string in the subject line, even though it’s there in the raw source. Any hints as to how to use this on a client?

    Cheers,
    Allan
    Apple Mail 3.6 on Mac OS 10.5.8

  8. Dave Says:

    Allan: Good question. I’m not sure Apple Mail can do this, and I don’t have my Mac in front of me to play with it.

    Can it filter on email header values? If so, the subject is just a header and you could theoretically add a filter against it.

  9. Al Angelov Says:

    What about users with mail clients configured to post messages in win-1251 by default but sending mails in English?

Leave a Reply