Using procmail to filter out Russian emails
Lots of the spam I get uses the Cyrillic alphabet. I believe it’s Russian. I don’t correspond with anyone using the Cyrillic alphabet, so I’ve come up with a procmail recipe to filter this email out. But first some background.
It seems that the subject lines of many (all?) Cyrillic emails look something like this:
Subject: =?koi8-r?B?7e/06ffh4+nxIPTy9eTh?=
Which appears like this in your email reader:
МОТИВÐЦИЯ ТРУДÐ
The “KOI8-R” you see in the above Subject line refers to a popular Cyrillic encoding and indicates to the mail client that the rest of the text is thusly encoded. For more info, Wikipedia has a nice article on KOI8-R. There is another encoding, called windows-1251 that is also used to encode Cyrillic, albeit less commonly than KOI8-R.
To filter out these messages, I added two super simple procmail recipes to my .procmailrc file:
:0:
* Subject:.*koi8-r
$HOME/Maildir/.crap/
:0:
* Subject:.*windows-1251
$HOME/Maildir/.crap/
Keep in mind that for these recipes to work, the Cyrillic stuff has to appear in the email subject, which most of my spam seems to do. I haven’t done extensive testing, but will let this run for the coming weeks and report how it worked.
9 comments to “Using procmail to filter out Russian emails”
Well, so far so good. I’ve had no false positives or false negatives so far. Procmail has moved every piece of Cyrillic email I’ve received into my “crap” folder, which is exactly where I want it!
Thanks procmail!
Thanks! These recipes look like just what I need. I’ve had the same email address for over 15 years, and I stubbornly refuse to change it, so I’m on the list of every spammer everywhere. So anything that lets me chop out some of the trash before it hits Spamassassin saves load on my sysadmin’s CPU, and everybody’s happy.
I’ve been running this procmail recipe for nearly 2 months with no false positives. I’ve collected lots of Russian email spam now. It’s so great to have that junk out of my inbox.
Your blog beats most everybody’s today! Thanks for the sanity saving recipe. You are my dreamboat of the week :-)
so, your Cyrillic comment was motivational effort. (as best as Dan and I could figure out)
You would think living in Russia and having the i.p. address we do that we would encounter all kinds of Russian spam but we don’t! (and if we do, we will use your filter to get it out so don’t erase this b/c Clinton would have to help us non tech oldie fogies out!)
Dan and Dee
Update: I’ve caught 337 pieces of spam with this rule now. :)
Dave,
Thanks for the koi8-r tip. I tried to use this as a filter on my incoming mail client, but the client doesn’t seem to be able to find the string in the subject line, even though it’s there in the raw source. Any hints as to how to use this on a client?
Cheers,
Allan
Apple Mail 3.6 on Mac OS 10.5.8
Allan: Good question. I’m not sure Apple Mail can do this, and I don’t have my Mac in front of me to play with it.
Can it filter on email header values? If so, the subject is just a header and you could theoretically add a filter against it.
What about users with mail clients configured to post messages in win-1251 by default but sending mails in English?