Sharing custom SpamAssassin Rules

micmc

@d19dotca said in Sharing custom SpamAssassin Rules:

Latest round of SpamAssassin rules I'm using, if anyone is interested.

The highlights here are just a couple of things:

A few new sources which come with new rules

Slight scoring tweaks on just a few rules

Of course, as they say... YMMV.

Amazing stuff mate! That's a great work and thank you so much for sharing this with the folks here.

humpty

@necrevistonnezr That's right.

d19dotca

@necrevistonnezr Yes, you can simply copy & paste the entire thing.

Btw, here is my current rules (hasn't changed too much though from the previous one I shared):

# scoring DNSBLs (blocklists & allowlists)
score RCVD_IN_BL_SPAMCOP_NET 2.5
score RCVD_IN_DNSWL_BLOCKED 0.0
score RCVD_IN_DNSWL_HI -5.0
score RCVD_IN_DNSWL_LOW -1.0
score RCVD_IN_DNSWL_MED -2.5
score RCVD_IN_DNSWL_NONE 0.5
score RCVD_IN_GBUDB 4.5
score RCVD_IN_IADB_DK -0.5
score RCVD_IN_IADB_DOPTIN_GT50 -0.5
score RCVD_IN_IADB_DOPTIN_LT50 -0.5
score RCVD_IN_IADB_EDDB -0.5
score RCVD_IN_IADB_EPIA -0.5
score RCVD_IN_IADB_GOODMAIL -0.5
score RCVD_IN_IADB_LISTED -0.5
score RCVD_IN_IADB_LOOSE -0.5
score RCVD_IN_IADB_MI_CPEAR 0
score RCVD_IN_IADB_MI_CPR_30 0
score RCVD_IN_IADB_MI_CPR_MAT 0.0
score RCVD_IN_IADB_NOCONTROL -0.5
score RCVD_IN_IADB_OOO -0.5
score RCVD_IN_IADB_OPTIN -0.5
score RCVD_IN_IADB_OPTIN_GT50 -0.5
score RCVD_IN_IADB_OPTIN_LT50 -0.5
score RCVD_IN_IADB_OPTOUTONLY -0.5
score RCVD_IN_IADB_RDNS -0.5
score RCVD_IN_IADB_SENDERID -0.5
score RCVD_IN_IADB_SPF -0.5
score RCVD_IN_IADB_UNVERIFIED_1 -0.5
score RCVD_IN_IADB_UNVERIFIED_2 -0.5
score RCVD_IN_IADB_UT_CPEAR 0
score RCVD_IN_IADB_UT_CPR_30 0
score RCVD_IN_IADB_UT_CPR_MAT 0
score RCVD_IN_JMF_BL 2.5
score RCVD_IN_MSPIKE_BL 0.0
score RCVD_IN_MSPIKE_H2 0.0
score RCVD_IN_MSPIKE_H3 -0.5
score RCVD_IN_MSPIKE_H4 -2.0
score RCVD_IN_MSPIKE_H5 -3.0
score RCVD_IN_MSPIKE_L2 1.5
score RCVD_IN_MSPIKE_L3 2.5
score RCVD_IN_MSPIKE_L4 4.0
score RCVD_IN_MSPIKE_L5 5.0
score RCVD_IN_MSPIKE_WL 0.0
score RCVD_IN_MSPIKE_ZBI 4.0
score RCVD_IN_PBL 3.5
score RCVD_IN_SBL 3.5
score RCVD_IN_SBL_CSS 3.5
score RCVD_IN_SEM_BACKSCATTER 1.5
score RCVD_IN_SEM_BLACK 3.5
score RCVD_IN_SEM_NET_BLACK 2.5
score RCVD_IN_SORBS_BLOCK 2.5
score RCVD_IN_SORBS_DUL 2.5
score RCVD_IN_SORBS_HTTP 2.5
score RCVD_IN_SORBS_MISC 2.5
score RCVD_IN_SORBS_SMTP 2.5
score RCVD_IN_SORBS_SOCKS 2.5
score RCVD_IN_SORBS_SPAM 2.5
score RCVD_IN_SORBS_WEB 2.5
score RCVD_IN_SORBS_ZOMBIE 2.5
score RCVD_IN_SPAMRATS 2.0
score RCVD_IN_UCEPROTECT2 1.5
score RCVD_IN_XBL 3.5
score RCVD_IN_ZEN_BLOCKED 0.0
score RCVD_IN_ZEN_BLOCKED_OPENDNS 0.0

# scoring URIBLs
score URIBL_ABUSE_SURBL 4.0
score URIBL_BLACK 4.5
score URIBL_CR_SURBL 4.0
score URIBL_CSS 2.0
score URIBL_CSS_A 2.0
score URIBL_DBL_ABUSE_BOTCC 3.0
score URIBL_DBL_ABUSE_MALW  3.0
score URIBL_DBL_ABUSE_PHISH 3.0
score URIBL_DBL_ABUSE_REDIR 1.0
score URIBL_DBL_ABUSE_SPAM 3.0
score URIBL_DBL_BLOCKED 0.0
score URIBL_DBL_BLOCKED_OPENDNS 0.0
score URIBL_DBL_BOTNETCC 3.5
score URIBL_DBL_ERROR 0.0
score URIBL_DBL_MALWARE 3.5
score URIBL_DBL_PHISH 3.5
score URIBL_DBL_SPAM 3.5
score URIBL_GREY 1.0
score URIBL_MW_SURBL 4.0
score URIBL_PH_SURBL 4.0
score URIBL_RED 1.5
score URIBL_RHS_DOB 2.0
score URIBL_SBL 1.5
score URIBL_SBL_A 1.5
score URIBL_SEM 3.0
score URIBL_SEM_FRESH30 1.5
score URIBL_WS_SURBL 3.0
score URIBL_ZEN_BLOCKED 0.0
score URIBL_ZEN_BLOCKED_OPENDNS 0.0

# scoring DKIM & SPF
score DKIM_INVALID 1.5
score DKIM_SIGNED 0.0
score DKIM_VALID 0.0
score DKIM_VALID_AU 0.0
score DKIM_VALID_EF 0.0
score DKIM_VERIFIED 0.0
score DKIMWL_BL 3.0
score DKIMWL_WL_HIGH -3.5
score DKIMWL_WL_MED -1.5
score DKIMWL_WL_MEDHI -2.5
score FORGED_SPF_HELO 3.0
score SPF_FAIL 1.5
score SPF_HELO_FAIL 1.5
score SPF_HELO_NEUTRAL 1.0
score SPF_HELO_NONE 0.5
score SPF_HELO_PASS 0.0
score SPF_HELO_SOFTFAIL 1.5
score SPF_NEUTRAL 0.5
score SPF_NONE 0.5
score SPF_PASS 0.0
score SPF_SOFTFAIL 1.5

# scoring BAYES
score BAYES_00 -4.0
score BAYES_05  -3.0
score BAYES_20  0.5
score BAYES_40  1.5
score BAYES_50  2.0
score BAYES_60  3.0
score BAYES_80  3.5
score BAYES_95  4.5
score BAYES_99  5.0
score BAYES_999 1.5

# scoring HTML
score HTML_FONT_LOW_CONTRAST 0.5
score HTML_IMAGE_ONLY_04 1.5
score HTML_IMAGE_ONLY_08 2.0
score HTML_IMAGE_ONLY_12 2.0
score HTML_IMAGE_ONLY_16 2.0
score HTML_IMAGE_ONLY_20 2.0
score HTML_IMAGE_ONLY_24 2.5
score HTML_IMAGE_ONLY_28 2.5
score HTML_IMAGE_ONLY_32 3.0
score HTML_IMAGE_RATIO_02 0.0
score HTML_IMAGE_RATIO_04 0.0
score HTML_IMAGE_RATIO_06 0.0
score HTML_IMAGE_RATIO_08 0.0
score HTML_MESSAGE 0.0

# scoring HEADER & MISSING
score HEADER_FROM_DIFFERENT_DOMAINS 0.5
score HEADER_SPAM 2.5
score MISSING_DATE 3.0
score MISSING_FROM 1.5
score MISSING_HB_SEP 0.0
score MISSING_HEADERS 1.5
score MISSING_MID 1.0
score MISSING_MIMEOLE 2.0
score MISSING_SUBJECT 2.0

# scoring FREEMAIL
score FORGED_GMAIL_RCVD 2.5
score FORGED_YAHOO_RCVD 2.5
score FREEMAIL_ENVFROM_END_DIGIT 0.5
score FREEMAIL_FORGED_REPLYTO 0.5
score FREEMAIL_FROM 0
score FREEMAIL_REPLY 0.5
score FREEMAIL_REPLYTO 0.5
score FREEMAIL_REPLYTO_END_DIGIT 0.5
score MALFORMED_FREEMAIL 4.0

# additional scoring tweaks
score BILLION_DOLLARS 2.0
score BODY_URI_ONLY 1.5
score EMPTY_MESSAGE 1.5
score HELO_DYNAMIC_SPLIT_IP 2.0
score HK_RANDOM_ENVFROM 0.5
score HK_RANDOM_FROM 0.5
score LOTS_OF_MONEY 0.5
score MPART_ALT_DIFF 0.5
score MPART_ALT_DIFF_COUNT 1.0
score NO_DNS_FOR_FROM 0.5
score PDS_TONAME_EQ_TOLOCAL 0.5
score PDS_TONAME_EQ_TOLOCAL_VSHORT 0.5
score RDNS_NONE 1.5
score REPLYTO_WITHOUT_TO_CC 2.5
score UNPARSEABLE_RELAY 0.5
score URI_DQ_UNSUB 2.0

# add GDUB TRUNCATE DNSBL
header RCVD_IN_GBUDB eval:check_rbl('gbudb', 'truncate.gbudb.net.')
describe RCVD_IN_GBUDB Listed in truncate.gbudb.net
tflags RCVD_IN_GBUDB net

# add JMF-Black DNSBL
header RCVD_IN_JMF_BL eval:check_rbl('jmf', 'black.junkemailfilter.com.')
describe RCVD_IN_JMF_BL Listed in black.junkemailfilter.com
tflags RCVD_IN_JMF_BL net

# add Spamrats DNSBL
header RCVD_IN_SPAMRATS eval:check_rbl('spamrats', 'all.spamrats.com.')
describe RCVD_IN_SPAMRATS Sender listed in all.spamrats.com
tflags RCVD_IN_SPAMRATS net

# add SpamEatingMonkey backscatter DNSBL
header RCVD_IN_SEM_BACKSCATTER eval:check_rbl('sem', 'backscatter.spameatingmonkey.net')
tflags RCVD_IN_SEM_BACKSCATTER net
describe RCVD_IN_SEM_BACKSCATTER Received from an IP listed by SEM-BACKSCATTER

# add SpamEatingMonkey network blacklist DNSBL
header RCVD_IN_SEM_NET_BLACK eval:check_rbl('sem', 'netbl.spameatingmonkey.net')
tflags RCVD_IN_SEM_NET_BLACK net
describe RCVD_IN_SEM_NET_BLACK Received from an IP listed by SpamEatingMonkeys

# add SpamEatingMonkey blacklist DNSBL
header RCVD_IN_SEM_BLACK eval:check_rbl('sem', 'bl.spameatingmonkey.net')
tflags RCVD_IN_SEM_BLACK net
describe RCVD_IN_SEM_BLACK Received from an IP listed by SpamEatingMonkeys

# add SpamEatingMonkey URIBL
urirhssub URIBL_SEM uribl.spameatingmonkey.net. A 2
body URIBL_SEM eval:check_uridnsbl('URIBL_SEM')
describe URIBL_SEM Contains a URI listed by SpamEatingMonkeys
tflags URIBL_SEM net

# add SpamEatingMonkey fresh domain URIBL
urirhssub URIBL_SEM_FRESH30 fresh30.spameatingmonkey.net. A 2
body URIBL_SEM_FRESH30 eval:check_uridnsbl('URIBL_SEM_FRESH30')
describe URIBL_SEM_FRESH30 From a domain registered less than 30 days ago
tflags URIBL_SEM_FRESH30 net

# add UCE DNSBL
header RCVD_IN_UCEPROTECT2 eval:check_rbl_txt('uceprotect2-lastexternal', 'dnsbl-2.uceprotect.net.')
describe RCVD_IN_UCEPROTECT2  Listed in dnsbl-2.uceprotect.net (open relay/proxy/dialup)
tflags   RCVD_IN_UCEPROTECT2  net

Also I have found the Abusix spam filtering to be very effective too with no false-positives that I've found. Something like <UUID>.combined.mail.abusix.zone once you've registered for free (it's a free service for up to 5,000 queries per day).

necrevistonnezr

@d19dotca thanks!

I noticed you don’t use the SpamAssassin rules by Abusix mentioned here: https://docs.abusix.com/abusix-mail-intelligence/gbG8EcJ3x3fSUv8cMZLiwA/setting-up-different-systems/dmw9dcwSGSNQiLTssFAnBW#spamassassin ….?

Also, as Abusix has its headquarters in the US - is using their services still GDPR compliant? At least in theory, a DSNBL provider might see a lot of metadata from your mails (unless you rsync the list to your server, I believe).

d19dotca

@necrevistonnezr I'm based in Canada and we don't really have a GDPR, so I can't necessarily say with confidence, however in my understanding... the short answer I believe is "yes, it should be GDPR compliant".

Abusix is just hosting an IP table blacklist which Cloudron's SMTP server would use to lookup if an IP is located on their list. Abusix would have no idea about the email contents as it isn't scanning them at all since it wasn't given the email message itself, only the originating IP address of the SMTP server that was used to send from. It's simply a DNSBL, and doesn't act any differently than basically every other DNSBL, the only difference in this case is I chose to use Abusix at the MTA layer instead of the application layer which means it is rejected before it even consumes much resources by Cloudron (before it's technically accepted by the Cloudron SMTP server). The other DNSBL lists I setup in SpamAssassin because they usually are good but not good enough to use at the MTA layer as ideally there should be 0% false-positives in the MTA layer otherwise legitimate email will never arrive.

Hopefully that helps clarify it a bit.

necrevistonnezr

@d19dotca
Hi, I looked into the matter a bit yesterday.

At least under GDPR, even transmitting IP addresses can be a problem.

Under recital 30 of the GDPR, an IP address is and "identifier" and protected as personal data:

"Natural persons may be associated with online identifiers provided by their devices, applications, tools and protocols, such as internet protocol addresses, cookie identifiers or other identifiers such as radio frequency identification tags. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them." (source: https://gdpr.eu/recital-30-online-identifiers-for-profiling-and-identification/ and https://gdpr.eu/eu-gdpr-personal-data/).

Transmitting such data to a US entity without proper agreements and controls in place (which is currently virtually impossible after the Schrems II judgment, see https://www.cms-lawnow.com/ealerts/2022/05/schrems-ii-reactions-to-the-judgement-and-the-supervisory-authorities-recommendations-update-33) is non compliant.

Therefore, professional providers (including my employer ) rsync those lists to their local servers and conduct the matching with DNS-blacklists locally.

This can be relevant for you if you sell your services as GDPR-compliant...

timconsidine

@d19dotca just letting you and others know that after implementing your list, spam on one account has reduced by 70%

Thank you so much !

d19dotca

@necrevistonnezr - Ah very interesting! I knew IP addresses were generally considered private data but didn't realize public SMTP server IP addresses were treated the same way in the EU for GDPR. Thanks for the insight, I'll definitely consider that going forward! I don't currently advertise as GDPR-compliant because my customers and I are all located in Canada where the GDPR doesn't exist (and they only offer their services in Canada too currently), but I try to follow the general philosophies / intentions of it such as respecting do-not-track cookies, logging least personal info as possible, able to delete the data upon request, etc. This is good info to have though if that all changes for my customers or my own offerings where I need to then pay more attention to GDPR.

@timconsidine - That's great to hear! Makes me happy to know that list is helping people.

necrevistonnezr

@d19dotca said in Sharing custom SpamAssassin Rules:

@necrevistonnezr - Ah very interesting! I knew IP addresses were generally considered private data but didn't realize public SMTP server IP addresses were treated the same way in the EU for GDPR.

Just to be clear (as I have understood): when you do a matching to a DNSBL, you send your SMTP server's IP as well as IPs / domains / patterns that seem suspicious (and hence you want to match with the DNS list) to the list operator - this combination of identifiers can contain quite a bit of information and therefore can be problematic to process.

timconsidine

@necrevistonnezr I avoid fussing about GDPR as I consider it a total mess but one important distinction to bear in mind is that while an IP address is indeed capable of indicating identity, the core issue is whether it is “PID” : personally identifiable data.

If the server is in a company name, I would think it questionable whether its IP address is PID.

Equally it is about protecting the privacy of those you have a relationship with, e.g. customers, prospects, suppliers, employees.

If someone is potentially spamming me, I don’t have any relationship with them, and I don’t see any issue with passing their data to anyone else for checking. They may not be a person either, so it may well not be PID.

Generally I think GDPR is “over-applied” and extended through excess caution, just to avoid any risk of non-compliance. Owing to my generally negative attitude towards GDPR, I err towards a restricted interpretation and a “f*ck ‘em” attitude. In the event of a complaint, run them in circles with a response challenging their interpretation of GDPR. They won’t bothered to prove their case.

Let’s all stop pandering to the legislative cr@p invented by EU bureaucrats who generally knew closeto nothing about the subject when they started their grand project (had to justify their existence and salaries somehow).

humpty

@timconsidine US resident here and just wanted to say that when I first heard about GDPR, I was like "WTF is that" and had the same "f*ck 'em" attitude. However, US privacy laws are dog shit so when I heard the EU fined Google for non-compliance, it made me chuckle. No law will be perfect, abusers will find loopholes, and politicians are.. well, why state the obvious. But if I had to choose between having GDPR or not, I prefer to have some protection laws.

timconsidine

@humptydumpty some privacy laws are definitely good and absolutely needed and should be global .
But not as structured / implemented in GDPR which is a mess.

For many businesses, it’s actually cheaper just to do “the right thing”, behave professionally with good security and ethics, and then risk a GDPR fine (which is highly unlikely anyway if you’ve done the preceding) than go through the expensive time-consuming nightmare bureaucracy that some suit behind a desk has thought a good idea.

We spent about 1-2 days on the important end-user customer-facing stuff, and about 12 months on the B2B contract ramifications, with everyone trying to shift the blame and responsibility while imposing stingent hand-cuffs.

No issues with privacy legislation itself, but I would point a flame-thrower at GDPR itself.

Think I better shut up now !
Too many people have a vested interest in promoting GDPR (lawyers, consultants, tech vendors etc.)

necrevistonnezr

@timconsidine said in Sharing custom SpamAssassin Rules:

@necrevistonnezr I avoid fussing about GDPR as I consider it a total mess but one important distinction to bear in mind is that while an IP address is indeed capable of indicating identity, the core issue is whether it is “PID” : personally identifiable data.

As I said, Identifiers can be problematic in combination. And please think of false positives, where totally legit IPs are combined with your server domain - think a private domain and an IP to abortion information, stuff like that. BTW companies are not protected, only persons.

And while GDPR is exhausting (mostly, if not really understood) and needs improvements, it has lead to many companies thinking privacy when developing a product, not just afterwards. Or how they need to secure private data with technical and organizational measures (Art 32 GDPR) - hardcoding passwords or keeping them unencrypted for example (think the current Twitter scandal) can lead to fines and liability. It’s a new mind set.

jdaviescoates

@d19dotca said in Sharing custom SpamAssassin Rules:

@necrevistonnezr Yes, you can simply copy & paste the entire thing.

Even though you link to the docs methinks it's worth explicitly adding this to the top post too, perhaps also with a mention to scroll down to find your latest tweaks) i.e.

Just copy the latest set of rules (scroll down the thread to find the most recent) and then paste them into the Custom Spamassassin Rules box that pops up when you press the little pencil edit icon in Email -> Setting -> Spam filtering:

Thanks again (if you can count my previous upvote as a thanks ) for sharing this!

d19dotca

@jdaviescoates said in Sharing custom SpamAssassin Rules:

Even though you link to the docs methinks it's worth explicitly adding this to the top post too, perhaps also with a mention to scroll down to find your latest tweaks)

I'd like to do that but unfortunately (I guess too much time has passed so) it won't let me edit the original post anymore. All I can do is post updates to this one. Maybe @girish or somebody with admin privileges can allow me to do that or make the edit for me?

girish

I have elongated the post editing time frame.

? Offline

@girish, But do we get updates when @d19dotca just edits the post?

humpty

@wirkaholic you should get a notification on the forum but you can make sure you're "watching" this topic as shown below.

Also, in your account settings, you can have the forum email you when you have notifications. Go to Profile > Settings > Notifications > When a post is edited in a topic you are watching > Set it to " Notification & Email".

? Offline

This post is deleted!

? Offline

@humptydumpty Yes, I can see! Thanks for your prompt reply!

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

Sharing custom SpamAssassin Rules