Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Support
  3. Spam filter doesn't always learn

Spam filter doesn't always learn

Scheduled Pinned Locked Moved Support
emailspam
12 Posts 6 Posters 1.7k Views 6 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • imc67I Online
      imc67I Online
      imc67
      translator
      wrote on last edited by imc67
      #1

      Hi,

      since weeks I noticed that some spam received in Cloudron mailbox and learned (by moving to Spam in Roundcube) that this spam keeps coming into the Inbox.

      This same mailbox is connected to FreeScout and there the spam detection (simply by senders' address) is working.

      Is there something I can see/do in Haraka to optimize this?

      1 Reply Last reply
      1
      • d19dotcaD Offline
        d19dotcaD Offline
        d19dotca
        wrote on last edited by d19dotca
        #2

        Unfortunately better spam control isn't here yet, but I sure hope it'll arrive in 6.0 or 6.1. 🙂

        In the meantime though... from the Cloudron docs they claim it takes about 50+ messages before spamassassin has enough data to work with to "start" to learn. So before it gets "good" I'd hazard a guess that it could take a couple hundred messages (depending on how much spam you get). I get a lot (~10-30/day) and thankfully after a few months of switching to Cloudron most spam gets into my junk box now, but the first few months were bad. lol. This may have been improved in Cloudron updates since back then too.

        So I guess my first question to you is... how many messages are currently in your spam folder? If it's not much, this may be why, you'll need more for it to learn from.

        --
        Dustin Dauncey
        www.d19.ca

        1 Reply Last reply
        1
        • girishG Offline
          girishG Offline
          girish
          Staff
          wrote on last edited by
          #3

          Ideally, SA should be catching. But the fact of the matter is that SA gives kinda low visibility into it's learning or how it categorizes mail. I think maybe something like rspamd is better here but it does require more resources to run and also has more packaging requirements than simple SA.

          @imc67 if you look into the raw mail headers, it will tell you how it assigned the scoring. Is there anything obvious there?

          MooCloud_MattM 1 Reply Last reply
          0
          • girishG girish

            Ideally, SA should be catching. But the fact of the matter is that SA gives kinda low visibility into it's learning or how it categorizes mail. I think maybe something like rspamd is better here but it does require more resources to run and also has more packaging requirements than simple SA.

            @imc67 if you look into the raw mail headers, it will tell you how it assigned the scoring. Is there anything obvious there?

            MooCloud_MattM Offline
            MooCloud_MattM Offline
            MooCloud_Matt
            wrote on last edited by
            #4

            @girish said in Spam filter doesn't always learn:

            rspamd is better here but it does require more resources

            rspamd require less training to work, and less complicated configuration setup.
            but without clamav is pretty much equal to spamassasin with a little bit of training, and tweak.

            AntiSpam in general are really havy to run, and complicated, if you take a plesk or cpanel antispam out of the box are similar to cloudron on.
            The only solution is to implement a more complete stack in general, with blacklist, signature, DNSBL in a collaborative way, so that every cloudron can report spam and especially ham.

            @imc67 if u want to manually improve you can add barracuda (b.barracudacentral.org) DNSBL to start. (this guide should work; backup your config file before edit it).
            The issue of cloudron are mostly resources, because an efficient antispam will take to much ram, and cpu, so we need to find alternative solution, and my company is try to help cloudron to improve on this.

            Matteo. R.
            Founder and Tech-Support Manager.
            MooCloud MSP
            Swiss Managed Service Provider

            1 Reply Last reply
            0
            • d19dotcaD Offline
              d19dotcaD Offline
              d19dotca
              wrote on last edited by
              #5

              Out of curiosity, is Cloudron configuring SpamAssassin for BAYES training with sa-learn commands? I tried searching the Cloudron Git but couldn't find any references to it. Perhaps this may be helpful for improving spam filtering with regards to BAYES and such?

              https://cwiki.apache.org/confluence/display/SPAMASSASSIN/BayesInSpamAssassin

              --
              Dustin Dauncey
              www.d19.ca

              MooCloud_MattM 1 Reply Last reply
              1
              • d19dotcaD d19dotca

                Out of curiosity, is Cloudron configuring SpamAssassin for BAYES training with sa-learn commands? I tried searching the Cloudron Git but couldn't find any references to it. Perhaps this may be helpful for improving spam filtering with regards to BAYES and such?

                https://cwiki.apache.org/confluence/display/SPAMASSASSIN/BayesInSpamAssassin

                MooCloud_MattM Offline
                MooCloud_MattM Offline
                MooCloud_Matt
                wrote on last edited by
                #6

                @d19dotca
                From my understanding it dose, but it's not enough data on a single server to actually make a difference.
                Not significantly

                Matteo. R.
                Founder and Tech-Support Manager.
                MooCloud MSP
                Swiss Managed Service Provider

                d19dotcaD 1 Reply Last reply
                0
                • MooCloud_MattM MooCloud_Matt

                  @d19dotca
                  From my understanding it dose, but it's not enough data on a single server to actually make a difference.
                  Not significantly

                  d19dotcaD Offline
                  d19dotcaD Offline
                  d19dotca
                  wrote on last edited by
                  #7

                  @MooCloud_Matt I see the BAYES rules applied but I find that I often have to move spam from inbox to spam or vice-versa and it doesn’t seem to learn very well at all. Makes me question if this can be improved at all. Is the sa-learn command running as a cron for example to force learning at all for inbox and archive folders (rather than only inbox alone for example) as ham messages and then the same for the junk folder for spam messages? Just seems very ineffective, anecdotally, unfortunately.

                  --
                  Dustin Dauncey
                  www.d19.ca

                  MooCloud_MattM 1 Reply Last reply
                  0
                  • d19dotcaD d19dotca

                    @MooCloud_Matt I see the BAYES rules applied but I find that I often have to move spam from inbox to spam or vice-versa and it doesn’t seem to learn very well at all. Makes me question if this can be improved at all. Is the sa-learn command running as a cron for example to force learning at all for inbox and archive folders (rather than only inbox alone for example) as ham messages and then the same for the junk folder for spam messages? Just seems very ineffective, anecdotally, unfortunately.

                    MooCloud_MattM Offline
                    MooCloud_MattM Offline
                    MooCloud_Matt
                    wrote on last edited by
                    #8

                    @d19dotca
                    We got improvement after the 500 mail for day, feed to our ML filter, BAYES normally need less information, but if you don't have enough fresh data it will always be too late.

                    Matteo. R.
                    Founder and Tech-Support Manager.
                    MooCloud MSP
                    Swiss Managed Service Provider

                    1 Reply Last reply
                    0
                    • sponchS Offline
                      sponchS Offline
                      sponch
                      wrote on last edited by
                      #9

                      is the spam filter also trained when moving mails on Apple Mail App e.g. into the spam folder or does this only work via MailCube?

                      d19dotcaD 1 Reply Last reply
                      0
                      • d19dotcaD Offline
                        d19dotcaD Offline
                        d19dotca
                        wrote on last edited by
                        #10

                        I've been noticing an increase in spam messages lately, and have been trying to determine what else I can do. I've tightened up the rules a bit so less ends up in the inbox (I'll update the other thread with updated rules soon), but also wanting to force learning of SpamAssassin on archive folders for ham and junk for spam to get a more accurate BAYES score for individual users.

                        --
                        Dustin Dauncey
                        www.d19.ca

                        1 Reply Last reply
                        0
                        • sponchS sponch

                          is the spam filter also trained when moving mails on Apple Mail App e.g. into the spam folder or does this only work via MailCube?

                          d19dotcaD Offline
                          d19dotcaD Offline
                          d19dotca
                          wrote on last edited by
                          #11

                          @sponch said in Spam filter doesn't always learn:

                          is the spam filter also trained when moving mails on Apple Mail App e.g. into the spam folder or does this only work via MailCube?

                          I believe so, yes. Any mail in the spam folder if it's moved from the inbox should be learned as spam, although I think the issue is this doesn't happen reliably, but I don't think it has to do with the mail client in particular, it's more server-side.

                          --
                          Dustin Dauncey
                          www.d19.ca

                          1 Reply Last reply
                          0
                          • nebulonN Offline
                            nebulonN Offline
                            nebulon
                            Staff
                            wrote on last edited by
                            #12

                            I also had some pretty annoying spam, which just wouldn't get marked as spam, despite marking the very same mail previously as spam more than once. In my case tweaking the bayes filter worked very well, but your milage may vary (a lot of spam for me is in German)

                            You can set custom spam assassin rules https://docs.cloudron.io/email/#custom-spam-filtering-rules and there the bayes values like this:

                            score BAYES_999 2.0
                            score BAYES_99 4.5
                            

                            But these values may probably have to be adjusted to your case, so I wouldn't bump up the weights too much to avoid marking ham as spam too often. I hope we can improve on that in the future as having to tweak such rules manually is really not a good strategy.

                            1 Reply Last reply
                            1
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                              • Login

                              • Don't have an account? Register

                              • Login or register to search.
                              • First post
                                Last post
                              0
                              • Categories
                              • Recent
                              • Tags
                              • Popular
                              • Bookmarks
                              • Search