Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


    Cloudron Forum

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular

    Spam filter doesn't always learn

    Support
    email spam
    4
    8
    371
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • imc67
      imc67 translator last edited by imc67

      Hi,

      since weeks I noticed that some spam received in Cloudron mailbox and learned (by moving to Spam in Roundcube) that this spam keeps coming into the Inbox.

      This same mailbox is connected to FreeScout and there the spam detection (simply by senders' address) is working.

      Is there something I can see/do in Haraka to optimize this?

      1 Reply Last reply Reply Quote 1
      • d19dotca
        d19dotca last edited by d19dotca

        Unfortunately better spam control isn't here yet, but I sure hope it'll arrive in 6.0 or 6.1. 🙂

        In the meantime though... from the Cloudron docs they claim it takes about 50+ messages before spamassassin has enough data to work with to "start" to learn. So before it gets "good" I'd hazard a guess that it could take a couple hundred messages (depending on how much spam you get). I get a lot (~10-30/day) and thankfully after a few months of switching to Cloudron most spam gets into my junk box now, but the first few months were bad. lol. This may have been improved in Cloudron updates since back then too.

        So I guess my first question to you is... how many messages are currently in your spam folder? If it's not much, this may be why, you'll need more for it to learn from.

        --
        Dustin Dauncey
        www.d19.ca

        1 Reply Last reply Reply Quote 1
        • girish
          girish Staff last edited by

          Ideally, SA should be catching. But the fact of the matter is that SA gives kinda low visibility into it's learning or how it categorizes mail. I think maybe something like rspamd is better here but it does require more resources to run and also has more packaging requirements than simple SA.

          @imc67 if you look into the raw mail headers, it will tell you how it assigned the scoring. Is there anything obvious there?

          MooCloud_Matt 1 Reply Last reply Reply Quote 0
          • MooCloud_Matt
            MooCloud_Matt @girish last edited by

            @girish said in Spam filter doesn't always learn:

            rspamd is better here but it does require more resources

            rspamd require less training to work, and less complicated configuration setup.
            but without clamav is pretty much equal to spamassasin with a little bit of training, and tweak.

            AntiSpam in general are really havy to run, and complicated, if you take a plesk or cpanel antispam out of the box are similar to cloudron on.
            The only solution is to implement a more complete stack in general, with blacklist, signature, DNSBL in a collaborative way, so that every cloudron can report spam and especially ham.

            @imc67 if u want to manually improve you can add barracuda (b.barracudacentral.org) DNSBL to start. (this guide should work; backup your config file before edit it).
            The issue of cloudron are mostly resources, because an efficient antispam will take to much ram, and cpu, so we need to find alternative solution, and my company is try to help cloudron to improve on this.

            Matteo. R.
            Founder and Tech-Support Manager.
            MooCloud MSP
            Swiss Managed Service Provider

            1 Reply Last reply Reply Quote 0
            • d19dotca
              d19dotca last edited by

              Out of curiosity, is Cloudron configuring SpamAssassin for BAYES training with sa-learn commands? I tried searching the Cloudron Git but couldn't find any references to it. Perhaps this may be helpful for improving spam filtering with regards to BAYES and such?

              https://cwiki.apache.org/confluence/display/SPAMASSASSIN/BayesInSpamAssassin

              --
              Dustin Dauncey
              www.d19.ca

              MooCloud_Matt 1 Reply Last reply Reply Quote 1
              • MooCloud_Matt
                MooCloud_Matt @d19dotca last edited by

                @d19dotca
                From my understanding it dose, but it's not enough data on a single server to actually make a difference.
                Not significantly

                Matteo. R.
                Founder and Tech-Support Manager.
                MooCloud MSP
                Swiss Managed Service Provider

                d19dotca 1 Reply Last reply Reply Quote 0
                • d19dotca
                  d19dotca @MooCloud_Matt last edited by

                  @MooCloud_Matt I see the BAYES rules applied but I find that I often have to move spam from inbox to spam or vice-versa and it doesn’t seem to learn very well at all. Makes me question if this can be improved at all. Is the sa-learn command running as a cron for example to force learning at all for inbox and archive folders (rather than only inbox alone for example) as ham messages and then the same for the junk folder for spam messages? Just seems very ineffective, anecdotally, unfortunately.

                  --
                  Dustin Dauncey
                  www.d19.ca

                  MooCloud_Matt 1 Reply Last reply Reply Quote 0
                  • MooCloud_Matt
                    MooCloud_Matt @d19dotca last edited by

                    @d19dotca
                    We got improvement after the 500 mail for day, feed to our ML filter, BAYES normally need less information, but if you don't have enough fresh data it will always be too late.

                    Matteo. R.
                    Founder and Tech-Support Manager.
                    MooCloud MSP
                    Swiss Managed Service Provider

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post
                    Powered by NodeBB