Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Discuss
  3. Deploying Anubis (AI Crawler Filtering) on a Cloudron Server

Deploying Anubis (AI Crawler Filtering) on a Cloudron Server

Scheduled Pinned Locked Moved Discuss
8 Posts 4 Posters 511 Views 3 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H Offline
    H Offline
    hareen
    wrote on last edited by
    #1

    I'm wondering whether it's possible to deploy something like Anubis (Github, Webpage) to challenge web scraper bots when visiting certain subdomains on a Cloudron instance.

    I currently have another VPS which I have deployed Anubis with Nginx proxy Mangager and it works pretty well at stopping crawlers as per metrics output of Anubis.

    If it can't or shouldn't be deployed on Cloudron itself, is it possible for the VPS with Nginx Proxy Manager to handle traffic and then forward it to Cloudron? Not sure how Cloudron will behave with addition of an additional reverse proxy in front of it.

    1 Reply Last reply
    3
    • jamesJ Offline
      jamesJ Offline
      james
      Staff
      wrote on last edited by
      #2

      Hello @hareen
      Good question!
      Can't give a definite answer until I looked deeper into this tool and maybe test it.

      1 Reply Last reply
      1
      • ? Offline
        ? Offline
        A Former User
        wrote on last edited by james
        #3

        You could technically place Anubis on your existing VPS as a reverse proxy in front of Cloudron, but you'd need to ensure that the proxy passes headers and SSL correctly to avoid breaking Cloudron services. Have you tested if Cloudron’s built-in reverse proxy supports custom middleware or chaining with another proxy? If not, isolating Anubis to only specific subdomains via your external Nginx setup may be the safer route.

        1 Reply Last reply
        4
        • H Offline
          H Offline
          hareen
          wrote last edited by hareen
          #4

          Just an update on this. I managed to get everything working.

          I set up an Anubis instance on the second VPS, with the TARGET configured as https://<cloudron-ip>:443. For the REDIRECT_DOMAINS, I specified the services I wanted to proxy through Anubis, like so:
          REDIRECT_DOMAINS=app1.example.cloud,app2.example.cloud

          After that, I added those domains to Nginx Proxy Manager and pointed it at Anubis container port, and everything is now working as expected.

          1 Reply Last reply
          3
          • jamesJ Offline
            jamesJ Offline
            james
            Staff
            wrote last edited by
            #5

            Hello @hareen
            That sounds great! Would you be willing to write up a detailed how to post?

            H 1 Reply Last reply
            1
            • SansGuidonS Offline
              SansGuidonS Offline
              SansGuidon
              wrote last edited by
              #6

              I'm interested too 😉

              About me / Now

              1 Reply Last reply
              0
              • jamesJ james

                Hello @hareen
                That sounds great! Would you be willing to write up a detailed how to post?

                H Offline
                H Offline
                hareen
                wrote last edited by
                #7

                @james I'd be happy to. I'll put together a guide and share it here over the weekend.

                1 Reply Last reply
                0
                • H Offline
                  H Offline
                  hareen
                  wrote last edited by hareen
                  #8

                  Setup Overview
                  In this setup, I will be using Nginx Proxy Manager, but these instructions will also apply to other reverse proxy setups. The goal is to proxy Cloudron traffic via Anubis without disrupting the existing Cloudron installation. To achieve this, I’ll be utilizing a second VPS to deploy Anubis and to proxy traffic. This arrangement allows to proxying of selected apps through Anubis instead of the entire server.

                  Note: This setup will not work for apps on Cloudron that require additional ports to be forwarded, beyond just port 443.

                  VPS Configuration

                  VPS 1: This VPS runs Cloudron with the apps you want to proxy.

                  VPS 2: This VPS runs Ubuntu Server and hosts three Docker containers:

                  • Nginx Proxy Manager: This acts as the reverse proxy for traffic going to Anubis.

                  • Anubis: This container forwards valid requests to the Cloudron server on VPS 1.

                  • Redis: Anubis stores completed challenge data in memory by default, which is lost on restart. The Redis container provides persistent storage for this data, ensuring that Anubis retains challenge information between restarts.

                  The following steps assume that Docker is installed and the user is familiar with deploying a Docker Compose file.

                  VPS 2 Docker Compose

                  See below for docker compose for above mentioned container stack. I've made comments below where modification's required. For additional information on Anubis Variables, visit this link.

                  services:
                    ngixn-proxy-manager:
                      image: jc21/nginx-proxy-manager:latest
                      container_name: npm
                      restart: unless-stopped
                      ports:
                        - "127:.0.0.1:80:80"
                        - "443:443"
                        - "127:.0.0.1:81:81"
                      volumes:
                        - /home/appdata/npm/data:/data #Change this path for npm data as required
                        - /home/appdata/npm/letsencrypt:/etc/letsencrypt #Change this path for npm certs as required
                      networks:
                        - anubis-cloudron
                      
                      
                    anubis-cloudron-redis:
                      image: redis:8-alpine
                      container_name: anubis-cloudron-redis
                      restart: always
                      volumes:
                        - redis_data:/data
                      networks:
                        - anubis-cloudron
                  
                  
                    anubis-cloudron:
                      image: ghcr.io/techarohq/anubis:latest
                      container_name: anubis-cloudron
                      ports:
                        - "127.0.0.1:10000:10000" #This port can be utilised with a prometheus container for Anubis metrics
                        - "127.0.0.1:8300:8300" #The network port that Anubis listens on. 
                      pull_policy: always
                      restart: always
                      depends_on:
                        - anubis-cloudron-redis
                      environment:
                        BIND: ":8300" #The network port that Anubis listens on. 
                        DIFFICULTY: "4" #The difficulty of the challenge
                        METRICS_BIND: ":10000" # Prometheus Metrics Port
                        SERVE_ROBOTS_TXT: "true" #If set true, Anubis will serve a default robots.txt file that disallows all known AI scrapers.
                        POLICY_FNAME: "/data/cfg/botPolicy.yaml" # Config file Internal location. This can be left as it is.
                        TARGET: "https://<CLOUDRON-VPS-IP-ADDRESS>" #Change this to the IP address of Cloudron Server.
                        TARGET_INSECURE_SKIP_VERIFY: "true" #Skip TLS certificate validation for targets that listen over https. This is required.
                        REDIRECT_DOMAINS: "app1.example.cloud, app2.example.cloud" #These should match the current cloudron app sub domains. This can be expanded as required.
                        COOKIE_DYNAMIC_DOMAIN: "true" #If set to true, automatically set cookie domain fields based on the hostname of the request. 
                        COOKIE_EXPIRATION_TIME: "168h" #The amount of time the authorization cookie is valid for.
                        COOKIE_SECURE: "true"
                        ED25519_PRIVATE_KEY_HEX: "4e7d024d97030b8e80f89de052494b31ff799d0ee83c238c6f22d01979ba8b54" #This is a sample key. Generate a new key by running 'openssl rand -hex 32' and paste it here
                        OG_PASSTHROUGH: "false" #If set to true, Anubis will enable Open Graph tag passthrough. 
                      volumes:
                        - "/home/appdata/anubis/cloudron.yaml:/data/cfg/botPolicy.yaml:ro" # Config file location. Change this to the location of file below in section below. This is to be manually created.
                  
                      networks:
                        - anubis-cloudron
                  
                  
                  volumes:
                    redis_data:
                    
                  networks:
                    anubis-cloudron:
                      driver: bridge 
                     
                  

                  Anubis - Config file

                  Below's the configuration file which I'm utilising for Cloudron. This setup allows both Mastodon and Pixelfed to be behind Anubis. Modify the configuration file as required depending on your Cloudron services and change mapping of config file path on docker compose file based on where this file is saved.

                    - import: (data)/bots/ai-robots-txt.yaml
                    - import: (data)/bots/cloudflare-workers.yaml
                    - import: (data)/bots/headless-browsers.yaml
                    - import: (data)/bots/us-ai-scraper.yaml
                    - import: (data)/crawlers/googlebot.yaml
                    - import: (data)/crawlers/bingbot.yaml
                    - import: (data)/crawlers/duckduckbot.yaml
                    - import: (data)/crawlers/qwantbot.yaml
                    - import: (data)/crawlers/internet-archive.yaml
                    - import: (data)/crawlers/kagibot.yaml
                    - import: (data)/crawlers/marginalia.yaml
                    - import: (data)/crawlers/mojeekbot.yaml
                    - import: (data)/clients/git.yaml
                    - import: (data)/common/keep-internet-working.yaml
                  
                  
                    - name: allow-uptime-kuma
                      user_agent_regex: Uptime-Kuma.*
                      action: ALLOW
                  
                    - name: allow-api
                      path_regex: ^/api/
                      action: ALLOW
                  
                    - name: allow-assets
                      action: ALLOW
                      path_regex: \.(eot|ttf|woff|woff2|css|js|jpg|jpeg|png|mp4|webp|svg)$
                  
                    - name: allow-website-logos
                      action: ALLOW
                      path_regex: ^/hareen/website-logos/.*$
                  
                    - name: allow-well-known
                      path_regex: ^/.well-known/.*$
                      action: ALLOW
                  
                    - name: allow-mastodon-actors-objects
                      path_regex: ^/users/[^/]+(/.*)?$
                      action: ALLOW
                  
                    - name: allow-shared-inbox
                      path_regex: ^/inbox$
                      action: ALLOW
                  
                    - name: allow-pixelfed-actors-objects
                      path_regex: ^/@[^/]+(/.*)?$
                      action: ALLOW
                  
                    - name: allow-user-inbox
                      path_regex: ^/@[^/]+/inbox$
                      action: ALLOW
                  
                    - name: allow-nodeinfo-webfinger
                      path_regex: ^/\\.well-known/(host-meta|webfinger|nodeinfo.*)$
                      action: ALLOW
                  
                    - name: generic-browser
                      user_agent_regex: >-
                        Mozilla|Opera
                      action: CHALLENGE
                  
                  dnsbl: false
                  
                  thresholds:
                    - name: minimal-suspicion
                      expression: weight <= 0
                      action: ALLOW
                  
                    - name: mild-suspicion
                      expression:
                        all:
                          - weight > 0
                          - weight < 10
                      action: CHALLENGE
                      challenge:
                        algorithm: metarefresh
                        difficulty: 1
                        report_as: 1
                  
                    - name: moderate-suspicion
                      expression:
                        all:
                          - weight >= 10
                          - weight < 20
                      action: CHALLENGE
                      challenge:
                        algorithm: fast
                        difficulty: 2
                        report_as: 2
                  
                    - name: extreme-suspicion
                      expression: weight >= 20
                      action: CHALLENGE
                      challenge:
                        algorithm: fast
                        difficulty: 4
                  
                  store:
                    backend: valkey
                    parameters:
                      url: "redis://anubis-cloudron-redis:6379/0"
                  
                  

                  Note: Redis is mapped in the configuration file and therefore creation of a config file is required prior to deploying the docker compose file above (Refer to 'store:' in above config).

                  Once the above setup is deployed, visit the Nginx Proxy Manager interface on Port 81, setup an account by following the on screen instructions.

                  DNS Provider Configuration

                  Visit the DNS provider, and change the ipv4 and ipv6 addresses of app1.example.cloud and app2.example.cloud, which points at VPS1 (Cloudron server) which were setup previously with Cloudron, to point at VPS2 (Anubis server) ipv4/ipv6 address instead.

                  Note: This change will disrupt these services till the next few steps are followed.

                  Nginx Proxy Mananger Configuration

                  First setup a SSL cerificate for *.example.cloud by visiting the SSL Certificate Tab → Let's Encrypt Certificate. DNS challange option can be utilised here for automated validation without needing to open port 80. Instructions will be shown when the slider's enabled.
                  Screenshot 2025-08-09 at 7.44.43 PM.png

                  Add a Proxy Host for app1.example.cloud. Set the scheme as http and point at the Anubis container (anubis-cloudron:8300 if above docker compose file was followed).
                  Screenshot 2025-08-09 at 6.03.56 PM.png

                  Next click SSL and select the certificate created in the previous step. Then enable Force SSL and HTTP/2 Support. Both HSTS options can also be enabled here based on the application being proxied.
                  Screenshot 2025-08-09 at 6.09.29 PM.png

                  Allow inbound traffic on port 443 on VPS 2, and ensure that UFW or any other firewall in use also allows traffic on this port. Once this is configured, app1.example.cloud (hosted on Cloudron) will be accessible with Anubis protection in place.

                  To add additional Cloudron subdomains, repeat the same steps. Don't forget to update the Docker Compose file to include the new subdomains in the REDIRECT_DOMAINS environment variable within the Anubis container configuration.

                  Optional Steps
                  A Prometheus docker container can be deployed which can be used with Anubis Metrics Port (10000 in above docker compose) to monitor the Anubis instance with Grafana to output data similar to this.
                  Screenshot 2025-08-09 at 6.31.40 PM.png

                  1 Reply Last reply
                  1
                  Reply
                  • Reply as topic
                  Log in to reply
                  • Oldest to Newest
                  • Newest to Oldest
                  • Most Votes


                  • Login

                  • Don't have an account? Register

                  • Login or register to search.
                  • First post
                    Last post
                  0
                  • Categories
                  • Recent
                  • Tags
                  • Popular
                  • Bookmarks
                  • Search