Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


    Cloudron Forum

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular

    Set robots.txt defaults to the following

    Feature Requests
    robots
    2
    3
    206
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      subtlecourage last edited by girish

      User-agent:ia_archiver
      Disallow: /

      User-agent: archive.org_bot
      Disallow: /

      User-agent: *
      Disallow: /

      User-agent: Rogerbot
      User-agent: Exabot
      User-agent: MJ12bot
      User-agent: Dotbot
      User-agent: Gigabot
      User-agent: Baiduspider
      User-agent: Ezooms
      User-agent: Nutch
      User-agent: archive.org_bot
      User-agent: MJ12bot
      User-agent: YandexBot
      User-agent: AhrefsBot
      User-agent: HTTrack
      User-agent: Wget
      User-agent: Zeus
      User-agent: BLEXBot
      User-agent: burroboot
      User-agent: DOC
      User-agent: MJ12Bot
      User-agent: SemrushBot
      User-agent: spbot
      User-agent: UbiCrawler
      User-agent: Zao
      User-agent: Netsparker
      User-agent: sitecheck.internetseer.com
      User-agent: Zealbot
      User-agent: MSIECrawler
      User-agent: SiteSnagger
      User-agent: WebStripper
      User-agent: WebCopier
      User-agent: Fetch
      User-agent: Offline Explorer
      User-agent: Teleport
      User-agent: TeleportPro
      User-agent: WebZIP
      User-agent: linko
      User-agent: Microsoft.URL.Control
      User-agent: Xenu
      User-agent: larbin
      User-agent: libwww
      User-agent: ZyBORG
      User-agent: Download Ninja
      User-agent: grub-client
      User-agent: k2spider
      User-agent: NPBot
      User-agent: WebReaper
      User-agent: CyotekWebCrawler
      User-agent: Whizbang
      User-agent: UniverseBot
      User-agent: SlySearch
      Disallow: /

      fbartels 1 Reply Last reply Reply Quote 0
      • fbartels
        fbartels App Dev @subtlecourage last edited by

        Having this as "the default" would probably collide with some users using Cloudron to host public facing websites (myself included).

        Besides:

        @subtlecourage said in Set robots.txt defaults to the following:

        User-agent: *
        Disallow: /

        is already added to robots.txt of the app if you go into the "security" tab of a website and select "Disable indexing". What makes you think that bots that ignore * (meaning all user agents) will not index your website if you name their user agent individually?

        S 1 Reply Last reply Reply Quote 2
        • S
          subtlecourage @fbartels last edited by

          @fbartels Optimism and hope? /s

          Great points.

          1 Reply Last reply Reply Quote 2
          • First post
            Last post
          Powered by NodeBB