Set robots.txt defaults to the following

subtlecourage

User-agent:ia_archiver
Disallow: /

User-agent: archive.org_bot
Disallow: /

User-agent: *
Disallow: /

User-agent: Rogerbot
User-agent: Exabot
User-agent: MJ12bot
User-agent: Dotbot
User-agent: Gigabot
User-agent: Baiduspider
User-agent: Ezooms
User-agent: Nutch
User-agent: archive.org_bot
User-agent: MJ12bot
User-agent: YandexBot
User-agent: AhrefsBot
User-agent: HTTrack
User-agent: Wget
User-agent: Zeus
User-agent: BLEXBot
User-agent: burroboot
User-agent: DOC
User-agent: MJ12Bot
User-agent: SemrushBot
User-agent: spbot
User-agent: UbiCrawler
User-agent: Zao
User-agent: Netsparker
User-agent: sitecheck.internetseer.com
User-agent: Zealbot
User-agent: MSIECrawler
User-agent: SiteSnagger
User-agent: WebStripper
User-agent: WebCopier
User-agent: Fetch
User-agent: Offline Explorer
User-agent: Teleport
User-agent: TeleportPro
User-agent: WebZIP
User-agent: linko
User-agent: Microsoft.URL.Control
User-agent: Xenu
User-agent: larbin
User-agent: libwww
User-agent: ZyBORG
User-agent: Download Ninja
User-agent: grub-client
User-agent: k2spider
User-agent: NPBot
User-agent: WebReaper
User-agent: CyotekWebCrawler
User-agent: Whizbang
User-agent: UniverseBot
User-agent: SlySearch
Disallow: /

fbartels

Having this as "the default" would probably collide with some users using Cloudron to host public facing websites (myself included).

Besides:

@subtlecourage said in Set robots.txt defaults to the following:

User-agent: *
Disallow: /

is already added to robots.txt of the app if you go into the "security" tab of a website and select "Disable indexing". What makes you think that bots that ignore * (meaning all user agents) will not index your website if you name their user agent individually?

subtlecourage

@fbartels Optimism and hope? /s

Great points.

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

Set robots.txt defaults to the following