Unable to enable robots.txt crawling
-
First, I was fighting Surfer which kept changing the contents of what I put into robots.txt and it oddly replaces it with the App configuration security setting of no crawling.
Then after changing that, I can set the file to Allow crawling, but it's still not able to crawl (according to google search console.
Also: the "Disable indexing" link should perhaps be a toggle, that enables and disables vs just disables.
-
First, I was fighting Surfer which kept changing the contents of what I put into robots.txt and it oddly replaces it with the App configuration security setting of no crawling.
Then after changing that, I can set the file to Allow crawling, but it's still not able to crawl (according to google search console.
Also: the "Disable indexing" link should perhaps be a toggle, that enables and disables vs just disables.
-
@robi turns out while working on another app with the security settings unmodified it became clear that it needs to be blank to enable robots to crawl, vs changing it with the Allow commands.
Totally not obvious, even for me.
@robi if the text input is empty, Cloudron's reverse proxy will not respond to the robots.txt request but will forward it to the app, so the app can control it. Is it possible that the app instance in your case also had a robots.txt which made this confusing?
Also do you have suggestions for how to improve the docs then at https://docs.cloudron.io/apps/#robotstxt ?
-
@robi if the text input is empty, Cloudron's reverse proxy will not respond to the robots.txt request but will forward it to the app, so the app can control it. Is it possible that the app instance in your case also had a robots.txt which made this confusing?
Also do you have suggestions for how to improve the docs then at https://docs.cloudron.io/apps/#robotstxt ?
@nebulon Yes, it took a while to figure that out.
The UI is the confusing part, since if you see that screen the first time and it is populated, you have no indication to clear it, or button to turn it off. The logical part is to change what it says, for ex Disallow to Allow, but that doesn't work either
So a UX improvement would be to have a toggle, which only displays the robots.txt proxy content if it's enabled and while off gives a hint as to what turning it on will actually do.
That way the breadcrumbs lead to the loaf of understanding and satisfaction