Any issues with including NetData on the root server and as an app add-on?
-
Has anyone tried to install the Netdata Agent on a Cloudron VPS in production? Does it work, any issues??
I just installed it on a local RaspberryPi with 6 docker containers and the information and metrics are really awesome!
-
-
@timconsidine said in Any issues with including NetData on the root server and as an app add-on?:
do the containers need some kind of agent to participate in data collection
Netdata uses the Docker socket to get data about running containers. So no additional agents are required.
-
Yesterday I migrated my Cloudron to a new NetCup RS but before that I installed Netdata on the RS and then Cloudron. After installing both and migrating the data everything works fine!
About Netdata:
Before I used Zabbix for years but its much too hard to install, configure, update/upgrade and maintain. Netdata is install and use within seconds!I use the app.netdata.cloud dashboard and the Netdata agent on the Netcup RS is "streaming" data (have a look at netdata.cloud for all the details).
The agent is streaming 494 (!!) different data sources and it is perfectly shown in summary dashboards, detailed graphs etc. right out of the box!
They implemented AI (buzz) that is looking for correlations in captured data and with default triggers it sends you emails with warnings in different levels.Some first useful and surprising findings:
- I got a warning that my mounted (!) Hetzner Storagebox for backups was 97.03% full (that is handy to know and act on!)
- I got a warning that in the nginx logs in a certain time the ratio successful HTTP request was low. That was during my attempts to repair a stuck app!
- All the Cloudron apps (docker containers) are monitored (alas with the not so nice cryptic names )
- I got a warning that app.swap was almost fully used. Why?? The machine has 32GB RAM only 1 third used so I switched the slow app.swap off and deleted it (again a useful insight!).
All together: it is very simple and useful!
I offered @girish SSH access to check if everything works fine from a Cloudron developers perspective so we can get a "green light" to use it "officially". Due to 7.5 release his research will take some time.
-
-
@timconsidine I was just thinking that if the installation of Cloudron would fail it might be because Netdata was already installed. It didn't, so there might be no reason to not install it afterwards but @girish is the only one who can confirm no issues I guess.
-
@imc67 we always create a swap because often we see servers which behave erratically when the server has no swap. I am not an expert on this topic but without swap bad things happen (tm) even if you have much RAM . See also https://haydenjames.io/linux-performance-almost-always-add-swap-space/
-
-
No too familiar with netdata but could I install it on a server at my office and use it to remotely monitor multiple instances of cloudron in one panel?
-
@imc67 said in Any issues with including NetData on the root server and as an app add-on?:
I offered @girish SSH access to check if everything works fine from a Cloudron developers perspective so we can get a "green light" to use it "officially". Due to 7.5 release his research will take some time.
@girish time to have a look at Netdata on my server? SSH is still open for you.
-
btw: the latest versions do have a local dashboard so you don't really need the (free) cloud dashboard anymore. It's even possible to install it in a docker container and access the local dashboard.
-
I think this will take a while for us. @imc67 what sort of official support are you expecting?
If it helps, I can put together a guide in our docs on how to use Cloudron with netdata. Pre-installing net data agent into every Cloudron installation is not under consideration at the moment.
-
@girish after all these months and busy work on 7.5 I can imagine you can't remember our mail correspondence and the details of this thread.
Summary:
Many Cloudron users are longing for more detailed live data of their server @BrutalBirdie and I used to use Zabbix but it's difficult to get it work and updated. @marcusquinn discovered Netdata and I gave it a try on a RaspberryPi and was heavily impressed so I also "pre-installed" it on my new Cloudron server before migrating installing/migrating Cloudron. It works perfect and I already got many insights by Netdata triggers and Netdata AI of things that where wrong (i.e. crashed Cloudron firewall where Cloudron didn't warn me!).The big question by @robi, @timconsidine, @marcusquinn and me is: is it safe to install it on the server in a live Cloudron environment?
Second question: could it be an app candidate as it is available as a docker container, able to monitor the server itself including all other docker containers and has its own web dashboard? -
@imc67 Good questions. I think for @girish and @nebulon . I've only used it very occasionally, but not had any issues from it on a production server.
TBH I think it's a great feature addition, and helps make Cloudron more appealing to enterprise system admins.
Perhaps it could be presented in an iFrame in the Cloudron Dashboard, too, for a sort of integrated experience for the prosumer users.
-
As FOSS goes, this is exceptionally fortunate for the world to have available. I like their ethics, too:
-
I looked into this a bit. The installation adds a custom apt repo and sets up automatic update stuff and installs a few other things. In general, it probably doesn't break anything but I find it hard to gauge what it could potentially break. My understanding is this is similar to DigitalOcean's and AWS monitoring tools (which are optionally installed on the server) to provide dashboards. We have not heard of people facing issues when they install the tools.
I think the best approach here is to deal with issues as they appear @imc67 . What do you think? I can't think of any better way. We can't possibly track netdata releases or their installer code to understand what all changes are happening.
-
@girish I think the utility being sought here is Application/Service level resource usage data, as opposed to the overall general CPU/RAM that host monitoring presents. As in, when something is hogging resources, it can be identified, same for when something is allocated more than it needs.
-
Using the install script is not even necessary, as netdata can just as well be run as a container itself. So a simple
docker-compose.yaml
with the following is enough:version: '3' services: netdata: image: netdata/netdata container_name: netdata pid: host network_mode: host restart: unless-stopped cap_add: - SYS_PTRACE - SYS_ADMIN security_opt: - apparmor:unconfined volumes: - ./netdataconfig/netdata:/etc/netdata - netdatalib:/var/lib/netdata - netdatacache:/var/cache/netdata - /etc/passwd:/host/etc/passwd:ro - /etc/group:/host/etc/group:ro - /proc:/host/proc:ro - /sys:/host/sys:ro - /etc/os-release:/host/etc/os-release:ro #- /var/run/docker.sock:/var/run/docker.sock:ro environment: - DOCKER_HOST=127.0.0.1:2375 cetusguard: image: hectorm/cetusguard:v1 network_mode: host read_only: true volumes: - /var/run/docker.sock:/var/run/docker.sock:ro environment: CETUSGUARD_BACKEND_ADDR: unix:///var/run/docker.sock CETUSGUARD_FRONTEND_ADDR: tcp://:2375 CETUSGUARD_RULES: | ! Inspect a container GET %API_PREFIX_CONTAINERS%/%CONTAINER_ID_OR_NAME%/json volumes: netdatalib: netdatacache:
Afterwards one can just create an app proxy to
http://127.0.0.1:19999
and netdata can be "publicly" reached.The above
docker-compose.yaml
actually comes from the netdata documentation.