-
Today, I noticed my Baserow instance crashed during the weekend.
I really didn't expect it to crash, because there is a very little usage for that DB at the moment (1 read/write operation every 10 seconds).
I looked at the logs, and noticed the
FATAL: too many connections for database
error in the Cloudron Postgres Logs (not in Baserow logs).I also found a mention of that same error within the Baserow logs.
<30>1 2024-06-24T07:08:42Z unly-baserow 7fdba71d-461b-44c4-81e0-b8aa8ec62681 999 7fdba71d-461b-44c4-81e0-b8aa8ec62681 - django.db.utils.OperationalError: connection to server at "postgresql" (fd00:c107:d509::4), port 5432 failed: FATAL: too many connections for database "db7fdba71d461b44c481e0b8aa8ec62681"
Logs:
I went ahead and increased the RAM of the Cloudron
postgresql
service from 1GiB to 2GiB, as it was 98% used.
That didn't solve the issue, so I restarted Baserow, and that solved it.But now, all of that raises a couple of questions and red flags:
-
What is the relationship between the Cloudron Postgresql service and the Baserow app? I wouldn't have thought they would be related.
-
What RAM is recommended for the Cloudron Postgresql service? It all ran without issues for months with just 1GiB, but now, it takes about 1.2GiB. Considering I haven't changed anything on the Baserow instance for a while, I find that surprising. (I did make a server migration 2 weeks ago, migrating from Digital Ocean to Hetzner, but I restored a Cloudron backup, so the Baserow instance's config itself shouldn't have changed at all)
-
The instance crashed on Saturday evening and I only took notice today. (I did get emails about it, thanks to the awesome UptimeRobot), but that raises concerns about using self-hosted Baserow instance if it can just crash like that without any kind of anticipation on my end, I wasn't aware the RAM of the Postgres service was slowly filling up, up to the point of completely crashing (if that's even the root cause of the whole thing - I don't know).
This is a critical service and not only do I need a good notification system, but a much better way of anticipate those issues, and act proactively.
What's even more concerning to me: That instance handles almost no charge!
It's a side project with crypto, and it only serves at storing the value of ~20 coins, there is one n8n worfklow reading/writing into it every 10 seconds, and that's basically it.
If I encounter downtime with such little load, it begs the wonder of the kind of issues I'll be facing when truly using it at the core of our many apps. -
-
@AmbroiseUnly said in Baserow instance crashed unexpectedly - postgres : FATAL: too many connections for database:
What is the relationship between the Cloudron Postgresql service and the Baserow app? I wouldn't have thought they would be related.
The Baserow app uses PostgreSQL. What you see as Services is used by apps. The database services are shared by apps. The services are not meant for the end user to be used directly and are meant for apps only! So, as your usage of Baserow grows (more data, more things to do), you have to give PostgreSQL more memory. I would say just give us some more memory and hopefully it works out.
This is a critical service and not only do I need a good notification system, but a much better way of anticipate those issues, and act proactively.
I am not sure about your server specs, but stability is a function of the resources available. Just have a big enough server with lots of RAM/CPU/Disk and then give the Baserow and Postgres lots of memory. In general, there is no easy approach (that I know of) to anticipate resource requirements of apps. It would also be beneficial to find out why the app crashed. If it's because of resources, there is nothing much you can do, other than give more resources...
-
-
Yes, but I didn't know all of that. (relationship between Cloudron services and apps)
Had I known beforehand, I would have done the setup differently (allocating more RAM to postgres)Also, had I been notified of the service's memory consumption (and its impact if not taken care of), I could have acted proactively, too.
Receiving an email like "Baserow app will soon crash because its postgres service is at 90% of memory. It is recommended to increase the service's memory now to avoid a downtime." would have been super efficient, and actionable.
The value brought by this kind of feature is very high, as it helps abstract technical low-level details and focus on things that are much more meaningful for non-too-technical people.
-
For anyone interested in configuring proper monitoring on your Cloudron server, I wrote a guide about it, and I hope you'll find it useful!
I wished when this issue happened that I had been warned before the disk ran out of space, and that's what monitoring is for.