Suddenly can't login to Vikunja!

scooke

This is the second time now this app just stopped working. I had added some Tasks and whatnot, nothing had changed though for several days. I was in fact nearing the end of a lengthy post about how using this app was a great solution... then it hung. When I refreshed the page, it wouldn't log me in. Here are the logs:

May 06 09:46:45 172.18.0.1 - - [06/May/2021:07:46:45 +0000] "GET /api/v1/info HTTP/1.1" 502 568 "https://vikunja.example.com/login" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36"
May 06 09:46:45 2021/05/06 07:46:45 [error] 220#220: *108 connect() failed (111: Connection refused) while connecting to upstream, client: 172.18.0.1, server: default_server, request: "GET /api/v1/info HTTP/1.1", upstream: "http://127.0.0.1:3456/api/v1/info", host: "vikunja.example.com", referrer: "https://vikunja.example.com/login"
May 06 09:46:45 2021/05/06 07:46:45 [error] 220#220: *108 connect() failed (111: Connection refused) while connecting to upstream, client: 172.18.0.1, server: default_server, request: "GET /api/v1/info HTTP/1.1", upstream: "http://127.0.0.1:3456/api/v1/info", host: "vikunja.example.com", referrer: "https://vikunja.example.com/login"
May 06 09:46:47 172.18.0.1 - - [06/May/2021:07:46:47 +0000] "GET /sw.js HTTP/1.1" 304 0 "https://vikunja.example.com/sw.js" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36"

I tried increasing a setting in the config.yml, maxitemsperpage: 100, from 50 to 100, as that seemed to be the only thing evident to me that could be affecting it. I restarted redis. I restarted PostgreSQL, I restarted the app, and I even rebooted the entire Cloudron! But the same error comes up, something up connection refused to upstream client.

If this is fixable, great. If it isn't, what a shame, as I really, really, really invested quite a bit of time into using it to manage my domains!!

Help!

scooke

@scooke FWIW, I can see the RAM usage in the app's Graphs plummeted from 133 to 18 at the same time as my login began failing. Seems like something isn't running. From my poking around the internet, the failed upstream message seems to have something to do with nginx. But I am not sure what to do to check more, or fix it. For now I will leave this post as the sole post, rather than also asking on the Vikunja Community forums, because it seems this is a Cloudron thing.

fbartels

@scooke that looks like the service that is providing the vikunja api is failing/has stopped. I would have expected that a restart of the whole app would have made it start up again, but if that is not the case there must be something else going on.

I would restart the app again and watch closely what kind of information are logged during the startup.

Disclaimer: not affiliated with Vikunja and also not using it

scooke

Seems like a time zone error in the config.yml is affecting the api. It was GMT+1... so I removed the +1.

May 06 13:17:33 box:settings initCache: pre-load settings
May 06 13:17:33 box:taskworker Starting task 7570. Logs are at /home/yellowtent/platformdata/logs/51667508-ab88-4d9d-ba18-be51560acaad/apptask.log
May 06 13:17:33 box:tasks 7570: {"percent":2,"error":null}
May 06 13:17:33 box:apptask vikunja.example.com startTask installationState: pending_restart runState: running
May 06 13:17:33 box:tasks 7570: {"percent":20,"message":"Restarting container"}
May 06 13:17:34 2021-05-06 11:17:34,060 WARN received SIGTERM indicating exit request
May 06 13:17:34 2021-05-06 11:17:34,061 INFO waiting for nginx to die
May 06 13:17:35 2021-05-06 11:17:35,100 INFO stopped: nginx (exit status 0)
May 06 13:17:36 ==> Changing ownership
May 06 13:17:36 ==> Configuring vikunja-api
May 06 13:17:36 box:tasks 7570: {"percent":100,"message":"Done"}
May 06 13:17:36 box:apptask vikunja.example.com updating app with values: {"installationState":"installed","error":null,"health":null}
May 06 13:17:36 box:taskworker Task took 3.191 seconds
May 06 13:17:36 box:tasks setCompleted - 7570: {"result":null,"error":null}
May 06 13:17:36 box:tasks 7570: {"percent":100,"result":null,"error":null}
May 06 13:17:48 ==> Starting vikunja frontend and API using supervisor
May 06 13:17:48 2021-05-06 11:17:48,745 CRIT Supervisor is running as root. Privileges were not dropped because no user is specified in the config file. If you intend to run as root, you can set user=root in the config file to avoid this message.
May 06 13:17:48 2021-05-06 11:17:48,745 INFO Included extra file "/etc/supervisor/conf.d/nginx.conf" during parsing
May 06 13:17:48 2021-05-06 11:17:48,745 INFO Included extra file "/etc/supervisor/conf.d/vikunja-api.conf" during parsing
May 06 13:17:48 2021-05-06 11:17:48,752 INFO RPC interface 'supervisor' initialized
May 06 13:17:48 2021-05-06 11:17:48,752 CRIT Server 'unix_http_server' running without any HTTP authentication checking
May 06 13:17:48 2021-05-06 11:17:48,753 INFO supervisord started with pid 1
May 06 13:17:49 2021-05-06 11:17:49,757 INFO spawned: 'nginx' with pid 216
May 06 13:17:49 2021-05-06 11:17:49,761 INFO spawned: 'vikunja-api' with pid 217
May 06 13:17:50 172.18.0.1 - - [06/May/2021:11:17:50 +0000] "GET / HTTP/1.1" 200 1417 "-" "Mozilla (CloudronHealth)"
May 06 13:17:50 Error parsing time zone: unknown location GMT+12021/05/06 11:17:50 Using config file: /app/code/api/config.yml
May 06 13:17:50 2021/05/06 11:17:50 Redis initialized
May 06 13:17:50 2021-05-06 11:17:50,114 INFO exited: vikunja-api (exit status 1; not expected)
May 06 13:17:51 2021-05-06 11:17:51,116 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
May 06 13:17:51 2021-05-06 11:17:51,120 INFO spawned: 'vikunja-api' with pid 235
May 06 13:17:51 2021/05/06 11:17:51 Using config file: /app/code/api/config.yml
May 06 13:17:51 2021/05/06 11:17:51 Redis initialized
May 06 13:17:51 Error parsing time zone: unknown location GMT+12021-05-06 11:17:51,436 INFO exited: vikunja-api (exit status 1; not expected)
May 06 13:17:53 2021-05-06 11:17:53,442 INFO spawned: 'vikunja-api' with pid 243
May 06 13:17:53 2021/05/06 11:17:53 Using config file: /app/code/api/config.yml
May 06 13:17:53 2021/05/06 11:17:53 Redis initialized
May 06 13:17:53 Error parsing time zone: unknown location GMT+12021-05-06 11:17:53,698 INFO exited: vikunja-api (exit status 1; not expected)
May 06 13:17:56 2021-05-06 11:17:56,706 INFO spawned: 'vikunja-api' with pid 253
May 06 13:17:56 2021/05/06 11:17:56 Using config file: /app/code/api/config.yml
May 06 13:17:56 2021/05/06 11:17:56 Redis initialized
May 06 13:17:56 Error parsing time zone: unknown location GMT+12021-05-06 11:17:56,981 INFO exited: vikunja-api (exit status 1; not expected)
May 06 13:17:57 2021-05-06 11:17:57,983 INFO gave up: vikunja-api entered FATAL state, too many start retries too quickly
May 06 13:18:00 172.18.0.1 - - [06/May/2021:11:18:00 +0000] "GET / HTTP/1.1" 200 1417 "-" "Mozilla (CloudronHealth)"

fbartels

@scooke I can confirm that changing the timezone for the vikunja api makes the service fail. Switching it back to gmt makes the service start again.

Seems like a case for the Vikunja bugtracker after all

scooke

Well, I changed the time zone from GMT+1 to GMT as per the details here (https://vikunja.io/docs/config-options/#timezone - I would have thought something like "GMT+1" would have been correct), but now, even though it started up and I can log in, the logsnow shw a redis error:

May 06 13:29:51 2021-05-06T11:29:51.950576113Z: ERROR ▶ [DATABASE] 1738 [redis_cacher] decode failed: gob: wrong type (models.RelatedTaskMap) for received field .RelatedTasks

girish

@scooke Maybe the redis went corrupt. Looks like it is just a cache for the app, maybe you can try clearing it:

First make a note of the app id.
If you do docker stop redis-<appid>
rm /home/yellowtent/platformdata/redis/<appid>/dump.rdb. This clears the previous redis database.
docker start redis-<appid>
Restart the app as well to re-establish redis connection

kolaente

@scooke That redis error is a problem in the caching layer provided by the orm I'm using for Vikunja. A few more details here.
The only solution right now is to disable the cache. If you're not running a very large installation, you won't notice the difference. I found that using redis can even slow down the overall speed because of the network overhead it has.

There should be logs from Vikunja somewhere too, the logs you posted are only the ones from the cloudron runtime if I understood that correctly. Maybe @girish can help with that?

scooke

@kolaente Sorry, I should have been clearer that setting the time zone was what got the app running again. I don't recall clearing any cache, despite @girish's kind advice. I'll try disabling the cache, even though I can't say I've noticed any slowdowns. It either has worked well, or not worked at all! Thanks.

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

Suddenly can't login to Vikunja!