@LoudLemur Thank you very much! I'll update the post to include the docker image link.
I'm glad to see so much interest for Actual. I am really looking forwards to seeing it on Cloudron as well!
@LoudLemur Thank you very much! I'll update the post to include the docker image link.
I'm glad to see so much interest for Actual. I am really looking forwards to seeing it on Cloudron as well!
It will be really great if this app could be packaged before the start of June. I'll be beginning a new job then, and it would be a good time to upgrade my budgeting workflow
@jdaviescoates Thank you!
I'll edit the main post to include the correct link. I am quite excited for Actual, as it is quite polished and well-made- and I hope to see it on Cloudron soon!
Actual is a self hosted, open source financial planning and budgeting system. It's designed to be fast, local-first, and privacy friendly, and has a lot of powerful features for budgeting and financial planning.
The Actual application was originally a closed-source software offering by James Long, but it has been recently released as an open source application under the MIT license.
The actual website and landing page is available at:
And their source code is available at:
https://github.com/actualbudget/actual-server
https://github.com/actualbudget/actual
Actual already comes with a Docker image:
https://github.com/actualbudget/actual-server#running-via-docker
They have an online demo that you can use to try out the features here:
I recommend Cloudron to package Actual, because it is a modern, high-quality financial planning app that adds value to a category that Cloudron does not have many applications in. Actual is fast, and unusually powerful, and has a great feature-set and UX/UI experience.
Hey there, @FeelNiceInc . I'm glad to hear that my tutorial was helpful for you, and I'm sorry that my provided regex did not work.
I think @girish 's solution is the best - by changing Cloudron's Nginx webserver to use the default combined
log format, matomo's log import script will automatically recognise and import the logs without needing to specify a special regex.
The regex that I provided in my tutorial was specifically in order to accomodate Cloudron's idiosyncratic combined2
log format -- but otherwise it provides little benefit.
I'm not sure why the regex didn't work for you, as it is working for me. For future readers that stumble upon this thread, I would recommend going with @girish 's advice, and simply change Cloudron to use the combined
format.
However, if you already have an archive of logs that are in the combined2
format which you need to import, I recommend trying to figure out the correct regex by hand. I use a regex visualiser called RegExr, which makes it easier to craft custom regular expressions.
The Regexr link to the combined2
log format is here:
I recommend taking a few lines of your server logs, and pasting them into regexr -- and see what matches, and what doesn't match. The way the regex expression is formatted is that it defines a few named capture groups, which are as follows:
(?P<ip>[\w*.:-]+)
IP Address(?P<date>.*?)
Date(?P<timezone>.*?)
Timezone(?P<method>\S+)
HTTP Request Method (e.g. Post, Get)(?P<path>.*?)
HTTP Request Path (e.g. /homepage.html)(?P<status>\d+)
HTTP Request Status(?P<generation_time_milli>\d*\.?\d+)
Amount of time for the server to respond(?P<referrer>.*?)
Referrer header(?P<host>[\w\-\.]*)
Host(?P<user_agent>.*?)
User Agent (what browser, device, etc)All the weird things like \s
or .+
in between simply account for things like spaces in the log lines. Try playing around with the Regex until it matches everything in your logs. The regexr website makes it all very visual and easy to understand.
I'm glad that you were able to get log analytics working. I hope this helps!
This is a guide on setting up Log Analytics with Matomo. Matomo is an open source, self hosted, privacy-friendly analytics platform that comes available as a Cloudron crate. Standard Matomo installations ingest data through a JavaScript tracking that you must embed in each website that you wish to enable analytics on.
Matomo also offers Log Analytics - where instead of using a client-side JavaScript tracker, it ingests data directly from your Nginx log files (access.log
). In comparison to using JavaScript, server-side Log Analytics have the following benefits:
I am using Log Analytics primarily out of privacy consideration for my website's visitors. I want to understand where my visitors come from, but in the most respectful, privacy-friendly way possible. Server-side log analytics means I won't inject any code at all, which is much friendlier in my opinion.
Broadly speaking, the process for sending logs to your Matomo installation looks like this. We will be automating it using a cronjob.
/var/log/nginx
which are called access.log
.import_logs.py
script (GitHub), we send the log files to your Matomo installation url (e.g. https://matomo.example.com
archive
job is run, and the data is now available on the dashboard.The biggest difficulty in this setup is step 2. As of Cloudron box version 7.0.1
, Cloudron's Nginx is configured to use a niche log format called combined2
. This log format seems to be only used by collectd
and nobody else, hence Matomo's import_logs.py
script cannot parse it. We will have to use a custom regex pattern in order to allow Matomo's import script to work.
Note: According to @girish , Cloudron will revert to the default Nginx combined
log format for the 7.1 release (source). Hence, if you are following this guide from the future, feel free to omit the custom regex pattern.
The combined2
log format that Cloudron uses is slightly different from Nginx's default combined
log format. Here's a comparison of their structure:
The combined2
format:
$remote_addr - [$time_local] "$request" $status $body_bytes_sent $request_time "$http_referer" "$host" "$http_user_agent"
The combined
format:
$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"
As you can see, the fields are different just enough that the import_logs.p
script cannot parse it. Thankfully, we can specify a custom regex pattern using the --log-format-regex
option.
combined2
logs:This is the Regex pattern that you need to use to parse the logs successfully:
(?P<ip>[\w*.:-]+)\s+\S+\s+[(?P<date>.*?)\s+(?P<timezone>.*?)]\s+"(?P<method>\S+)\s+(?P<path>.*?)\s+\S+"\s+(?P<status>\d+)\s+(?P<length>\S+)\s+(?P<generation_time_milli>\d*\.?\d+)\s+"(?P<referrer>.*?)"\s"(?P<host>[\w\-\.]*)"\s"(?P<user_agent>.*?)"
Essentially, it defines a bunch of named capture groups, such as <date>
or <path>
which import_logs.py
can understand.
Here's a nice, visual explanation of the regex format, complete with some example log data (IP addresses are fake, sourced from reserved ranges):
import_logs.py
scriptIn order to import our server logs into Matomo, we must use their provided Python 3 import script. We can get the script from their official Github repository:
https://github.com/matomo-org/matomo-log-analytics
I will show you where to download it in a moment.
The import_logs.py
script requires three parameters:
--url
: This is the url of your matomo installation. It must include the https://
prefix!--token-auth
: This is an API authentication token from Matomo. You must generate it from the dashboard.--log-format-regex
This tells the script to use your custom provided regex pattern, so it can understand Cloudron's combined2
format.Once again, if you are following this guide from the future (e.g. version 7.1 and above), you do not need to specify the --log-format-regex
.
This is how the command should look like:
python3 import_logs.py \
--url=https://matomo.example.com \
--token-auth=KEEP_THIS_SECRET \
--log-format-regex='(?P<ip>[\w*.:-]+)\s+\S+\s+[(?P<date>.*?)\s+(?P<timezone>.*?)]\s+"(?P<method>\S+)\s+(?P<path>.*?)\s+\S+"\s+(?P<status>\d+)\s+(?P<length>\S+)\s+(?P<generation_time_milli>\d*\.?\d+)\s+"(?P<referrer>.*?)"\s"(?P<host>[\w\-\.]*)"\s"(?P<user_agent>.*?)"' \
/var/log/nginx/access.log.1
import_logs.py
Now we are ready to get it working. We must first login to your base Cloudron server. Cloudron does not run Nginx on a per-application basis (i.e. in every docker container), but rather runs Nginx on the base server itself. Hence all the logs are there, and we need to execute the script there.
First, login to your server using SSH:
ssh root@my.example.com
Next, we will download the script from Matomo, and go inside the directory that contains it.
cd ~
git clone https://github.com/matomo-org/matomo-log-analytics.git
cd matomo-log-analytics
Now we run the above command. Make sure to have the correct --url
and --token-auth
parameters, as well as the right log file, which should be /var/log/nginx/access.log.1
.
python3 import_logs.py \
--url=https://matomo.example.com \
--token-auth=KEEP_THIS_SECRET \
--log-format-regex='(?P<ip>[\w*.:-]+)\s+\S+\s+[(?P<date>.*?)\s+(?P<timezone>.*?)]\s+"(?P<method>\S+)\s+(?P<path>.*?)\s+\S+"\s+(?P<status>\d+)\s+(?P<length>\S+)\s+(?P<generation_time_milli>\d*\.?\d+)\s+"(?P<referrer>.*?)"\s"(?P<host>[\w\-\.]*)"\s"(?P<user_agent>.*?)"' \
/var/log/nginx/access.log.1
If you are successful, you should see the output which looks like this:
x lines parsed, x lines recorded, x records/sec (avg), x records/sec (current)
Parsing log /var/log/nginx/access.log.1...
x lines parsed, x lines recorded, x records/sec (avg), x records/sec (current)
...
Processing your log data
------------------------
In order for your logs to be processed by Matomo, you may need to run the following command:
./console core:archive --force-all-websites --url='https://matomo.example.com'
Now your logs should have been ingested by Matomo. If you have any additional logs, such as access.log.2
, access.log.3
, et cetera, this is the time to import them as well.
In order for Matomo's dashboard to update, we will have to tell it to archive
, thankfully the default Matomo cloudron installation is already configured to archive automatically every 15 minutes. If you wish to perform a manual archive
, simply open a terminal in the Matomo docker (you can do this from the browser) and tell it to run the archive
cronjob.
Check your Matomo dashboard now
Do you see any data? If you do not see any data, it may be because you have not setup a website in the Matomo dashboard. By default, Matomo rejects log entries that do not correspond to a website in the dashboard. If your Matomo install is brand new, this is the time for you to add your websites. Then run the import commands again.
Now your dashboard should be updated with the log analytics.
Now, we must import the files every day. The best way to automate this is to put the command into a bash script, and set a cronjob to automate it. This is the script that you can use:
#!/usr/bin/env bash
python3 import_logs.py \
--url=https://matomo.example.com \
--token-auth=KEEP_THIS_SECRET \
--log-format-regex='(?P<ip>[\w*.:-]+)\s+\S+\s+[(?P<date>.*?)\s+(?P<timezone>.*?)]\s+"(?P<method>\S+)\s+(?P<path>.*?)\s+\S+"\s+(?P<status>\d+)\s+(?P<length>\S+)\s+(?P<generation_time_milli>\d*\.?\d+)\s+"(?P<referrer>.*?)"\s"(?P<host>[\w\-\.]*)"\s"(?P<user_agent>.*?)"' \
access.log.1
Make sure that the log file is access.log.1
. Nginx automatically rotates the log files once a day at midnight, where the log files switch like this:
access.log -> access.log.1 -> access.log.2.gz -> access.log.3.gz
Since we are running our cronjob once a day at 1:00am, we always want to get access.log.1
which represent "yesterday's" logs, fresh right after the rotation If you ran the import on access.log
, you will gain an empty file since the logs were just rotated.
Save it somewhere like at /root/import-cronjob.sh
.
Now, all you have to do is to add the cronjob into the root
crontab. To do so, you run:
crontab -e
Follow the on-screen instructions to choose an editor, and then add the following cronjob:
0 1 * * * /root/import-cronjob.sh >/dev/null 2>&1
This tells the server to run the script once a day, at 1:00 (1:00am), and to silence all output.
Save the crontab, and now your setup should be complete. Congratulations, your Matomo instance on Cloudron is now using server-side Log Analytics!
I hope this tutorial has been helpful. If you need any help, feel free to ask questions.
Keywords to aid search
Matomo, log analytics, nginx, logs, combined2, log format.