-
@girish thanks, it does indeed work. But I think what Cloudflare does is entirely different from how it works through Cloudfront. So maybe someone else will be able to help us out. Also I wonder what your next Let's Encrypt re-new in Cloudron will do? Wouldn't it fail because now you're not resolving back to the Cloudron machine? You'd need to set up the Cloudflare self-signed cert for full ssl, right?
-
@Prinzhorn said in Hide Cloudron (Cloudfront, CDN, multiple IPs):
@girish but it's not behind Cloudflare? It's behind nginx? The question is how much does Matomo actually know what's happening in front of nginx.
(As you probably know), cloudflare and cloudfront and very different kind of proxies. Cloudflare has a "limitation" that the domain also has to be hosted there (in my case,
cloudron.site
is hosted on cloudflare). Cloudfront does not have this limitation.TBH, I don't completely understand the issue you are facing. I have to try Cloudfront to understand better. I am going to try this shortly.
-
@Prinzhorn said in Hide Cloudron (Cloudfront, CDN, multiple IPs):
Also I wonder what your next Let's Encrypt re-new in Cloudron will do? Wouldn't it fail because now you're not resolving back to the Cloudron machine? You'd need to set up the Cloudflare self-signed cert for full ssl, right?
Cloudflare requires the domain to be hosted on cloudflare This means then that we can certs via Let's Encrypt DNS automation. So, the site always has valid certs even when behind cloudflare...
-
@girish yes, they appear to be completely different. The main issue and my main point is that I would access Matomo from a domain that neither Matomo nor Cloudron know about. A domain that points to the Cloufront distribution. Thanks for looking into this!
-
@Prinzhorn I ran into same issue as you when using Cloudfront. So, the issue is that Cloudfront is trying to serve the files incorrectly.
When accessing directly, see that the content-type is
text/css
:When accessing via cloudfront, the content-type changes to
text/html
, I have no idea why:This is the reason the resources don't load properly on the browser. The css/js everything is treated as html.
-
@robi said in Hide Cloudron (Cloudfront, CDN, multiple IPs):
Hence it's used for assets not fronting applications.
you the man! I fundamentally misunderstood the service. So, @Prinzhorn you can't easily front matomo with Cloudfront. I think if you want to hide the IP, we have to go another route.
To take a step back, is your only issue that http://ip is showing some Cloudron page?
-
I'm sorry, but this is just wrong. How is https://www.freecodecamp.org/news/do-not-use-s3-for-static-assets/ relevant to what we're talking about here at all? First they say that Cloudfront does not support brotli (ok?) and their conclusion is that Cloudfront is too expensive? The article say nothing about dynamic content, it sounds more like a random collection of facts S3 / Cloudfront for SEO?
I'm not saying Cloudfront is amazing or anything. And I'm far from being an expert. But saying it's not possible to serve dynamic content is completely misleading. And backing that by a random source that is entirely unrelated doesn't make your argument more convincing.
https://aws.amazon.com/cloudfront/dynamic-content/
If you are serving dynamic content such as web applications or APIs directly from an Amazon Elastic Load Balancer (ELB) or Amazon EC2 instances to end users on the internet, you can improve the performance, availability, and security of your content by using Amazon CloudFront as your content delivery network. With Amazon CloudFront, your end users connections are terminated at CloudFront locations closer to them, which helps in reducing the overall round trip time required to establish a connection.
For Amazon Elastic Compute Cloud (Amazon EC2) origins serving dynamic and/or static content, you should also consider using CloudFront because it offers a number of additional benefits. In this blog post, we will provide an overview of the performance, security, and cost benefits you get when using CloudFront to serve dynamic and/or static assets from Amazon EC2.
You'll find countless sources, both official and unofficial:
https://docs.aws.amazon.com/whitepapers/latest/best-practices-wordpress/dynamic-content.html
https://blog.shikisoft.com/serving-dynamic-website-with-amazon-cloudfront/
@girish regarding stuff being served as
text/html
, I already solved that as described above. I had to configure Cloudfront to pass along query strings as well, or else Cloudron will only getindex.php
as URL without the CSS/JS params.
Let's take a step back. How does Cloudron's nginx talk to Matomo? From what I can tell Matomo does not know that the domain I'm using to access it (e.g.
matomo.cloudron.example.com
) even exists. Onlylocalhost
is configured astrusted_hosts[]
. So I assume nginx already does some rewriting to make it work at all, correct? E.g. rewrite the known domain to localhost? Does that include things likeOrigin
header? Or am I on the wrong track? -
@Prinzhorn The article is probably wrong, tbh I didn't read it. I was commenting on if Cloudfront being a CDN.
When a CDN is placed in front of an app, the app needs to understand there is a CDN in front of it. The app has to be specifically coded and have configuration options to support a CDN. Some specific examples:
-
The app could be hardcoded to query the origin - both for assets and REST/API queries. This means that if you "view source", you might find it has origin URLs in the assets (css/js), thus leaking your origin (and thus IP). It might do some REST API calls straight to origin and you can see them in the Network tab.
-
The app could set a CSP which does not work with a CDN. As an example, the Cloudron dashboard sets a very restrictive CSP and if you put a CDN in front of it, it won't work. This happens even with Cloudflare+CDN. See this - our own dashboard breaks exactly like matomo's.
-
As you probably know, trust proxy setting needs to be supported for an app to trust the headers. With a CDN, the app has to "support" trust proxy configuration.
This is why I concluded that a pure CDN service is not a good way to "hide the IP". It won't solve your problem, especially if an app that does not know a CDN is in front of it will break now or later in many ways and might leak your IP. Are you OK with this (I thought the end goal is to hide the IP and not to use a CDN somehow?)
Finally, cloudflare is specifically designed to hide the IP. It makes sure the origin domain and the app domain are the same and this is how it solves all the above issues.
-
-
@Prinzhorn said in Hide Cloudron (Cloudfront, CDN, multiple IPs):
Let's take a step back. How does Cloudron's nginx talk to Matomo? From what I can tell Matomo does not know that the domain I'm using to access it (e.g. matomo.cloudron.example.com) even exists. Only localhost is configured as trusted_hosts[]. So I assume nginx already does some rewriting to make it work at all, correct? E.g. rewrite the known domain to localhost? Does that include things like Origin header? Or am I on the wrong track?
The app package itself is here - https://git.cloudron.io/cloudron/matomo-app. Cloudron's nginx is just a reverse proxy without any understanding of matomo specific paths. So, it will just forward anything to
matomo.cloudron.example.com
to matamo "container". The matomo container has an apache instance configured with this - https://git.cloudron.io/cloudron/matomo-app/-/blob/master/apache/matomo.confIn cloudron's case, the trusted hosted check in matomo is disabled (
enable_trusted_host_check
is 0). This is fine because the container is not reachable by anyone other than cloudron's nginx. You can see https://git.cloudron.io/cloudron/matomo-app/-/blob/master/start.sh#L37Finally, the Origin header is only needed for CORS i.e when a tracked website is making a call into matomo. It doesn't come into picture when loading matomo's login screen. It might appear to come into picture, but this is only because we have a separate domain when using a CDN and this goes back to figuring out if matomo supports a CDN based setup (this, I don't know and I cannot find any information in their docs or the forum).
-
@girish said in Hide Cloudron (Cloudfront, CDN, multiple IPs):
This means that if you "view source", you might find it has origin URLs in the assets (css/js), thus leaking your origin (and thus IP).
FWIW, open https://analytics.cloudron.io -> view source . It has
https://analytics.cloudron.io/plugins/Morpheus/images/logo.svg?matomo
inside the source. So, the IP will get leaked even if you a CDN in front. I realize some CDNs can probably patch HTML contents, but I was just trying to explain the bigger point. -
When a CDN is placed in front of an app, the app needs to understand there is a CDN in front of it. The app has to be specifically coded and have configuration options to support a CDN.
I think all these things you've listed apply to your nginx in the same way they apply to any reverse proxy you put in front of it (including CloudFront). And you've already solved them, otherwise Matomo and other apps would respond with
localhost:8080
in the HTML and break all the time.Finally, the Origin header is only needed for CORS i.e when a tracked website is making a call into matomo. It doesn't come into picture when loading matomo's login screen. It might appear to come into picture, but this is only because we have a separate domain when using a CDN and this goes back to figuring out if matomo supports a CDN based setup (this, I don't know and I cannot find any information in their docs or the forum).
No and yes. As I've assumed earlier I'm running in Origin based CSRF protection and the Matomo source confirms that https://github.com/matomo-org/matomo/blob/e4a7b579622da9f585a9e662efc114b08b6fe08d/core/Nonce.php#L132-L139
Thanks for all the information regarding how nginx and Matomo interact. I've been able to successfully access Matomo through CloudFront by including
Origin: https://matomo.cloudron.example.com
in the origin request. However, login redirects me fromfoo.cloudfront.net
tomatomo.cloudron.example.com
, which is not surprising, since that's the Host header Matomo sees.From all the information I have I think all I need is the following: a way to associate a domain with an app without all the DNS/TLS setup that Cloudron otherwise needs. The only downside would be that it wouldn't be full SSL and traffic between EC2 and CloudFront would be unencrypted inside the AWS network.
So instead of
https://matomo.cloudron.example.com
-> nginx -> dockerit would be
https://foo.cloudfront.net
-> CloudFront ->http://foo.cloudfront.net
-> nginx -> dockerwhat I have right now is
https://foo.cloudfront.net
-> CloudFront ->https://matomo.cloudron.example.com
-> nginx -> dockerI can configure CloudFront to forward the Host header. So there just needs to be an additional nginx config to forward my arbitrary host to the Matomo container. Matomo itself doesn't know about this, but would see the CloudFront Host + Origin header and be happy. Right now it doesn't know about
foo.cloudfront.net
, because CloudFront looks to nginx like any user accessing Matomo atmatomo.cloudron.example.com
. With this new setup there cannot be a leak ofmatomo.cloudron.example.com
, because it isn't part of the request or response at all and Matomo is not aware of its existence. As you saidenable_trusted_host_check
is disabled and I just need a second trusted host (trusted by nginx) that nginx forwards to the container. -
@girish did you get a chance to look at my recent post? Is this something Cloudron is interested in supporting (connecting domains to an app without TLS)? Alternatively (and more flexible) will https://forum.cloudron.io/topic/6713/where-can-i-put-custom-nginx-conf-that-won-t-be-overwritten go somewhere? You said
I think to properly support this, we can look into this for the next release
If there was a way to customize the nginx config then I could wire everything up myself. That would probably be both easier to implement on your end (doesn't even need a UI in the first iteration) and also way more flexible for different use-cases. Maybe some sort of fs based heuristic where you include custom nginx configs for each app in a specific location or that follow a specific name pattern?
-
@Prinzhorn I can look into this further but did you see me note here - https://forum.cloudron.io/topic/6773/hide-cloudron-cloudfront-cdn-multiple-ips/24 ? If you do view source, even with CDN, the cloudron domain will leak because CDN integration is not "built into" matomo. Are you OK with that ? Because I thought the goal was to hide the IP in a secure way (or maybe in your use case it's enough to make it hard).
-
-
@girish I don't think I understand the problem here. There is no "cloudron domain" that could be leaked. I create a CloudFront distribution, say
example.cloudfront.net
(or whatever custom domain I configure in CloudFront). This distribution forwards requests with this host to my Cloudron box (over plain HTTP). And there I have a custom nginx config that connectsexample.cloudfront.net
to the Matomo container. There is no other domain involved.I think the way Cloudron works I still need at least one domain like
matomo.cloudron.example.com
connected to the app (what you called the "cloudron domain"). Because Cloudron wouldn't know that my custom nginx config already handles connection of the domain. But I don't use it and Matomo doesn't know it exists. Matomo also doesn't know about CloudFront. To Matomo it is indistinguishable from the original setup. Because Matomo is not behind CloudFront, Matomo is behind Cloudron's nginx. How could it leak?The only thing I need to take care of is that if someone finds the Cloudron Domain (which I don't need, but Cloudron does), they cannot correlate it to the same Matomo that is served through CloudFront. I think the way I'll do this is by using a domain specifically for Cloudron (e.g. not
cloudron.my-company.com
butmy-cloudron.com
), so that it cannot be found by enumeration of our subdomains (not sure how well you can protect DNS). And if the custom nginx config takes precedence over Cloudron's own, then I think I can also disable the custom domain and return a 400 instead of forwarding requests to Matomo? So someone "discovering" the domain would still not know it's actually the same Matomo instance served via CloudFront (CloudFront doesn't leak the origin).Is there something I'm misunderstanding?
-
I am yet to investigate this a second time but I just chanced upon https://matomo.org/faq/how-to/faq_132/ which suggests using https://github.com/matomo-org/tracker-proxy . It seems to provide a way to hide the matomo server URL.
-
@Prinzhorn said in Hide Cloudron (Cloudfront, CDN, multiple IPs):
... We don't want clients to know that we are running Cloudron and that multiple projects are running on the same machine (let's just take this as a given).
It's a point I and some others have lifted here and there, I believe Cloudron should not be seen or exposed to other areas than what its own client sees. There are many many reasons for that plus the one submitted here. Meaning, other than the CR's client, any other end users and visitors to a site maintained on a CR instance, or anyone else who stumbles on a domain which is mounted on a CR instance but has no app installed for example, should NOT have any clue about "a Cloudron" being in use. Let alone with a direct link to cloudron.io where it is absolutely irrelevant when it's encountered by a visitor who's not aware in any way. Reasons to avoid this might be for tactical reasons, competitive reasons and, moreover ethical reasons because that traffic is CR's owner (or client) traffic and should not be redirected in any way, let alone be totally confused by being presented with something that will NOT solve this visitor's problem at the moment. 2 cents, again, about this issue