Deploying apps that treat code as data

girish

The problem is this: upstream authors often don't think of deployment and many times treat "code" as data. This means they write plugin and module infrastructure which writes all over the codebase. For the sake of this discussion, let's assume that we cannot convince the upstream authors that their approach is not good

Cloudron was designed with the idea of reproducible and immutable deployments. One of the biggest things is that it marks code as read only. This makes sure that apps can't write all over and that backups are actually correct. This is also needed for updates and restore to work reliably/smoothly. I have to say this works very well for most of the apps and yet use cases come up now and then which makes us wonder if we should let go of some of these rigid requirements.

Here's some situations:

WordPress plugins want to write all over. Some caching plugins even write to the root wordpress directory, no idea why.
Dolibarr/odoo are like frameworks and you have to deploy a whole bunch of plugins to make them functional. Unfortunately, these plugins also use relative paths like include "../foo" which tends to break when we symlink mutable paths.
NodeBB plugins are node_modules. Latest versions of node don't allow symlinking node_modules anymore. This means we possibly have to put the entire code in some writable directory.
Koel wants to write to public directory to storage cover art. This is also the folder that contains all the js/css/webpack output! It also has code accessing things via relative paths. So, symlinking public to writable portion doesn't work.
Etherpad creates a file inside every node_module plugin directory.
Nextcloud plugins sometimes access relative paths as well.

(On a separate note: none of the apps have a proper sandboxing and/or versioning mechanism for plugins. this means if a plugin breaks, nothing works.)

Any ideas welcome on how such apps are maintained and updated outside of Cloudron.

mehdi

In short, I am quite against letting go of these restrictions. I think their benefit is huge. I know it makes it a pain to package some apps, but I still think it's worth it.

As for more specific points:

@girish said in Deploying apps that treat code as data:

Latest versions of node don't allow symlinking node_modules anymore

Do you have a source on this ? I am very surprised, as a lot of nodejs utilities use symlinks in node_modules, things like Lerna (https://lerna.js.org/), or even native npm link.

@girish said in Deploying apps that treat code as data:

Koel wants to write to public directory to storage cover art. This is also the folder that contains all the js/css/webpack output! It also has code accessing things via relative paths. So, symlinking public to writable portion doesn't work.

You could do it the other way around :

in Dockerfile, rename de folder /app/code/public2
create a symlink from /app/code/public to /app/data/public
in start.sh create /app/data/public, and inside it create symlinks to everything in /app/code/public2

A bit hacky, but it should work I think

murgero

@mehdi Reverse the symlink to make it work but otherwise, this is what I use to package certain apps as well

mehdi

@murgero Are you sure ? I have seen (and made) some apps with a simple symlink, but I have never seen this sort of double-symlink (and no, i don't think there's anything to reverse in my explanation)

robi

@mehdi I think he means in ln -s syntax it's reversed.

@murgero is that right?

robi

@girish It may also be useful to have this concept available as a Tut/Writeup in the docs which you can link/send to upstream app devs to think about their app arch and adjust it for x, y, z benefits and easier inclusion into the Cloudron ecosystem.

If they ignore it, we have workarounds.
If they take it and make it, we all win

infogulch

My initial thought was along the lines of @robi's latest comment: as a first step lets have some good posts espousing the benefits of RW^X in app container directories and describe the pitfalls of letting everything devolve into a giant mess.

Ideally the first thing you do when you find an app that treats its code frivolously is link the authors to posts that describe why what they're doing is a Bad Idea (tm) and how the problem they're trying to solve with this crazy code-as-data strategy can be solved more sanely. (Obviously you'll want to be somewhat more respectful than this when opening a random issue on someone's free project lol, but really guys... don't do this.)

As far as the initial question where we attempt to proceed assuming we can't tame the app's behavior, maybe a "messy mode" for apps where backups include more directories than just /data. Backups would be much larger. Images maybe couldn't be based on the cloudron base image. Updates would become very messy (maybe just overwrite files with ones from the new image?). At that point I'm not sure if it would be sustainable to officially support such apps at all, they might require a white glove support for individual installations.

girish

@mehdi said in Deploying apps that treat code as data:

@girish said in Deploying apps that treat code as data:

Latest versions of node don't allow symlinking node_modules anymore

Do you have a source on this ? I am very surprised, as a lot of nodejs utilities use symlinks in node_modules, things like Lerna (https://lerna.js.org/), or even native npm link.

Yes, see https://github.com/npm/cli/issues/3669 and (related) https://github.com/nodejs/node/issues/22922 . We symlink node_modules in the nodebb app which doesn't work anymore because of this. Not 100% sure but I think npm link creates symlinks inside node_modules which is different from this one where node_modules itself is symlinked.

privsec

Why not offer an option within the app to determine whether or not it is write protected?

infogulch

@privsec If the user installs a plugin which is stored in the code area, and the backup only backs up the data area, when the app is deleted and restored from a backup it will lose the plugin and potentially also data.

privsec

@infogulch
could the backup locations be updated?

murgero

@robi Ah yes - this is what I was thinking about.

@mehdi - My apologies, I may have been thinking of alternate syntax to the symlinking command.

MooCloud_Matt

@girish

Probably a possible solution is to allow multiple Volumes to be mounted to the container, so there is no need for a syslink, and is still easy to backup, the backup script knows which volume to backup for each container.
Is a bit more work on the Backup script, but should allow an easier port to Cloudron.

Or, I see as an alternative the release of the Manifest standard as a separate Spec from Cloudron, like the "Compose Spec".
This should allow developers to better understand what cloudron and the manifest support and how they work, and hopefully start to act more in line with safer containerization standards.

timconsidine

@girish : it will for sure be impossible to persuade others to do it the Cloudron way, so the problem has to be addressed.

Personally I feel the Cloudron stability and reliability is SO IMPORTANT that it must be preserved and it is the priority. If it is impossible to find a solution, we have to accept a situation like "sorry, this XXX app is such a mess, it is not possible to make it comply with Cloudron standards, and even though it is good functionality, and we could possibly package it, we're not going to. Because it's a mess."

While I don't have a technical solution approach to the problem, I think it's almost inevitable that we end up with another status of apps :

OFFICIAL (package tested, released, maintained, conforms to Cloudron standards)
UNSTABLE (although I don't like this word : should be Beta or not-yet-officially-released)
MESS or "Pig's Breakfast" (packaged, works, use it, but maintained as best endeavours, esp ref backups, because upstream dev does not conform to Cloudron standards)

Not ideal, but better than lowering Cloudron standards. Would rather accept not having XXX app in the store.

mehdi

I quite agree with @timconsidine . We could have these apps on Cloudron, but with a big fat red warning that says the app is unsupported :

zero user support
larger backups
no auto-update

If unhappy: bother the original author.

robi

@moocloud_matt said in Deploying apps that treat code as data:

a possible solution is to allow multiple Volumes to be mounted to the container

This is along the lines of OpenEBS storage discussion here.

Elegant way to install an "elevator" to any directory in the container and swap out files we might need, without disrupting the app.

MooCloud_Matt

@robi
in general, having support for multiple drivers for Volumes is a good idea, but set in the Manifest standard the support for multiple volumes, could solve the issue at least partially.

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

Deploying apps that treat code as data