Solved Paperless - indexing and archiving scanned documents
-
Paperless for indexing and archiving scanned paper documents.
It supports Docker, see documentation.
This app was forked and now maintained at https://github.com/jonaswinkler/paperless-ng
-
Would realy love to see this in the near future!
-
This looks to be packaged already: https://git.cloudron.io/cloudron/paperless-app
Is there any reason why it is not published in the appstore?
-
I notice it's been 15 months since the last upstream release
https://github.com/the-paperless-project/paperless/releases -
@Hillside502 The author says it is feature-complete ("I am no longer doing new development on Paperless as it does exactly what I need it to" - https://github.com/the-paperless-project/paperless). However, it should be working.
-
I'd appreciate it. Paperless has the potential to replace the Scansnap cloud. I have the matching Fujitsu IX 500 scanner.
-
@stantropics I tried installing this on my cloudron server to no avail. I think the repo is out of date. Really looking for a document manager on cloudron...
-
@ergabergblerg can you try the repo in https://forum.cloudron.io/topic/1066/mayan-edms/10
-
There is a forked version being maintained at
https://github.com/jonaswinkler/paperless-ng -
@bubonicfred I am starting to package this fork since I am stuck on Papermerge. Hopefully this one will be a bit more cooperative. Will post link when I get something resembling functional.
-
@doodlemania2 My initial commit is here: https://git.cloudron.io/doodlemania2/paperless-ng
THIS DOES NOT YET WORK
But - I need to do these things and I think it's done:
- sed in the environment variables to the config
- create the start.sh (with #1 above)
- setup the three services in systemd
I've done all of the above previously, I just am short on time, so, if anyone can give me an assist, would greatly appreciate it! If not, will continue just as quick as I can.
-
@doodlemania2 latest checkin has the start.sh and systemd files getting checked in. Still iterating as time permits.
-
@doodlemania2 Latest checking (simple one tonight) - I switched from supervisor to the build in gunicorn runner and was able to create a username/password and log in! Woot
Now, it's throwing errors about attempt to write to a read only database, so I suspect the auth mechanism is doing something in /app/code that I'll need to symlink.
After that is:
Get the supervisors to work (some weird ini file error)
Test the app
Cleanup
Final commit before handing it off for cloudron test dev in case anyone wants to see it in the app store. -
@doodlemania2 Bravo!
-
This app is now up and functional. Please test! I'll submit to the appstore for WIP in a day or so after some more cleanup and testing.
https://git.cloudron.io/doodlemania2/paperless-ng/
Note: inbound mail is enabled, but you have to configure it yourself in paperless config. I haven't tested that yet and from the looks of the repo, there are lots of issues with mail at the moment, so we may ship unstable without inbound mail.
cc: @girish
-
@doodlemania2 just an fyi - this is working well, but the "consumption" directory isn't polling so still tinkering with that. I am able to upload manually and auto tagging and pdf extraction is working. so, we're about 95% there I think.
-
@doodlemania2 Keep up the good work!
-
@doodlemania2 Very excited about your efforts on this! I appreciate you, sir! I'm a vanilla PHP and JS developer for the most part, so node, docker, manifests, build scripts, all of it gets a little sketchy if I'm the one on it. I would love to learn how to package Cloudron apps, though. Maybe I'll have a look at some of the material that's available on that here later.
I have been aching for a good document management system with user permissions for probably the last 12 months or so. -
@girish - I'm officially stuck. The app is ready and I think the tests might even work except for the consumption directory.
Described my issue here: https://github.com/jonaswinkler/paperless-ng/issues/438.
Tried moving the supervisors around, nothing.
If I run the consume script from shell, works like a champ. Open to ideas.
-
@doodlemania2 Is that a periodic job? Even if not, could we just hack around it with a small caveat by using the shell command and
scheduler
add-on? It'd be a minor thing, I think, for the Cloudron package to run an "every minute" or something consumption -
@jimcavoli That's a fabulous idea - I will give that a try - it uses inotify on the FS to monitor for changes (intended to always run), but this could be a fabulous workaround! ON IT!
-
@doodlemania2 Sure thing! Also, didn't know it was inotify under the hood - that has a fraught history with dockerized runtimes, so this is likely your best option anyway
-
@jimcavoli alrighty - that seems to have worked (running it with cloudron scheduler). now just need someone to help me with tests and it can ship!
-
@doodlemania2 I marked this as WIP now! Great progress.
-
hi all,
I was actually working on this a few weeks ago. I forked @doodlemania2's repo and updated it to the latest version. I also enabled SFTP access I am also working to implement the proxyauth addon to pass though to the django app (still wip) as it does not support LDAP natively.
I don't have access to fork in the cloudron gitlab so I had to copy to my github account. You can find the repo here:
https://github.com/campbellmcgregor/paperless-ng-cloudronthanks
-
@ultraviolet would you send me a PR? I'd be delighted to merge those in!
-
@doodlemania2 sure next time I am at a desk I will send you a PR
-
@ultraviolet said in Paperless - indexing and archiving scanned documents:
I don't have access to fork in the cloudron gitlab so I had to copy to my github account
Whoops, sorry about that. By default, accounts cannot fork since we get too much spam. I assume you are https://git.cloudron.io/admin/users/ultraviolet ? I have increased the project limit already.
-
@girish yup that is me thanks!
-
@doodlemania2 sent you a PR
-
@ultraviolet Merged! That's fantastic - great work!!!
-
@doodlemania2 thanks, I may send you another PR in a day or so hit a few walls with the proxyauth.
-
@ultraviolet @doodlemania2 Per https://github.com/jonaswinkler/paperless-ng/discussions/654, it seems if someone knows some python we can add proper ldap support.
-
@girish oooo shiny! @ultraviolet -- that you?
-
@doodlemania2 great work, thank you for making this available.
I built (very smooth) and installed on my Cloudron.
As a noob to paperless-ng, I am not sure what is needed for "processing".
I uploaded few documents to test and they seem stuck in 'processing' in dashboard view.
A tag and correspondent don't seem to get auto-applied.
Is this functionality working ?
Do I need to do something ? -
@timconsidine Seems last was my noob issue.
The dashboard clears eventually and tags can be retro-applied withroot@<app-instance>:/app/code/src# python3 manage.py document_retagger -T
New Q : I'm looking to add schedule tasks but cannot see in doc a list of task.function. Is there one available?
-
@timconsidine Glad you got it working - I'm note sure how the scheduled tasks stuff works - since it does what i need out of the box, I just ignored those
-
@doodlemania2 no problem, I understand, and likewise I have started using it as it does everything I currently want 'out of the box'.
I just don't like to miss a trick by getting even more value out of it by using 'advanced options'.I have to say : thank you again. It's an awesome app, and solves a problem of coordinating docs between users (in my case purchase invoices needing review and input). Previously I have been using a shared folder in Nextcloud, and although that works, I have had issues when some user's sync to Nextcloud stops, and everyong starts shouting at each other because they are seeing different views of the world.
The only things not there which I can identify which would be nice :
-
"tag tree" view, which I notice Teedy has. Just a UI advantage
-
a way to remove tags w/o manually doing so, based on some conditions
But hey ho, just "nice to have" not essentials.
-
-
@timconsidine outstanding! really glad it works for you
-
@timconsidine Hello! To get this installed, did you download the source, cd into the directory and run
cloudron build
thencloudron install
? I only once ever managed to install a non-official app but I totally forget how I did it!One place I get stuck on conceptually is at https://docs.cloudron.io/custom-apps/tutorial/, where it tells us to build:
# build the app $ docker build -t username/nodejs-app:1.0.0 .
What in the world is "username" supposed to be? Just a bit earlier it tells us to
git clone https://git.cloudron.io/cloudron/tutorial-nodejs-app
, so why wouldn't the build section just usecloudron
as the username? And if it isn'tcloudron
but myself, does this mean it is going to be put in a directory in the directory I'm building in? Anyway, this stuff confuses the heck out of me. -
@scooke No I didn't use
cloudron build
My process is :- install cloudron cli (but I guess you did this)
- have your own docker repository : this is what is meant by username. I think it is best described as reponame. I am not so sophisticated as to have my own private repository system. I just use the free account on https://hub.docker.com/
- install Docker Desktop for your platform
- on local machine create a directory e.g. /cloudron-custom/appname and change into the appname directory
- download a zip file of the app source and unzip into the appname directory
- make sure you have Docker Desktop running (which logs you into the free hub.docker account)
- then build using
docker build -t <reponame>/<appname>:<tag> .
NB my change of terminology, and remember the trailing dot - if I understand correctly, tag is only to distinguish different builds of the app. Initially i just ignored these, but probably good practice. But they can be anything (date, codeword)
- the built app is now just on your local machine so get it into the reppository system you chose, e.g. hub.docker.com using
docker push <reponame>/<appname>:<tag>
- then it is available to cloudron cli so
cloudron install --image <reponame>/<appname>:<tag>
will ask you what location you want (e.g. appname.yourdomain.com) - generally it is seamless from there, watch the pretty dots and when finished it should be in your Cloudron dashboard, ready to set access permissions or resources if you don't want the default.
I guess this is not a sophisticated process, it has multiple steps, but I find that easier when something goes wrong to debug.
I think my build of Paperless-NG is available publicly at
tcmbp132021/paperless-ng:derek210709
I've never tried it but maybe you could save some effort and try just doingcloudron install --image tcmbp132021/paperless-ng:derek210709
Maybe there will be some permission issues, but it says it is public. If you try that, let me know, it will be useful to know.I hope that's not too granular for anybody, personally I just like having step-by-step instructions. Good luck and shout if you run into problems. You will succeed !
-
@timconsidine This is super helpful. I guess another thing that confuses me is, if I understand this process, the hub.docker.com site is going to end up (or already is) jam-packed with 1001 Docker images from everyone and their grandmother. Seems ineffecient and confusing to me.
I'll try skipping to the
install
step with the filled in <reponame>:<appname> details you gave. Thanks! -
@scooke I think that's Docker's problem !
No different to any other SaaS-style hosted service. Providing performance on your account is OK, it's probably nothing to worry about.Or alternatively set up a private repository. Isn't there a Cloudron app already for that ?
I've not been brave enough / had enough time to do that yet.
-
@doodlemania2 With the help of @timconsidine I have this installed also. Looks and runs smooth! Thank you!
-
neurokrish
-
So this app is published as unstable in the paperless-ng flavor, thanks to @doodlemania2 for the majority of the initial packaging!
The forum section for further feedback and issues is now https://forum.cloudron.io/category/137/paperless-ng so I will lock this topic.
-
nebulon