"run rules" on already imported docs

timconsidine

This may be staring me in the face but can't see it.

Paperless-NG does an excellent job of indexing a document on import, eg correspondent, tag.
But that depends on setting those up in advance.

I have large archive docs I want to bring in to the app.
I can setup the "obvious" ones, but will probably miss a number (eg suppliers I have forgotten I dealt with but have docs from them.

Is it possible to set up eg correspondent after importing and then "run rules" on already imported docs which were not categorised on initial import ?

nebulon

@timconsidine I guess you have looked into https://paperless-ng.readthedocs.io/en/latest/advanced_usage.html# already. I also couldn't quickly find some obvious command to run.

But generally, so you put the documents into the storage folder and then just want it to put them to the index? Or what did you mean with:

@timconsidine said in "run rules" on already imported docs:

But that depends on setting those up in advance.

timconsidine

@nebulon thank you
I am thinking to :

upload a folder of several docs, too many to look at manually.
allow Paperless to do its excellent matching and tagging
probably then find an unforeseeable number of docs which should be allocated to correspondents or tags but weren't because I didn';t set this up in advance
somehow - this is the question - get Paperless to go through matching process on already imported docs.

Maybe I am over-worrying.
Just conscious it will be a big job importing this archive and trying to reduce my manual work tagging/classifying post-import.

neurokrish

@timconsidine I think you are looking for document retagger. I used this already and it works great!

timconsidine

@neurokrish thank you !

EDIT : jeez, isn't Paperless-NG awesome !

ChristopherMag

PS, anyone needing to do this, open the cloudron terminal for the app and then run the following:

cd src
python3 manage.py document_retagger -T

jdaviescoates

@ChristopherMag said in "run rules" on already imported docs:

PS, anyone needing to do this, open the cloudron terminal for the app and then run the following:
python3 src/manage.py document_retagger -T

Sounds like something that'd be useful to add to the docs @Staff

girish

@ChristopherMag thanks, documented in https://docs.cloudron.io/apps/paperless-ngx/#document-retagger

ChristopherMag

@girish I thought that you could run the command without first cd'ing into the src directory but when running it again today I found out that was incorrect.

I have edited my post above to first include cd src and then python3 manage.py document_retagger -T.

Could you please update the documentation as what I originally posted doesn't work for some reason, python throws an exception if you don't cd first?

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

"run rules" on already imported docs