Add Apache Tika to Paperless-ngx package.
-
To process office and eml files, paperless needs a Apache tika server. IT can be hosted with docker and needs to be added in the paperless config. That's it.
Could you please do that?
-
And Gotenberg too.
https://docs.paperless-ngx.com/configuration/#optional-services -
-
-
@nebulon
I get the following error, when I try to add a eml file[2024-11-03 17:53:03,119] [ERROR] [paperless.tasks] ConsumeTaskPlugin failed: WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error occurred while consuming document WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error while converting email to PDF: [Errno 111] Connection refused Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions yield File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 236, in handle_request resp = self._pool.handle_request(req) File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection_pool.py", line 216, in handle_request raise exc from None File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection_pool.py", line 196, in handle_request response = connection.handle_request( File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection.py", line 99, in handle_request raise exc File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection.py", line 76, in handle_request stream = self._connect(request) File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection.py", line 122, in _connect stream = self._network_backend.connect_tcp(**kwargs) File "/usr/local/lib/python3.10/dist-packages/httpcore/_backends/sync.py", line 205, in connect_tcp with map_exceptions(exc_map): File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.10/dist-packages/httpcore/_exceptions.py", line 14, in map_exceptions raise to_exc(exc) from exc httpcore.ConnectError: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/app/code/src/paperless_mail/parsers.py", line 354, in generate_pdf_from_mail .run() File "/usr/local/lib/python3.10/dist-packages/gotenberg_client/_base.py", line 113, in run resp = self._client.post(url=self._route, headers=self._headers, data=self._form_data, files=self._get_files()) File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 1157, in post return self.request( File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 837, in request return self.send(request, auth=auth, follow_redirects=follow_redirects) File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 926, in send response = self._send_handling_auth( File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 954, in _send_handling_auth response = self._send_handling_redirects( File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 991, in _send_handling_redirects response = self._send_single_request(request) File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 1027, in _send_single_request response = transport.handle_request(request) File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 235, in handle_request with map_httpcore_exceptions(): File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions raise mapped_exc(message) from exc httpx.ConnectError: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/asgiref/sync.py", line 327, in main_wrap raise exc_info[1] File "/app/code/src/documents/consumer.py", line 476, in run document_parser.parse(self.working_copy, mime_type, self.filename) File "/app/code/src/paperless_mail/parsers.py", line 183, in parse self.archive_path = self.generate_pdf(mail) File "/app/code/src/paperless_mail/parsers.py", line 223, in generate_pdf mail_pdf_file = self.generate_pdf_from_mail(mail_message) File "/app/code/src/paperless_mail/parsers.py", line 357, in generate_pdf_from_mail raise ParseError( documents.parsers.ParseError: Error while converting email to PDF: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/app/code/src/documents/tasks.py", line 148, in consume_file msg = plugin.run() File "/app/code/src/documents/consumer.py", line 508, in run self._fail( File "/app/code/src/documents/consumer.py", line 151, in _fail raise ConsumerError(f"{self.filename}: {log_message or message}") from exception documents.consumer.ConsumerError: WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error occurred while consuming document WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error while converting email to PDF: [Errno 111] Connection refused
-
-
-
@nebulon This is super Nebulon. There are so many apps Cloudron has made available. But adding "small" requests like this to existing apps makes Cloudron all that more appealing. You've just saved me the cost of a separate VPS on which I had been running Tika and Gotenburg! Has there been a better improvement of any app than this?! Thank you!
PAPERLESS_TIKA_ENABLED=true
PAPERLESS_TIKA_ENDPOINT=http://localhost:9998
PAPERLESS_TIKA_GOTENBERG_ENDPOINT=http://localhost:3000