Add Apache Tika to Paperless-ngx package.
Solved
Paperless-ngx
-
To process office and eml files, paperless needs a Apache tika server. IT can be hosted with docker and needs to be added in the paperless config. That's it.
Could you please do that?
-
And Gotenberg too.
https://docs.paperless-ngx.com/configuration/#optional-services -
-
-
@nebulon
I get the following error, when I try to add a eml file[2024-11-03 17:53:03,119] [ERROR] [paperless.tasks] ConsumeTaskPlugin failed: WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error occurred while consuming document WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error while converting email to PDF: [Errno 111] Connection refused Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions yield File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 236, in handle_request resp = self._pool.handle_request(req) File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection_pool.py", line 216, in handle_request raise exc from None File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection_pool.py", line 196, in handle_request response = connection.handle_request( File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection.py", line 99, in handle_request raise exc File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection.py", line 76, in handle_request stream = self._connect(request) File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection.py", line 122, in _connect stream = self._network_backend.connect_tcp(**kwargs) File "/usr/local/lib/python3.10/dist-packages/httpcore/_backends/sync.py", line 205, in connect_tcp with map_exceptions(exc_map): File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.10/dist-packages/httpcore/_exceptions.py", line 14, in map_exceptions raise to_exc(exc) from exc httpcore.ConnectError: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/app/code/src/paperless_mail/parsers.py", line 354, in generate_pdf_from_mail .run() File "/usr/local/lib/python3.10/dist-packages/gotenberg_client/_base.py", line 113, in run resp = self._client.post(url=self._route, headers=self._headers, data=self._form_data, files=self._get_files()) File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 1157, in post return self.request( File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 837, in request return self.send(request, auth=auth, follow_redirects=follow_redirects) File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 926, in send response = self._send_handling_auth( File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 954, in _send_handling_auth response = self._send_handling_redirects( File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 991, in _send_handling_redirects response = self._send_single_request(request) File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 1027, in _send_single_request response = transport.handle_request(request) File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 235, in handle_request with map_httpcore_exceptions(): File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions raise mapped_exc(message) from exc httpx.ConnectError: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/asgiref/sync.py", line 327, in main_wrap raise exc_info[1] File "/app/code/src/documents/consumer.py", line 476, in run document_parser.parse(self.working_copy, mime_type, self.filename) File "/app/code/src/paperless_mail/parsers.py", line 183, in parse self.archive_path = self.generate_pdf(mail) File "/app/code/src/paperless_mail/parsers.py", line 223, in generate_pdf mail_pdf_file = self.generate_pdf_from_mail(mail_message) File "/app/code/src/paperless_mail/parsers.py", line 357, in generate_pdf_from_mail raise ParseError( documents.parsers.ParseError: Error while converting email to PDF: [Errno 111] Connection refused The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/app/code/src/documents/tasks.py", line 148, in consume_file msg = plugin.run() File "/app/code/src/documents/consumer.py", line 508, in run self._fail( File "/app/code/src/documents/consumer.py", line 151, in _fail raise ConsumerError(f"{self.filename}: {log_message or message}") from exception documents.consumer.ConsumerError: WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error occurred while consuming document WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error while converting email to PDF: [Errno 111] Connection refused
-
-
Here is what's going on.
Gotenberg isn't able to start up. I always crashes instantly and give up after a few times. I reinstalled it multiple times now and still no help.
[no timestamp] 2024-11-08T16:19:51Z mail 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 866 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 - _____ __ __ [no timestamp] 2024-11-08T16:19:51Z mail 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 866 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 - / ___/__ / /____ ___ / / ___ _______ _ [no timestamp] 2024-11-08T16:19:51Z mail 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 866 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 - / (_ / _ \/ __/ -_) _ \/ _ \/ -_) __/ _ '/ [no timestamp] 2024-11-08T16:19:51Z mail 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 866 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 - \___/\___/\__/\__/_//_/_.__/\__/_/ \_, / [no timestamp] 2024-11-08T16:19:51Z mail 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 866 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 - /___/ [no timestamp] 2024-11-08T16:19:51Z mail 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 866 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 - [no timestamp] 2024-11-08T16:19:51Z mail 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 866 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 - A Docker-powered stateless API for PDF files. [no timestamp] 2024-11-08T16:19:51Z mail 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 866 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 - Version: 8.13.0 [no timestamp] 2024-11-08T16:19:51Z mail 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 866 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 - ------------------------------------------------------- [no timestamp] 2024-11-08T16:19:51Z mail 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 866 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 - [SYSTEM] modules: api chromium exiftool libreoffice libreoffice-api libreoffice-pdfengine logging pdfcpu pdfengines pdftk prometheus qpdf webhook [no timestamp] 2024-11-08T16:19:51Z mail 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 866 03b9fdda-0a4b-4a05-a704-d2ca51f0d0e0 - [FATAL] provision module api: get routers: provision module chromium: get PDF engine provider: get module: provision module pdfengines: get PDF engines: provision module pdfcpu: PDFCPU_BIN_PATH environment variable is not set Nov 08 17:19:51 - 2024-11-08 16:19:51,135 INFO exited: gotenberg (exit status 1; not expected)