PDF to Text/RTF fails
-
Error Internal Server Error:Command process failed with exit code 77. Error message: Warning: failed to launch javaldx - java may not function correctly (process:70): dconf-CRITICAL **: 08:57:54.827: unable to create directory '/home/cloudron/.cache/dconf': Permission denied. dconf will not work properly. LibreOffice 7.3 - Fatal Error: The application cannot be started. User installation could not be completed.
-
I made that error go away but it always produces empty text.
Running
soffice --infilter=writer_pdf_import --convert-to txt:Text --outdir /tmp/invoice Invoice.pdf
locally also doesn't work -
I tried a couple but I think maybe it needs text in the PDF for this feature to work (i.e this is not OCR).
-
@girish said in PDF to Text/RTF fails:
I think maybe it needs text in the PDF for this feature to work (i.e this is not OCR)
That's what I was going to say it could very likely be.
Edit: but I don't think that is the case. I just tried with a PDF that has text and got the same error:
java.io.IOException: Command process failed with exit code 77. Error message: javaldx failed! Warning: failed to read path from javaldx LibreOffice 7.3 - Fatal Error: The application cannot be started. User installation could not be completed. at stirling.software.SPDF.utils.ProcessExecutor.runCommandWithOutputHandling(ProcessExecutor.java:93) at stirling.software.SPDF.utils.PDFToFile.processPdfToOfficeFormat(PDFToFile.java:56) at stirling.software.SPDF.controller.api.converters.ConvertPDFToOffice.processPdfToRTForTXT(ConvertPDFToOffice.java:48) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:207) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:152) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:884) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:797) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1081) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:974) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1011) at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:914) at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:590) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885) at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:658) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:205) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at stirling.software.SPDF.config.MetricsFilter.doFilterInternal(MetricsFilter.java:41) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.ServerHttpObservationFilter.doFilterInternal(ServerHttpObservationFilter.java:109) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:166) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:90) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:482) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:115) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:93) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74) at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:738) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:341) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:390) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:894) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1741) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52) at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191) at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.base/java.lang.Thread.run(Thread.java:833)
-
@jdaviescoates the javaldx issue is sorted out in latest update atleast (I think).
-
@girish said in PDF to Text/RTF fails:
@jdaviescoates the javaldx issue is sorted out in latest update atleast (I think).
I still seem to get the exact same issue:
java.io.IOException: Command process failed with exit code 77. Error message: javaldx failed! Warning: failed to read path from javaldx LibreOffice 7.3 - Fatal Error: The application cannot be started. User installation could not be completed. at stirling.software.SPDF.utils.ProcessExecutor.runCommandWithOutputHandling(ProcessExecutor.java:93) at stirling.software.SPDF.utils.PDFToFile.processPdfToOfficeFormat(PDFToFile.java:56) at stirling.software.SPDF.controller.api.converters.ConvertPDFToOffice.processPdfToRTForTXT(ConvertPDFToOffice.java:48) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:207) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:152) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:884) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:797) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1081) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:974) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1011) at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:914) at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:590) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885) at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:658) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:205) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at stirling.software.SPDF.config.MetricsFilter.doFilterInternal(MetricsFilter.java:41) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.ServerHttpObservationFilter.doFilterInternal(ServerHttpObservationFilter.java:109) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:166) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:90) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:482) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:115) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:93) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74) at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:738) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:341) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:390) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:894) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1741) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52) at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191) at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.base/java.lang.Thread.run(Thread.java:833)
-
-
@girish There is also a problem when converting any file to PDF.
java.io.IOException: Command process failed with exit code 1. Error message: /usr/local/bin/unoconv:19: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives from distutils.version import LooseVersion unoconv: Cannot find a suitable office installation on your system. ERROR: Please locate your office installation and send your feedback to: http://github.com/dagwieers/unoconv/issues at stirling.software.SPDF.utils.ProcessExecutor.runCommandWithOutputHandling(ProcessExecutor.java:93) at stirling.software.SPDF.controller.api.converters.ConvertOfficeController.convertToPdf(ConvertOfficeController.java:42) at stirling.software.SPDF.controller.api.converters.ConvertOfficeController.processPdfWithOCR(ConvertOfficeController.java:74) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:207) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:152) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:884) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:797) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1081) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:974) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1011) at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:914) at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:590) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885) at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:658) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:205) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at stirling.software.SPDF.config.MetricsFilter.doFilterInternal(MetricsFilter.java:41) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.ServerHttpObservationFilter.doFilterInternal(ServerHttpObservationFilter.java:109) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:166) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:90) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:482) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:115) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:93) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74) at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:738) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:341) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:390) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:894) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1741) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52) at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191) at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.base/java.lang.Thread.run(Thread.java:833)
-
@archos I think this one is fixed with the latest update.
-
-
OCR should also work better now. It was missing a bunch of packages.
-
For me, it’s doing something but doesn’t let me access the file after conversion - nothing happens:
Jul 01 23:05:45 Running command: soffice --infilter=writer_pdf_import --convert-to doc --outdir /tmp/output_12731918207241714492 /tmp/input_12820457046940439487.pdfCommand output: Jul 01 23:05:45 convert /tmp/input_12820457046940439487.pdf -> /tmp/output_12731918207241714492/input_12820457046940439487.doc using filter : MS Word 97 Jul 01 23:06:04 Running command: soffice --infilter=writer_pdf_import --convert-to docx --outdir /tmp/output_13110194667904771198 /tmp/input_2013883566298576050.pdfCommand output: Jul 01 23:06:04 convert /tmp/input_2013883566298576050.pdf -> /tmp/output_13110194667904771198/input_2013883566298576050.docx using filter : MS Word 2007 XML Jul 01 23:06:35 Running command: soffice --infilter=writer_pdf_import --convert-to rtf --outdir /tmp/output_12759007429764915141 /tmp/input_5102818123428729636.pdfCommand output: Jul 01 23:06:35 convert /tmp/input_5102818123428729636.pdf -> /tmp/output_12759007429764915141/input_5102818123428729636.rtf using filter : Rich Text Format
-
@necrevistonnezr I noticed that for some of the conversions. The processing button would be disabled but the browser already downloaded the file under Downloads.
-
@archos please try with v1.1.0 package . Think I have fixed it for real now.
-
@LoudLemur said in PDF to Text/RTF fails:
I tried this yesterday, pdf-to-rtf, pdf-to-txt. Both failed, and created empty files instead.
Same.
@nebulon said in PDF to Text/RTF fails:
Were there any errors shown
No errors were shown on screen nor in the app logs.
@nebulon said in PDF to Text/RTF fails:
And just to be sure, have you tried other pdf documents?
Yes, I've tried numerous PDFs all with the same empty .txt file as the result.