tesseract 5 support
Solved
Stirling-PDF
-
Does the cloudron instance support tesseract5? Stirling-PDF recently did a change from tess 4 to tess5 and is no longer checking the 4.00 path for its OCR.
For SPDF docker we did
add-apt-repository ppa:alex-p/tesseract-ocr5 && apt install -y --no-install-recommends tesseract-ocr
https://github.com/Frooodle/Stirling-PDF/blob/main/DockerfileBase#L26As the ubuntu base image (debian too) are 4.x
-
https://git.cloudron.io/cloudron/stirling-pdf-app/-/commit/335acf606a060260314e17312f1a709a0a011817
# tesseract --version tesseract 5.3.2 leptonica-1.82.0 libgif 5.1.9 : libjpeg 8d (libjpeg-turbo 2.1.1) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.2 : libopenjp2 2.4.0 Found AVX512BW Found AVX512F Found AVX2 Found AVX Found FMA Found SSE4.1 Found OpenMP 201511 Found libarchive 3.6.0 zlib/1.2.11 liblzma/5.2.5 bz2lib/1.0.8 liblz4/1.9.3 libzstd/1.4.8 Found libcurl/7.81.0 OpenSSL/3.0.2 zlib/1.2.11 brotli/1.0.9 zstd/1.4.8 libidn2/2.3.2 libpsl/0.21.0 (+libidn2/2.3.2) libssh/0.9.6/openssl/zlib nghttp2/1.43.0 librtmp/2.3 OpenLDAP/2.5.16
-
-