Their docker is so nice, I was able to get it up and running on a bare metal server in five minutes. No joke.
Check out their demo site
"pd3f is an Open-source PDF text extraction pipeline that is self-hosted, local-first and Docker-based.
pd3f reconstructs the original continuous text with the help of machine learning."