Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


    Cloudron Forum

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular

    FYI size of n-gram data sets

    LanguageTool
    6
    7
    137
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • luckow
      luckow translator last edited by

      EN is around 8 GB

      Pronouns: he/him | Primary language: German

      necrevistonnezr 1 Reply Last reply Reply Quote 3
      • nebulon
        nebulon Staff last edited by

        good point, we will put that in the docs

        1 Reply Last reply Reply Quote 0
        • RazielKanos
          RazielKanos last edited by

          how can i add another language? do i just
          NGRAM_DATASET=("en,de")?

          luckow vladimir.d 2 Replies Last reply Reply Quote 0
          • luckow
            luckow translator @RazielKanos last edited by luckow

            @RazielKanos NGRAM_DATASET=("en;de") works for me.
            Sorry. Not true 🙂

            Pronouns: he/him | Primary language: German

            1 Reply Last reply Reply Quote 0
            • vladimir.d
              vladimir.d @RazielKanos last edited by vladimir.d

              @RazielKanos said in FYI size of n-gram data sets:

              how can i add another language? do i just
              NGRAM_DATASET=("en,de")?

              Basically it's a bash script array variable so you should split values by a whitespace.

              NGRAM_DATASET=("en" "de")
              

              I'm not a German speaker but I heard it works very well.
              Just wondering how it works with two languages.

              1 Reply Last reply Reply Quote 4
              • girish
                girish Staff last edited by

                The warning is now in https://docs.cloudron.io/apps/languagetool/#n-grams . Also, the way to install ngrams has slightly changed.

                1 Reply Last reply Reply Quote 0
                • necrevistonnezr
                  necrevistonnezr @luckow last edited by

                  @luckow said in FYI size of n-gram data sets:

                  EN is around 8 GB

                  This is download size. Unpacked it takes 14.34 GB of server space for English and 3.06 GB for German.

                  2834EBAC-FFAF-40E4-B4B7-3584448CB671.jpeg 49076E70-5F1C-479F-911E-D1C717771557.jpeg

                  1 Reply Last reply Reply Quote 1
                  • First post
                    Last post
                  Powered by NodeBB