FYI size of n-gram data sets
-
EN is around 8 GB
-
how can i add another language? do i just
NGRAM_DATASET=("en,de")? -
@RazielKanos NGRAM_DATASET=("en;de")
works for me.
Sorry. Not true -
@RazielKanos said in FYI size of n-gram data sets:
how can i add another language? do i just
NGRAM_DATASET=("en,de")?Basically it's a bash script array variable so you should split values by a whitespace.
NGRAM_DATASET=("en" "de")
I'm not a German speaker but I heard it works very well.
Just wondering how it works with two languages. -
@luckow said in FYI size of n-gram data sets:
EN is around 8 GB
This is download size. Unpacked it takes 14.34 GB of server space for English and 3.06 GB for German.