the best option for predicting a language
npm install which-langIn a search for the _best_ option for predicting a language from text which didn't require a large machine learning model, it appeared that fast-text, created by FaceBook, was the best option (https://towardsdatascience.com/benchmarking-language-detection-for-nlp-8250ea8b67c).
```
npm install which-lang
Note: This will install the fast-text model by facebook which is about 150MB. You also need python installed, if you're running an alipine docker see how to easily do this here
_Testing_
`js
import LanguageDetection from 'which-lang';
async function run(){
const lid = new LanguageDetection()
console.log(await lid.predict('FastText-LID provides a great language identification'))
console.log(await lid.predict('FastText-LID bietet eine hervorragende Sprachidentifikation'))
console.log(await lid.predict('FastText-LID fornisce un ottimo linguaggio di identificazione'))
console.log(await lid.predict('FastText-LID fournit une excellente identification de la langue'))
console.log(await lid.predict('FastText-LID proporciona una gran identificación de idioma'))
console.log(await lid.predict('FastText-LID обеспечивает отличную идентификацию языка'))
console.log(await lid.predict('这个case我想close.'))
console.log(await lid.predict('FastText-LID提供了很好的語言識別'))
}
run()
`
> The second argument is the number of returned responses, i.e. lid.predict(text, 10) will return an array of 10 results
_Output_
``
[ { lang: 'en', prob: 0.6313226222991943, isReliableLanguage: true } ]
[ { lang: 'de', prob: 0.9137917160987854, isReliableLanguage: true } ]
[ { lang: 'it', prob: 0.974501371383667, isReliableLanguage: true } ]
[ { lang: 'fr', prob: 0.7358829379081726, isReliableLanguage: true } ]
[ { lang: 'es', prob: 0.9211937189102173, isReliableLanguage: true } ]
[ { lang: 'ru', prob: 0.9899846911430359, isReliableLanguage: true } ]
[ { lang: 'zh', prob: 0.9437162280082703, isReliableLanguage: true } ]
[ { lang: 'zh', prob: 0.8515647649765015, isReliableLanguage: true } ]
> isReliableLanguage` is true if there were 10 + test results and accuracy was 95% or more