A Node.js wrapper for the Tesseract OCR API
npm install node-tesseract-ocr



First, you need to install the Tesseract project. Instructions for installing Tesseract for all platforms can be found on the project site. On Debian/Ubuntu:
``bash`
apt-get install tesseract-ocr
After you've installed Tesseract, you can go installing the npm-package:
`bash`
npm install node-tesseract-ocr
`js
const tesseract = require("node-tesseract-ocr")
const config = {
lang: "eng",
oem: 1,
psm: 3,
}
tesseract
.recognize("image.jpg", config)
.then((text) => {
console.log("Result:", text)
})
.catch((error) => {
console.log(error.message)
})
`
Also you can pass Buffer:
`js
const img = fs.readFileSync("image.jpg")
tesseract
.recognize(img, config)
.then((text) => {
console.log("Result:", text)
})
.catch((error) => {
console.log(error.message)
})
`
or URL:
`js
const img = "https://tesseract.projectnaptha.com/img/eng_bw.png"
tesseract
.recognize(img, config)
.then((text) => {
console.log("Result:", text)
})
.catch((error) => {
console.log(error.message)
})
`
If you want to process multiple images in a single run, then pass an array:
`js
const images = ["./test/samples/file1.png", "./test/samples/file2.png"]
tesseract
.recognize(images, config)
.then((text) => {
console.log("Result:", text)
})
.catch((error) => {
console.log(error.message)
})
`
In the config object you can pass any OCR options. Also you can pass here any control parameters or use ready-made sets of config files (like hocr):
`js``
const result = await tesseract.recognize("image.jpg", {
load_system_dawg: 0,
tessedit_char_whitelist: "0123456789",
presets: ["tsv"],
})