PDF to HTML or Text conversion using Apache Tika. Also generate PDF thumbnail using Apache PDFBox.
npm install pdf2html-jk-temporary



pdf2html helps to convert PDF file to HTML or Text using Apache Tika. This module also helps to generate thumbnail image for PDF file using Apache PDFBox.
```
yarn add pdf2html
via npm:
``
npm install --save pdf2html
Java runtime environment (JRE) is required to run this module.
javascript
const pdf2html = require('pdf2html')pdf2html.html('sample.pdf', (err, html) => {
if (err) {
console.error('Conversion error: ' + err)
} else {
console.log(html)
}
})
`#### Convert to text
`javascript
pdf2html.text('sample.pdf', (err, text) => {
if (err) {
console.error('Conversion error: ' + err)
} else {
console.log(text)
}
})
`#### Convert as pages
`javascript
pdf2html.pages('sample.pdf', (err, htmlPages) => {
if (err) {
console.error('Conversion error: ' + err)
} else {
console.log(htmlPages)
}
})
``javascript
const options = { text: true }
pdf2html.pages('sample.pdf', options, (err, textPages) => {
if (err) {
console.error('Conversion error: ' + err)
} else {
console.log(textPages)
}
})
`#### Extra metadata
`javascript
pdf2html.meta('sample.pdf', (err, meta) => {
if (err) {
console.error('Conversion error: ' + err)
} else {
console.log(meta)
}
})
`#### Generate thumbnail
`javascript
pdf2html.thumbnail('sample.pdf', (err, thumbnailPath) => {
if (err) {
console.error('Conversion error: ' + err)
} else {
console.log(thumbnailPath)
}
})
``javascript
const options = { page: 1, imageType: 'png', width: 160, height: 226 }
pdf2html.thumbnail('sample.pdf', options, (err, thumbnailPath) => {
if (err) {
console.error('Conversion error: ' + err)
} else {
console.log(thumbnailPath)
}
})
`$3
Sometimes downloading the dependencies might be too slow or unable to download in a HTTP proxy environment. Follow the step below to skip the dependency downloads.
`bash
cd node_modules/pdf2html/vendor
These URLs come from https://github.com/shebinleo/pdf2html/blob/master/postinstall.js#L6-L7
wget https://dlcdn.apache.org/pdfbox/2.0.26/pdfbox-app-2.0.26.jar
wget https://dlcdn.apache.org/tika/2.3.0/tika-app-2.3.0.jar
``