Convert pdf to markdown, support typescript
npm install pdf2md-tsForked from @opendocsg/pdf2md
[2024-3-2]
1. Add types to the package, typescript needs types!
2. Change to return markdown by pages, return string[].
3. Remove CLI scripts.
``bash`
npm install --save pdf2md-tsor
yarn add pdf2md-ts
ES5
`js
const fs = require('fs')
const pdf2md = require('pdf2md-ts')
const pdfBuffer = fs.readFileSync(filePath)
pdf2md(pdfBuffer, callbacks)
.then(text => {
console.log(text.join('\n'))
})
.catch(err => {
console.error(err)
})
`
ES6 & TS
`ts``
import pdf2md from 'pdf2md-ts'
const buffer =readFileSync(path)
const res = await pdf2md(buffer)
console.log(res) // string[]
- @opendocsg/pdf2md - Which is this repo forked from
- pdf-to-markdown - original project by Johannes Zillmann
- pdf.js - Mozilla's PDF parsing & rendering platform which is used as a raw parser