Showing 1-20 of 86 packages
A npm module wrapping the `pdftotext` utitility software
Extracts text from pdf with pdftotext (poppler)
A simple light weight react package to extract plain text from a pdf file.
pdftotext wrapper that generates JSON with bounding box data. Takes care of duplicate characters.
Use pdftotext stdin without specifying an output file.
A reliable Model Context Protocol server for PDF text extraction using pdftotext from poppler-utils
Poppler's `pdftotext` compiled to WebAssembly with Emscripten.
Extract text from pdfs that contain searchable pdf text
A document recognition and extraction library that uses pdftotext and tesseract to perform document recognition and reading.
include the windows executable for pdftotext and has nicely typed interface to call it from Node.js code.
Extract the text from pdf files
A reliable Model Context Protocol server for PDF text extraction using pdftotext from poppler-utils
A transformer stream wrapper around the pdftotext command-line tool.
Another simple Node.js wrapper for the popular `pdftotext` library.
PDF text extraction in TypeScript
poppler's PDFtoText for firebase functions
Extract the text from pdf files and more utils
Node PDF is a set of tools that takes in PDF files and converts them to usable formats for data processing. The library supports both extracting text from searchable pdf files as well as performing OCR on pdfs which are just scanned images of text
A simple light weight react package to extract plain text from a pdf file.
Extracting text from files of various type including html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf, text/*, and various open office.