Parsely: Is a lightweight JavaScript library for parsing various types of documents and web pages. It provides easy-to-use parsers for extracting and processing data from PDFs, Word documents, Excel spreadsheets, and web pages.
npm install @bj.dev/parselybash
npm install "@bj.dev/parsely"
`
---
Usage
$3
`javascript
import { PDFParser, DocxParser, XlsxParser, WebParser } from "@bj.dev/parsely";
`
$3
`javascript
const pdfParser = new PDFParser("path/to/document.pdf");
(async () => {
const result = await pdfParser.parse();
console.log(result);
})();
`
$3
`javascript
const docxParser = new DocxParser("path/to/document.docx");
(async () => {
const text = await docxParser.parse();
console.log(text);
})();
`
$3
`javascript
const xlsxParser = new XlsxParser("path/to/spreadsheet.xlsx");
(async () => {
const data = await xlsxParser.parse();
console.log(data);
})();
`
$3
`javascript
const webParser = new WebParser("https://example.com");
(async () => {
const html = await webParser.parse();
console.log(html);
})();
`
---
API Reference
$3
- Constructor: new PDFParser(filePath)
- filePath (String): Path to the PDF file.
- Method: parse()
- Returns a Promise resolving to an object with metadata and text content.
$3
- Constructor: new DocxParser(filePath)
- filePath (String): Path to the DOCX file.
- Method: parse()
- Returns a Promise resolving to the raw text of the document.
$3
- Constructor: new XlsxParser(filePath)
- filePath (String): Path to the XLSX file.
- Method: parse()
- Returns a Promise resolving to an array of sheet data.
$3
- Constructor: new WebParser(webUrl)
- webUrl (String): URL of the web page to parse.
- Method: parse()`