A PowerPoint (PPTX) parser that extracts text content with preserved formatting
npm install node-pptx-parserbash
npm install node-pptx-parser
`
Usage
Once the package is installed you can you it with import or require statements like this:
`javascript
// ESM import:
import PptxParser from "node-pptx-parser";
// CommonJs require:
const PptxParser = require("node-pptx-parser").default;
`
$3
`typescript
import PptxParser from "node-pptx-parser";
async function main() {
const parser = new PptxParser("presentation.pptx");
try {
// Extract text from all slides
const textContent = await parser.extractText();
// Print text from each slide
textContent.forEach((slide) => {
console.log(\nSlide ${slide.id}:);
console.log(slide.text.join("\n"));
});
} catch (error) {
console.error("Error:", error.message);
}
}
main();
`
$3
`typescript
import PptxParser from "node-pptx-parser";
async function main() {
const parser = new PptxParser("presentation.pptx");
try {
// Get complete parsed presentation content
const parsedContent = await parser.parse();
// Access presentation structure
console.log(parsedContent.presentation.parsed);
// Access individual slides
parsedContent.slides.forEach((slide) => {
console.log(Slide ${slide.id}:, slide.parsed);
});
// Access raw XML if needed
console.log(parsedContent.presentation.xml);
} catch (error) {
console.error("Error:", error.message);
}
}
main();
`
API Reference
$3
The main class for parsing PPTX files.
#### Constructor
`typescript
constructor(filePath: string)
`
Creates a new instance of PptxParser.
- filePath: Path to the PPTX file to be parsed
#### Methods
##### parse()
`typescript
async parse(): Promise
`
Parses the entire PPTX file and returns its content.
- Returns: Promise resolving to a ParsedPresentation object containing the complete presentation structure
##### extractText()
`typescript
async extractText(): Promise
`
Extracts formatted text content from all slides.
- Returns: Promise resolving to an array of SlideTextContent objects
$3
#### ParsedPresentation
`typescript
interface ParsedPresentation {
presentation: {
path: string;
xml: string;
parsed: any;
};
relationships: {
path: string;
xml: string;
parsed: any;
};
slides: ParsedSlide[];
}
`
#### ParsedSlide
`typescript
interface ParsedSlide {
id: string;
path: string;
xml: string;
parsed: any;
}
`
#### SlideTextContent
`typescript
interface SlideTextContent extends ParsedSlide {
text: string[];
}
`
Error Handling
The library throws errors in the following cases:
- Invalid PPTX file structure
- File reading errors
- XML parsing errors
Example error handling:
`typescript
try {
const parser = new PptxParser("presentation.ppt");
const content = await parser.extractText();
} catch (error) {
if (error.message.includes("Invalid PPTX file structure")) {
console.error("The PPTX file is corrupted or invalid");
} else {
console.error("An error occurred:", error.message);
}
}
``