A library-agnostic service for converting HTML content to Microsoft Word DocX documents. Works in both Angular frontend applications and Node.js backend environments.
npm install @packback/html-to-docxA library-agnostic service for converting HTML content with all its oddities to Microsoft Word DocX documents. Works in both browser and Node.js environments.


- Library Agnostic: Accepts any DOM Document object, not tied to specific HTML parsers
- Node.js Compatible: Works in server environments using JSDOM
- Browser Compatible: Works in frontend applications using native DOMParser
- Comprehensive HTML Support: Handles formatting, lists, images, headers, and more
- Document Styling: Configurable fonts, sizes, and citation formats (APA, MLA, Chicago)
- Self-Contained: All dependencies are local to avoid circular imports
``bash`
npm install @packback/html-to-docxor
yarn add @packback/html-to-docx
For detailed instructions on testing local changes in both frontend and backend environments, see LOCAL_DEVELOPMENT.md (this file is not published to NPM and is only available in the source repository).
Quick start for frontend:
- Uncomment the path mapping in frontend/questions-frontend/src/tsconfig.app.dev.json
- Restart your dev server
Quick start for backend:
`bash`
cd /questions/backend/app-api
npm run link-local-html-to-docx
See the full guide for rebuild workflows, Docker setup, cleanup steps, and troubleshooting.
`typescript
import { HtmlToDocxService } from '@packback/html-to-docx';
// Convert HTML string
const docxDocument = await HtmlToDocxService.convertHtmlToDocument({
htmlContent: '
Hello world!
',// Convert pre-parsed document (library agnostic)
const parser = new DOMParser();
const document = parser.parseFromString(htmlContent, 'text/html');
const docxDocument = await HtmlToDocxService.convertHtmlToDocument({
htmlContent: '', // Not used when document is provided
document,
documentSettings: { font_family: 'open-sans', font_size: 12 }
});
// With references/bibliography page
const sources = [
{
citation: [
{ resolved: true, text: 'Smith, J.' },
{ resolved: true, text: ' (2023). ' },
{ resolved: true, text: 'Book Title', format: 'italic' },
{ resolved: true, text: '. Publisher.' }
],
citation_format: 'apa'
}
];
const docxDocument = await HtmlToDocxService.convertHtmlToDocument({
htmlContent: '
Hello world!
',`
`javascript
import { HtmlToDocxService } from '@packback/html-to-docx';
import { JSDOM } from 'jsdom';
import { Packer } from 'docx';
import fs from 'fs/promises';
// Make Node constants available globally
const jsdom = new JSDOM('');
global.Node = jsdom.window.Node;
async function convertHtml(htmlContent, outputPath) {
// Parse HTML using JSDOM
const jsdom = new JSDOM(htmlContent);
const document = jsdom.window.document;
// Convert to DocX
const docxDocument = await HtmlToDocxService.convertHtmlToDocument({
htmlContent: '',
document,
documentSettings: {
font_family: 'times-new-roman',
font_size: 11,
format_style: 'mla'
}
});
// Save to file
const buffer = await Packer.toBuffer(docxDocument);
await fs.writeFile(outputPath, buffer);
}
`
Every generated document includes metadata in its custom properties. This can be helpful for troubleshooting or tracking document generation parameters:
- Document title
- Font family and size
- Format style (APA, MLA, Chicago)
- Header/footer settings
- Preview mode status
- Title page presence
To view in Microsoft Word: File > Info > Properties > Advanced Properties > Custom tab
` Contenttypescript`
const doc = await HtmlToDocxService.convertHtmlToDocument({
htmlContent: '
documentSettings: { font_family: 'arial', font_size: 12 }
});
// doc.CustomProperties contains: fontFamily='arial', fontSize='12', etc.
- arial - Arialopen-sans
- - Open Sans (default)times-new-roman
- - Times New Roman
- 10 - 10 point11
- - 11 point12
- - 12 point (default)
- apa - APA formattingchicago
- - Chicago stylemla
- - MLA formatting
The package supports automatic generation of properly formatted references/bibliography pages based on citation data. When sources are provided, a references page is automatically appended to the document with appropriate formatting for the selected citation style.
- Automatic Page Break: A page break is inserted before the references section
- Style-Specific Formatting:
- APA: "References" title (bold), double-spaced entries
- MLA: "Works Cited" title, double-spaced entries
- Chicago: "Bibliography" title, single-spaced within entries, double-spaced between
- Hanging Indentation: All entries use 0.5-inch hanging indentation
- Alphabetical Sorting: Sources are automatically sorted by first author/text
- Format Preservation: Italics and other formatting from citations are preserved
- Filtering: Only resolved citation pieces are included; placeholder text is omitted
Sources should be provided as an array of objects with:
- citation: Array of citation pieces (text, resolved status, optional format)citation_format
- : The citation style ('apa', 'mla', or 'chicago')
Only sources matching the document's format_style will be included in the references page.
,
- Italic: ,
- Underline:
- Subscript:
- Superscript: $3
- Paragraphs:
dd-title-header
- Headers: Custom Quill header classes (, dd-h1-header, etc.)
- Lists:
, with data-list attributes
- Links: with proper hyperlink styling$3
- Alignment: .ql-align-center, .ql-align-right, .ql-align-justify
- Indentation: .ql-indent-1 through .ql-indent-9
- Line Height: .ql-line-height-1, .ql-line-height-1-5, .ql-line-height-2
- Page Breaks: .page-break class$3
- Images: elements with URL support
- Alt Text: Proper fallback handling for failed image loadsCommand Line Interface
The package includes a CLI tool for converting HTML files to DOCX from the command line, useful for testing and integration with other systems (e.g., PHP applications).
$3
The CLI reads files without validation. Never pass user-controlled input as file paths, as
attackers could read sensitive files. Always validate and sanitize paths before use
(restrict directories, validate extensions, block path traversal).
$3
`bash
Install globally
npm install -g @packback/html-to-docxOr build locally and use node directly (recommended for development)
cd packages/html-to-docx
yarn install && yarn build
`$3
`bash
Using node directly (preserves quotes properly)
node dist/cli.js input.html output.docxWith custom font and size
node dist/cli.js input.html output.docx --font arial --size 11With formatting style
node dist/cli.js input.html output.docx --style apaIf installed globally
html-to-docx input.html output.docx --style apa
`$3
-
--font - Font family: arial, open-sans, times-new-roman (default: open-sans)
- --size - Font size: 10, 11, 12 (default: 12)
- --style - Format style: apa, mla, chicago
- --header-title - Page header title
- --header-last-name - Page header last name
- --header-page-numbers - Include page numbers in header
- --footer - Footer text
- --sources - Path to JSON file containing sources for references/bibliography page
- -h, --help - Show help messageNote: All documents include metadata in custom properties (font, size, style, etc.), accessible via File > Info > Properties > Advanced Properties in Microsoft Word.
$3
Sample HTML files are provided in the
examples/ directory:`bash
Simple example with basic formatting
node dist/cli.js examples/simple-example.html output.docxFull Quill document with title page and MLA style
node dist/cli.js examples/sample-quill.html output.docx --font times-new-roman --size 12 --style mlaWith page header and numbers (APA style) - use quotes for multi-word values
node dist/cli.js examples/sample-quill.html output.docx \
--style apa \
--header-title 'The Baroque Period' \
--header-last-name Koves \
--header-page-numbersWith custom footer
node dist/cli.js examples/simple-example.html output.docx \
--footer 'Copyright 2025 - All Rights Reserved'With references/bibliography page from sources.json
node dist/cli.js examples/sample-quill.html output.docx \
--style apa \
--sources examples/sources.json
`The
sources.json file should contain an array of sources with citation data:`json
[
{
"citation": [
{ "resolved": true, "text": "Smith, J." },
{ "resolved": true, "text": " (2023). " },
{ "resolved": true, "text": "Book Title", "format": "italic" },
{ "resolved": true, "text": ". Publisher." }
],
"citation_format": "apa"
}
]
`An example of apa formatted page which should be on the last page of the output document:
Node.js Compatibility
When running in Node.js environments:
1. Set
global.Node = jsdom.window.Node to provide DOM constants
2. Use the document parameter instead of htmlContent
3. Import JSDOM for HTML parsingDependencies
- docx: DocX document generation
- Local utilities: Self-contained formatting and styling utilities
- DOM API: Browser DOMParser or Node.js JSDOM
Development
$3
`bash
Install dependencies
yarn installBuild the package
yarn buildRun tests
yarn testRun tests in watch mode
yarn test:watchLint code
yarn lint
`$3
The package includes comprehensive tests that run in both browser and Node.js environments using Jest with JSDOM.
Run tests:
`bash
yarn test # Run all tests
yarn test:watch # Run tests in watch mode
yarn test:coverage # Run tests with coverage report
yarn test -- --testPathPattern=filename # Run a specific test file
`Code Coverage
After running
yarn test:coverage, open coverage/index.html in your browser for a detailed interactive coverage report.$3
The TypeScript source is compiled to CommonJS format in the
dist/` directory with type definitions.MIT - see LICENSE file for details.