CLI html to plain text converter
npm install @html-to/text-cliCommand line interface for html-to-text Node.js package.
- almost all html-to-text options can be specified via command line arguments or json config (the only exception is functions such as custom formatters);
- a couple of presets for common use cases (human reading in terminal and machine indexing/search).
Available here: CHANGELOG.md
```
npm i -g @html-to/text-cli
- old versions of html-to-text package expose a command with the same name. Make sure that package is not installed globally anymore.html-to-text
- there is an old abandoned CLI package that exposes a command with the same name and actually has nothing to do with package. Make sure to only use namespaced package @html-to/text-cli.
- Use html-to-text command (html-to-text.cmd in PowerShell);stdin
- Pipe HTML to ;stdout
- Get plain text from ;
- Pass converter options as command arguments.
`shell`
> cat ./input.html | html-to-text [commands...] [keys and values...] > ./output.txt
In PowerShell:
`shell`
PS> Get-Content .\input.html | html-to-text.cmd [commands...] [keys and values...] > .\output.txt
.ps1 wrapper installed by npm might not work with stdin, so use .cmd instead.
| Command | Alias | Argument | Description
| --------- | ----- | -------------- | -----------
| json | -j | \preset
| | -p | \inspect
| | -i | | Pretty print the parsed options object and exit. Useful as a dry run to check how options are parsed.unparse
| | -u | | Print the parsed options object back as args string and exit. Can be used to check what arguments produce the result equivalent to a given json file.help
| | -h | | Print help message end exit.version
| | -v | | Print version number and exit.
Note: short aliases cannot be merged.
| Preset | Description
| --------- | -----------
| human | Some options more suitable for human reading in terminal (ensure line length of 80 characters, format tables visually)machine
| | Some options more suitable for machine processing (no line length limit, format tables and cells as blocks)
Refer to html-to-text help output for brief syntax information.
Refer to aspargvs readme for more detailed information.
Note: PowerShell requires to escape quotes and curly braces.
All options that are representable in JSON format (that is all except functions) can be specified via CLI arguments. Below are some examples.
| JSON | CLI
| --------------------- | ---
| { preserveNewlines: true } | --preserveNewlines{ wordwrap: 100 }
| | --wordwrap=100{ wordwrap: false }
| | --!wordwrap{ baseElements: { orderBy: 'occurrence' } }
| | --baseElements.orderBy=occurrence{ selectors: [
| { selector: 'img', format: 'skip' }] } | --selectors[] {} :selector=img :format=skip{ selectors: [
| { selector: 'h1', options: { uppercase: false } },{ selector: 'h2', options: { uppercase: false } }] }| --selectors[] {} :selector=h1 :!options.uppercase {} :selector=h2 :!options.uppercase{ selectors: [
| { selector: 'table', format: 'dataTable', options: { uppercaseHeaderCells: false } }] } | --selectors[] {} :selector=table :format=dataTable :options.uppercase-header-cells=false{ selectors: [
| { selector: 'a', options: { linkBrackets: ['<', '>'] } }] } | --selectors[] {} :selector=a :options.linkBrackets=['<','>']`