CLI for DocLD document processing - parse, extract, edit, and split documents from the command line
npm install @docld/cliParse, extract, and edit documents from the command line.
``bash`
npm install -g docld-cli
Before using the CLI, authenticate by running:
`bash`
docld login
This opens your browser to DocLD where you can securely authenticate. Alternatively, provide an API key directly:
`bash`
docld login --key your_api_key_here
Convert documents into structured markdown output.
`bashParse a single file
docld parse document.pdf
Output: Creates
files containing:
- YAML frontmatter with job ID, page count, and studio link
- Structured markdown content$3
Extract structured data from documents using JSON schemas.
`bash
Extract with a schema file
docld extract invoice.pdf -s schemas/invoice.jsonExtract from multiple files
docld extract ./invoices -s schemas/invoice.jsonInclude source citations
docld extract invoice.pdf -s schema.json --citations
`Output: Creates
files containing the extracted data.Extraction automatically reuses existing
.parse.md files when available to speed up processing.$3
Modify documents with natural language instructions.
`bash
Fill a form
docld edit form.pdf -i "Fill the client name as 'Acme Corp' and date as 'January 15, 2024'"Edit multiple documents
docld edit ./contracts -i "Replace 'OLD COMPANY' with 'NEW COMPANY' throughout"
`Output: Creates
files with the modifications applied.Options
$3
-
--help - Show help for any command
- --version - Show CLI version$3
| Flag | Description |
|------|-------------|
|
--agentic | Enable AI enhancement for text, tables, and figures |
| --change-tracking | Enable change tracking for document revisions |
| --hyperlinks | Include hyperlinks in output |
| --comments | Include document comments in output |
| --highlights | Include highlighted text in output |
| -o, --output | Output directory |$3
| Flag | Description |
|------|-------------|
|
-s, --schema | Path to JSON schema file (required) |
| --citations | Include source citations in output |
| -o, --output | Output directory |$3
| Flag | Description |
|------|-------------|
|
-i, --instructions | Natural language editing instructions (required) |
| -o, --output | Output directory |Schema Format
Extraction schemas must be valid JSON Schema documents with
type: "object":`json
{
"type": "object",
"properties": {
"invoice_number": { "type": "string" },
"total_amount": { "type": "number" },
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": { "type": "string" },
"quantity": { "type": "number" },
"price": { "type": "number" }
}
}
}
},
"required": ["invoice_number", "total_amount"]
}
`Supported File Types
| Format | Extensions |
|--------|------------|
| PDF |
.pdf |
| Images | .png, .jpg, .jpeg |
| Office documents | .doc, .docx, .ppt, .pptx |
| Spreadsheets | .xls, .xlsx |Environment Variables
| Variable | Description |
|----------|-------------|
|
DOCLD_API_KEY | API key (alternative to docld login) |
| DOCLD_API_URL | Custom API URL for self-hosted instances |Configuration
Configuration is stored in
~/.config/docld/config.json (Linux/macOS) or %APPDATA%/docld/config.json (Windows).Examples
$3
`bash
Parse all invoices
docld parse ./invoices --agenticExtract data from each
docld extract ./invoices -s schemas/invoice.jsonResults are in .extract.json files
`$3
`bash
Update company name across all contracts
docld edit ./contracts -i "Replace 'Old Corp' with 'New Corp' throughout the document"
`$3
`bash
Process entire document folder
docld parse ./documents -o ./parsed
docld extract ./documents -s schema.json -o ./extracted
``MIT