n8n node for extracting metadata from files (PDF, images, ebooks, archives, office docs, audio, video, markdown) with namespace support
npm install n8n-nodes-file-metadatan8n node for extracting metadata from files with namespace support for Qdrant filtering.
- Extract metadata from PDF files
- Extract EXIF data from images
- Process ebooks (EPUB)
- Extract archive information (ZIP)
- Parse Word documents
- Read Excel spreadsheets
- Get audio metadata
- Extract video information
- Parse markdown frontmatter
- NEW: Automatic namespace generation for Qdrant vector store filtering
This version automatically adds a namespace field to the metadata based on the document title:
- Extracts title or info.Title from document metadata
- Sanitizes the title to create a valid namespace (alphanumeric and underscores only)
- Limits namespace length to 96 characters for Qdrant compatibility
- Enables filtering with queries like:
``json`
{
"must": [
{
"key": "metadata.namespace",
"match": {
"value": "Writing_521A_Creative_Writing"
}
}
]
}
`bash`
npm install n8n-nodes-file-metadata@1.0.1
1. Add the File Metadata Extractor node to your workflow
2. Connect it to your document source
3. Configure the binary property name (default: 'data')
4. The node will output metadata including the new namespace field
5. Use the namespace for filtering in Qdrant vector stores
`json``
{
"title": "Writing 521A Creative Writing",
"author": "John Doe",
"fileType": "PDF",
"numberOfPages": 25,
"namespace": "Writing_521A_Creative_Writing",
"creationDate": "2024-01-15T10:30:00.000Z"
}
MIT