Docusaurus plugin for generating LLM-friendly documentation following the llmstxt.org standard
npm install docusaurus-plugin-llmsA Docusaurus plugin for generating LLM-friendly documentation following the llmstxt standard.
- โก๏ธ Easy integration with Docusaurus
- โ
Zero config required, works out of the box
- โ๏ธ Highly customizable with multiple options
- ๐ Creates llms.txt with section links
- ๐ Produces llms-full.txt with all content in one file
- ๐ Document ordering control for custom sequence
- ๐ Path transformation to customize URL construction
- ๐ Option to include blog posts
- ๐งฉ Custom LLM files for specific documentation sections
- ๐งน Cleans HTML and normalizes content for optimal LLM consumption
- ๐ซ Optional import statement removal for cleaner MDX content
- ๐ Optional duplicate heading removal for concise output
- ๐ Provides statistics about generated documentation
- Installation
- Configuration Options
- Available Options
- Path Transformation Examples
- Document Ordering Examples
- Custom LLM Files
- Content Cleaning Options
- Best Practices
- How It Works
- Implementation Details
- Testing
- Future Enhancements
- License
``bash`
npm install docusaurus-plugin-llms --save-dev
Then add to your Docusaurus configuration:
`js`
module.exports = {
// ... your existing Docusaurus config
plugins: [
'docusaurus-plugin-llms',
// ... your other plugins
],
};
You can configure the plugin by passing options:
`js`
module.exports = {
// ... your existing Docusaurus config
plugins: [
[
'docusaurus-plugin-llms',
{
// Options here
generateLLMsTxt: true,
generateLLMsFullTxt: true,
docsDir: 'docs',
ignoreFiles: ['advanced/', 'private/'],
title: 'My Project Documentation',
description: 'Complete reference documentation for My Project',
includeBlog: true,
// Content cleaning options
excludeImports: true,
removeDuplicateHeadings: true,
// Generate individual markdown files following llmstxt.org specification
generateMarkdownFiles: true,
// Control documentation order
includeOrder: [
'getting-started/*',
'guides/*',
'api/*',
],
includeUnmatchedLast: true,
// Path transformation options
pathTransformation: {
// Paths to ignore when constructing URLs (will be removed if found)
ignorePaths: ['docs'],
// Paths to add when constructing URLs (will be prepended if not already present)
addPaths: ['api'],
},
// Custom LLM files for specific documentation sections
customLLMFiles: [
{
filename: 'llms-python.txt',
includePatterns: ['api/python/*/.md', 'guides/python/*.md'],
fullContent: true,
title: 'Python API Documentation',
description: 'Complete reference for Python API'
},
{
filename: 'llms-tutorials.txt',
includePatterns: ['tutorials/*/.md'],
fullContent: false,
title: 'Tutorial Documentation',
description: 'All tutorials in a single file'
}
],
},
],
// ... your other plugins
],
};
| Option | Type | Default | Description |
|----------------------------------|----------|-------------------|---------------------------------------------------------------|
| description | string | Site tagline | Custom description to use in generated files |docsDir
| | string | 'docs' | Base directory for documentation files |excludeImports
| | boolean | false | Remove import statements from generated content |generateLLMsFullTxt
| | boolean | true | Whether to generate the full content file |generateLLMsTxt
| | boolean | true | Whether to generate the links file |ignoreFiles
| | string[] | [] | Array of glob patterns for files to ignore |includeBlog
| | boolean | false | Whether to include blog content |includeOrder
| | string[] | [] | Array of glob patterns for files to process in specific order |includeUnmatchedLast
| | boolean | true | Whether to include unmatched files at the end |llmsFullTxtFilename
| | string | 'llms-full.txt' | Custom filename for the full content file |llmsTxtFilename
| | string | 'llms.txt' | Custom filename for the links file |pathTransformation.addPaths
| | string[] | [] | Path segments to add when constructing URLs |pathTransformation.ignorePaths
| | string[] | [] | Path segments to ignore when constructing URLs |pathTransformation
| | object | undefined | Path transformation options for URL construction |removeDuplicateHeadings
| | boolean | false | Remove redundant content that duplicates heading text |title
| | string | Site title | Custom title to use in generated files |version
| | string | undefined | Global version to include in all generated files |customLLMFiles
| | array | [] | Array of custom LLM file configurations |generateMarkdownFiles
| | boolean | false | Generate individual markdown files and link to them from llms.txt |keepFrontMatter
| | string[] | [] | Preserve selected front matter items when generating individual markdown files |preserveDirectoryStructure
| | boolean | true | Preserve full directory structure in generated markdown files (e.g., docs/server/config.md instead of server/config.md) |processingBatchSize
| | number | 100 | Batch size for processing documents to prevent out-of-memory errors on large sites |rootContent
| | string | (see below) | Custom content to include at the root level of llms.txt |fullRootContent
| | string | (see below) | Custom content to include at the root level of llms-full.txt |logLevel
| | string | 'normal' | Logging level for plugin output: 'quiet', 'normal', or 'verbose' |
The rootContent and fullRootContent options allow you to customize the introductory content that appears in your generated files, following the llmstxt.org standard which allows "zero or more markdown sections (e.g. paragraphs, lists, etc) of any type except headings" after the title and description.
#### Default Content
If not specified, the plugin uses these defaults:
- llms.txt: "This file contains links to documentation sections following the llmstxt.org standard."
- llms-full.txt: "This file contains all documentation content in a single document following the llmstxt.org standard."
#### Custom Content Examples
Example 1: Add project-specific context
`jsWelcome to the MyProject documentation.
rootContent:
This documentation covers:
- Installation and setup
- API reference
- Advanced usage guides
- Troubleshooting
For the latest updates, visit https://myproject.dev/changelog`
Example 2: Add technical specifications
`jsComplete offline documentation bundle for MyProject v2.0.
fullRootContent:
Format: Markdown with code examples
Languages: JavaScript, TypeScript, Python
Last Generated: ${new Date().toISOString()}
> Note: Some features require authentication tokens.
> See the Authentication section for details.`
Example 3: Add navigation hints for AI assistants
`jsThis documentation is optimized for AI assistants and LLMs.
rootContent:
Quick navigation:
- For API endpoints, search for "API:"
- For code examples, search for "Example:"
- For configuration, search for "Config:"
All code examples are MIT licensed unless otherwise noted.`
#### Custom Root Content for Custom LLM Files
You can also specify root content for each custom LLM file:
`jsComplete API reference for all REST endpoints.
customLLMFiles: [
{
filename: 'llms-api.txt',
includePatterns: ['api/*/.md'],
fullContent: true,
title: 'API Documentation',
rootContent:
Authentication required for all endpoints except /health.
Base URL: https://api.example.com/v2`
}
]
The plugin includes batch processing to prevent out-of-memory errors when processing very large documentation sites. By default, documents are processed in batches of 100, but you can configure this using the processingBatchSize option.
When to adjust batch size:
- Large sites (1000+ documents): Reduce batch size (e.g., 50) to lower memory usage200
- Small sites (< 100 documents): Default value is fine
- Memory-constrained environments: Reduce batch size to prevent OOM errors
- High-memory systems: Increase batch size (e.g., ) for faster processing
Example configuration:
`js`
module.exports = {
plugins: [
[
'docusaurus-plugin-llms',
{
processingBatchSize: 50, // Process 50 documents at a time
// ... other options
},
],
],
};
How it works:
- Documents are processed in chunks of the specified batch size
- Each batch is processed sequentially to control memory usage
- Document order is preserved across batches
- Progress is logged when processing multiple batches (in verbose mode)
The path transformation feature allows you to manipulate how URLs are constructed from file paths:
Example 1: Remove 'docs' from the URL path
`js`
pathTransformation: {
ignorePaths: ['docs'],
}/content/docs/manual/decorators.md
File path: โ URL: https://example.com/manual/decorators
Example 2: Add 'api' to the URL path
`js`
pathTransformation: {
addPaths: ['api'],
}/content/manual/decorators.md
File path: โ URL: https://example.com/api/manual/decorators
Example 3: Combine both transformations
`js`
pathTransformation: {
ignorePaths: ['docs'],
addPaths: ['api'],
}/content/docs/manual/decorators.md
File path: โ URL: https://example.com/api/manual/decorators
The configuration supports multiple path segments in both arrays.
The document ordering feature allows you to control the sequence in which files appear in the generated output.
#### Pattern Matching Behavior
Patterns in includeOrder, ignoreFiles, and customLLMFiles.includePatterns are matched against both site-relative and docs-relative paths for maximum flexibility:
- Site-relative path: The path relative to your site root (e.g., docs/quickstart/file.md)quickstart/file.md
- Docs-relative path: The path relative to your docs directory (e.g., )
This means both of these patterns will match the same file:
`js`
includeOrder: [
'docs/quickstart/*', // Matches site-relative path
'quickstart/*' // Matches docs-relative path (more intuitive!)
]
Recommended approach: Use docs-relative paths (without the docs/ prefix) as they are more intuitive and portable across different configurations.
Example 1: Basic Section Ordering
`js`
includeOrder: [
'getting-started/', // Matches docs/getting-started/.md
'guides/', // Matches docs/guides/.md
'api/', // Matches docs/api/.md
'advanced/' // Matches docs/advanced/.md
]
Result: Files will appear in the generated output following this section order.
Example 2: Strict Inclusion List
`js`
includeOrder: [
'public-docs//.md' // Matches docs/public-docs//.md
],
includeUnmatchedLast: false
Result: Only files matching 'public-docs/*/.md' are included, all others are excluded.
Example 3: Detailed Ordering with Specific Files First
`js`
includeOrder: [
'getting-started/installation.md', // Specific file first
'getting-started/quick-start.md', // Another specific file
'getting-started/*.md', // Rest of getting-started
'api/core/*.md', // Core API docs
'api/plugins/*.md', // Plugin API docs
'api/*/.md' // All other API docs
]
Result: Installation and quick-start guides appear first, followed by other getting-started files, then API documentation in a specific order.
Example 4: Nested Directory Patterns
`js`
includeOrder: [
'tutorials/beginner/*/', // All beginner tutorials (deeply nested)
'tutorials/intermediate/*', // Intermediate tutorials (one level)
'tutorials/*/' // All other tutorials
]
Result: Beginner tutorials appear first (regardless of nesting depth), then intermediate, then everything else.
The plugin fully supports Docusaurus partials - reusable MDX content files that can be imported into other documents.
#### How It Works
1. Partial files (MDX files starting with underscore, e.g., _shared-config.mdx) are automatically excluded from the generated llms*.txt files
2. Import statements for partials are resolved and the content is inlined when processing documents
#### Example
Given a partial file _api-config.mdx:`mdxAPI Configuration
Set your API endpoint:
`javascript`
const API_URL = 'https://api.example.com';`
And a document that imports it:
`mdx
---
title: Getting Started
---
import ApiConfig from './_api-config.mdx';
Now you can make API calls...
`
The plugin will:
- Exclude _api-config.mdx from llms.txt
- Replace the import and with the actual content in the processed document
In addition to the standard llms.txt and llms-full.txt files, you can generate custom LLM-friendly files for different sections of your documentation with the customLLMFiles option:
`js`
customLLMFiles: [
{
filename: 'llms-python.txt',
includePatterns: ['api/python/*/.md', 'guides/python/*.md'],
fullContent: true,
title: 'Python API Documentation',
description: 'Complete reference for Python API'
},
{
filename: 'llms-tutorials.txt',
includePatterns: ['tutorials/*/.md'],
fullContent: false,
title: 'Tutorial Documentation',
description: 'All tutorials in a single file'
}
]
#### Custom LLM File Configuration
Each custom LLM file is defined by an object with the following properties:
| Option | Type | Required | Description |
|-----------------------|----------|----------|----------------------------------------------|
| filename | string | Yes | Name of the output file (e.g., 'llms-python.txt') |includePatterns
| | string[] | Yes | Glob patterns for files to include |fullContent
| | boolean | Yes | true for full content like llms-full.txt, false for links only like llms.txt |title
| | string | No | Custom title for this file (defaults to site title) |description
| | string | No | Custom description for this file (defaults to site description) |ignorePatterns
| | string[] | No | Additional patterns to exclude (combined with global ignoreFiles) |orderPatterns
| | string[] | No | Order patterns for controlling file ordering (similar to includeOrder) |includeUnmatchedLast
| | boolean | No | Whether to include unmatched files last (default: false) |version
| | string | No | Version information for this LLM file (overrides global version) |
#### Use Cases
##### Language-Specific Documentation
Create separate files for different programming languages:
`js`
customLLMFiles: [
{
filename: 'llms-python.txt',
includePatterns: ['api/python/*/.md', 'guides/python/*.md'],
fullContent: true,
title: 'Python API Documentation'
},
{
filename: 'llms-javascript.txt',
includePatterns: ['api/javascript/*/.md', 'guides/javascript/*.md'],
fullContent: true,
title: 'JavaScript API Documentation'
}
]
##### Content Type Separation
Separate tutorials from API reference:
`js`
customLLMFiles: [
{
filename: 'llms-tutorials.txt',
includePatterns: ['tutorials//.md', 'guides//.md'],
fullContent: true,
title: 'Tutorials and Guides'
},
{
filename: 'llms-api.txt',
includePatterns: ['api//.md', 'reference//.md'],
fullContent: true,
title: 'API Reference'
}
]
##### Beginner-Friendly Documentation
Create a beginner-focused file with carefully ordered content:
`js`
customLLMFiles: [
{
filename: 'llms-getting-started.txt',
includePatterns: ['*/.md'],
ignorePatterns: ['advanced//.md', 'internal//.md'],
orderPatterns: [
'introduction.md',
'getting-started/*.md',
'tutorials/basic/*.md',
'examples/simple/*.md'
],
fullContent: true,
title: 'Getting Started Guide',
description: 'Beginner-friendly documentation with essential concepts'
}
]
##### Versioned Documentation
Include version information in your documentation files:
`js`
plugins: [
[
'docusaurus-plugin-llms',
{
// Global version applies to all files
version: '2.0.0',
// Custom LLM files with specific versions
customLLMFiles: [
{
filename: 'api-reference.txt',
title: 'API Reference Documentation',
description: 'Complete API reference for developers',
includePatterns: ['/api//.md', '/reference//.md'],
fullContent: true,
version: '1.0.0' // Overrides global version
},
{
filename: 'tutorials.txt',
title: 'Tutorials and Guides',
description: 'Step-by-step tutorials and guides',
includePatterns: ['/tutorials//.md', '/guides//.md'],
fullContent: true,
version: '0.9.5-beta' // Overrides global version
}
]
}
],
]
The generated files will include the version information under the description:
`API Reference Documentation
> Complete API reference for developers
Version: 1.0.0
This file contains all documentation content in a single document following the llmstxt.org standard.
`
The plugin includes a configurable logging system that allows you to control the amount of output during the build process.
The plugin supports three logging levels:
- quiet: Suppresses all output except errors
- normal (default): Shows standard informational messages and warnings
- verbose: Shows detailed progress information including file-by-file processing
`js`
module.exports = {
plugins: [
[
'docusaurus-plugin-llms',
{
logLevel: 'verbose', // Options: 'quiet', 'normal', 'verbose'
// Other configuration options...
},
],
],
};
#### Quiet Mode (logLevel: 'quiet')
Only errors are displayed. Use this for clean builds in CI/CD environments or when you don't need build feedback.
`js`
{
logLevel: 'quiet'
}
Output:
``
[docusaurus-plugin-llms] ERROR: Error generating LLM documentation: ...
#### Normal Mode (logLevel: 'normal') - Default
Shows standard progress messages, warnings, and errors. This is the recommended setting for most users.
`js`
{
logLevel: 'normal' // or omit - this is the default
}
Output:
``
[docusaurus-plugin-llms] Generating LLM-friendly documentation...
[docusaurus-plugin-llms] Generating individual markdown files...
[docusaurus-plugin-llms] Generated: /path/to/llms.txt
[docusaurus-plugin-llms] Generated: /path/to/llms-full.txt
[docusaurus-plugin-llms] Stats: 42 total available documents processed
#### Verbose Mode (logLevel: 'verbose')
Shows detailed information about every file being processed. Use this for debugging or when you need detailed feedback.
`js`
{
logLevel: 'verbose'
}
Output:
``
[docusaurus-plugin-llms] Generating LLM-friendly documentation...
[docusaurus-plugin-llms] Generating file: /path/to/llms.txt, version: undefined
[docusaurus-plugin-llms] Processed 42 documentation files for standard LLM files
[docusaurus-plugin-llms] Generating individual markdown files...
[docusaurus-plugin-llms] Generated markdown file: getting-started.md
[docusaurus-plugin-llms] Generated markdown file: api/reference.md
[docusaurus-plugin-llms] Generated: /path/to/llms.txt
[docusaurus-plugin-llms] Generated: /path/to/llms-full.txt
[docusaurus-plugin-llms] Stats: 42 total available documents processed
#### Development
Use normal or verbose mode during development to see what's being generated:
`js`
{
logLevel: 'verbose',
generateMarkdownFiles: true
}
#### Production/CI
Use quiet mode in production builds or CI/CD to reduce log noise:
`js`
{
logLevel: 'quiet',
generateLLMsTxt: true,
generateLLMsFullTxt: true
}
#### Debugging
Use verbose mode when troubleshooting issues:
`js`
{
logLevel: 'verbose',
excludeImports: true,
removeDuplicateHeadings: true
}
The plugin provides advanced content cleaning options to optimize your documentation for LLM consumption by removing unnecessary elements that can clutter the output.
The excludeImports option removes JavaScript/TypeScript import statements from your MDX files, which are typically not useful for LLMs and can create noise in the generated documentation.
#### When to Use
- Your documentation uses MDX files with React components
- You have many import statements for UI components
- You want cleaner, more readable output for LLMs
#### Example
Before (with excludeImports: false):`markdown
import ApiTabs from "@theme/ApiTabs";
import DiscriminatorTabs from "@theme/DiscriminatorTabs";
import MethodEndpoint from "@theme/ApiExplorer/MethodEndpoint";
import SecuritySchemes from "@theme/ApiExplorer/SecuritySchemes";
import MimeTabs from "@theme/MimeTabs";
import ParamsItem from "@theme/ParamsItem";
This endpoint creates a new user account...
`
After (with excludeImports: true):`markdownCreate User Account
This endpoint creates a new user account...
`
#### Configuration
`js`
{
excludeImports: true, // Remove all import statements
}
The removeDuplicateHeadings option removes redundant content that simply repeats the heading text immediately after the heading, which is common in auto-generated API documentation.
#### When to Use
- Your documentation has redundant content that repeats heading text
- You have auto-generated API docs with minimal content
- You want to eliminate repetitive patterns for cleaner LLM consumption
#### Example
Before (with removeDuplicateHeadings: false):`markdownCreate Deliverable
Create Deliverable
---
Update User Profile
---
`
After (with removeDuplicateHeadings: true):`markdownCreate Deliverable
---
---
`
#### Configuration
`js`
{
removeDuplicateHeadings: true, // Remove redundant heading text
}
For optimal LLM-friendly output, you can combine both options:
`js`
module.exports = {
plugins: [
[
'docusaurus-plugin-llms',
{
// Enable both content cleaning options for optimal LLM output
excludeImports: true,
removeDuplicateHeadings: true,
// Other configuration options...
generateLLMsTxt: true,
generateLLMsFullTxt: true,
docsDir: 'docs',
},
],
],
};
#### Minimal Cleanup (Default Behavior)
`js`
{
excludeImports: false,
removeDuplicateHeadings: false
}
- Preserves all original content
- Suitable when you want to keep import statements for reference
- Good for documentation that doesn't have redundant patterns
#### Import Cleanup Only
`js`
{
excludeImports: true,
removeDuplicateHeadings: false
}
- Removes import statements but keeps all content
- Good for MDX-heavy documentation sites
- Maintains content structure while removing technical imports
#### Full Cleanup (Recommended for LLMs)
`js`
{
excludeImports: true,
removeDuplicateHeadings: true
}
- Maximum cleanup for LLM consumption
- Removes both imports and redundant content
- Recommended for API documentation and auto-generated content
- Produces the cleanest, most concise output
`js`
{
excludeImports: true, // Remove React component imports
removeDuplicateHeadings: true, // Remove redundant API endpoint descriptions
generateLLMsFullTxt: true, // Create comprehensive single file
}
`js`
{
excludeImports: true, // Remove any MDX imports
removeDuplicateHeadings: false, // Keep all content as written
includeOrder: [ // Organize content logically
'getting-started/*',
'tutorials/*',
'advanced/*'
]
}
`js`
{
excludeImports: true,
removeDuplicateHeadings: true,
customLLMFiles: [
{
filename: 'llms-python.txt',
includePatterns: ['/python//*.md'],
fullContent: true,
title: 'Python Documentation'
},
{
filename: 'llms-javascript.txt',
includePatterns: ['/javascript//*.md'],
fullContent: true,
title: 'JavaScript Documentation'
}
]
}
, ensuring existing configurations continue to work without changes. Only users who explicitly enable these features will see the cleaned output.Markdown File Generation (
generateMarkdownFiles)The
generateMarkdownFiles option enables the plugin to generate individual markdown files for each documentation page, following the llmstxt.org specification more closely. When enabled, this creates separate .md files for LLM consumption instead of linking to your original documentation pages.$3
Default Behavior (generateMarkdownFiles: false):
- Generates
llms.txt with links to your original documentation pages
- Example: Getting StartedWith generateMarkdownFiles: true:
- Generates individual markdown files (e.g.,
getting-started.md, api-reference.md)
- Generates llms.txt with links to these generated markdown files
- Example: Getting Started$3
1. Standards Compliance: Follows the llmstxt.org specification by providing individual markdown files rather than linking to HTML pages
2. LLM Optimization: Generated files contain clean, processed markdown optimized for LLM consumption
3. Self-Contained: All necessary content is available in markdown format without requiring HTML parsing
4. Flexible Naming: Automatically generates readable filenames based on document titles
$3
`js
module.exports = {
plugins: [
[
'docusaurus-plugin-llms',
{
generateMarkdownFiles: true, // Enable individual markdown file generation
generateLLMsTxt: true, // Generate index file linking to markdown files
excludeImports: true, // Clean up import statements
removeDuplicateHeadings: true, // Remove redundant content
// Other options work normally
includeOrder: ['getting-started/', 'guides/', 'api/*'],
pathTransformation: {
ignorePaths: ['docs']
}
}
]
]
}
`$3
#### Preserving Directory Structure (
preserveDirectoryStructure)By default (
preserveDirectoryStructure: true), generated markdown files maintain the same directory structure as your HTML output, making them accessible at matching URL paths:With
preserveDirectoryStructure: true (default):
`
docs/server/config.md โ build/docs/server/config.md
`With
preserveDirectoryStructure: false:
`
docs/server/config.md โ build/server/config.md
`This is particularly useful when you want markdown files to sit alongside HTML files in the build output, allowing them to be served from the same URL path with a
.md extension.Example configuration:
`js
{
generateMarkdownFiles: true,
preserveDirectoryStructure: true, // Default: matches HTML output structure
docsDir: 'docs'
}
`With this configuration, if your HTML is at
https://yoursite.com/docs/server/config.html, the markdown will be at https://yoursite.com/docs/server/config.md.$3
With
generateMarkdownFiles: true, your output directory will contain:`
build/
โโโ llms.txt # Index file with links to generated markdown files
โโโ llms-full.txt # Full content file (if enabled)
โโโ docs/ # Preserves directory structure (default)
โ โโโ getting-started.md
โ โโโ api/
โ โ โโโ reference.md
โ โโโ server/
โ โโโ config.md
โโโ ... # Other generated markdown files
`Or with
preserveDirectoryStructure: false:`
build/
โโโ llms.txt # Index file with links to generated markdown files
โโโ llms-full.txt # Full content file (if enabled)
โโโ getting-started.md # Flat structure (old behavior)
โโโ api/
โ โโโ reference.md
โโโ server/
โโโ config.md
`$3
The plugin generates readable filenames using this priority:
1. Document Title: Converted to kebab-case (e.g., "Getting Started" โ
getting-started.md)
2. URL Path: If title is unavailable, uses the document's URL path
3. Uniqueness: Automatically appends numbers for duplicate names (e.g., getting-started-1.md)$3
Generated markdown files include:
- Document title as H1 heading
- Document description as blockquote (following llmstxt.org format)
- Processed content with optional cleaning (import removal, duplicate heading removal)
- Proper markdown formatting optimized for LLM consumption
$3
Input documentation about "API Authentication" would generate
api-authentication.md:`markdown
API Authentication
> Learn how to authenticate with our API using various methods
Overview
This guide covers all authentication methods supported by our API...
API Key Authentication
Use your API key to authenticate requests:
`javascript
const client = new Client({ apiKey: 'your-key' });
`
`$3
#### Standards-Compliant Documentation
Perfect for projects that want to follow the llmstxt.org specification exactly:
`js
{
generateMarkdownFiles: true,
generateLLMsTxt: true,
generateLLMsFullTxt: false // Optional: disable if only individual files are needed
}
`#### LLM Training Data
Generate clean markdown files for LLM training or fine-tuning:
`js
{
generateMarkdownFiles: true,
excludeImports: true,
removeDuplicateHeadings: true,
customLLMFiles: [
{
filename: 'training-data.txt',
includePatterns: ['*/.md'],
fullContent: true
}
]
}
`#### Multi-Format Output
Generate both original links and markdown files for different use cases:
`js
{
generateLLMsTxt: true, // Links to original pages
generateMarkdownFiles: true, // Also generate individual markdown files
llmsTxtFilename: 'llms-original.txt', // Original links file
// The markdown-linked version will be in llms.txt
}
`$3
- Fully backward compatible: Defaults to
false, existing configurations unchanged
- Works with all existing options: Path transformations, custom LLM files, content cleaning
- Respects ordering: Generated files maintain the same order as configured with includeOrder
- Custom LLM files: Also support markdown file generation when the global option is enabledHow It Works
This plugin automatically generates the following files during the build process:
- llms.txt: Contains links to all sections of your documentation
- llms-full.txt: Contains all documentation content in a single file
- Custom LLM files: Additional files based on your custom configurations
These files follow the llmstxt standard, making your documentation optimized for use with Large Language Models (LLMs).
Implementation Details
The plugin:
1. Scans your
docs directory recursively for all Markdown files
2. Optionally includes blog content
3. Orders documents according to specified glob patterns (if provided)
4. Extracts metadata, titles, and content from each file
5. Creates proper URL links to each document section
6. Applies path transformations according to configuration (removing or adding path segments)
7. Generates a table of contents in llms.txt
8. Combines all documentation content in llms-full.txt
9. Creates custom LLM files based on specified configurations
10. Provides statistics about the generated documentationTesting
The plugin includes comprehensive tests in the
tests directory:- Unit tests: Test the path transformation functionality in isolation
- Integration tests: Simulate a Docusaurus build with various configurations
To run the tests:
`bash
Run all tests
npm testRun just the unit tests
npm run test:unitRun just the integration tests
npm run test:integration
``For more detailed testing instructions, see tests/TESTING.md.
Planned features for future versions:
- Advanced glob pattern matching for file filtering
- Support for i18n content
- Specific content tags for LLM-only sections