PaddleOCR Skills - Install PP-OCRv5 and PaddleOCR-VL 1.5 for Claude Code
npm install paddleocr-skillsbash
npx paddleocr-skills
`
The installer will:
1. Prompt you to select skills (PP-OCRv5 and/or PaddleOCR-VL)
2. Copy skill files to your project
3. Install Python dependencies
4. Guide you through API configuration
5. Verify the installation
What's Included
$3
- Fast text recognition for images and documents
- Adaptive quality modes (auto/fast/quality)
- Supports URLs and local files
- Confidence scoring and quality metrics
$3
- Advanced document structure analysis
- Table, formula, and chart recognition
- Layout detection (headers, footers, page numbers)
- Complete document parsing with reading order
Prerequisites
- Node.js: 14.0.0 or higher
- Python: 3.7 or higher
- Claude Code: Installed and configured
- API Access: Get your API credentials at Baidu AI Studio
Installation
$3
`bash
npx paddleocr-skills
`
The installer will guide you through:
- Skill selection
- Python dependency installation
- API configuration
$3
If you want to configure later:
`bash
npx paddleocr-skills
Choose "No" when asked about configuration
`
Then configure manually:
`bash
For PP-OCRv5
python scripts/ppocrv5/configure.py
For PaddleOCR-VL
python scripts/paddleocr-vl-1.5/configure.py
`
Usage
After installation, use the skills in your Claude Code session:
$3
`bash
Extract text from an image
python scripts/ppocrv5/ocr_caller.py --file-url "https://example.com/image.jpg" --pretty
Save result to file
python scripts/ppocrv5/ocr_caller.py --file-path "document.pdf" --output result.json --pretty
`
$3
`bash
Parse a complex document
python scripts/paddleocr-vl-1.5/vl_caller.py --file-url "https://example.com/paper.pdf" --pretty
Save result to file
python scripts/paddleocr-vl-1.5/vl_caller.py --file-path "invoice.pdf" --output result.json --pretty
`
Project Structure
After installation, your project will have:
`
your-project/
├── skills/
│ ├── ppocrv5/
│ │ └── SKILL.md
│ └── paddleocr-vl-1.5/
│ └── SKILL.md
├── scripts/
│ ├── ppocrv5/
│ │ ├── ocr_caller.py
│ │ ├── configure.py
│ │ ├── smoke_test.py
│ │ └── requirements.txt
│ └── paddleocr-vl-1.5/
│ ├── vl_caller.py
│ ├── configure.py
│ ├── smoke_test.py
│ └── requirements.txt
├── references/
│ ├── ppocrv5/
│ └── paddleocr-vl-1.5/
└── .env.example
`
Configuration
$3
If you skipped auto-configuration, create a .env file:
`bash
Copy the example file
cp .env.example .env
Edit with your credentials
nano .env
`
Add your API credentials:
`env
PP-OCRv5
API_URL=https://your-api-url.aistudio-app.com/ocr
TOKEN=your-token-here
PaddleOCR-VL
VL_API_URL=https://your-vl-api-url.com/v1
VL_TOKEN=your-vl-token-here
`
$3
`bash
Configure PP-OCRv5
python scripts/ppocrv5/configure.py --api-url "YOUR_URL" --token "YOUR_TOKEN"
Configure PaddleOCR-VL
python scripts/paddleocr-vl-1.5/configure.py --api-url "YOUR_URL" --token "YOUR_TOKEN"
`
Verification
Test your installation:
`bash
Test PP-OCRv5
python scripts/ppocrv5/smoke_test.py
Test PaddleOCR-VL
python scripts/paddleocr-vl-1.5/smoke_test.py
`
Troubleshooting
$3
Ensure Python is in your PATH:
`bash
python --version
`
If not found, install Python 3.7+ from python.org.
$3
Get your API credentials:
1. Visit Baidu AI Studio
2. Create a new task or use an existing one
3. Copy the API URL and TOKEN
4. Run the configuration script
$3
On Windows, run your terminal as Administrator if you encounter permission errors.
Documentation
Each skill includes comprehensive documentation:
- skills/ppocrv5/SKILL.md: PP-OCRv5 usage guide
- skills/paddleocr-vl-1.5/SKILL.md: PaddleOCR-VL usage guide
- references/: Technical reference documentation
License
MIT
Support
For issues and questions:
- GitHub Issues: Report a bug
- Documentation: See skills/*/SKILL.md` files