AI-powered dataset discovery, quality analysis, and preparation MCP server with multimodal support (text, image, audio, video)
npm install @vespermcp/mcp-serverbash
npm install -g @vespermcp/mcp-server
`
$3
`bash
npm install -g git+https://github.com/vespermcp/mcp-server.git
`
The postinstall script will automatically:
- Install Python dependencies (opencv-python, librosa, etc.)
- Create data directories in ~/.vesper
- Display setup instructions
$3
`bash
pip install opencv-python pillow numpy librosa soundfile
`
⚙️ MCP Configuration
$3
1. Go to Settings > Features > MCP
2. Click Add New MCP Server
3. Enter:
- Name: vesper
- Type: command
- Command: vesper
$3
Vesper attempts to auto-configure itself! Restart Claude and check. If not:
`json
{
"mcpServers": {
"vesper": {
"command": "vesper",
"args": [],
"env": {
"HF_TOKEN": "your-huggingface-token"
}
}
}
}
`
> Note: If the vesper command isn't found, you can stick to the absolute path method.
$3
- KAGGLE_USERNAME & KAGGLE_KEY: For Kaggle dataset access
- HF_TOKEN: For private HuggingFace datasets
🚀 Quick Start
After installation and configuration, restart your AI assistant and try:
`
search_datasets(query="sentiment analysis", limit=5)
`
`
prepare_dataset(query="image classification cats vs dogs")
`
`
generate_quality_report(
dataset_id="huggingface:imdb",
dataset_path="/path/to/data"
)
`
📚 Available Tools
$3
#### search_datasets
Search for datasets across multiple sources.
Parameters:
- query (string): Search query
- limit (number, optional): Max results (default: 10)
- min_quality_score (number, optional): Minimum quality threshold
Example:
`
search_datasets(query="medical imaging", limit=5, min_quality_score=70)
`
---
$3
#### prepare_dataset
Download, analyze, and prepare a dataset for use.
Parameters:
- query (string): Dataset search query or ID
Example:
`
prepare_dataset(query="squad")
`
---
#### export_dataset
Export a prepared dataset to a custom directory with format conversion.
Parameters:
- dataset_id (string): Dataset identifier
- target_dir (string): Export directory
- format (string, optional): Output format (csv, json, parquet)
Example:
`
export_dataset(
dataset_id="huggingface:imdb",
target_dir="./my-data",
format="csv"
)
`
---
$3
#### analyze_image_quality
Analyze image datasets for resolution, corruption, and blur.
Parameters:
- path (string): Path to image file or folder
Example:
`
analyze_image_quality(path="/path/to/images")
`
---
#### analyze_media_quality
Analyze audio/video files for quality metrics.
Parameters:
- path (string): Path to media file or folder
Example:
`
analyze_media_quality(path="/path/to/audio")
`
---
#### generate_quality_report
Generate a comprehensive unified quality report for multimodal datasets.
Parameters:
- dataset_id (string): Dataset identifier
- dataset_path (string): Path to dataset directory
Example:
`
generate_quality_report(
dataset_id="my-dataset",
dataset_path="/path/to/data"
)
`
---
$3
#### split_dataset
Split a dataset into train/test/validation sets.
Parameters:
- dataset_id (string): Dataset identifier
- train_ratio (number): Training set ratio (0-1)
- test_ratio (number): Test set ratio (0-1)
- val_ratio (number, optional): Validation set ratio (0-1)
Example:
`
split_dataset(
dataset_id="my-dataset",
train_ratio=0.7,
test_ratio=0.2,
val_ratio=0.1
)
``