JSON Batch Processor MCP Server

A Model Context Protocol (MCP) server that enables AI IDEs to process large JSON arrays in manageable batches. This tool solves the context window limitation problem by breaking down large datasets into smaller chunks that can be processed incrementally.

Features

- JSON File Binding: Bind JSON files for batch processing with automatic structure analysis
- Flexible Task Generation: Create batch processing tasks with customizable batch sizes and field paths
- Batch Reading: Read specific batches of data from large JSON arrays
- Progress Tracking: Persistent progress tracking that survives system restarts
- Result Merging: Automatically merge processed batches back into a complete JSON file
- MCP Integration: Seamless integration with AI IDEs through the Model Context Protocol

Installation

$3

- Node.js v18 or higher
- npm or yarn package manager

$3

``bash npm install -g json-batch-processor-mcp`

`$3`

`bash git clone cd json-batch-processor-mcp npm install npm run build npm link`

`Configuration`

`$3`

Add the following configuration to your MCP settings file: - Workspace level:.kiro/settings/mcp.json(project-specific) - User level:~/.kiro/settings/mcp.json (global, all projects)

#### Option 1: Development/Local Installation

If you cloned the repository or installed from source:

`json { "mcpServers": { "json-batch-processor": { "command": "node", "args": ["/absolute/path/to/json-batch-processor-mcp/dist/index.js"], "disabled": false, "autoApprove": [] } } }`

Replace /absolute/path/to/json-batch-processor-mcp with the actual path to your installation.

#### Option 2: Global npm Installation

If you installed globally via npm install -g:

`json { "mcpServers": { "json-batch-processor": { "command": "json-batch-processor", "args": [], "disabled": false, "autoApprove": [] } } }`

#### Configuration Options

- command: The executable command or path to the server - args: Command-line arguments (empty for global install) - disabled: Set totrueto disable the server without removing the configuration - autoApprove: List of tool names to auto-approve (e.g.,["bind_json_file", "read_batch"])

#### Restart MCP Server

After updating the configuration: 1. Open the MCP Server view in your IDE 2. Click "Reconnect" next to the json-batch-processor server 3. Or restart your IDE

`$3`

The server stores all data in:`~/.kiro/mcp-data/json-batch-processor/`

Each binding creates a subdirectory with: -binding.json- Binding metadata -tasks.json- Task list -progress.json- Progress tracking -results/ - Processed batch results

`Available MCP Tools`

`$3`

Bind a JSON file for batch processing.

Parameters: -filePath (string, required): Absolute path to the JSON file

Returns:`json { "bindingId": "uuid-string", "structure": { "type": "object", "arrayPaths": ["$.data", "$.items"], "totalElements": 1000 } }`

`$3`

Generate a list of batch processing tasks.

Parameters: -bindingId(string, required): The binding ID from bind_json_file -batchSize(number, required): Number of elements per batch -fieldPath (string, optional): JSONPath to the array (e.g., "$.data.items")

Returns:`json { "bindingId": "uuid-string", "totalTasks": 20, "batchSize": 50, "fieldPath": "$.data", "tasks": [ { "id": "uuid-batch-0", "batchIndex": 0, "startIndex": 0, "endIndex": 49, "status": "pending" } ] }`

`$3`

Read data for a specific batch.

Parameters: -bindingId(string, required): The binding ID -taskId (string, required): The task ID from the task list

Returns:`json { "taskId": "uuid-batch-0", "batchIndex": 0, "data": [...], "startIndex": 0, "endIndex": 49, "totalElements": 1000 }`

`$3`

Update the status of a task and optionally save processed results.

Parameters: -bindingId(string, required): The binding ID -taskId(string, required): The task ID -status(string, required): Either "pending" or "completed" -result (array, optional): The processed batch data

Returns:`json { "success": true, "taskId": "uuid-batch-0", "status": "completed" }`

`$3`

Get the current processing progress.

Parameters: -bindingId (string, required): The binding ID

Returns:`json { "bindingId": "uuid-string", "totalTasks": 20, "completedTasks": 5, "pendingTasks": 15, "percentage": 25, "tasks": [ { "taskId": "uuid-batch-0", "status": "completed", "completedAt": "2025-11-08T10:05:00Z", "hasResult": true } ] }`

`$3`

Merge all processed batches into a final JSON file.

Parameters: -bindingId(string, required): The binding ID -outputPath (string, required): Path for the output JSON file

Returns:`json { "success": true, "outputPath": "/path/to/output.json", "totalElements": 1000, "message": "Successfully merged 20 batches" }`

`Quick Start`

Here's a minimal example to get started:

`javascript // 1. Bind a JSON file const binding = await callTool("bind_json_file", { filePath: "/path/to/data.json" });

// 2. Generate tasks (50 items per batch) const tasks = await callTool("generate_task_list", { bindingId: binding.bindingId, batchSize: 50, fieldPath: "$.items" // Optional: specify array path });

// 3. Process each batch for (const task of tasks.tasks) { // Read batch const batch = await callTool("read_batch", { bindingId: binding.bindingId, taskId: task.id }); // Process data (your custom logic) const processed = batch.data.map(item => ({ ...item, processed: true })); // Save results await callTool("update_task_status", { bindingId: binding.bindingId, taskId: task.id, status: "completed", result: processed }); }

// 4. Merge all results const result = await callTool("merge_results", { bindingId: binding.bindingId, outputPath: "/path/to/output.json" });`

For a complete walkthrough with detailed examples, see EXAMPLE.md.

`Error Handling`

The server returns structured error responses:

`json { "error": { "code": "FileNotFoundError", "message": "JSON file not found at path: /path/to/file.json", "details": {} } }`

Common error codes: -FileNotFoundError: JSON file doesn't exist -InvalidJSONError: Invalid JSON format -BindingNotFoundError: Binding ID not found -TaskNotFoundError: Task ID not found -InvalidBatchIndexError: Batch index out of range -IncompleteTasksError: Not all tasks completed -InvalidFieldPathError: Invalid JSONPath or path doesn't point to an array -StorageError: File system operation failed

`Performance`

- Binding operation: < 1 second (10MB file) - Task generation: < 500ms (1000 tasks) - Batch reading: < 100ms (50 element batch) - Progress update: < 50ms - Result merging: < 5 seconds (1000 batches)

`Limitations`

- Maximum file size: 100MB (configurable) - Maximum storage per binding: 500MB (configurable) - Supports JSON format only (CSV, XML support planned) - Requires Node.js runtime (not browser-compatible)

`Use Cases`

- Data Enrichment: Add information from external APIs to large datasets - Data Transformation: Convert data structures in manageable chunks - Data Validation: Validate large datasets batch by batch - Data Migration: Transform and migrate data between formats - AI Processing: Process large datasets within AI context window limits - Data Analysis: Analyze large datasets incrementally

`Architecture`

The server consists of six core components:

1. Binding Manager: Validates and binds JSON files, scans structure 2. Task Manager: Generates and manages batch processing tasks 3. Batch Reader: Extracts specific batches from JSON arrays 4. Progress Tracker: Tracks and persists task completion status 5. Result Merger: Merges processed batches back into complete JSON 6. MCP Server: Exposes functionality through MCP protocol tools

Data is stored in ~/.kiro/mcp-data/json-batch-processor/{bindingId}/: -binding.json- File binding metadata -tasks.json- Complete task list -progress.json- Current progress state -results/ - Individual batch results

`Development`

`$3`

`bash npm run build`

`$3`

`bash npm run dev`

`$3`

`bash npm run build:clean`

`$3`

`json-batch-processor-mcp/ ├── src/ │ ├── index.ts # MCP server entry point │ ├── managers/ # Core business logic │ │ ├── BindingManager.ts │ │ ├── TaskManager.ts │ │ ├── BatchReader.ts │ │ ├── ProgressTracker.ts │ │ └── ResultMerger.ts │ ├── types/ # TypeScript type definitions │ │ └── index.ts │ └── utils/ # Utility functions │ ├── storage.ts │ └── errors.ts ├── examples/ # Sample JSON files ├── dist/ # Compiled JavaScript └── package.json`

`License`

MIT

`Contributing`

Contributions are welcome! Please open an issue or submit a pull request.

`Recent Updates`

`$3`

Fixed a critical bug where binding information was not persisted to disk, causing read_batch and subsequent operations to fail. The server now correctly saves binding.json files, ensuring reliable operation across server restarts.

What changed: - Binding information (including JSON content) is now saved to disk -read_batchoperations work reliably even after server restart - All batch processing workflows are now resumable

Migration: If you have existing bindings from before this fix, please re-bind your JSON files using bind_json_file.

`Troubleshooting`

`$3`

Problem: MCP server doesn't appear in your IDE

Solutions: 1. Verify the configuration path in your mcp.json is correct 2. Check that the server is not disabled ("disabled": false) 3. Restart your IDE or reconnect the MCP server 4. Check the MCP server logs for errors

`$3`

Problem: FileNotFoundError when binding

Solutions: 1. Use absolute paths, not relative paths 2. Verify the file exists:ls -la /path/to/file.json3. Check file permissions (must be readable) 4. Ensure no typos in the file path

`$3`

Problem: InvalidJSONError when binding

Solutions: 1. Validate your JSON:cat file.json | jq .2. Check for trailing commas (not allowed in JSON) 3. Ensure proper quote escaping 4. Verify UTF-8 encoding

`$3`

Problem: BindingNotFoundError on subsequent operations

Solutions: 1. Verify you're using the correct binding ID from the bind response 2. Check if the binding data still exists in~/.kiro/mcp-data/json-batch-processor/3. Re-bind the file if necessary

`$3`

Problem: IncompleteTasksError when merging

Solutions: 1. Callget_progressto see which tasks are pending 2. Complete all pending tasks before merging 3. Check for failed tasks and retry them

`FAQ`

Q: Can I process multiple JSON files simultaneously? A: Yes! Each binding has a unique ID, so you can bind multiple files and process them in parallel.

Q: What happens if my process crashes? A: All progress is saved to disk. Simply callget_progress to see where you left off and continue.

Q: Can I change the batch size after generating tasks? A: No, you need to unbind and create a new binding with a different batch size.

Q: Does this work with nested arrays? A: Yes! Use JSONPath syntax to specify the exact array path (e.g.,$.data.users[*].orders).

Q: Can I process arrays in parallel? A: Yes, tasks are independent. You can process multiple batches simultaneously if your logic allows.

Q: What if my JSON has multiple arrays? A: Specify the exact array using thefieldPath parameter. If omitted, the first array is used.

Q: Can I modify the original file? A: No, the original file is never modified. Results are always written to a new output file.

Q: How do I clean up old bindings? A: Delete the binding directory:rm -rf ~/.kiro/mcp-data/json-batch-processor/{bindingId}`

Support

For issues and questions, please open an issue on the GitHub repository.