The ultimate research MCP toolkit: Reddit mining, web search with CTR aggregation, AI-powered deep research, and intelligent web scraping - all in one modular package
npm install research-powerpack-mcp
The ultimate research toolkit for your AI coding assistant. It searches the web, mines Reddit, scrapes any URL, and synthesizes everything into perfectly structured context your LLM actually understands.
โก Get Started โข
โจ Key Features โข
๐ฎ Usage & Examples โข
โ๏ธ API Key Setup โข
๐ Why This Slaps
---
research-powerpack-mcp is the research assistant your AI wishes it had. Stop asking your LLM to guess about things it doesn't know. This MCP server acts like a senior researcher, searching the web, mining Reddit discussions, scraping documentation, and synthesizing everything into perfectly structured context so your AI can actually give you answers worth a damn.
๐Batch Web Search 100 keywords in parallel | ๐ฌReddit Mining Real opinions, not marketing | ๐Universal Scraping JS rendering + geo-targeting | ๐งDeep Research AI synthesis with citations |
How it slaps:
- You: "What's the best database for my use case?"
- AI + Powerpack: Searches Google, mines Reddit threads, scrapes docs, synthesizes findings.
- You: Get an actually informed answer with real community opinions and citations.
- Result: Ship better decisions. Skip the 47 browser tabs.
---
Manually researching is a vibe-killer. research-powerpack-mcp makes other methods look ancient.
| โ The Old Way (Pain) | โ The Powerpack Way (Glory) |
|
|
We're not just fetching random pages. We're building high-signal, low-noise context with CTR-weighted ranking, smart comment allocation, and intelligent token distribution that prevents massive responses from breaking your LLM's context window.
---
``bash`
npm install research-powerpack-mcp
| Client | Config File | Docs |
|:------:|:-----------:|:----:|
| ๐ฅ๏ธ Claude Desktop | claude_desktop_config.json | Setup |~/.claude.json
| โจ๏ธ Claude Code | or CLI | Setup |.cursor/mcp.json
| ๐ฏ Cursor | | Setup |
| ๐ Windsurf | MCP settings | Setup |
#### Claude Desktop
Add to your claude_desktop_config.json:
`json`
{
"mcpServers": {
"research-powerpack": {
"command": "npx",
"args": ["research-powerpack-mcp"],
"env": {
"SERPER_API_KEY": "your_key",
"REDDIT_CLIENT_ID": "your_id",
"REDDIT_CLIENT_SECRET": "your_secret",
"SCRAPEDO_API_KEY": "your_key",
"OPENROUTER_API_KEY": "your_key"
}
}
}
}
or quick install (for MacOS):
``
cat ~/Library/Application\ Support/Claude/claude_desktop_config.json | jq '.mcpServers["research-powerpack"] = {
"command": "npx",
"args": ["research-powerpack-mcp@latest"],
"disabled": false,
"env": {
"OPENROUTER_API_KEY": "xxx",
"REDDIT_CLIENT_ID": "xxx",
"REDDIT_CLIENT_SECRET": "xxx",
"RESEARCH_MODEL": "xxxx",
"SCRAPEDO_API_KEY": "xxx",
"SERPER_API_KEY": "xxxx"
}
}' | tee ~/Library/Application\ Support/Claude/claude_desktop_config.json
#### Claude Code (CLI)
One command to rule them all:
`bash`
claude mcp add research-powerpack npx \
--scope user \
--env SERPER_API_KEY=your_key \
--env REDDIT_CLIENT_ID=your_id \
--env REDDIT_CLIENT_SECRET=your_secret \
--env OPENROUTER_API_KEY=your_key \
--env OPENROUTER_BASE_URL=https://openrouter.ai/api/v1 \
--env RESEARCH_MODEL=x-ai/grok-4.1-fast \
-- research-powerpack-mcp
Or manually add to ~/.claude.json:
`json`
{
"mcpServers": {
"research-powerpack": {
"command": "npx",
"args": ["research-powerpack-mcp"],
"env": {
"SERPER_API_KEY": "your_key",
"REDDIT_CLIENT_ID": "your_id",
"REDDIT_CLIENT_SECRET": "your_secret",
"OPENROUTER_API_KEY": "your_key",
"OPENROUTER_BASE_URL": "https://openrouter.ai/api/v1",
"RESEARCH_MODEL": "x-ai/grok-4.1-fast"
}
}
}
}
#### Cursor/Windsurf
Add to .cursor/mcp.json or equivalent:
`json`
{
"mcpServers": {
"research-powerpack": {
"command": "npx",
"args": ["research-powerpack-mcp"],
"env": {
"SERPER_API_KEY": "your_key"
}
}
}
}
> โจ Zero Crash Promise: Missing API keys? No problem. The server always starts. Tools just return helpful setup instructions instead of exploding.
---
| Feature | What It Does | Why You Care |
| :---: | :--- | :--- |
| ๐ Batch Search
100 keywords parallel | Search Google for up to 100 queries simultaneously | Cover every angle of a topic in one shot |Smart URL scoring
| ๐ CTR Ranking | Identifies URLs that appear across multiple searches | Surfaces high-consensus authoritative sources |Real human opinions
| ๐ฌ Reddit Mining | Google-powered Reddit search + native API fetching | Get actual user experiences, not marketing fluff |Token-aware budgets
| ๐ฏ Smart Allocation | 1,000 comment budget distributed across posts | Deep dive on 2 posts or quick scan on 50 |Works on everything
| ๐ Universal Scraping | Auto-fallback: basic โ JS render โ geo-targeting | Handles SPAs, paywalls, and geo-restricted content |AI-powered synthesis
| ๐ง Deep Research | Batch research with web search and citations | Get comprehensive answers to complex questions |Use what you need
| ๐งฉ Modular Design | Each tool works independently | Pay only for the APIs you actually use |
---
๐web_searchBatch Google search | ๐ฌsearch_redditFind Reddit discussions | ๐get_reddit_postFetch posts + comments | ๐scrape_linksExtract any URL | ๐งdeep_researchAI synthesis |
Batch web search using Google via Serper API. Search up to 100 keywords in parallel.
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| keywords | string[] | Yes | Search queries (1-100). Use distinct keywords for maximum coverage. |
Supports Google operators: site:, -exclusion, "exact phrase", filetype:
`json`
{
"keywords": [
"best IDE 2025",
"VS Code alternatives",
"Cursor vs Windsurf comparison"
]
}
---
Search Reddit via Google with automatic site:reddit.com filtering.
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| queries | string[] | Yes | Search queries (max 10) |date_after
| | string | No | Filter results after date (YYYY-MM-DD) |
Search operators: intitle:keyword, "exact phrase", OR, -exclude
`json`
{
"queries": [
"best mechanical keyboard 2025",
"intitle:keyboard recommendation"
],
"date_after": "2024-01-01"
}
---
Fetch Reddit posts with smart comment allocation (1,000 comment budget distributed automatically).
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| urls | string[] | Yes | โ | Reddit post URLs (2-50) |fetch_comments
| | boolean | No | true | Whether to fetch comments |max_comments
| | number | No | auto | Override comment allocation |
Smart Allocation:
- 2 posts โ ~500 comments/post (deep dive)
- 10 posts โ ~100 comments/post
- 50 posts โ ~20 comments/post (quick scan)
`json`
{
"urls": [
"https://reddit.com/r/programming/comments/abc123/post_title",
"https://reddit.com/r/webdev/comments/def456/another_post"
]
}
---
Universal URL content extraction with automatic fallback modes.
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| urls | string[] | Yes | โ | URLs to scrape (3-50) |timeout
| | number | No | 30 | Timeout per URL (seconds) |use_llm
| | boolean | No | false | Enable AI extraction |what_to_extract
| | string | No | โ | Extraction instructions for AI |
Automatic Fallback: Basic โ JS rendering โ JS + US geo-targeting
`json`
{
"urls": ["https://example.com/article1", "https://example.com/article2"],
"use_llm": true,
"what_to_extract": "Extract the main arguments and key statistics"
}
---
AI-powered batch research with web search and citations.
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| questions | object[] | Yes | Research questions (2-10) |questions[].question
| | string | Yes | The research question |questions[].file_attachments
| | object[] | No | Files to include as context |
Token Allocation: 32,000 tokens distributed across questions:
- 2 questions โ 16,000 tokens/question (deep dive)
- 10 questions โ 3,200 tokens/question (rapid multi-topic)
`json`
{
"questions": [
{ "question": "What are the current best practices for React Server Components in 2025?" },
{ "question": "Compare Bun vs Node.js for production workloads with benchmarks." }
]
}
---
Research Powerpack uses a modular architecture. Tools are automatically enabled based on which API keys you provide:
| ENV Variable | Tools Enabled | Free Tier |
|:------------:|:-------------:|:---------:|
| SERPER_API_KEY | web_search, search_reddit | 2,500 queries/mo |REDDIT_CLIENT_ID
| + SECRET | get_reddit_post | Unlimited |SCRAPEDO_API_KEY
| | scrape_links | 1,000 credits/mo |OPENROUTER_API_KEY
| | deep_research + AI in scrape_links | Pay-as-you-go |RESEARCH_MODEL
| | Model for deep_research | Default: perplexity/sonar-deep-research |LLM_EXTRACTION_MODEL
| | Model for AI extraction in scrape_links | Default: openrouter/gpt-oss-120b:nitro |
`bashSearch-only mode (just web_search and search_reddit)
SERPER_API_KEY=xxx
---
๐ API Key Setup Guides
๐ Serper API (Google Search) โ FREE: 2,500 queries/month
$3
- Fast Google search results via API
- Enables web_search and search_reddit tools$3
1. Go to serper.dev
2. Click "Get API Key" (top right)
3. Sign up with email or Google
4. Copy your API key from the dashboard
5. Add to your config:
`
SERPER_API_KEY=your_key_here
`$3
- Free: 2,500 queries/month
- Paid: $50/month for 50,000 queries
๐ค Reddit OAuth โ FREE: Unlimited access
$3
- Full Reddit API access
- Fetch posts and comments with upvote sorting
- Enables get_reddit_post tool$3
1. Go to reddit.com/prefs/apps
2. Scroll down and click "create another app..."
3. Fill in:
- Name: research-powerpack (or any name)
- App type: Select "script" (important!)
- Redirect URI: http://localhost:8080
4. Click "create app"
5. Copy your credentials:
- Client ID: The string under your app name
- Client Secret: The "secret" field
6. Add to your config:
`
REDDIT_CLIENT_ID=your_client_id
REDDIT_CLIENT_SECRET=your_client_secret
`
๐ Scrape.do (Web Scraping) โ FREE: 1,000 credits/month
$3
- JavaScript rendering support
- Geo-targeting and CAPTCHA handling
- Enables scrape_links tool$3
1. Go to scrape.do
2. Click "Start Free"
3. Sign up with email
4. Copy your API key from the dashboard
5. Add to your config:
`
SCRAPEDO_API_KEY=your_key_here
`$3
- Basic scrape: 1 credit
- JavaScript rendering: 5 credits
- Geo-targeting: +25 credits
๐ง OpenRouter (AI Models) โ Pay-as-you-go
$3
- Access to 100+ AI models via one API
- Enables deep_research tool
- Enables AI extraction in scrape_links$3
1. Go to openrouter.ai
2. Sign up with Google/GitHub/email
3. Go to openrouter.ai/keys
4. Click "Create Key"
5. Copy the key (starts with sk-or-...)
6. Add to your config:
`
OPENROUTER_API_KEY=sk-or-v1-xxxxx
`$3
`bash
Default (optimized for research)
RESEARCH_MODEL=perplexity/sonar-deep-researchFast and capable
RESEARCH_MODEL=x-ai/grok-4.1-fastHigh quality
RESEARCH_MODEL=anthropic/claude-3.5-sonnetBudget-friendly
RESEARCH_MODEL=openai/gpt-4o-mini
`$3
`bash
Default (fast and cost-effective for extraction)
LLM_EXTRACTION_MODEL=openrouter/gpt-oss-120b:nitroHigh quality extraction
LLM_EXTRACTION_MODEL=anthropic/claude-3.5-sonnetBudget-friendly
LLM_EXTRACTION_MODEL=openai/gpt-4o-mini
`> Note:
RESEARCH_MODEL and LLM_EXTRACTION_MODEL are independent. You can use a powerful model for deep research and a faster/cheaper model for content extraction, or vice versa.---
๐ฅ Recommended Workflows
$3
`
1. web_search โ ["React vs Vue 2025", "Next.js vs Nuxt comparison"]
2. search_reddit โ ["best frontend framework 2025", "Next.js production experience"]
3. get_reddit_post โ [URLs from step 2]
4. scrape_links โ [Documentation and blog URLs from step 1]
5. deep_research โ [Synthesize findings into specific questions]
`$3
`
1. web_search โ ["competitor name review", "competitor vs alternatives"]
2. scrape_links โ [Competitor websites, review sites]
3. search_reddit โ ["competitor name experience", "switching from competitor"]
4. get_reddit_post โ [URLs from step 3]
`$3
`
1. web_search โ ["exact error message", "error + framework name"]
2. search_reddit โ ["error message", "framework + error type"]
3. get_reddit_post โ [URLs with solutions]
4. scrape_links โ [Stack Overflow answers, GitHub issues]
`---
๐ฅ Enable Full Power Mode
For the best research experience, configure all four API keys:
`bash
SERPER_API_KEY=your_serper_key # Free: 2,500 queries/month
REDDIT_CLIENT_ID=your_reddit_id # Free: Unlimited
REDDIT_CLIENT_SECRET=your_reddit_secret
SCRAPEDO_API_KEY=your_scrapedo_key # Free: 1,000 credits/month
OPENROUTER_API_KEY=your_openrouter_key # Pay-as-you-go
`This unlocks:
- 5 research tools working together
- AI-powered content extraction in scrape_links
- Deep research with web search and citations
- Complete Reddit mining (search โ fetch โ analyze)
Total setup time: ~10 minutes. Total free tier value: ~$50/month equivalent.
---
๐ ๏ธ Development
`bash
Clone
git clone https://github.com/yigitkonur/research-powerpack-mcp.git
cd research-powerpack-mcpInstall
npm installDevelopment
npm run devBuild
npm run buildType check
npm run typecheck
`---
๐๏ธ Architecture (v3.4.0+)
The codebase uses a YAML-driven configuration system with aggressive LLM optimization (v3.5.0+):
$3
| Component | File | Purpose |
|-----------|------|---------|
| Tool Definitions |
src/config/yaml/tools.yaml | Single source of truth for all tool metadata |
| Handler Registry | src/tools/registry.ts | Declarative tool registration + executeTool wrapper |
| YAML Loader | src/config/loader.ts | Parses YAML, generates MCP-compatible definitions (cached) |
| Concurrency Utils | src/utils/concurrency.ts | Bounded parallel execution (pMap/pMapSettled) |
| Shared Utils | src/tools/utils.ts | Common utility functions |Adding a new tool:
1. Add tool definition to
tools.yaml
2. Create handler in src/tools/
3. Register in src/tools/registry.tsSee
docs/refactoring/04-migration-guide.md for detailed instructions.$3
All parallel operations use bounded concurrency to prevent CPU spikes and API rate limits:
| Operation | Before | After |
|-----------|--------|-------|
| Reddit search queries | 50 concurrent | 8 concurrent |
| Web scraping batches | 30 concurrent | 10 concurrent |
| Deep research questions | Unbounded | 3 concurrent |
| Reddit post fetching | 10 concurrent | 5 concurrent |
| File attachments | Unbounded | 5 concurrent |
Additional optimizations:
- YAML config cached in memory (no repeated disk reads)
- Async file I/O (no event loop blocking)
- Pre-compiled regex patterns for hot paths
- Reddit auth token deduplication (prevents concurrent token requests)
$3
All tools include aggressive guidance to force LLMs to use them optimally:
| Feature | Description |
|---------|-------------|
| Configurable Limits | All min/max values in YAML (
limits section) |
| BAD vs GOOD Examples | Every tool shows anti-patterns and perfect usage |
| Aggressive Phrasing | Changed from "you can" to "you MUST" |
| Visual Formatting | Emoji headers, section dividers, icons for visual scanning |
| Templates | Structured formats for questions, extractions, file descriptions |Key Enhancements:
-
search_reddit: Minimum 10 queries (was 3), 10-category formula
- deep_research: 7-section question template, file attachment requirements
- scrape_links: Extraction template with OR statements, use_llm=true push
- web_search: Minimum 3 keywords, search operator examples
- file_attachments: Numbered 5-section description templateSee
docs/refactoring/07-llm-optimization-summary.md for full details.---
๐ฅ Common Issues & Quick Fixes
Expand for troubleshooting tips
| Problem | Solution |
| :--- | :--- |
| Tool returns "API key not configured" | Add the required ENV variable to your MCP config. The error message tells you exactly which key is missing. |
| Reddit posts returning empty | Check your
REDDIT_CLIENT_ID and REDDIT_CLIENT_SECRET. Make sure you created a "script" type app. |
| Scraping fails on JavaScript sites | This is expected for first attempt. The tool auto-retries with JS rendering. If still failing, the site may be blocking scrapers. |
| Deep research taking too long | Use a faster model like x-ai/grok-4.1-fast instead of perplexity/sonar-deep-research`. |---
Built with ๐ฅ because manually researching for your AI is a soul-crushing waste of time.
MIT ยฉ Yiฤit Konur