CLI tools for AI agents
npm install cursor-toolsbrowser agent command for complex multi-step taskscursor-tools is optimized for Cursor Composer Agent but it can be used by any coding agent that can execute commands
After installation, to see AI teamwork in action just ask Cursor Composer to use Perplexity or Gemini.
Here are two examples:
cursor-tools provides a CLI that your AI agent can use to expand its capabilities. cursor-tools is designed to be installed globally, providing system-wide access to its powerful features. When you run cursor-tools install we automatically add a prompt section to your Cursor project rules. During installation, you can choose between:
- The new .cursor/rules/cursor-tools.mdc file (recommended)
- The legacy .cursorrules file (for backward compatibility)
You can also control this using the USE_LEGACY_CURSORRULES environment variable:
- USE_LEGACY_CURSORRULES=true - Use legacy .cursorrules file
- USE_LEGACY_CURSORRULES=false - Use new .cursor/rules/cursor-tools.mdc file
- If not set, defaults to legacy mode for backward compatibility
cursor-tools requires a Perplexity API key and a Google AI API key.
cursor-tools is a node package that should be installed globally.
Install cursor-tools globally:
``bash`
npm install -g cursor-tools
Then run the interactive setup:
`bash`
cursor-tools install .
This command will:
1. Guide you through API key configuration
2. Update your Cursor project rules for Cursor integration (using .cursor/rules/cursor-tools.mdc or existing .cursorrules)
- Node.js 18 or later
- Perplexity API key
- Google Gemini API key
- For browser commands:
- Playwright (npm install --global playwright)act
- OpenAI API key or Anthropic API key (for , extract, observe, and agent commands)
cursor-tools uses Gemini-2.0 because it is the only good LLM with a context window that goes up to 2 million tokens - enough to handle and entire codebase in one shot. Gemini 2.0 experimental models that we use by default are currently free to use on Google and you need a Google Cloud project to create an API key.
cursor-tools uses Perplexity because Perplexity has the best web search api and indexes and it does not hallucinate. Perplexity Pro users can get an API key with their pro account and recieve $5/month of free credits (at time of writing). Support for Google search grounding is coming soon but so far testing has shown it still frequently hallucinates things like APIs and libraries that don't exist.
- Ask Cursor Agent to have Gemini review its work
- Ask Cursor Agent to generate documentation for external dependencies and write it to a local-docs/ folder
If you do something cool with cursor-tools please let me know on twitter or make a PR to add to this section!
Use Cursor Composer in agent mode with command execution (not sure what this means, see section below on Cursor Agent configuration). If you have installed the cursor-tools prompt to your .cursorrules (or equivalent) just ask your AI coding agent/assistant to use "cursor-tools" to do things.
allows direct querying of any model from any provider. It's best for simple questions where you want to use a specific model or compare responses from different models.
- cursor-tools web uses an AI teammate with web search capability to answer questions. web is best for finding up-to-date information from the web that is not specific to the repository such as how to use a library to search for known issues and error messages or to get suggestions on how to do something. Web is a teammate who knows tons of stuff and is always up to date.
- cursor-tools repo uses an AI teammate with large context window capability to answer questions. repo sends the entire repo as context so it is ideal for questions about how things work or where to find something, it is also great for code review, debugging and planning. is a teammate who knows the entire codebase inside out and understands how everything works together.
- cursor-tools plan uses an AI teammate with reasoning capability to plan complex tasks. Plan uses a two step process. First it does a whole repo search with a large context window model to find relevant files. Then it sends only those files as context to a thinking model to generate a plan it is great for planning complex tasks and for debugging and refactoring. Plan is a teammate who is really smart on a well defined problem, although doesn't consider the bigger picture.
- cursor-tools doc uses an AI teammate with large context window capability to generate documentation for local or github hosted repositories by sending the entire repo as context. doc can be given precise documentation tasks or can be asked to generate complete docs from scratch it is great for generating docs updates or for generating local documentation for a libary or API that you use! Doc is a teammate who is great at summarising and explaining code, in this repo or in any other repo!
- cursor-tools browser uses an AI teammate with browser control (aka operator) capability to operate web browsers. browser can operate in a hidden (headless) mode to invisibly test and debug web apps or it can be used to connect to an existing browser session to interactively share your browser with Cursor agent it is great for testing and debugging web apps and for carrying out any task that can be done in a browser such as reading information from a bug ticket or even filling out a form. Browser is a teammate who can help you test and debug web apps, and can share control of your browser to perform small browser-based tasks.Note: For repo, doc and plan commands the repository content that is sent as context can be reduced by filtering out files in a .repomixignore file.
$3
When using cursor-tools with Cursor Composer, you can use these nicknames:
- "Gemini" is a nickname for cursor-tools repo
- "Perplexity" is a nickname for cursor-tools web
- "Stagehand" is a nickname for cursor-tools browser
- "Operator" is a nickname for cursor-tools browser agent
$3
"Please implement country specific stripe payment pages for the USA, UK, France and Germany. Use cursor-tools web to check the available stripe payment methods in each country."Note: in most cases you can say "ask Perplexity" instead of "use cursor-tools web" and it will work the same.
$3
"Let's refactor our User class to allow multiple email aliases per user. Use cursor-tools repo to ask for a plan including a list of all files that need to be changed."Note: in most cases you can say "ask Gemini" instead of "use cursor-tools repo" and it will work the same.
$3
"Use cursor-tools to generate documentation for the Github repo https://github.com/kait-http/kaito" and write it to docs/kaito.md"Note: in most cases you can say "generate documentation" instead of "use cursor-tools doc" and it will work the same.
$3
"Use cursor-tools github to fetch issue 123 and suggest a solution to the user's problem""Use cursor-tools github to fetch PR 321 and see if you can fix Andy's latest comment"
Note: in most cases you can say "fetch issue 123" or "fetch PR 321" instead of "use cursor-tools github" and it will work the same.
$3
"Use cursor-tools to open the users page and check the error in the console logs, fix it""Use cursor-tools to test the form field validation logic. Take screenshots of each state"
"Use cursor-tools to open https://example.com/foo the and check the error in the network logs, what could be causing it?"
$3
"Use cursor-tools browser agent to analyze the login page and complete the authentication process""Use cursor-tools browser agent to find products under $50 with at least 4-star rating and add them to cart"
"Use cursor-tools browser agent to debug this form submission issue by exploring the page and trying different inputs"
Note: in most cases you can say "Use Stagehand" instead of "use cursor-tools" and it will work the same.
$3
"Use cursor-tools ask to compare how different models answer this question: 'What are the key differences between REST and GraphQL?'""Ask OpenAI's o3-mini model to explain the concept of dependency injection."
Note: The ask command requires both --provider and --model parameters to be specified. This command is generally less useful than other commands like
repo or plan because it does not include any context from your codebase or repository.
Authentication and API Keys
cursor-tools requires API keys for Perplexity AI, Google Gemini, and optionally for OpenAI, Anthropic and OpenRouter. These can be configured in two ways:1. Interactive Setup: Run
cursor-tools install and follow the prompts
2. Manual Setup: Create ~/.cursor-tools/.env in your home directory or .cursor-tools.env in your project root:
`env
PERPLEXITY_API_KEY="your-perplexity-api-key"
GEMINI_API_KEY="your-gemini-api-key"
OPENAI_API_KEY="your-openai-api-key" # Optional, for Stagehand
ANTHROPIC_API_KEY="your-anthropic-api-key" # Optional, for Stagehand and MCP
OPENROUTER_API_KEY="your-openrouter-api-key" # Optional, for MCP
GITHUB_TOKEN="your-github-token" # Optional, for enhanced GitHub access
`
* At least one of ANTHROPIC_API_KEY and OPENROUTER_API_KEY must be provided to use the mcp commands.$3
cursor-tools supports multiple authentication methods for accessing the Google Gemini API, providing flexibility for different environments and security requirements. You can choose from the following methods:1. API Key (Default)
- This is the simplest method and continues to be supported for backward compatibility.
- Set the
GEMINI_API_KEY environment variable to your API key string obtained from Google AI Studio.
- Example:
`env
GEMINI_API_KEY="your-api-key-here"
`2. Service Account JSON Key File
- For enhanced security, especially in production environments, use a service account JSON key file.
- Set the
GEMINI_API_KEY environment variable to the path of your downloaded service account JSON key file.
- Example:
`env
GEMINI_API_KEY="./path/to/service-account.json"
`
- This method enables access to the latest Gemini models available through Vertex AI, such as gemini-2.0-flash.3. Application Default Credentials (ADC) (Recommended for Google Cloud Environments)
- ADC is ideal when running
cursor-tools within Google Cloud environments (e.g., Compute Engine, Kubernetes Engine) or for local development using gcloud.
- Set the GEMINI_API_KEY environment variable to adc.
- Example:
`env
GEMINI_API_KEY="adc"
`
- Setup Instructions: First, authenticate locally using gcloud:
`bash
gcloud auth application-default login
`
AI Team Features
$3
Use Perplexity AI to get up-to-date information directly within Cursor:
`bash
cursor-tools web "What's new in TypeScript 5.7?"
`$3
Leverage Google Gemini 2.0 models with 1M+ token context windows for codebase-aware assistance and implementation planning:`bash
Get context-aware assistance
cursor-tools repo "Explain the authentication flow in this project, which files are involved?"Generate implementation plans
cursor-tools plan "Add user authentication to the login page"
`The plan command uses multiple AI models to:
1. Identify relevant files in your codebase (using Gemini by default)
2. Extract content from those files
3. Generate a detailed implementation plan (using o3-mini by default)
Plan Command Options:
-
--fileProvider=: Provider for file identification (gemini, openai, anthropic, perplexity, modelbox, or openrouter)
- --thinkingProvider=: Provider for plan generation (gemini, openai, anthropic, perplexity, modelbox, or openrouter)
- --fileModel=: Model to use for file identification
- --thinkingModel=: Model to use for plan generation
- --fileMaxTokens=: Maximum tokens for file identification
- --thinkingMaxTokens=: Maximum tokens for plan generation
- --debug: Show detailed error informationRepository context is created using Repomix. See repomix configuration section below for details on how to change repomix behaviour.
Above 1M tokens cursor-tools will always send requests to Gemini 2.0 Pro as it is the only model that supports 1M+ tokens.
The Gemini 2.0 Pro context limit is 2M tokens, you can add filters to .repomixignore if your repomix context is above this limit.
$3
Automate browser interactions for web scraping, testing, and debugging:Important: The
browser command requires the Playwright package to be installed separately in your project:
`bash
npm install playwright
or
yarn add playwright
or
pnpm add playwright
`1.
open - Open a URL and capture page content:
`bash
Open and capture HTML content, console logs and network activity (enabled by default)
cursor-tools browser open "https://example.com" --htmlTake a screenshot
cursor-tools browser open "https://example.com" --screenshot=page.pngDebug in an interactive browser session
cursor-tools browser open "https://example.com" --connect-to=9222
`2.
act - Execute actions using natural language - Agent tells the browser-use agent what to do:
`bash
Single action
cursor-tools browser act "Login as 'user@example.com'" --url "https://example.com/login"Multi-step workflow using pipe separator
cursor-tools browser act "Click Login | Type 'user@example.com' into email | Click Submit" --url "https://example.com"Record interaction video
cursor-tools browser act "Fill out registration form" --url "https://example.com/signup" --video="./recordings"
`3.
observe - Analyze interactive elements:
`bash
Get overview of interactive elements
cursor-tools browser observe "What can I interact with?" --url "https://example.com"Find specific elements
cursor-tools browser observe "Find the login form" --url "https://example.com"
`4.
extract - Extract data using natural language:
`bash
Extract specific content
cursor-tools browser extract "Get all product prices" --url "https://example.com/products"Save extracted content
cursor-tools browser extract "Get article text" --url "https://example.com/blog" --html > article.htmlExtract with network monitoring
cursor-tools browser extract "Get API responses" --url "https://example.com/api-test" --network
`#### Browser Command Options
All browser commands (
open, act, observe, extract, agent) support these options:
- --console: Capture browser console logs (enabled by default, use --no-console to disable)
- --html: Capture page HTML content (disabled by default)
- --network: Capture network activity (enabled by default, use --no-network to disable)
- --screenshot=: Save a screenshot of the page
- --timeout=: Set navigation timeout (default: 120000ms for Stagehand operations, 30000ms for navigation)
- --viewport=: Set viewport size (e.g., 1280x720)
- --headless: Run browser in headless mode (default: true)
- --no-headless: Show browser UI (non-headless mode) for debugging
- --connect-to=: Connect to existing Chrome instance. Special values: 'current' (use existing page), 'reload-current' (refresh existing page)
- --wait=: Wait after page load (e.g., 'time:5s', 'selector:#element-id')
- --video=: Save a video recording (1280x720 resolution, timestamped subdirectory). Not available when using --connect-to
- --url=: Required for act, observe, extract, and agent commands
- --evaluate=: JavaScript code to execute in the browser before the main commandAdditional options for the
agent subcommand:
- --provider=: AI provider to use (openai, anthropic)
- --model=: Model to use for the agent:
- For OpenAI: computer-use-preview-2025-03-11
- For Anthropic: claude-3-5-sonnet-20240620 or claude-3-7-sonnet-20250219Notes on Connecting to an existing browser session with --connect-to
- DO NOT ask browser act to "wait" for anything, the wait command is currently disabled in Stagehand.
- When using
--connect-to, viewport is only changed if --viewport is explicitly provided
- Video recording is not available when using --connect-to
- Special --connect-to values:
- current: Use the existing page without reloading
- reload-current: Use the existing page and refresh it (useful in development)#### Video Recording
All browser commands support video recording of the browser interaction in headless mode (not supported with --connect-to):
- Use
--video= to enable recording
- Videos are saved at 1280x720 resolution in timestamped subdirectories
- Recording starts when the browser opens and ends when it closes
- Videos are saved as .webm filesExample:
`bash
Record a video of filling out a form
cursor-tools browser act "Fill out registration form with name John Doe" --url "http://localhost:3000/signup" --video="./recordings"
`#### Browser Agent
The
browser agent subcommand provides autonomous browser operation for complex multi-step tasks:`bash
Execute an autonomous browser task with a single instruction
cursor-tools browser agent "Analyze the login page, fill out the form with test@example.com and password123, then submit it" --url "https://example.com/login"Browser agent with custom model
cursor-tools browser agent "Find and click on all broken image links" --url "https://example.com" --provider openai --model computer-use-preview-2025-03-11Record a video of the agent's work
cursor-tools browser agent "Complete the multi-page checkout process" --url "https://example.com/cart" --video="./recordings"
`The browser agent:
- Makes decisions based on page content without requiring step-by-step instructions
- Handles unexpected situations and errors more robustly than act/extract commands
- Supports both OpenAI and Anthropic Computer Using Agent (CUA) models
- Works well for complex workflows that involve decision-making based on dynamic content
#### Console and Network Logging
Console logs and network activity are captured by default:
- Use
--no-console to disable console logging
- Use --no-network to disable network logging
- Logs are displayed in the command output#### Complex Actions
The
act command supports chaining multiple actions using the pipe (|) separator:`bash
Login sequence with console/network logging (enabled by default)
cursor-tools browser act "Click Login | Type 'user@example.com' into email | Click Submit" --url "http://localhost:3000/login"Form filling with multiple fields
cursor-tools browser act "Select 'Mr' from title | Type 'John' into first name | Type 'Doe' into last name | Click Next" --url "http://localhost:3000/register"Record complex interaction
cursor-tools browser act "Fill form | Submit | Verify success" --url "http://localhost:3000/signup" --video="./recordings"
`#### Troubleshooting Browser Commands
Common issues and solutions:
1. Element Not Found Errors
- Use
--no-headless to visually debug the page
- Use browser observe to see what elements Stagehand can identify
- Check if the element is in an iframe or shadow DOM
- Ensure the page has fully loaded (try increasing --timeout)2. Stagehand API Errors
- Verify your OpenAI or Anthropic API key is set correctly
- Check if you have sufficient API credits
- Try switching models using
--model3. Network Errors
- Check your internet connection
- Verify the target website is accessible
- Try increasing the timeout with
--timeout
- Check if the site blocks automated access4. Video Recording Issues
- Ensure the target directory exists and is writable
- Check disk space
- Video recording is not available with
--connect-to5. Performance Issues
- Use
--headless mode for better performance (default)
- Reduce the viewport size with --viewport
- Consider using --connect-to for development
Skills
$3
Access GitHub issues and pull requests directly from the command line with rich formatting and full context:`bash
List recent PRs or issues
cursor-tools github pr
cursor-tools github issueView specific PR or issue with full discussion
cursor-tools github pr 123
cursor-tools github issue 456
`The GitHub commands provide:
- View of 10 most recent open PRs or issues when no number specified
- Detailed view of specific PR/issue including:
- PR/Issue description and metadata
- Code review comments grouped by file (PRs only)
- Full discussion thread
- Labels, assignees, milestones and reviewers
- Support for both local repositories and remote GitHub repositories
- Markdown-formatted output for readability
Authentication Methods:
The commands support multiple authentication methods:
1. GitHub token via environment variable:
GITHUB_TOKEN=your_token_here
2. GitHub CLI integration (if gh is installed and logged in)
3. Git credentials (stored tokens or Basic Auth)Without authentication:
- Public repositories: Limited to 60 requests per hour
- Private repositories: Not accessible
With authentication:
- Public repositories: 5,000 requests per hour
- Private repositories: Full access (with appropriate token scopes)
$3
Automate iOS app building, testing, and running in the simulator:`bash
Available subcommands
cursor-tools xcode build # Build Xcode project and report errors
cursor-tools xcode run # Build and run app in simulator
cursor-tools xcode lint # Analyze code and offer to fix warnings
`Build Command Options:
`bash
Specify custom build path (derived data)
cursor-tools xcode build buildPath=/custom/build/pathSpecify target device
cursor-tools xcode build destination="platform=iOS Simulator,name=iPhone 15"
`Run Command Options:
`bash
Run on iPhone simulator (default)
cursor-tools xcode run iphoneRun on iPad simulator
cursor-tools xcode run ipadRun on specific device with custom build path
cursor-tools xcode run device="iPhone 16 Pro" buildPath=/custom/build/path
`The Xcode commands provide:
- Automatic project/workspace detection
- Dynamic app bundle identification
- Build output streaming with error parsing
- Simulator device management
- Support for both iPhone and iPad simulators
- Custom build path specification to control derived data location
$3
Generate comprehensive documentation for your repository or any GitHub repository:
`bash
Document local repository and save to file
cursor-tools doc --save-to=docs.mdDocument remote GitHub repository (both formats supported)
cursor-tools doc --from-github=username/repo-name@branch
cursor-tools doc --from-github=https://github.com/username/repo-name@branchSave documentation to file (with and without a hint)
This is really useful to generate local documentation for libraries and dependencies
cursor-tools doc --from-github=eastlondoner/cursor-tools --save-to=docs/CURSOR-TOOLS.md
cursor-tools doc --from-github=eastlondoner/cursor-tools --save-to=docs/CURSOR-TOOLS.md --hint="only information about the doc command"
`
Configuration
$3
Customize cursor-tools behavior by creating a cursor-tools.config.json file. This file can be created either globally in ~/.cursor-tools/cursor-tools.config.json or locally in your project root.The cursor-tools.config file configures the local default behaviour for each command and provider.
Here is an example of a typical cursor-tools.config.json file, showing some of the most common configuration options:
`json
{
// Commands
"repo": {
"provider": "openrouter",
"model": "google/gemini-2.0-pro-exp-02-05:free",
},
"doc": {
"provider": "openrouter",
"model": "anthropic/claude-3.7-sonnet",
"maxTokens": 4096
},
"web": {
"provider": "gemini",
"model": "gemini-2.0-pro-exp",
},
"plan": {
"fileProvider": "gemini",
"thinkingProvider": "perplexity",
"thinkingModel": "r1-1776"
},
"browser": {
"headless": false,
},
//... // Providers
"stagehand": {
"model": "claude-3-7-sonnet-latest", // For Anthropic provider
"provider": "anthropic", // or "openai"
"timeout": 90000
},
"openai": {
"model": "gpt-4o"
},
//...
}
`For details of all configuration options, see CONFIGURATION.md. This includes details of all the configuration options and how to use them.
$3
The GitHub commands support several authentication methods:1. Environment Variable: Set
GITHUB_TOKEN in your environment:
`env
GITHUB_TOKEN=your_token_here
`2. GitHub CLI: If you have the GitHub CLI (
gh) installed and are logged in, cursor-tools will automatically use it to generate tokens with the necessary scopes.3. Git Credentials: If you have authenticated git with GitHub (via HTTPS), cursor-tools will automatically:
- Use your stored GitHub token if available (credentials starting with
ghp_ or gho_)
- Fall back to using Basic Auth with your git credentialsTo set up git credentials:
1. Configure git to use HTTPS instead of SSH:
`bash
git config --global url."https://github.com/".insteadOf git@github.com:
`
2. Store your credentials:
`bash
git config --global credential.helper store # Permanent storage
# Or for macOS keychain:
git config --global credential.helper osxkeychain
`
3. The next time you perform a git operation requiring authentication, your credentials will be storedAuthentication Status:
- Without authentication:
- Public repositories: Limited to 60 requests per hour
- Private repositories: Not accessible
- Some features may be restricted
- With authentication (any method):
- Public repositories: 5,000 requests per hour
- Private repositories: Full access (if token has required scopes)
cursor-tools will automatically try these authentication methods in order:
1.
GITHUB_TOKEN environment variable
2. GitHub CLI token (if gh is installed and logged in)
3. Git credentials (stored token or Basic Auth)If no authentication is available, it will fall back to unauthenticated access with rate limits.
$3
When generating documentation, cursor-tools uses Repomix to analyze your repository. By default, it excludes certain files and directories that are typically not relevant for documentation:
- Node modules and package directories (
node_modules/, packages/, etc.)
- Build output directories (dist/, build/, etc.)
- Version control directories (.git/)
- Test files and directories (test/, tests/, __tests__/, etc.)
- Configuration files (.env, .config, etc.)
- Log files and temporary files
- Binary files and media filesYou can customize the files and folders to exclude by adding a
.repomixignore file to your project root.Example
.repomixignore file for a Laravel project:
`
vendor/
public/
database/
storage/
.idea
.env
`This ensures that the documentation focuses on your actual source code and documentation files.
Support to customize the input files to include is coming soon - open an issue if you run into problems here.
#### Model Selection
The
browser commands support different AI models for processing. You can select the model using the --model option:`bash
Use gpt-4o
cursor-tools browser act "Click Login" --url "https://example.com" --model=gpt-4oUse Claude 3.7 Sonnet
cursor-tools browser act "Click Login" --url "https://example.com" --model=claude-3-7-sonnet-latest
``