@arizeai/phoenix-cli

A command-line interface for Arize Phoenix. Fetch traces, list datasets, and export experiment results directly from your terminal—or pipe them into AI coding agents like Claude Code, Cursor, Codex, and Gemini CLI.

Installation

``bash npm install -g @arizeai/phoenix-cli`

Or run directly with npx:

`bash npx @arizeai/phoenix-cli`

`Quick Start`

`bash

`Configure your Phoenix instance`


export PHOENIX_HOST=http://localhost:6006
export PHOENIX_PROJECT=my-project
export PHOENIX_API_KEY=your-api-key  # if authentication is enabled
Fetch the most recent trace

px traces --limit 1
Fetch a specific trace by ID

px trace abc123def456
Export traces to a directory

px traces ./my-traces --limit 50


Environment Variables

| Variable | Description | | ------------------------ | ---------------------------------------------------- | |PHOENIX_HOST | Phoenix API endpoint (e.g., http://localhost:6006) | |PHOENIX_PROJECT| Project name or ID | |PHOENIX_API_KEY| API key for authentication (if required) | |PHOENIX_CLIENT_HEADERS | Custom headers as JSON string |

CLI flags take priority over environment variables.

`Commands`

`$3`

List all available projects.

`bash px projects px projects --format raw # JSON output for piping`

`$3`

Fetch recent traces from the configured project.

`bash px traces --limit 10 # Output to stdout px traces ./my-traces --limit 10 # Save to directory px traces --last-n-minutes 60 --limit 20 # Filter by time px traces --since 2026-01-13T10:00:00Z # Since timestamp px traces --format raw --no-progress | jq # Pipe to jq`

| Option | Description | Default | | --------------------------- | ----------------------------------------- | -------- | |[directory]| Save traces as JSON files to directory | stdout | |-n, --limit | Number of traces to fetch (newest first) | 10 | |--last-n-minutes | Only fetch traces from the last N minutes | — | |--since | Fetch traces since ISO timestamp | — | |--format | pretty, json, or raw | pretty| |--no-progress| Disable progress output | — | |--include-annotations | Include span annotations in trace export | — |

`$3`

Fetch a specific trace by ID.

`bash px trace abc123def456 px trace abc123def456 --file trace.json # Save to file px trace abc123def456 --format raw | jq # Pipe to jq`

| Option | Description | Default | | ----------------------- | ---------------------------------------- | -------- | |--file | Save to file instead of stdout | stdout | |--format | pretty, json, or raw | pretty| |--include-annotations | Include span annotations in trace export | — |

`$3`

List all available datasets.

`bash px datasets px datasets --format json # JSON output px datasets --format raw --no-progress | jq # Pipe to jq`

| Option | Description | Default | | ------------------- | -------------------------- | -------- | |--format | pretty, json, or raw | pretty| |--limit | Maximum number of datasets | — |

`$3`

Fetch examples from a dataset.

`bash px dataset query_response # Fetch all examples px dataset query_response --split train # Filter by split px dataset query_response --split train --split test # Multiple splits px dataset query_response --version # Specific version px dataset query_response --file dataset.json # Save to file px dataset query_response --format raw | jq '.examples[].input'`

| Option | Description | Default | | ---------------- | ---------------------------------------- | -------- | |--split | Filter by split (can be used repeatedly) | — | |--version | Fetch from specific dataset version | latest | |--file | Save to file instead of stdout | stdout | |--format | pretty, json, or raw | pretty |

`$3`

List experiments for a dataset, optionally exporting full data to files.

`bash px experiments --dataset my-dataset # List experiments px experiments --dataset my-dataset --format json # JSON output px experiments --dataset my-dataset ./experiments # Export to directory`

| Option | Description | Default | | ------------------------ | ----------------------------------------- | -------- | |--dataset | Dataset name or ID (required) | — | |[directory]| Export experiment JSON files to directory | stdout | |--format | pretty, json, or raw | pretty| |--limit | Maximum number of experiments | — |

`$3`

Fetch a single experiment with all run data.

`bash px experiment RXhwZXJpbWVudDox px experiment RXhwZXJpbWVudDox --file exp.json # Save to file px experiment RXhwZXJpbWVudDox --format json # JSON output`

| Option | Description | Default | | ------------------- | ------------------------------ | -------- | |--file | Save to file instead of stdout | stdout | |--format | pretty, json, or raw | pretty |

`Output Formats`

pretty (default) — Human-readable tree view:

`┌─ Trace: abc123def456 │ │ Input: What is the weather in San Francisco? │ Output: The weather is currently sunny... │ │ Spans: │ └─ ✓ agent_run (CHAIN) - 1250ms │ ├─ ✓ llm_call (LLM) - 800ms │ └─ ✓ tool_execution (TOOL) - 400ms └─`

json — Formatted JSON with indentation.

raw — Compact JSON for piping to jq or other tools.

`JSON Structure`

`json { "traceId": "abc123def456", "spans": [ { "name": "chat_completion", "context": { "trace_id": "abc123def456", "span_id": "span-1" }, "span_kind": "LLM", "parent_id": null, "start_time": "2026-01-17T10:00:00.000Z", "end_time": "2026-01-17T10:00:01.250Z", "status_code": "OK", "attributes": { "llm.model_name": "gpt-4", "llm.token_count.prompt": 512, "llm.token_count.completion": 256, "input.value": "What is the weather?", "output.value": "The weather is sunny..." } } ], "rootSpan": { ... }, "startTime": "2026-01-17T10:00:00.000Z", "endTime": "2026-01-17T10:00:01.250Z", "duration": 1250, "status": "OK" }`

Spans include OpenInference semantic attributes like llm.model_name, llm.token_count., input.value, output.value, tool.name, and exception..

`Examples`

`$3`

`bash px traces --limit 20 --format raw --no-progress | jq '.[] | select(.status == "ERROR")'`

`$3`

`bash px traces --limit 10 --format raw --no-progress | jq 'sort_by(-.duration) | .[0:3]'`

`$3`

`bash px traces --limit 50 --format raw --no-progress | \ jq -r '.[].spans[] | select(.span_kind == "LLM") | .attributes["llm.model_name"]' | sort -u`

`$3`

`bash px traces --limit 100 --format raw --no-progress | jq '[.[] | select(.status == "ERROR")] | length'`

`$3`

`bash

`List all datasets`


px datasets --format raw --no-progress | jq '.[].name'
Output: "query_response"
List experiments for a dataset

px experiments --dataset query_response --format raw --no-progress | \
  jq '.[] | {id, successful_run_count, failed_run_count}'
Output: {"id":"RXhwZXJpbWVudDox","successful_run_count":249,"failed_run_count":1}
Export all experiment data for a dataset to a directory

px experiments --dataset query_response ./experiments/

$3

`bash

`Get input queries and latency from an experiment`


px experiment RXhwZXJpbWVudDox --format raw --no-progress | \
  jq '.[] | {query: .input.query, latency_ms, trace_id}'
Find failed runs in an experiment

px experiment RXhwZXJpbWVudDox --format raw --no-progress | \
  jq '.[] | select(.error != null) | {query: .input.query, error}'
Output: {"query":"looking for complex fodmap meal ideas","error":"peer closed connection..."}
Calculate average latency across runs

px experiment RXhwZXJpbWVudDox --format raw --no-progress | \
  jq '[.[].latency_ms] | add / length'

---

Community

- 🌍 Slack community
- 📚 Documentation
- 🌟 GitHub
- 🐞 Report bugs
- 𝕏 @ArizePhoenix

@arizeai/phoenix-cli

Installation

``bash npm install -g @arizeai/phoenix-cli`

Or run directly with npx:

`bash npx @arizeai/phoenix-cli`

`Quick Start`

`bash

`Configure your Phoenix instance`


export PHOENIX_HOST=http://localhost:6006
export PHOENIX_PROJECT=my-project
export PHOENIX_API_KEY=your-api-key  # if authentication is enabled
Fetch the most recent trace

px traces --limit 1
Fetch a specific trace by ID

px trace abc123def456
Export traces to a directory

px traces ./my-traces --limit 50


Environment Variables

CLI flags take priority over environment variables.

`Commands`

`$3`

List all available projects.

`bash px projects px projects --format raw # JSON output for piping`

`$3`

Fetch recent traces from the configured project.

`$3`

Fetch a specific trace by ID.

`bash px trace abc123def456 px trace abc123def456 --file trace.json # Save to file px trace abc123def456 --format raw | jq # Pipe to jq`

`$3`

List all available datasets.

`bash px datasets px datasets --format json # JSON output px datasets --format raw --no-progress | jq # Pipe to jq`

`$3`

Fetch examples from a dataset.

`$3`

List experiments for a dataset, optionally exporting full data to files.

`bash px experiments --dataset my-dataset # List experiments px experiments --dataset my-dataset --format json # JSON output px experiments --dataset my-dataset ./experiments # Export to directory`

`$3`

Fetch a single experiment with all run data.

`bash px experiment RXhwZXJpbWVudDox px experiment RXhwZXJpbWVudDox --file exp.json # Save to file px experiment RXhwZXJpbWVudDox --format json # JSON output`

`Output Formats`

pretty (default) — Human-readable tree view:

json — Formatted JSON with indentation.

raw — Compact JSON for piping to jq or other tools.

`JSON Structure`

Spans include OpenInference semantic attributes like llm.model_name, llm.token_count., input.value, output.value, tool.name, and exception..

`Examples`

`$3`

`bash px traces --limit 20 --format raw --no-progress | jq '.[] | select(.status == "ERROR")'`

`$3`

`bash px traces --limit 10 --format raw --no-progress | jq 'sort_by(-.duration) | .[0:3]'`

`$3`

`bash px traces --limit 50 --format raw --no-progress | \ jq -r '.[].spans[] | select(.span_kind == "LLM") | .attributes["llm.model_name"]' | sort -u`

`$3`

`bash px traces --limit 100 --format raw --no-progress | jq '[.[] | select(.status == "ERROR")] | length'`

`$3`

`bash

`List all datasets`


px datasets --format raw --no-progress | jq '.[].name'
Output: "query_response"
List experiments for a dataset

px experiments --dataset query_response --format raw --no-progress | \
  jq '.[] | {id, successful_run_count, failed_run_count}'
Output: {"id":"RXhwZXJpbWVudDox","successful_run_count":249,"failed_run_count":1}
Export all experiment data for a dataset to a directory

px experiments --dataset query_response ./experiments/

$3

`bash

`Get input queries and latency from an experiment`


px experiment RXhwZXJpbWVudDox --format raw --no-progress | \
  jq '.[] | {query: .input.query, latency_ms, trace_id}'
Find failed runs in an experiment

px experiment RXhwZXJpbWVudDox --format raw --no-progress | \
  jq '.[] | select(.error != null) | {query: .input.query, error}'
Output: {"query":"looking for complex fodmap meal ideas","error":"peer closed connection..."}
Calculate average latency across runs

px experiment RXhwZXJpbWVudDox --format raw --no-progress | \
  jq '[.[].latency_ms] | add / length'

---

Community

- 🌍 Slack community
- 📚 Documentation
- 🌟 GitHub
- 🐞 Report bugs
- 𝕏 @ArizePhoenix