FlatAgents

Define LLM agents in YAML. Run them anywhere.

For LLM/machine readers: see MACHINES.md for comprehensive reference.

Why?

- Composition over inheritance — compose stateless agents and checkpointable machines
- Compact structure — easy for LLMs to read and generate
- Simple hook interfaces — escape hatches without complexity; webhook ready
- Inspectable — every agent and machine is readable config
- Language-agnostic — reduce code in any particular runtime
- Common TypeScript interface — single schema for agents, single schema for machines
- Limitations — machine topologies can get complex at scale

Inspired by Kubernetes manifests and character card specifications.

Versioning

All specs (flatagent.d.ts, flatmachine.d.ts, profiles.d.ts) and SDKs (Python, JS) use lockstep versioning. A single version number applies across the entire repository.

Core Concepts

Use machines to write flatagents and flatmachines, they are designed for LLMs.

| Term | What it is |
|------|------------|
| FlatAgent | A single LLM call: model + prompts + output schema |
| FlatMachine | A state machine that orchestrates multiple agents, actions, and state machines |

Use FlatAgent alone for simple tasks. Use FlatMachine when you need multi-step workflows, branching, or error handling.

Examples

| Example | What it demonstrates |
|---------|---------------------|
| helloworld | Minimal setup — single agent, single state machine |
| writer_critic | Multi-agent loop — writer drafts, critic reviews, iterates |
| story_writer | Multi-step creative workflow with chapter generation |
| human-in-the-loop | Pause execution for human approval via hooks |
| error_handling | Error recovery and retry patterns at state machine level |
| dynamic_agent | On-the-fly agent generation from runtime context |
| character_card | Loading agent config from character card format |
| mdap | MDAP voting execution — multi-sample consensus |
| gepa_self_optimizer | Self-optimizing prompts via reflection and critique |
| research_paper_analysis | Document analysis with structured extraction |
| multi_paper_synthesizer | Cross-document synthesis with dynamic machine launching |
| support_triage_json | JSON input/output with classification pipeline |
| parallelism | Parallel machines, dynamic foreach, fire-and-forget launches |

Quick Start

``bash pip install flatagents[all]`

`python from flatagents import FlatAgent

agent = FlatAgent(config_file="reviewer.yml") result = await agent.call(query="Review this code...") print(result.output)`

`Example Agent`

reviewer.yml`yaml spec: flatagent spec_version: "0.8.2"

data: name: code-reviewer

model: "smart-expensive" # Reference profile from profiles.yml

system: | You are a senior code reviewer. Analyze code for bugs, style issues, and potential improvements.

user: | Review this code: {{ input.code }}

output: issues: type: list items: type: str description: "List of issues found" rating: type: str enum: ["good", "needs_work", "critical"] description: "Overall code quality"`

What the fields mean:

- spec/spec_version — Format identifier and version - data.name — Agent identifier - data.model — Profile name, inline config, or profile with overrides - data.system — System prompt (sets behavior) - data.user — User prompt template (uses Jinja2,{{ input.* }}for runtime values) - data.output — Structured output schema (the runtime extracts these fields)

`Model Profiles`

Centralize model configurations in profiles.yml and reference them by name:

profiles.yml`yaml spec: flatprofiles spec_version: "0.8.2"

data: model_profiles: fast-cheap: provider: cerebras name: zai-glm-4.6 temperature: 0.6 max_tokens: 2048

smart-expensive: provider: anthropic name: claude-3-opus-20240229 temperature: 0.3 max_tokens: 4096

default: fast-cheap # Fallback when agent has no model # override: smart-expensive # Uncomment to force all agents`

Agent usage:`yaml

`String shorthand — profile lookup`


model: "fast-cheap"
Profile with overrides

model:
  profile: "fast-cheap"
  temperature: 0.9
Inline config (no profile)

model:
  provider: openai
  name: gpt-4
  temperature: 0.3


Resolution order (low → high): default profile → named profile → inline overrides → override profile
Output Types

`yaml output: answer: { type: str } count: { type: int } score: { type: float } valid: { type: bool } raw: { type: json } items: { type: list, items: { type: str } } metadata: { type: object, properties: { key: { type: str } } }`

Use enum: [...] to constrain string values.

`Multi-Agent Workflows`

For orchestration, use FlatMachine (full docs in MACHINES.md):

`python from flatagents import FlatMachine

machine = FlatMachine(config_file="workflow.yml") result = await machine.execute(input={"query": "..."})`

FlatMachine provides: state transitions, conditional branching, loops, retry with backoff, and error recovery—all in YAML.

`Features`

- Checkpoint and restore - Python SDK (TypeScript SDK in progress) - MACHINES.md — LLM-optimized reference docs - Decider agents and machines - On-the-fly agent and machine definitions - Webhook hooks for remote state machine handling - Metrics and logging - Error recovery and exception handling at the state machine level - Parallel machine execution (machine: [a, b, c]) - Dynamic parallelism withforeach- Fire-and-forget launches for background tasks

`Planned`

- Distributed execution — cross-network machine peering, inter-machine strategies - SQL persistence backend - TypeScript SDK -max_depthconfig to limit machine launch nesting - Checkpoint pruning to prevent storage explosion -$root/path prefix — resolve agent/machine refs from workspace root, not config dir - Input size validation — warn when prompt exceeds model context window - Serialization warnings — flag non-JSON-serializable context values before checkpoint

`Specs`

TypeScript definitions are the source of truth: -flatagent.d.ts-flatmachine.d.ts-profiles.d.ts

`Python SDK`

`bash pip install flatagents[litellm]`

`$3`

`python from flatagents import LiteLLMBackend, AISuiteBackend

`LiteLLM (default)`


agent = FlatAgent(config_file="agent.yml")
AISuite

backend = AISuiteBackend(model="openai:gpt-4o")
agent = FlatAgent(config_file="agent.yml", backend=backend)


$3
Extend machine behavior with Python hooks:

`python from flatagents import FlatMachine, MachineHooks

class CustomHooks(MachineHooks): def on_state_enter(self, state: str, context: dict) -> dict: context["entered_at"] = time.time() return context

def on_action(self, action: str, context: dict) -> dict: if action == "fetch_data": context["data"] = fetch_from_api() return context

machine = FlatMachine(config_file="machine.yml", hooks=CustomHooks())`

Available hooks: on_machine_start, on_machine_end, on_state_enter, on_state_exit, on_transition, on_error, on_action

`$3`

`yaml execution: type: retry # retry | parallel | mdap_voting backoffs: [2, 8, 16, 35] # Seconds between retries jitter: 0.1 # ±10% random variation`

| Type | Use Case | |------|----------| |default| Single call | |retry| Rate limit handling with backoff | |parallel | Multiple samples (n_samples) | |mdap_voting | Consensus voting (k_margin, max_candidates) |

`$3`

`python from flatagents import validate_flatagent_config, validate_flatmachine_config

warnings = validate_flatagent_config(config) warnings = validate_flatmachine_config(config)`

`$3`

`python from flatagents import setup_logging, get_logger

setup_logging(level="INFO") # Respects FLATAGENTS_LOG_LEVEL env var logger = get_logger(__name__)`

Env vars: FLATAGENTS_LOG_LEVEL (DEBUG/INFO/WARNING/ERROR), FLATAGENTS_LOG_FORMAT (standard/json/simple)

For OpenTelemetry metrics:

`bash pip install flatagents[metrics] export FLATAGENTS_METRICS_ENABLED=true`

Metrics are enabled by default and print to stdout every 5s. Redirect to file or use OTLP for production:

`bash

`Metrics print to stdout by default`


python your_script.py
Save to file

python your_script.py >> metrics.log 2>&1
Disable if needed

FLATAGENTS_METRICS_ENABLED=false python your_script.py
Send to OTLP collector for production

OTEL_METRICS_EXPORTER=otlp \
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
python your_script.py


Env vars for metrics:

| Variable | Default | Purpose | |----------|---------|---------| |FLATAGENTS_METRICS_ENABLED | true| Enable OpenTelemetry metrics | |OTEL_METRICS_EXPORTER | console | console (stdout) or otlp(production) | |OTEL_EXPORTER_OTLP_ENDPOINT| — | OTLP collector endpoint | |OTEL_METRIC_EXPORT_INTERVAL | 5000 / 60000| Export interval in ms (5s for console, 60s for otlp) | |OTEL_SERVICE_NAME | flatagents` | Service name in metrics |

FlatAgents

Define LLM agents in YAML. Run them anywhere.

For LLM/machine readers: see MACHINES.md for comprehensive reference.

Why?

Inspired by Kubernetes manifests and character card specifications.

Versioning

All specs (flatagent.d.ts, flatmachine.d.ts, profiles.d.ts) and SDKs (Python, JS) use lockstep versioning. A single version number applies across the entire repository.

Core Concepts

Use machines to write flatagents and flatmachines, they are designed for LLMs.

Use FlatAgent alone for simple tasks. Use FlatMachine when you need multi-step workflows, branching, or error handling.

Examples

Quick Start

``bash pip install flatagents[all]`

`python from flatagents import FlatAgent

agent = FlatAgent(config_file="reviewer.yml") result = await agent.call(query="Review this code...") print(result.output)`

`Example Agent`

reviewer.yml`yaml spec: flatagent spec_version: "0.8.2"

data: name: code-reviewer

model: "smart-expensive" # Reference profile from profiles.yml

system: | You are a senior code reviewer. Analyze code for bugs, style issues, and potential improvements.

user: | Review this code: {{ input.code }}

output: issues: type: list items: type: str description: "List of issues found" rating: type: str enum: ["good", "needs_work", "critical"] description: "Overall code quality"`

What the fields mean:

`Model Profiles`

Centralize model configurations in profiles.yml and reference them by name:

profiles.yml`yaml spec: flatprofiles spec_version: "0.8.2"

data: model_profiles: fast-cheap: provider: cerebras name: zai-glm-4.6 temperature: 0.6 max_tokens: 2048

smart-expensive: provider: anthropic name: claude-3-opus-20240229 temperature: 0.3 max_tokens: 4096

default: fast-cheap # Fallback when agent has no model # override: smart-expensive # Uncomment to force all agents`

Agent usage:`yaml

`String shorthand — profile lookup`


model: "fast-cheap"
Profile with overrides

model:
  profile: "fast-cheap"
  temperature: 0.9
Inline config (no profile)

model:
  provider: openai
  name: gpt-4
  temperature: 0.3


Resolution order (low → high): default profile → named profile → inline overrides → override profile
Output Types

Use enum: [...] to constrain string values.

`Multi-Agent Workflows`

For orchestration, use FlatMachine (full docs in MACHINES.md):

`python from flatagents import FlatMachine

machine = FlatMachine(config_file="workflow.yml") result = await machine.execute(input={"query": "..."})`

FlatMachine provides: state transitions, conditional branching, loops, retry with backoff, and error recovery—all in YAML.

`Features`

`Planned`

`Specs`

TypeScript definitions are the source of truth: -flatagent.d.ts-flatmachine.d.ts-profiles.d.ts

`Python SDK`

`bash pip install flatagents[litellm]`

`$3`

`python from flatagents import LiteLLMBackend, AISuiteBackend

`LiteLLM (default)`


agent = FlatAgent(config_file="agent.yml")
AISuite

backend = AISuiteBackend(model="openai:gpt-4o")
agent = FlatAgent(config_file="agent.yml", backend=backend)


$3
Extend machine behavior with Python hooks:

`python from flatagents import FlatMachine, MachineHooks

class CustomHooks(MachineHooks): def on_state_enter(self, state: str, context: dict) -> dict: context["entered_at"] = time.time() return context

def on_action(self, action: str, context: dict) -> dict: if action == "fetch_data": context["data"] = fetch_from_api() return context

machine = FlatMachine(config_file="machine.yml", hooks=CustomHooks())`

Available hooks: on_machine_start, on_machine_end, on_state_enter, on_state_exit, on_transition, on_error, on_action

`$3`

`yaml execution: type: retry # retry | parallel | mdap_voting backoffs: [2, 8, 16, 35] # Seconds between retries jitter: 0.1 # ±10% random variation`

`$3`

`python from flatagents import validate_flatagent_config, validate_flatmachine_config

warnings = validate_flatagent_config(config) warnings = validate_flatmachine_config(config)`

`$3`

`python from flatagents import setup_logging, get_logger

setup_logging(level="INFO") # Respects FLATAGENTS_LOG_LEVEL env var logger = get_logger(__name__)`

Env vars: FLATAGENTS_LOG_LEVEL (DEBUG/INFO/WARNING/ERROR), FLATAGENTS_LOG_FORMAT (standard/json/simple)

For OpenTelemetry metrics:

`bash pip install flatagents[metrics] export FLATAGENTS_METRICS_ENABLED=true`

Metrics are enabled by default and print to stdout every 5s. Redirect to file or use OTLP for production:

`bash

`Metrics print to stdout by default`


python your_script.py
Save to file

python your_script.py >> metrics.log 2>&1
Disable if needed

FLATAGENTS_METRICS_ENABLED=false python your_script.py
Send to OTLP collector for production

OTEL_METRICS_EXPORTER=otlp \
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
python your_script.py


Env vars for metrics:

@memgrafter/flatagents

FlatAgents

Why?

Versioning

Core Concepts

Examples

Quick Start

Example Agent

Model Profiles

String shorthand — profile lookup

Profile with overrides

Inline config (no profile)

Output Types

Multi-Agent Workflows

Features

Planned

Specs

Python SDK

$3

LiteLLM (default)

AISuite

$3

$3

$3

$3

Metrics print to stdout by default

Save to file

Disable if needed

Send to OTLP collector for production

@memgrafter/flatagents

FlatAgents

Why?

Versioning

Core Concepts

Examples

Quick Start

Example Agent

Model Profiles

String shorthand — profile lookup

Profile with overrides

Inline config (no profile)

Output Types

Multi-Agent Workflows

Features

Planned

Specs

Python SDK

$3

LiteLLM (default)

AISuite

$3

$3

$3

$3

Metrics print to stdout by default

Save to file

Disable if needed

Send to OTLP collector for production

`Example Agent`

`Model Profiles`

`String shorthand — profile lookup`

`Multi-Agent Workflows`

`Features`

`Planned`

`Specs`

`Python SDK`

`$3`

`LiteLLM (default)`

`$3`

`$3`

`$3`

`Metrics print to stdout by default`

`Example Agent`

`Model Profiles`

`String shorthand — profile lookup`

`Multi-Agent Workflows`

`Features`

`Planned`

`Specs`

`Python SDK`

`$3`

`LiteLLM (default)`

`$3`

`$3`

`$3`

`Metrics print to stdout by default`