Databricks node for n8n
npm install n8n-nodes-databricksThis is an n8n community node that provides comprehensive integration with Databricks APIs, including Genie AI, SQL, Unity Catalog, Model Serving, Files, and Vector Search capabilities.
- š¤ Genie AI Assistant: Start conversations, send messages, and execute SQL queries through Databricks' AI assistant
- š¤ AI Agent with MLflow Tracking: Full-featured AI agent with optional MLflow observability for comprehensive tracing
- š File Operations: Upload, download, list, and manage files in Databricks volumes (up to 5 GiB)
- šļø Databricks SQL: Execute SQL queries and manage statements
- š Unity Catalog: Manage catalogs, schemas, tables, and volumes
- š¤ Model Serving: Query AI models and manage endpoints
- š Vector Search: Perform vector similarity searches
You need the following installed on your development machine:
* git
* Node.js and pnpm. Minimum version Node 18. You can find instructions on how to install both using nvm (Node Version Manager) for Linux, Mac, and WSL here. For Windows users, refer to Microsoft's guide to Install NodeJS on Windows.
* Install n8n with:
```
pnpm install n8n -g
* A Databricks workspace with a personal access token
1. Clone this repository:
`bash`
git clone https://github.com/
cd n8n-nodes-databricks
2. Install dependencies:
`bash`
pnpm install
3. Build the node:
`bash`
pnpm build
4. Link to your n8n installation:
`bash`
npm link
cd ~/.n8n/custom
npm link n8n-nodes-databricks
`bash`
npm install n8n-nodes-databricks
To use this node, you need to configure Databricks credentials:
1. Host: Your Databricks workspace URL (e.g., https://adb-1234567890123456.7.azuredatabricks.net)
2. Token: Your Databricks personal access token
To generate a token:
1. Log into your Databricks workspace
2. Go to User Settings ā Access Tokens
3. Click "Generate New Token"
4. Copy and save the token securely
``
n8n-nodes-databricks/
ā
āāā šÆ Main Node Entry Point
ā āāā nodes/Databricks/Databricks.node.ts
ā āāā Class: Databricks (implements INodeType)
ā āāā Properties:
ā ā āāā displayName: 'Databricks'
ā ā āāā version: 1
ā ā āāā usableAsTool: true (can be used as an AI agent tool)
ā ā āāā requestDefaults: { baseURL, Authorization }
ā ā
ā āāā Node Configuration:
ā ā āāā Resource selector (dropdown):
ā ā ā āāā Genie (AI Assistant)
ā ā ā āāā Databricks SQL
ā ā ā āāā Unity Catalog
ā ā ā āāā Model Serving
ā ā ā āāā Files
ā ā ā āāā Vector Search
ā ā ā
ā ā āāā Operations (per resource)
ā ā āāā Parameters (per resource)
ā ā
ā āāā Execute Method:
ā āāā Process each input item
ā āāā Handle special cases (custom logic)
ā āāā Error handling with continueOnFail support
ā
āāā š Resource Definitions
ā āāā nodes/Databricks/resources/
ā āāā index.ts (exports all operations & parameters)
ā ā
ā āāā š¤ genie/
ā ā āāā operations.ts
ā ā ā āāā Operations: [6 operations]
ā ā ā āāā startConversation
ā ā ā āāā createMessage
ā ā ā āāā getMessage
ā ā ā āāā executeMessageQuery
ā ā ā āāā getQueryResults
ā ā ā āāā getSpace
ā ā ā
ā ā āāā parameters.ts
ā ā āāā Parameters: spaceId, conversationId, messageId, etc.
ā ā
ā āāā š files/
ā ā āāā operations.ts
ā ā ā āāā Operations: [7 operations]
ā ā ā āāā uploadFile (PUT binary data)
ā ā ā āāā downloadFile (GET file content)
ā ā ā āāā deleteFile (DELETE)
ā ā ā āāā getFileInfo (HEAD metadata)
ā ā ā āāā listDirectory (GET directory contents)
ā ā ā āāā createDirectory (PUT)
ā ā ā āāā deleteDirectory (DELETE)
ā ā ā
ā ā āāā parameters.ts
ā ā
ā āāā šļø databricksSql/
ā āāā š unityCatalog/
ā āāā š¤ modelServing/
ā āāā š vectorSearch/
ā
āāā š¤ AI Agent Node
ā āāā nodes/agents/DatabricksAiAgent/
ā āāā DatabricksAiAgent.node.ts (node definition)
ā āāā execute.ts (agent execution with MLflow)
ā āāā CallbackHandler.ts (MLflow tracing)
ā āāā description.ts (node properties)
ā āāā utils.ts (input configuration)
ā āāā src/
ā āāā constants.ts (MLflow constants)
ā āāā types/ (TypeScript types)
ā āāā utils/ (helper functions)
ā
āāā š Credentials
ā āāā credentials/Databricks.credentials.ts
ā āāā DatabricksCredentials interface:
ā āāā host: string (Databricks workspace URL)
ā āāā token: string (Personal access token)
ā
āāā šØ Assets
āāā databricks.svg (light mode icon)
āāā databricks.dark.svg (dark mode icon)
``
User Input (n8n workflow)
ā
1. User selects RESOURCE (e.g., "Genie")
ā
2. User selects OPERATION (e.g., "Start Conversation")
ā
3. UI displays relevant PARAMETERS (using displayOptions.show)
ā
4. User fills in parameters (spaceId, initialMessage, etc.)
ā
5. Execute method is called
ā
6. Two execution paths:
ā
āāā Path A: Declarative Routing (most operations)
ā āāā n8n uses 'routing' config from operations.ts
ā āāā Automatically builds HTTP request
ā āāā Substitutes parameters using {{$parameter.xxx}}
ā āāā Sends request with credentials from requestDefaults
ā
āāā Path B: Custom Logic (special cases)
āāā Files.uploadFile ā Custom binary data handling
āāā Genie operations ā Custom switch statement
āāā Build URL dynamically
āāā Create request body
āāā Call this.helpers.httpRequest()
āāā Return response
ā
7. Return INodeExecutionData[][]
#### 1. Resource-Based Organization
Each Databricks API category is a separate "resource" with its own operations and parameters.
#### 2. Declarative Routing
Most operations use n8n's declarative routing configuration:`typescript`
routing: {
request: {
method: 'POST',
url: '=/api/2.0/genie/spaces/{{$parameter.spaceId}}/conversations',
body: {
initial_message: '={{$parameter.initialMessage}}'
}
}
}
#### 3. Conditional Parameter Display
Parameters appear/hide based on selected resource and operation:
`typescript`
displayOptions: {
show: {
resource: ['genie'],
operation: ['startConversation']
}
}
#### 4. Two Execution Modes
- Declarative: n8n handles HTTP requests automatically (most operations)
- Imperative: Custom logic in execute() method (files upload, genie operations)
#### 5. Error Handling
Comprehensive error handling with three types:
- API Errors: Status code + error details
- Network Errors: Connection failures
- Other Errors: General exceptions
All support continueOnFail mode for resilient workflows.
``
User Input ā Databricks AI Agent Node
ā
1. Load Configuration
āā Get Chat Model (required)
āā Get Tools (optional)
āā Get Memory (optional)
āā Get Output Parser (optional)
āā Check MLflow enabled
ā
2. MLflow Setup (if enabled)
āā Validate Databricks credentials
āā Get/Create experiment: /Shared/n8n-workflows-{workflow-id}
āā Initialize MLflow CallbackHandler
ā
3. Agent Execution
āā Create LangChain ToolCallingAgent
āā Setup fallback model (if configured)
āā Execute with streaming or standard mode
ā
4. Processing Loop (for each iteration)
āā LLM Call ā MLflow CHAT_MODEL span
āā Tool Calls ā MLflow TOOL spans
āā Continue until final answer
ā
5. MLflow Tracing (if enabled)
āā Log AGENT span with full execution
āā Record token usage and latency
āā Capture all intermediate steps
ā
6. Return Result
āā Output text/structured data to n8n workflow
`bash`
pnpm build
`bash`
pnpm lintor auto-fix
pnpm lintfix
`bash`
pnpm test
The Databricks AI Agent node provides a full-featured AI agent built on LangChain's ToolCallingAgent with optional MLflow observability for comprehensive tracing of your agent's reasoning, tool usage, and LLM interactions.
#### AI Agent Capabilities
- Tool Calling - Supports any LangChain tool or MCP toolkit
- Memory - Conversation history with BaseChatMemory
- Structured Output - Optional output parser for validated JSON responses
- Streaming - Real-time token streaming support
- Fallback Models - Automatic failover to secondary model
- Binary Images - Automatic passthrough of images to vision models
#### MLflow Observability (Optional)
- Toggle On/Off - Enable MLflow logging with a simple checkbox
- Automatic Tracing - Creates MLflow spans for every step when enabled
- Span Types:
- AGENT - Overall agent executionCHAT_MODEL
- - LLM calls with token usageTOOL
- - Tool invocations with arguments and resultsRETRIEVER
- - Vector store retrievals (if used)
- Metrics - Latency, token counts, model info
- Tags & Metadata - Full context for filtering and analysis
#### Enabling MLflow (Optional)
MLflow logging is disabled by default. To enable it:
1. Add the Databricks AI Agent node to your workflow
2. Toggle "Enable MLflow Tracking" to ON
3. Configure Databricks credentials (credential selector appears when enabled)
4. The node will automatically use your workflow ID as the experiment name
#### MLflow Experiment Management (Automatic)
When MLflow tracking is enabled, the node automatically manages experiments:
- Experiment Name: Automatically set to /Shared/n8n-workflows-{workflow-id}/Shared/
- Auto-Creation: If the experiment doesn't exist, it's created automatically
- Auto-Reuse: If the experiment exists, it's reused automatically
- One Workflow = One Experiment: Each n8n workflow gets its own dedicated MLflow experiment
- Shared Workspace: Experiments are created in for team accessibility
#### Basic Agent Setup
1. Add Agent Node - Drag "Databricks AI Agent" to your workflow
2. Connect Chat Model - Add OpenAI, Databricks, or compatible model
3. Connect Tools (optional) - Add n8n tools or MCP clients
4. Connect Memory (optional) - Add chat memory for conversations
5. Configure Input - Map user message to the agent
#### Node Inputs
The node requires these connections:
- Chat Model (required) - The LLM to use
- Tools (optional) - Zero or more tools the agent can call
- Memory (optional) - For conversation history
- Output Parser (optional) - For structured JSON validation
Every agent execution creates a trace in Databricks MLflow with:
- Agent Span - Overall execution with messages and system prompt
- Chat Model Spans - Each LLM call with:
- Input messages
- Model parameters (temperature, max_tokens, etc.)
- Response with token usage
- Latency metrics
- Tool Spans - Each tool invocation with:
- Tool name and description
- Input arguments
- Output results
- Execution time
Metrics captured per trace:
- Total latency
- Total cost
- Total tokens (input + output)
- LLM calls count
- Tool calls count
1. Add the Databricks AI Agent node to your workflow
2. Connect a Chat Model node (e.g., OpenAI or Databricks Chat Model)
3. (Optional) Connect Tools - Add any n8n tools you want the agent to use
4. (Optional) Connect Memory - Add chat memory for conversation history
5. Toggle "Enable MLflow Tracking" to ON
6. Select your Databricks credentials
7. Configure the input prompt
8. Run the workflow - traces will appear in MLflow under /Shared/n8n-workflows-{workflow-id}
1. Add the Databricks node to your workflow
2. Select Resource: Genie
3. Select Operation: Start Conversation
4. Enter your Space ID
5. Enter your Initial Message: "Show me sales data for last quarter"
1. Add the Databricks node after a node that provides binary data
2. Select Resource: Files
3. Select Operation: Upload File
4. Configure:
- Data Field Name: datamain
- Catalog: default
- Schema: my_volume
- Volume: reports/report.pdf
- Path:
1. Add the Databricks node to your workflow
2. Select Resource: Vector Search
3. Select Operation: Query Index
4. Configure your query parameters
To extend this node with new operations:
1. Navigate to the appropriate resource folder in nodes/Databricks/resources/operations.ts
2. Add the new operation to :`
typescript`
{
name: 'My New Operation',
value: 'myNewOperation',
description: 'Description of what it does',
action: 'Perform my new operation',
routing: {
request: {
method: 'GET',
url: '=/api/2.0/path/{{$parameter.id}}'
}
}
}
parameters.ts`
3. Add required parameters to
4. Rebuild and test
Contributions are welcome! Please feel free to submit a Pull Request.
For issues, questions, or contributions, please visit the GitHub repository.
- n8n Documentation
- Databricks API Documentation
- n8n Community