SignalK plugin and webapp that archives SK data to Parquet files with a regimen control system, advanced querying, Claude integrated AI analysis, spatial capabilities, and REST API.
npm install signalk-parquet
SignalK Parquet Data StoreA comprehensive SignalK plugin and webapp that saves SignalK data directly to Parquet files with manual and automated regimen-based archiving and advanced querying features, including a REST API built on the SignalK History API, Claude AI history data analysis, and spatial geographic analysis capabilities.
#### Benefits of Proper Data Types
Using correct data types in Parquet files provides significant advantages:
- Storage Efficiency: Numeric data stored as DOUBLE uses ~50% less space than string representations
- Query Performance: Native numeric operations are 5-10x faster than string parsing during analysis
- Data Integrity: Type validation prevents data corruption and ensures consistent analysis results
- Analytics Compatibility: Proper types enable advanced statistical analysis and machine learning applications
- Compression: Parquet's columnar compression works optimally with correctly typed data
#### Validation Process
The validation system checks each Parquet file for:
- Field Type Consistency: Ensures numeric marine data (position, speed, depth) is stored as DOUBLE
- Boolean Representation: Validates true/false values are stored as BOOLEAN, not strings
- Metadata Alignment: Compares file schemas against SignalK metadata for units like meters, volts, amperes
- Schema Standards: Enforces data best practices for long-term data integrity
signalk-units-preference plugin?convertUnits=true to any history query?convertTimesToLocal=true to convert timestamps to local time&timezone=America/New_York for custom IANA timezone2025-10-20T12:34:04-04:00)vessels.*) with MMSI-based exclusion filteringactivateOnMatch state (ON/OFF). False evaluations leave the command untouched, so use a second threshold if you want a different level to switch it back.convertUnits=true will have no effectbash
Navigate to folder
cd ~/.signalk/node_modules/Install from npm (after publishing)
npm install signalk-parquetOr install from GitHub
npm install motamman/signalk-parquet
cd ~/.signalk/node_modules/signalk-parquet
npm run buildRestart SignalK
sudo systemctl restart signalk
`ā ļø IMPORTANT IF UPGRADING FROM 0.5.0-beta.3: Consolidation Bug Fix
THIS FIXES A RECURSIVE BUG THAT WAS CREATING NESTED PROCESSED DIRECTORIES AND REPEATEDLY PROCESSING THE SAME FILES. THIS SHOULD FIX THAT PROBLEM BUT ANY
processed FOLDERS NESTED INSIDE A processed FOLDER SHOULD BE MANUALLY DELETED.$3
No action is likely needed if upgrading from 0.5.0-beta.4 or better. If you're upgrading from a previous version, you may have nested processed directories that need cleanup:
`bash
Check for nested processed directories
find data -name "processed" -type d | head -20See the deepest nesting levels
find data -name "processed" -type d | awk -F'/' '{print NF-1, $0}' | sort -nr | head -5Count files in nested processed directories
find data -path "/processed/processed/" -type f | wc -lRemove ALL nested processed directories (RECOMMENDED)
find data -name "processed" -type d -exec rm -rf {} +Verify cleanup completed
find data -path "/processed/processed/" -type f | wc -l # Should show 0
`Note: The processed directories only contain files that were moved during consolidation - removing them does not delete your original data.
$3
`bash
Clone or copy the signalk-parquet directory
cd signalk-parquetInstall dependencies
npm installBuild the TypeScript code
npm run buildCopy to SignalK plugins directory
cp -r . ~/.signalk/node_modules/signalk-parquet/Restart SignalK
sudo systemctl restart signalk
`$3
`bash
Build for production
npm run buildThe compiled JavaScript will be in the dist/ directory
`Configuration
$3
Navigate to SignalK Admin ā Server ā Plugin Config ā SignalK Parquet Data Store
Configure basic plugin settings (path configuration is managed separately in the web interface):
| Setting | Description | Default |
|---------|-------------|---------|
| Buffer Size | Number of records to buffer before writing | 1000 |
| Save Interval | How often to save buffered data (seconds) | 30 |
| Output Directory | Directory to save data files | SignalK data directory |
| Filename Prefix | Prefix for generated filenames |
signalk_data |
| File Format | Output format (parquet, json, csv) | parquet |
| Retention Days | Days to keep processed files | 7 |
| Unit Conversion Cache Duration š | How long to cache unit conversions before reloading (minutes) | 5 |> Note: The Unit Conversion Cache Duration setting controls how quickly changes to unit preferences (in the signalk-units-preference plugin) are reflected in the history API. Lower values (1-2 minutes) reflect changes faster but use more resources. Higher values (30-60 minutes) reduce overhead but take longer to reflect changes. The default of 5 minutes provides a good balance for most users.
$3
Configure S3 upload settings in the plugin configuration:
| Setting | Description | Default |
|---------|-------------|---------|
| Enable S3 Upload | Enable uploading to Amazon S3 |
false |
| Upload Timing | When to upload (realtime/consolidation) | consolidation |
| S3 Bucket | Name of S3 bucket | - |
| AWS Region | AWS region for S3 bucket | us-east-1 |
| Key Prefix | S3 object key prefix | - |
| Access Key ID | AWS credentials (optional) | - |
| Secret Access Key | AWS credentials (optional) | - |
| Delete After Upload | Delete local files after upload | false |$3
Configure Claude AI integration in the plugin configuration for advanced data analysis:
| Setting | Description | Default |
|---------|-------------|---------|
| Enable Claude Integration | Enable AI-powered data analysis |
false |
| API Key | Anthropic Claude API key (required) | - |
| Model | Claude model to use for analysis | claude-3-7-sonnet-20250219 |
| Max Tokens | Maximum tokens for AI responses | 4000 |
| Temperature | AI creativity level (0-1) | 0.3 |#### Supported Claude Models
| Model | Description | Use Case |
|-------|-------------|----------|
|
claude-opus-4-1-20250805 | Latest Opus model - highest intelligence | Complex analysis, detailed insights |
| claude-opus-4-20250514 | Opus model - very high intelligence | Advanced analysis |
| claude-sonnet-4-20250514 | Sonnet model - balanced performance | Recommended default |#### Getting a Claude API Key
1. Visit Anthropic Console
2. Create an account or sign in
3. Navigate to API Keys section
4. Generate a new API key
5. Copy the key and paste it in the plugin configuration
Note: Claude AI analysis requires an active Anthropic API subscription. Usage is billed based on tokens consumed during analysis.
Path Configuration
Important: Path configuration is managed exclusively through the web interface, not in the SignalK admin interface. This provides a more intuitive interface for managing data collection paths.
$3
1. Navigate to:
http://localhost:3000/plugins/signalk-parquet
2. Click the āļø Path Configuration tab$3
Use the web interface to configure which SignalK paths to collect:
1. Click ā Add New Path
2. Configure the path settings:
- SignalK Path: The SignalK data path (e.g.,
navigation.position)
- Always Enabled: Collect data regardless of regimen state
- Regimen Control: Command name that controls collection
- Source Filter: Only collect from specific sources
- Context: SignalK context (vessels.self, vessels.*, or specific vessel)
- Exclude MMSI: For vessels.* context, exclude specific MMSI numbers
3. Click ā
Add Path$3
- Edit Path: Click āļø Edit button to modify path settings
- Delete Path: Click šļø Remove button to delete a path
- Refresh: Click š Refresh Paths to reload configuration
- Show/Hide Commands: Toggle button to show/hide command paths in the table
$3
The plugin streamlines command management with automatic path configuration:
1. Register Command: Commands are automatically registered with enabled path configurations
2. Start Command: Click Start button to activate a command regimen
3. Stop Command: Click Stop button to deactivate a command regimen
4. Remove Command: Click Remove button to delete a command and its path configuration
This eliminates the previous 3-step process of registering commands, adding paths, and enabling them separately.
$3
Path configurations are stored separately from plugin configuration in:
`
~/.signalk/signalk-parquet/webapp-config.json
`This allows for:
- Independent management of path configurations
- Better separation of concerns
- Easier backup and migration of path settings
- More intuitive web-based configuration interface
$3
Regimens allow you to control data collection based on SignalK commands:
Example: Weather data collection with source filtering
`json
{
"path": "environment.wind.angleApparent",
"enabled": false,
"regimen": "captureWeather",
"source": "mqtt-weatherflow-udp",
"context": "vessels.self"
}
`Note: Source filtering accesses raw data before SignalK server arbitration, allowing collection of data from specific sources that might otherwise be filtered out.
Multi-Vessel Example: Collect navigation data from all vessels except specific MMSI numbers
`json
{
"path": "navigation.position",
"enabled": true,
"context": "vessels.*",
"excludeMMSI": ["123456789", "987654321"]
}
`Command Path: Command paths are automatically created when registering commands
`json
{
"path": "commands.captureWeather",
"enabled": true,
"context": "vessels.self"
}
`This path will only collect data when the command
commands.captureWeather is active.TypeScript Architecture
$3
The plugin uses comprehensive TypeScript interfaces:
`typescript
interface PluginConfig {
bufferSize: number;
saveIntervalSeconds: number;
outputDirectory: string;
filenamePrefix: string;
fileFormat: 'json' | 'csv' | 'parquet';
paths: PathConfig[];
s3Upload: S3UploadConfig;
}interface PathConfig {
path: string;
enabled: boolean;
regimen?: string;
source?: string;
context: string;
excludeMMSI?: string[];
}
interface DataRecord {
received_timestamp: string;
signalk_timestamp: string;
context: string;
path: string;
value: any;
source_label?: string;
meta?: string;
}
`$3
The plugin maintains typed state:
`typescript
interface PluginState {
unsubscribes: Array<() => void>;
dataBuffers: Map;
activeRegimens: Set;
subscribedPaths: Set;
parquetWriter?: ParquetWriter;
s3Client?: any;
currentConfig?: PluginConfig;
}
`$3
API routes are fully typed:
`typescript
router.get('/api/paths',
(_: TypedRequest, res: TypedResponse) => {
// Typed request/response handling
}
);
`Data Output Structure
$3
`
output_directory/
āāā vessels/
ā āāā self/
ā āāā navigation/
ā ā āāā position/
ā ā ā āāā signalk_data_20250716T120000.parquet
ā ā ā āāā signalk_data_20250716_consolidated.parquet
ā ā āāā speedOverGround/
ā āāā environment/
ā āāā wind/
ā āāā angleApparent/
āāā processed/
āāā [moved files after consolidation]
`$3
Each record contains:
| Field | Type | Description |
|-------|------|-------------|
|
received_timestamp | string | When the plugin received the data |
| signalk_timestamp | string | Original SignalK timestamp |
| context | string | SignalK context (e.g., vessels.self) |
| path | string | SignalK path |
| value | DOUBLE/BOOLEAN/INT64/UTF8 | Smart typed values - numbers stored as DOUBLE, booleans as BOOLEAN, etc. |
| value_json | string | JSON representation for complex values |
| source | string | Complete source information |
| source_label | string | Source label |
| source_type | string | Source type |
| source_pgn | number | PGN number (if applicable) |
| meta | string | Metadata information |#### Smart Data Types
The plugin now intelligently detects and preserves native data types:
- Numbers: Stored as
DOUBLE (floating point) or INT64 (integers)
- Booleans: Stored as BOOLEAN
- Strings: Stored as UTF8
- Objects: Serialized to JSON and stored as UTF8
- Mixed Types: Falls back to UTF8 when a path contains multiple data typesThis provides better compression, faster queries, and proper type safety for data analysis.
Web Interface
$3
- Path Configuration: Manage data collection paths with multi-vessel support
- Command Management: Streamlined command registration and control
- Data Exploration: Browse available data paths
- SQL Queries: Execute DuckDB queries against Parquet files
- History API: Query historical data using SignalK History API endpoints
- S3 Status: Test S3 connectivity and configuration
- Responsive Design: Works on desktop and mobile
- MMSI Filtering: Exclude specific vessels from wildcard contexts
$3
| Endpoint | Method | Description |
|----------|--------|-------------|
|
/api/paths | GET | List available data paths |
| /api/files/:path | GET | List files for a path |
| /api/sample/:path | GET | Sample data from a path |
| /api/query | POST | Execute SQL query |
| /api/config/paths | GET/POST/PUT/DELETE | Manage path configurations |
| /api/test-s3 | POST | Test S3 connection |
| /api/health | GET | Health check |
| Claude AI Analysis API | | |
| /api/analyze | POST | Perform AI analysis on data |
| /api/analyze/templates | GET | Get available analysis templates |
| /api/analyze/followup | POST | Follow-up analysis questions |
| /api/analyze/history | GET | Get analysis history |
| /api/analyze/test-connection | POST | Test Claude API connection |
| SignalK History API | | |
| /signalk/v1/history/values | GET | SignalK History API - Get historical values |
| /signalk/v1/history/contexts | GET | SignalK History API - Get available contexts |
| /signalk/v1/history/paths | GET | SignalK History API - Get available paths |DuckDB Integration
$3
#### Basic Queries
`sql
-- Get latest 10 records from navigation position
SELECT FROM read_parquet('/path/to/navigation/position/.parquet', union_by_name=true)
ORDER BY received_timestamp DESC LIMIT 10;-- Count total records
SELECT COUNT() FROM read_parquet('/path/to/navigation/position/.parquet', union_by_name=true);
-- Filter by source
SELECT FROM read_parquet('/path/to/environment/wind/.parquet', union_by_name=true)
WHERE source_label = 'mqtt-weatherflow-udp'
ORDER BY received_timestamp DESC LIMIT 100;
-- Aggregate by hour
SELECT
DATE_TRUNC('hour', received_timestamp::timestamp) as hour,
AVG(value::double) as avg_value,
COUNT(*) as record_count
FROM read_parquet('/path/to/data/*.parquet', union_by_name=true)
GROUP BY hour
ORDER BY hour;
`#### š Spatial Analysis Queries
`sql
-- Calculate distance traveled over time
WITH ordered_positions AS (
SELECT
signalk_timestamp,
ST_Point(value_longitude, value_latitude) as position,
LAG(ST_Point(value_longitude, value_latitude)) OVER (ORDER BY signalk_timestamp) as prev_position
FROM read_parquet('data/vessels/urn_mrn_imo_mmsi_368396230/navigation/position/*.parquet', union_by_name=true)
WHERE signalk_timestamp >= '2025-09-27T16:00:00Z'
AND signalk_timestamp <= '2025-09-27T23:59:59Z'
AND value_latitude IS NOT NULL AND value_longitude IS NOT NULL
),
distances AS (
SELECT *,
CASE
WHEN prev_position IS NOT NULL
THEN ST_Distance_Sphere(position, prev_position)
ELSE 0
END as distance_meters
FROM ordered_positions
)
SELECT
strftime(date_trunc('hour', signalk_timestamp::TIMESTAMP), '%Y-%m-%dT%H:%M:%SZ') as time_bucket,
AVG(value_latitude) as avg_lat,
AVG(value_longitude) as avg_lon,
ST_AsText(ST_Centroid(ST_Collect(position))) as centroid,
SUM(distance_meters) as total_distance_meters,
COUNT(*) as position_records,
ST_AsText(ST_ConvexHull(ST_Collect(position))) as movement_area
FROM distances
GROUP BY time_bucket
ORDER BY time_bucket;-- Multi-vessel proximity analysis
SELECT
v1.context as vessel1,
v2.context as vessel2,
ST_Distance_Sphere(
ST_Point(v1.value_longitude, v1.value_latitude),
ST_Point(v2.value_longitude, v2.value_latitude)
) as distance_meters,
v1.signalk_timestamp
FROM read_parquet('data/vessels//navigation/position/.parquet', union_by_name=true) v1
JOIN read_parquet('data/vessels//navigation/position/.parquet', union_by_name=true) v2
ON v1.signalk_timestamp = v2.signalk_timestamp AND v1.context != v2.context
WHERE v1.signalk_timestamp >= '2025-09-27T00:00:00Z'
AND ST_Distance_Sphere(
ST_Point(v1.value_longitude, v1.value_latitude),
ST_Point(v2.value_longitude, v2.value_latitude)
) < 1000 -- Within 1km
ORDER BY distance_meters;
-- Advanced movement analysis with bounding boxes
WITH ordered_positions AS (
SELECT
signalk_timestamp,
ST_Point(value_longitude, value_latitude) as position,
value_latitude,
value_longitude,
LAG(ST_Point(value_longitude, value_latitude)) OVER (ORDER BY signalk_timestamp) as prev_position,
strftime(date_trunc('hour', signalk_timestamp::TIMESTAMP), '%Y-%m-%dT%H:%M:%SZ') as time_bucket
FROM read_parquet('data/vessels/urn_mrn_imo_mmsi_368396230/navigation/position/*.parquet', union_by_name=true)
WHERE signalk_timestamp >= '2025-09-27T16:00:00Z'
AND signalk_timestamp <= '2025-09-27T23:59:59Z'
AND value_latitude IS NOT NULL AND value_longitude IS NOT NULL
),
distances AS (
SELECT *,
CASE
WHEN prev_position IS NOT NULL
THEN ST_Distance_Sphere(position, prev_position)
ELSE 0
END as distance_meters
FROM ordered_positions
)
SELECT
time_bucket,
AVG(value_latitude) as avg_lat,
AVG(value_longitude) as avg_lon,
-- Calculate bounding box manually
MIN(value_latitude) as min_lat,
MAX(value_latitude) as max_lat,
MIN(value_longitude) as min_lon,
MAX(value_longitude) as max_lon,
-- Distance and movement metrics
SUM(distance_meters) as total_distance_meters,
ROUND(SUM(distance_meters) / 1000.0, 2) as total_distance_km,
COUNT(*) as position_records,
-- Movement area approximation using bounding box
(MAX(value_latitude) - MIN(value_latitude)) 111320
(MAX(value_longitude) - MIN(value_longitude)) 111320
COS(RADIANS(AVG(value_latitude))) as approx_area_m2
FROM distances
GROUP BY time_bucket
ORDER BY time_bucket;
`#### Available Spatial Functions
-
ST_Point(longitude, latitude) - Create point geometries
- ST_Distance_Sphere(point1, point2) - Calculate distances in meters
- ST_AsText(geometry) - Convert to Well-Known Text format
- ST_Centroid(ST_Collect(points)) - Find center of multiple points
- ST_ConvexHull(ST_Collect(points)) - Create movement boundary polygonsHistory API Integration
The plugin provides full SignalK History API compliance, allowing you to query historical data using standard SignalK API endpoints with enhanced performance and filtering capabilities.
$3
| Endpoint | Description | Parameters |
|----------|-------------|------------|
|
/signalk/v1/history/values | Get historical values for specified paths | Standard patterns (see below)
Optional: resolution, refresh, includeMovingAverages, useUTC |
| /signalk/v1/history/contexts | Get available vessel contexts for time range | Time Range: Any standard pattern (see below)
Returns only contexts with data in specified range |
| /signalk/v1/history/paths | Get available SignalK paths for time range | Time Range: Any standard pattern (see below)
Returns only paths with data in specified range |$3
The History API supports 5 standard SignalK time query patterns:
| Pattern | Parameters | Description | Example |
|---------|-----------|-------------|---------|
| 1 |
duration | Query back from now | ?duration=1h |
| 2 | from + duration | Query forward from start | ?from=2025-01-01T00:00:00Z&duration=1h |
| 3 | to + duration | Query backward to end | ?to=2025-01-01T12:00:00Z&duration=1h |
| 4 | from | From start to now | ?from=2025-01-01T00:00:00Z |
| 5 | from + to | Specific range | ?from=2025-01-01T00:00:00Z&to=2025-01-02T00:00:00Z |Legacy Support: The
start parameter (used with duration) is deprecated but still supported for backward compatibility. A console warning will be shown. Use standard patterns instead.$3
| Parameter | Description | Format | Examples |
|-----------|-------------|---------|----------|
| Required for
/values: | | | |
| paths | SignalK paths with optional aggregation | path:method,path:method | navigation.position:first,wind.speed:average |
| Time Range: | Use one of the 5 standard patterns above | | |
| duration | Time period | [number][unit] | 1h, 30m, 15s, 2d |
| from | Start time (ISO 8601) | ISO datetime | 2025-01-01T00:00:00Z |
| to | End time (ISO 8601) | ISO datetime | 2025-01-01T06:00:00Z |
| Optional: | | | |
| context | Vessel context | vessels.self or vessels. | vessels.self (default) |
| resolution | Time bucket size in milliseconds | Number | 60000 (1 minute buckets) |
| refresh | Enable auto-refresh (pattern 1 only) | true or 1 | refresh=true |
| includeMovingAverages | Include EMA/SMA calculations | true or 1 | includeMovingAverages=true |
| useUTC | Treat datetime inputs as UTC | true or 1 | useUTC=true |
| convertUnits | š Convert to preferred units (requires signalk-units-preference plugin) | true or 1 | convertUnits=true |
| convertTimesToLocal | š Convert timestamps to local/specified timezone | true or 1 | convertTimesToLocal=true |
| timezone | š IANA timezone ID (used with convertTimesToLocal) | IANA timezone | timezone=America/New_York |
| Deprecated: | | | |
| start | ā ļø Use standard patterns instead | now or ISO datetime | Deprecated, use duration or from/to |$3
#### Pattern 1: Duration Only (Query back from now)
`bash
Last hour of wind data
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=environment.wind.speedApparent"Last 30 minutes with moving averages
curl "http://localhost:3000/signalk/v1/history/values?duration=30m&paths=environment.wind.speedApparent&includeMovingAverages=true"Real-time with auto-refresh
curl "http://localhost:3000/signalk/v1/history/values?duration=15m&paths=navigation.position&refresh=true"
`#### Pattern 2: From + Duration (Query forward)
`bash
6 hours forward from specific time
curl "http://localhost:3000/signalk/v1/history/values?from=2025-01-01T00:00:00Z&duration=6h&paths=navigation.position"
`#### Pattern 3: To + Duration (Query backward)
`bash
2 hours backward to specific time
curl "http://localhost:3000/signalk/v1/history/values?to=2025-01-01T12:00:00Z&duration=2h&paths=environment.wind.speedApparent"
`#### Pattern 4: From Only (From start to now)
`bash
From specific time until now
curl "http://localhost:3000/signalk/v1/history/values?from=2025-01-01T00:00:00Z&paths=navigation.speedOverGround"
`#### Pattern 5: From + To (Specific range)
`bash
Specific 24-hour period
curl "http://localhost:3000/signalk/v1/history/values?from=2025-01-01T00:00:00Z&to=2025-01-02T00:00:00Z&paths=navigation.position"
`#### Advanced Query Examples
Multiple paths with time alignment:
`bash
curl "http://localhost:3000/signalk/v1/history/values?duration=6h&paths=environment.wind.angleApparent,environment.wind.speedApparent,navigation.position&resolution=60000"
`Multiple aggregations of same path:
`bash
curl "http://localhost:3000/signalk/v1/history/values?from=2025-01-01T00:00:00Z&to=2025-01-01T06:00:00Z&paths=environment.wind.speedApparent:average,environment.wind.speedApparent:min,environment.wind.speedApparent:max&resolution=60000"
`With moving averages for trend analysis:
`bash
curl "http://localhost:3000/signalk/v1/history/values?duration=24h&paths=electrical.batteries.512.voltage&includeMovingAverages=true&resolution=300000"
`Different temporal samples:
`bash
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=navigation.position:first,navigation.position:middle_index,navigation.position:last&resolution=60000"
`#### Context and Path Discovery
Get contexts with data in last hour:
`bash
curl "http://localhost:3000/signalk/v1/history/contexts?duration=1h"
`Get contexts for specific time range:
`bash
curl "http://localhost:3000/signalk/v1/history/contexts?from=2025-01-01T00:00:00Z&to=2025-01-07T00:00:00Z"
`Get available paths with recent data:
`bash
curl "http://localhost:3000/signalk/v1/history/paths?duration=24h"
`Get all paths (no time filter):
`bash
curl "http://localhost:3000/signalk/v1/history/paths"
`#### Unit Conversion (NEW in v0.6.0)
Convert to user's preferred units:
`bash
Speed in knots (if configured in signalk-units-preference)
curl "http://localhost:3000/signalk/v1/history/values?duration=2d&paths=navigation.speedOverGround&convertUnits=true"Wind speed in preferred units (knots, km/h, or mph)
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=environment.wind.speedApparent&convertUnits=true"Temperature in preferred units (°C or °F)
curl "http://localhost:3000/signalk/v1/history/values?duration=24h&paths=environment.outside.temperature&convertUnits=true"
`Response includes conversion metadata:
`json
{
"values": [{"path": "navigation.speedOverGround", "method": "average"}],
"data": [["2025-10-20T16:12:14Z", 5.2]],
"units": {
"converted": true,
"conversions": [{
"path": "navigation.speedOverGround",
"baseUnit": "m/s",
"targetUnit": "knots",
"symbol": "kn"
}]
}
}
`#### Timezone Conversion (NEW in v0.6.0)
Convert to server's local time:
`bash
curl "http://localhost:3000/signalk/v1/history/values?duration=2d&paths=environment.wind.speedApparent&convertTimesToLocal=true"
`Convert to specific timezone:
`bash
New York time (Eastern)
curl "http://localhost:3000/signalk/v1/history/values?duration=2d&paths=navigation.position&convertTimesToLocal=true&timezone=America/New_York"London time
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=environment.wind.speedApparent&convertTimesToLocal=true&timezone=Europe/London"Tokyo time
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=navigation.speedOverGround&convertTimesToLocal=true&timezone=Asia/Tokyo"
`Response includes timezone metadata:
`json
{
"range": {
"from": "2025-10-20T12:12:19-04:00",
"to": "2025-10-20T13:12:19-04:00"
},
"data": [
["2025-10-20T12:12:14-04:00", 5.84],
["2025-10-20T12:12:28-04:00", 5.26]
],
"timezone": {
"converted": true,
"targetTimezone": "America/New_York",
"offset": "-04:00",
"description": "Converted to user-specified timezone: America/New_York (-04:00)"
}
}
`Combine both conversions:
`bash
Convert values to knots AND timestamps to New York time
curl "http://localhost:3000/signalk/v1/history/values?duration=2d&paths=navigation.speedOverGround,environment.wind.speedApparent&convertUnits=true&convertTimesToLocal=true&timezone=America/New_York"
`Common IANA Timezone IDs:
-
America/New_York - Eastern Time (US)
- America/Chicago - Central Time (US)
- America/Denver - Mountain Time (US)
- America/Los_Angeles - Pacific Time (US)
- Europe/London - UK
- Europe/Paris - Central European Time
- Asia/Tokyo - Japan
- Pacific/Auckland - New Zealand
- Australia/Sydney - Australian Eastern Time#### Duration Formats
-
30s - 30 seconds
- 15m - 15 minutes
- 2h - 2 hours
- 1d - 1 day$3
Local time conversion (default behavior):
`bash
8:00 AM local time ā automatically converted to UTC
curl "http://localhost:3000/signalk/v1/history/values?context=vessels.self&start=2025-08-13T08:00:00&duration=1h&paths=navigation.position"
`UTC time mode:
`bash
8:00 AM UTC (not converted)
curl "http://localhost:3000/signalk/v1/history/values?context=vessels.self&start=2025-08-13T08:00:00&duration=1h&paths=navigation.position&useUTC=true"
`Explicit timezone (always respected):
`bash
Explicit UTC timezone
curl "http://localhost:3000/signalk/v1/history/values?context=vessels.self&start=2025-08-13T08:00:00Z&duration=1h&paths=navigation.position"Explicit timezone offset
curl "http://localhost:3000/signalk/v1/history/values?context=vessels.self&start=2025-08-13T08:00:00-04:00&duration=1h&paths=navigation.position"
`Timezone behavior:
- Default (
useUTC=false): Datetime strings without timezone info are treated as local time and automatically converted to UTC
- UTC mode (useUTC=true): Datetime strings without timezone info are treated as UTC time
- Explicit timezone: Strings with Z, +HH:MM, or -HH:MM are always parsed as-is regardless of useUTC setting
- start=now: Always uses current UTC time regardless of useUTC settingGet available contexts:
`bash
curl "http://localhost:3000/signalk/v1/history/contexts"
`$3
The History API automatically aligns data from different paths using time bucketing to solve the common problem of misaligned timestamps. This enables:
- Plotting: Data points align properly on charts
- Correlation: Compare values from different sensors at the same time
- Export: Clean, aligned datasets for analysis
Key Features:
- Smart Type Handling: Automatically handles numeric values (wind speed) and JSON objects (position)
- Robust Aggregation: Uses proper SQL type casting to prevent type errors
- Configurable Resolution: Time bucket size in milliseconds (default: auto-calculated based on time range)
- Multiple Aggregation Methods:
average for numeric data, first for complex objectsParameters:
-
resolution - Time bucket size in milliseconds (default: auto-calculated)
- Aggregation methods: average, min, max, first, last, mid, middle_indexAggregation Methods:
-
average - Average value in time bucket (default for numeric data)
- min - Minimum value in time bucket
- max - Maximum value in time bucket
- first - First value in time bucket (default for objects)
- last - Last value in time bucket
- mid - Median value (average of middle values for even counts)
- middle_index - Middle value by index (first of two middle values for even counts)When to Use Each Method:
- Numeric data (wind speed, voltage, etc.): Use
average, min, max for statistics
- Position data: Use first, last, middle_index for specific readings
- String/object data: Avoid mid (unpredictable), prefer first, last, middle_index
- Multiple stats: Query same path with different methods (e.g., wind:average,wind:max)$3
The History API returns time-aligned data in standard SignalK format.
#### Default Response (without moving averages)
`json
{
"context": "vessels.self",
"range": {
"from": "2025-01-01T00:00:00Z",
"to": "2025-01-01T06:00:00Z"
},
"values": [
{
"path": "environment.wind.speedApparent",
"method": "average"
},
{
"path": "navigation.position",
"method": "first"
}
],
"data": [
["2025-01-01T00:00:00Z", 12.5, {"latitude": 37.7749, "longitude": -122.4194}],
["2025-01-01T00:01:00Z", 13.2, {"latitude": 37.7750, "longitude": -122.4195}],
["2025-01-01T00:02:00Z", 11.8, {"latitude": 37.7751, "longitude": -122.4196}]
]
}
`#### With Moving Averages (includeMovingAverages=true)
`json
{
"context": "vessels.self",
"range": {
"from": "2025-01-01T00:00:00Z",
"to": "2025-01-01T06:00:00Z"
},
"values": [
{
"path": "environment.wind.speedApparent",
"method": "average"
},
{
"path": "environment.wind.speedApparent.ema",
"method": "ema"
},
{
"path": "environment.wind.speedApparent.sma",
"method": "sma"
},
{
"path": "navigation.position",
"method": "first"
}
],
"data": [
["2025-01-01T00:00:00Z", 12.5, 12.5, 12.5, {"latitude": 37.7749, "longitude": -122.4194}],
["2025-01-01T00:01:00Z", 13.2, 12.64, 12.85, {"latitude": 37.7750, "longitude": -122.4195}],
["2025-01-01T00:02:00Z", 11.8, 12.45, 12.5, {"latitude": 37.7751, "longitude": -122.4196}]
]
}
`Notes:
- Each data array element is
[timestamp, value1, value2, ...] corresponding to the paths in the values array
- Moving averages (EMA/SMA) are opt-in - add includeMovingAverages=true to include them
- EMA/SMA are only calculated for numeric values; non-numeric values (objects, strings) show null for their EMA/SMA columns
- Without includeMovingAverages, response size is ~66% smallerClaude AI Analysis
The plugin integrates Claude AI to provide intelligent analysis of maritime data, offering insights that would be difficult to extract through traditional querying methods.
$3
Claude AI can generate interactive charts and visualizations directly from your data using Plotly.js specifications. Charts are automatically embedded in analysis responses when analysis would benefit from visualization.
Supported Chart Types:
- Line Charts: Time series trends for navigation, environmental, and performance data
- Bar Charts: Categorical analysis and frequency distributions
- Scatter Plots: Correlation analysis between different parameters
- Wind Roses/Radar Charts: Professional wind direction and speed frequency analysis
- Multiple Series Charts: Compare multiple data streams on the same chart
- Polar Charts: Wind patterns, compass headings, and directional data
Marine-Specific Chart Features:
- Wind Analysis: Automated wind rose generation with Beaufort scale categories
- Navigation Plots: Course over ground, speed trends, and position tracking
- Environmental Monitoring: Temperature, pressure, and weather pattern visualization
- Performance Analysis: Fuel efficiency, battery usage, and system performance charts
- Multi-Vessel Comparisons: Side-by-side analysis of multiple vessels
Chart Data Integrity:
- All chart data is sourced directly from database queries - no fabricated or estimated data
- Charts display exact data points from query results with full traceability
- Automatic validation ensures chart data matches query output
- Time-aligned data from History API ensures accurate multi-parameter visualization
Example Chart Generation:
When you ask Claude to "analyze wind patterns over the last 48 hours", it will:
1. Query your wind direction and speed data
2. Generate a wind rose chart showing frequency by compass direction
3. Color-code by wind speed categories (calm, light breeze, strong breeze, etc.)
4. Display the chart as interactive Plotly.js visualization in the web interface
Charts are automatically included when analysis benefits from visualization, or you can explicitly request specific chart types like "create a line chart" or "show me a wind rose".
$3
EXAMPLES OF POSSIBLE Pre-built analysis templates provide ready-to-use analysis for common maritime operations:
#### Navigation & Routing Templates
- Navigation Summary: Comprehensive analysis of navigation patterns and route efficiency
- Route Optimization: Identify opportunities to optimize routes for efficiency and safety
- Anchoring Analysis: Analyze anchoring patterns, duration, and safety considerations
#### Weather & Environment Templates
- Weather Impact Analysis: Analyze how weather conditions affect vessel performance
- Wind Pattern Analysis: Detailed wind analysis for sailing optimization
#### Electrical System Templates
- Battery Health Assessment: Comprehensive battery performance and charging pattern analysis
- Power Consumption Analysis: Analyze electrical power usage patterns and efficiency
#### Safety & Monitoring Templates
- Safety Anomaly Detection: Detect unusual patterns that might indicate safety concerns
- Equipment Health Monitoring: Monitor equipment performance and predict maintenance needs
#### Performance & Efficiency Templates
- Fuel Efficiency Analysis: Analyze fuel consumption patterns and identify efficiency opportunities
- Overall Performance Trends: Comprehensive vessel performance analysis over time
$3
#### Via Web Interface
1. Navigate to the plugin's web interface
2. Go to the š§ AI Analysis tab
3. Select a data path to analyze
4. Choose an analysis template or create custom analysis
5. Configure time range and analysis parameters
6. Click Analyze Data to generate insights
#### Via API
Test Claude Connection:
`bash
curl -X POST http://localhost:3000/plugins/signalk-parquet/api/analyze/test-connection
`Get Available Templates:
`bash
curl http://localhost:3000/plugins/signalk-parquet/api/analyze/templates
`Custom Analysis:
`bash
curl -X POST http://localhost:3000/plugins/signalk-parquet/api/analyze \
-H "Content-Type: application/json" \
-d '{
"dataPath": "environment.wind.speedTrue,navigation.speedOverGround",
"analysisType": "custom",
"customPrompt": "Analyze the relationship between wind speed and vessel speed. Identify optimal wind conditions for best performance.",
"timeRange": {
"start": "2025-01-01T00:00:00Z",
"end": "2025-01-07T00:00:00Z"
},
"aggregationMethod": "average",
"resolution": "3600000"
}'
`$3
Claude AI analysis returns structured insights:
`json
{
"id": "analysis_1234567890_abcdef123",
"analysis": "Main analysis text with detailed insights",
"insights": [
"Key insight 1",
"Key insight 2",
"Key insight 3"
],
"recommendations": [
"Actionable recommendation 1",
"Actionable recommendation 2"
],
"anomalies": [
{
"timestamp": "2025-01-01T12:00:00Z",
"value": 25.5,
"expectedRange": {"min": 10.0, "max": 20.0},
"severity": "medium",
"description": "Wind speed higher than normal range",
"confidence": 0.87
}
],
"confidence": 0.92,
"dataQuality": "High quality data with 98% completeness",
"timestamp": "2025-01-01T15:30:00Z",
"metadata": {
"dataPath": "environment.wind.speedTrue",
"analysisType": "summary",
"recordCount": 1440,
"timeRange": {
"start": "2025-01-01T00:00:00Z",
"end": "2025-01-02T00:00:00Z"
}
}
}
`$3
All Claude AI analyses are automatically saved and can be retrieved:
Get Analysis History:
`bash
curl http://localhost:3000/plugins/signalk-parquet/api/analyze/history?limit=10
`History files are stored in:
data/analysis-history/analysis_*.json$3
1. Data Quality: Ensure good data coverage for more reliable analysis
2. Time Ranges: Use appropriate time ranges - longer for trends, shorter for anomalies
3. Path Selection: Combine related paths for correlation analysis
4. Template Usage: Start with templates then customize prompts as needed
5. API Limits: Be mindful of Anthropic API token limits and costs
6. Model Selection: Use Opus for complex analysis, Sonnet for general use, Haiku for quick insights
$3
Common Issues:
- "Claude not enabled": Check plugin configuration and enable Claude integration
- "API key missing": Add valid Anthropic API key in plugin settings
- "Analysis timeout": Reduce data size or use faster model (Haiku)
- "Token limit exceeded": Reduce time range or use data sampling
Debug Claude Integration:
`bash
Test API connection
curl -X POST http://localhost:3000/plugins/signalk-parquet/api/analyze/test-connectionCheck plugin logs for Claude-specific messages
journalctl -u signalk -f | grep -i claude
`Moving Averages (EMA & SMA)
The plugin calculates Exponential Moving Average (EMA) and Simple Moving Average (SMA) for numeric values when explicitly requested via the
includeMovingAverages parameter, providing enhanced trend analysis capabilities.$3
History API:
`bash
Add includeMovingAverages=true to any query
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=environment.wind.speedApparent&includeMovingAverages=true"
`Default Behavior (v0.5.6+):
- Moving averages are opt-in - not included by default
- Reduces response size by ~66% when not needed
- Better API compliance with SignalK specification
Legacy Behavior (pre-v0.5.6):
- Moving averages were automatically included for all queries
- To maintain old behavior, add
includeMovingAverages=true to all requests$3
#### Exponential Moving Average (EMA)
- Period: ~10 equivalent (α = 0.2)
- Formula:
EMA = α à currentValue + (1 - α) à previousEMA
- Characteristic: Responds faster to recent changes, emphasizes recent data
- Use Case: Trend detection, rapid response to data changes#### Simple Moving Average (SMA)
- Period: 10 data points
- Formula: Average of the last 10 values
- Characteristic: Smooths out fluctuations, equal weight to all values in window
- Use Case: Noise reduction, general trend analysis
$3
`javascript
// Initial Data Load (isIncremental: false)
Point 1: Value=5.0, EMA=5.0, SMA=5.0
Point 2: Value=6.0, EMA=5.2, SMA=5.5
Point 3: Value=4.0, EMA=5.0, SMA=5.0// Incremental Updates (isIncremental: true)
Point 4: Value=7.0, EMA=5.4, SMA=5.5 // Continues from previous EMA
Point 5: Value=5.5, EMA=5.42, SMA=5.5 // Rolling 10-point SMA window
`$3
- šļø Opt-In: Add
includeMovingAverages=true to enable (v0.5.6+)
- ā
Memory Efficient: SMA maintains rolling 10-point window
- ā
Non-Numeric Handling: Non-numeric values (strings, objects) show null for EMA/SMA
- ā
Precision: Values rounded to 3 decimal places to prevent floating-point noise
- ā” Performance: Smaller response sizes when not needed$3
Marine Data Examples:
- Wind Speed: EMA detects gusts quickly, SMA shows general wind conditions
- Battery Voltage: EMA shows charging/discharging trends, SMA indicates overall battery health
- Engine RPM: EMA responds to throttle changes, SMA shows average operating level
- Water Temperature: EMA detects thermal changes, SMA provides stable baseline
Available in:
- š History API: Add
includeMovingAverages=true to include EMA/SMA calculations
S3 Integration
$3
Real-time Upload: Files are uploaded immediately after creation
`json
{
"s3Upload": {
"enabled": true,
"timing": "realtime"
}
}
`Consolidation Upload: Files are uploaded after daily consolidation
`json
{
"s3Upload": {
"enabled": true,
"timing": "consolidation"
}
}
`$3
With prefix
marine-data/:
`
marine-data/vessels/self/navigation/position/signalk_data_20250716_consolidated.parquet
marine-data/vessels/self/environment/wind/angleApparent/signalk_data_20250716_120000.parquet
`File Consolidation
The plugin automatically consolidates files daily at midnight UTC:
1. File Discovery: Finds all files for the previous day
2. Merging: Combines files by SignalK path
3. Sorting: Sorts records by timestamp
4. Cleanup: Moves source files to
processed/ directory
5. S3 Upload: Uploads consolidated files if configuredPerformance Characteristics
- Memory Usage: Configurable buffer sizes (default 1000 records)
- Disk I/O: Efficient batch writes with configurable intervals
- CPU Usage: Minimal - mostly I/O bound operations
- Network: Optional S3 uploads with retry logic
Development
$3
`
signalk-parquet/
āāā src/
ā āāā index.ts # Main plugin entry point and lifecycle (~340 lines)
ā āāā commands.ts # Command management system (~400 lines)
ā āāā data-handler.ts # Data processing, subscriptions, S3 (~650 lines)
ā āāā api-routes.ts # Web API endpoints (~600 lines)
ā āāā types.ts # TypeScript interfaces (~360 lines)
ā āāā parquet-writer.ts # File writing logic
ā āāā HistoryAPI.ts # SignalK History API implementation
ā āāā HistoryAPI-types.ts # History API type definitions
ā āāā utils/
ā āāā path-helpers.ts # Path utility functions
āāā dist/ # Compiled JavaScript
āāā public/
ā āāā index.html # Web interface
ā āāā parquet.png # Plugin icon
āāā tsconfig.json # TypeScript configuration
āāā package.json # Dependencies and scripts
āāā README.md # This file
`$3
The plugin uses a modular TypeScript architecture for maintainability:
-
index.ts: Plugin lifecycle, configuration, and initialization
- commands.ts: SignalK command registration, execution, and management
- data-handler.ts: Data subscriptions, buffering, consolidation, and S3 operations
- api-routes.ts: REST API endpoints for web interface
- types.ts: Comprehensive TypeScript type definitions
- utils/: Utility functions and helpers$3
1. API Endpoints: Add to
src/api-routes.ts
2. Data Processing: Extend src/data-handler.ts
3. Commands: Modify src/commands.ts
4. Types: Add interfaces to src/types.ts
5. Claude AI Models: Update src/claude-models.ts (see below)
6. Update Documentation: Update README and inline comments#### Updating Claude AI Models
When Anthropic releases new models, update the single source of truth in
src/claude-models.ts:`typescript
export const CLAUDE_MODELS = {
OPUS_4_1: 'claude-opus-4-1-20250805',
OPUS_4: 'claude-opus-4-20250514',
SONNET_4: 'claude-sonnet-4-20250514',
SONNET_4_5: 'claude-sonnet-4-5-20250929',
// Add new models here
} as const;export const SUPPORTED_CLAUDE_MODELS = [
CLAUDE_MODELS.OPUS_4_1,
CLAUDE_MODELS.OPUS_4,
CLAUDE_MODELS.SONNET_4,
CLAUDE_MODELS.SONNET_4_5,
// Add to supported list
] as const;
export const DEFAULT_CLAUDE_MODEL = CLAUDE_MODELS.SONNET_4_5; // Update default if needed
export const CLAUDE_MODEL_DESCRIPTIONS = {
[CLAUDE_MODELS.OPUS_4_1]: 'Claude Opus 4.1 (Most Capable & Intelligent)',
[CLAUDE_MODELS.OPUS_4]: 'Claude Opus 4 (Previous Flagship)',
[CLAUDE_MODELS.SONNET_4]: 'Claude Sonnet 4 (Balanced Performance)',
[CLAUDE_MODELS.SONNET_4_5]: 'Claude Sonnet 4.5 (Latest Sonnet)',
// Add descriptions for new models
} as const;
`Why this matters:
- All model definitions are centralized in one file
- Type safety across the entire codebase
- Automatic migration of outdated models on plugin startup
- Prevents form validation errors when users have old model values saved
- No need to update multiple files when adding new models
The plugin automatically migrates old/invalid model values to the current default on startup, preventing configuration save failures.
$3
The plugin uses strict TypeScript configuration:
`json
{
"compilerOptions": {
"strict": true,
"noImplicitAny": true,
"noImplicitReturns": true,
"strictNullChecks": true
}
}
`Troubleshooting
$3
Build Errors
`bash
Clean and rebuild
npm run clean
npm run build
`DuckDB Not Available
- Check that
@duckdb/node-api is installed
- Verify Node.js version compatibility (>=16.0.0)S3 Upload Failures
- Verify AWS credentials and permissions
- Check S3 bucket exists and is accessible
- Test connection using web interface
No Data Collection
- Verify path configurations are correct
- Check if regimens are properly activated
- Review SignalK logs for subscription errors
$3
Enable debug logging in SignalK:
`json
{
"settings": {
"debug": "signalk-parquet*"
}
}
`
$3
-
@dsnp/parquetjs: Parquet file format support
- @duckdb/node-api: SQL query engine
- @aws-sdk/client-s3: S3 upload functionality
- fs-extra: Enhanced file system operations
- glob: File pattern matching
- express: Web server framework$3
-
typescript: TypeScript compiler
- @types/node: Node.js type definitions
- @types/express: Express type definitions
- @types/fs-extra: fs-extra type definitionsLicense
MIT License - See LICENSE file for details.
Testing
Comprehensive testing procedures are documented in
TESTING.md. The testing guide covers:- Installation and build verification
- Plugin configuration testing
- Web interface functionality
- Data collection validation
- Regimen control testing
- File output verification
- S3 integration testing
- API endpoint testing
- Performance testing
- Error handling validation
$3
`bash
Test plugin health
curl http://localhost:3000/plugins/signalk-parquet/api/healthTest path configuration
curl http://localhost:3000/plugins/signalk-parquet/api/config/pathsTest data collection
curl http://localhost:3000/plugins/signalk-parquet/api/pathsTest History API
curl "http://localhost:3000/signalk/v1/history/contexts"
`TODO
- [x] Implement startup consolidation for missed previous days (exclude current day)
- [x] Add history API integration
- [ ] Incorporate user preferences from units-preference in the regimen filter system
- [ ] Expose recorded spatial event via api endpoint (geojson)
- [ ] Add Grafana integration
Contributing
1. Fork the repository
2. Create a feature branch
3. Add TypeScript types for new features
4. Include tests and documentation
5. Follow the testing procedures in
TESTING.md
6. Submit a pull requestChangelog
See CHANGELOG.md for complete version history.
$3
- šÆ SignalK History API Compliance: Full support for all 5 standard time range patterns
- āŖ Backward Compatibility: Legacy start parameter supported with deprecation warnings
- šļø Optional Moving Averages: EMA/SMA now opt-in via includeMovingAverages parameter
- š Time-Filtered Discovery: Paths and contexts endpoints accept time range parameters
- ā” Performance: 4.3x faster context discovery (13s ā 3s) with SQL optimization and caching$3
- š§ Threshold Automation State Machine Fix: Fixed automation enable/disable transitions to properly execute state changes
- When enabling automation (.auto = true): Command is now set to OFF, then all thresholds are immediately evaluated
- When disabling automation (.auto = false): Threshold monitoring stops and command state remains unchanged
- Default command state is hardcoded to OFF on server side
- Fixed autoPutHandler in src/commands.ts to execute one-time transition operations when .auto path toggles
- Ensures thresholds take control immediately upon automation enable instead of waiting for next SignalK delta update
- Removed user-configurable default state dropdown from UI (both add and edit command forms)$3
- š§± Front-end Modularization: Replaced the 5,000-line inline dashboard script with focused JS modules under public/js, improving readability and maintainability.
- āļø Threshold Automation Fix: Threshold monitoring now listens to raw SignalK values via getSelfStream`, so saved trigger conditions reliably toggle their commands.