Node.js client for StatsD, DogStatsD, and Telegraf
npm install hot-shotsA Node.js client for Datadog's DogStatsD server, InfluxDB's Telegraf StatsD server, the OpenTelemetry Collector StatsD receiver, and Etsy's StatsD server.
This project was originally a fork off of node-statsd. This project
includes all changes in the latest node-statsd and many additional changes, including:
* uds (Unix domain socket) protocol support
* raw stream protocol support
* TypeScript types
* Telegraf support
* events
* child clients
* tcp protocol support
* mock mode
* asyncTimer
* asyncDistTimer
* debug logging
* much more, including many bug fixes
You can read about all changes in the changelog.
hot-shots supports Node 16.x and higher.
All initialization parameters are optional.
Parameters (specified as one object passed into hot-shots):
* host: The host to send stats to, if not set, the constructor tries to
retrieve it from the DD_AGENT_HOST environment variable, default: 'undefined' which as per UDP/datagram socket docs results in 127.0.0.1 or ::1 being used.
* port: The port to send stats to, if not set, the constructor tries to retrieve it from the DD_DOGSTATSD_PORT environment variable, default: 8125
* prefix: What to prefix each stat name with default: ''. A period separator is automatically added if not present (e.g. my_prefix becomes my_prefix.).
* suffix: What to suffix each stat name with default: ''. A period separator is automatically added if not present (e.g. my_suffix becomes .my_suffix).
* tagPrefix: Prefix tag list with character default: '#'. Note does not work with telegraf option.
* tagSeparator: Separate tags with character default: ','. Note does not work with telegraf option.
* globalize: Expose this StatsD instance globally. default: false
cacheDns: Caches dns lookup to host for cacheDnsTtl*, only used
when protocol is udp, default: false
cacheDnsTtl: time-to-live of dns lookups in milliseconds, when cacheDns* is enabled. default: 60000
* mock: Create a mock StatsD instance, using a mock transport that doesn't create real sockets.
Stats are not sent to the server but can be read from mockBuffer for testing. Note that
mockBuffer will keep growing, so only use for testing or clear out periodically. default: false
* globalTags: Tags that will be added to every metric. Can be either an object or list of tags. default: {}.
includeDataDogTags: Whether to include DataDog tags to the global tags. default: true. The following Datadog* tags are appended to globalTags from the corresponding environment variable if the latter is set:
* dd.internal.entity_id from DD_ENTITY_ID (docs)
* env from DD_ENV (docs)
* service from DD_SERVICE (docs)
* version from DD_VERSION (docs)
* maxBufferSize: If larger than 0, metrics will be buffered and only sent when the string length is greater than the size. default: 0 for udp and tcp. default: 8192 for uds.
* bufferFlushInterval: If buffering is in use, this is the time in ms to always flush any buffered metrics. default: 1000
* telegraf: Use Telegraf's StatsD line protocol, which is slightly different than the rest default: false
* sampleRate: Sends only a sample of data to StatsD for all StatsD methods. Can be overridden at the method level. default: 1
* errorHandler: A function with one argument. It is called to handle various errors. default: none, errors are thrown/logger to console
* useDefaultRoute: Use the default interface on a Linux system. Useful when running in containers
* protocol: Use tcp option for TCP protocol, or uds for the Unix Domain Socket protocol or stream for the raw stream. Defaults to udp otherwise.
* path: Used only when the protocol is uds. Defaults to /var/run/datadog/dsd.socket.
* stream: Reference to a stream instance. Used only when the protocol is stream.
* tcpGracefulErrorHandling: Used only when the protocol is tcp. Boolean indicating whether to handle socket errors gracefully. Defaults to true.
* tcpGracefulRestartRateLimit: Used only when the protocol is tcp. Time (ms) between re-creating the socket. Defaults to 1000.
* udsGracefulErrorHandling: Used only when the protocol is uds. Boolean indicating whether to handle socket errors gracefully. Defaults to true.
* udsGracefulRestartRateLimit: Used only when the protocol is uds. Time (ms) between re-creating the socket. Defaults to 1000.
* closingFlushInterval: Before closing, StatsD will check for inflight messages. Time (ms) between each check. Defaults to 50.
* udsRetryOptions: Used only when the protocol is uds. Retry/backoff options for UDS sends:
* retries: Number of retry attempts for failed packet sends. Defaults to 3.
* retryDelayMs: Initial delay in milliseconds before retrying a failed packet send. Defaults to 100.
* maxRetryDelayMs: Maximum delay in milliseconds between retry attempts (caps exponential backoff). Defaults to 1000.
* backoffFactor: Exponential backoff multiplier for retry delays. Defaults to 2.
* udpSocketOptions: Used only when the protocol is udp. Specify the options passed into dgram.createSocket(). The socket type (udp4 or udp6) is auto-detected based on the host: IPv6 addresses (e.g., ::1) use udp6, IPv4 addresses use udp4, and hostnames default to udp4. You can override auto-detection by explicitly setting type (e.g., { type: 'udp6' }).
* includeDatadogTelemetry: Enable client-side telemetry to track metrics about the client itself. This helps diagnose high-throughput metric delivery issues. Telemetry metrics are prefixed with datadog.dogstatsd.client. and are not billed as custom metrics. default: false. See Client-Side Telemetry for details.
* telemetryFlushInterval: When telemetry is enabled, how often (in ms) to send telemetry metrics. default: 10000
event, close, and check have the same API:name: Stat name requiredvalue: Stat value required except in increment/decrement where it defaults to 1/-1 respectivelysampleRate: Sends only a sample of data to StatsD default: 1tags: The tags to add to metrics. Can be either an object { tag: "value"} or an array of tags. default: []callback: The callback to execute once the metric has been sent or bufferedAlternatively, you can pass an options object in place of sampleRate and tags:
* options: An object with optional properties:
* sampleRate: Sends only a sample of data to StatsD default: 1
* tags: The tags to add to metrics default: []
* timestamp: A timestamp to associate with the metric. Can be a Date object or Unix timestamp in seconds. (DogStatsD only, ignored for Telegraf)
* callback: The callback to execute once the metric has been sent or buffered
If an array is specified as the name parameter each item in that array will be sent along with the specified value.
#### close
The close method has the following API:
* callback: The callback to execute once close is complete. All other calls to statsd will fail once this is called.
#### event
The event method has the following API:
* title: Event title required
* text: Event description default is title
* options: Options for the event
* date_happened Assign a timestamp to the event default is now
* hostname Assign a hostname to the event.
* aggregation_key Assign an aggregation key to the event, to group it with some others.
* priority Can be ‘normal’ or ‘low’ default: normal
* source_type_name Assign a source type to the event.
* alert_type Can be ‘error’, ‘warning’, ‘info’ or ‘success’ default: info
* tags: The tags to add to metrics. Can be either an object { tag: "value"} or an array of tags. default: []
* callback: The callback to execute once the metric has been sent.
#### check
The check method has the following API:
* name: Check name required
* status: Check status required
* options: Options for the check
* date_happened Assign a timestamp to the check default is now
* hostname Assign a hostname to the check.
* message Assign a message to the check.
* tags: The tags to add to metrics. Can be either an object { tag: "value"} or an array of tags. default: []
* callback: The callback to execute once the metric has been sent.
``javascript
var StatsD = require('hot-shots'),
client = new StatsD({
port: 8020,
globalTags: { env: process.env.NODE_ENV },
errorHandler: errorHandler,
});
// Increment: Increments a stat by a value (default is 1)
client.increment('my_counter');
// Decrement: Decrements a stat by a value (default is -1)
client.decrement('my_counter');
// Histogram: send data for histogram stat (DataDog and Telegraf only)
client.histogram('my_histogram', 42);
// Distribution: Tracks the statistical distribution of a set of values across your infrastructure.
// (DataDog v6)
client.distribution('my_distribution', 42);
// Gauge: Gauge a stat by a specified amount
client.gauge('my_gauge', 123.45);
// Gauge: Gauge a stat by a specified amount, but change it rather than setting it
client.gaugeDelta('my_gauge', -10);
client.gaugeDelta('my_gauge', 4);
// Set: Counts unique occurrences of a stat (alias of unique)
client.set('my_unique', 'foobar');
client.unique('my_unique', 'foobarbaz');
// Event: sends the titled event (DataDog only)
client.event('my_title', 'description');
// Check: sends a service check (DataDog only)
client.check('service.up', client.CHECKS.OK, { hostname: 'host-1' }, ['foo', 'bar'])
// Incrementing multiple items
client.increment(['these', 'are', 'different', 'stats']);
// Incrementing with tags
client.increment('my_counter', ['foo', 'bar']);
// Sampling, this will sample 25% of the time the StatsD Daemon will compensate for sampling
client.increment('my_counter', 1, 0.25);
// Tags, this will add user-defined tags to the data
// (DataDog and Telegraf only)
client.histogram('my_histogram', 42, ['foo', 'bar']);
// Options object, allows combining sampleRate, tags, and timestamp
// (DataDog only for timestamp)
client.gauge('my_gauge', 42, { sampleRate: 0.25, tags: ['foo', 'bar'] });
// Timestamp: send a metric with a specific timestamp (DataDog only)
client.gauge('my_gauge', 42, { timestamp: new Date('2022-01-01') });
client.increment('my_counter', 1, { timestamp: 1640995200 }); // Unix seconds
// Using the callback. This is the same format for the callback
// with all non-close calls
client.set(['foo', 'bar'], 42, function(error, bytes){
//this only gets called once after all messages have been sent
if(error){
console.error('Oh noes! There was an error:', error);
} else {
console.log('Successfully sent', bytes, 'bytes');
}
});
// Timing: sends a timing command with the specified milliseconds
client.timing('response_time', 42);
// Timing: also accepts a Date object of which the difference is calculated
client.timing('response_time', new Date());
// Timing: measuring elapsed time with Date.now()
var startTime = Date.now();
// ... your code here ...
client.timing('response_time', Date.now() - startTime);
// Timer: Returns a function that you call to record how long the first
// parameter takes to execute (in milliseconds) and then sends that value
// using 'client.timing'.
// The parameters after the first one (in this case 'fn')
// match those in 'client.timing'.
var fn = function(a, b) { return a + b };
client.timer(fn, 'fn_execution_time')(2, 2);
// Async timer: Similar to timer above, but you instead pass in a function
// that returns a Promise. And then it returns a Promise that will record the timing.
var fn = function () { return new Promise(function (resolve, reject) { setTimeout(resolve, n); }); };
var instrumented = statsd.asyncTimer(fn, 'fn_execution_time');
instrumented().then(function() {
console.log('Code run and metric sent');
});
// Async timer: Similar to asyncTimer above, but it instead emits a distribution.
var fn = function () { return new Promise(function (resolve, reject) { setTimeout(resolve, n); }); };
var instrumented = statsd.asyncDistTimer(fn, 'fn_execution_time');
instrumented().then(function() {
console.log('Code run and metric sent');
});
// Sampling, tags and callback are optional and could be used in any combination (DataDog and Telegraf only)
client.histogram('my_histogram', 42, 0.25); // 25% Sample Rate
client.histogram('my_histogram', 42, { tag: 'value'}); // User-defined tag
client.histogram('my_histogram', 42, ['tag:value']); // Tags as an array
client.histogram('my_histogram', 42, next); // Callback
client.histogram('my_histogram', 42, 0.25, ['tag']);
client.histogram('my_histogram', 42, 0.25, next);
client.histogram('my_histogram', 42, { tag: 'value'}, next);
client.histogram('my_histogram', 42, 0.25, { tag: 'value'}, next);
// Use a child client to add more context to the client.
// Clients can be nested.
var childClient = client.childClient({
prefix: 'additionalPrefix.',
suffix: '.additionalSuffix',
globalTags: { globalTag1: 'forAllMetricsFromChildClient'}
});
childClient.increment('my_counter_with_more_tags');
// Close statsd. This will ensure all stats are sent and stop statsd
// from doing anything more.
client.close(function(err) {
console.log('The close did not work quite right: ', err);
});
// UDS client with automatic retry on packet failures
var client = new StatsD({
protocol: 'uds',
path: '/var/run/datadog/dsd.socket',
udsRetryOptions: {
// Retry options (all optional, showing defaults):
// retries: 3, // Number of retry attempts (set to 0 to disable)
// retryDelayMs: 100, // Initial delay in ms
// maxRetryDelayMs: 1000,// Maximum delay cap in ms
// backoffFactor: 2 // Exponential backoff multiplier
}
});
`
Some of the functionality mentioned above is specific to certain backends and will not work with others.
* globalTags parameter - DogStatsD, Telegraf, or OpenTelemetry
* tags parameter - DogStatsD, Telegraf, or OpenTelemetry
* histogram method - DogStatsD, Telegraf, or OpenTelemetry
* telegraf parameter - Telegraf
* uds option in protocol parameter - DogStatsD
* distribution method - DogStatsD
* set / unique method - DogStatsD or Telegraf (not OpenTelemetry)
* event method - DogStatsD
* check method - DogStatsD
* timestamp option - DogStatsD
* includeDatadogTelemetry parameter - DogStatsD
* telemetryFlushInterval parameter - DogStatsD
hot-shots is compatible with the OpenTelemetry Collector's StatsD receiver. The following features work out of the box:
| Feature | hot-shots Method | OTel Support |
|---------|------------------|--------------|
| Counter | increment(), decrement() | Yes |gauge()
| Gauge | | Yes |gaugeDelta()
| Gauge delta (+/-) | | Yes |timing()
| Timer | | Yes (converted to gauge/summary/histogram) |histogram()
| Histogram | | Yes (treated as timer) |
| Sample rate | All methods | Yes |
| Tags | All methods | Yes |
Example configuration for OpenTelemetry Collector:
`javascript
var client = new StatsD({
host: 'localhost',
port: 8125,
protocol: 'udp' // or 'tcp'
});
// These all work with OpenTelemetry
client.increment('requests');
client.gauge('queue_size', 100);
client.gaugeDelta('connections', 1);
client.timing('response_time', 250);
client.histogram('request_size', 1024);
`
To prevent malformed packets, hot-shots automatically replaces protocol-breaking characters with underscores (_).
* Metric names: :, |, \n:
* Tag keys: , |, ,, \n, plus @ and # for StatsD/DogStatsD|
* Tag values: , ,, \n, plus @ and # for StatsD/DogStatsD
Colons are allowed in tag values (e.g., url:https://example.com:8080).
As usual, callbacks will have an error as their first parameter. You can have an error in both the message and close callbacks.
If the optional callback is not given, an error is thrown in some
cases and a console.log message is used in others. An error will only
be explicitly thrown when there is a missing callback or if it is some potential configuration issue to be fixed.
If you would like to ensure all errors are caught, specify an errorHandler in your root
client. This will catch errors in socket setup, sending of messages,
and closing of the socket. If you specify an errorHandler and a callback, the callback will take precedence.
`javascript`
// Using errorHandler
var client = new StatsD({
errorHandler: function (error) {
console.log("Socket errors caught here: ", error);
}
})
If you get an error like Error sending hot-shots message: Error: congestion with an error code of 1,
it is probably because you are sending large volumes of metrics to a single agent/ server.
This error only arises when using the UDS protocol and means that packages are being dropped.
Take a look at the Datadog docs for some tips on tuning your connection.
Metrics sent from process.on('exit') handlers will not be delivered. This is a fundamental Node.js limitation, not a bug in hot-shots. When the exit event fires, the event loop has stopped processing async operations, so socket send callbacks will never execute.
The same applies to process.on('uncaughtExceptionMonitor') since that handler is also synchronous.
Alternatives that work:
Use beforeExit for graceful shutdown (fires when event loop is empty but before exit):`javascript`
process.on('beforeExit', (code) => {
client.increment('app.shutdown');
client.close();
});
Use signal handlers for external shutdown requests:
`javascriptsignal:${signal}
function gracefulShutdown(signal) {
client.increment('app.shutdown', []);
client.close(() => {
process.exit(0);
});
}
process.on('SIGTERM', () => gracefulShutdown('SIGTERM'));
process.on('SIGINT', () => gracefulShutdown('SIGINT'));
`
For uncaught exceptions, use uncaughtException (not uncaughtExceptionMonitor) and delay exit:`javascript`
process.on('uncaughtException', (err) => {
client.increment('app.crash');
client.close(() => {
console.error('Uncaught exception:', err);
process.exit(1);
});
});
If you're having issues with metrics not being sent or want to understand what hot-shots is doing
in detail, you can enable debug logging using Node.js's built-in NODE_DEBUG environment variable:
`bash`
NODE_DEBUG=hot-shots node your-app.js
The 'uds' option as the protocol is to support Unix Domain Sockets for Datadog. It has the following limitations:
- It only works where 'node-gyp' works. If you don't know what this is, this
is probably fine for you. If you had an troubles with libraries that
you 'node-gyp' before, you will have problems here as well.
- It does not work on Windows
The above will cause the underlying library that is used, unix-dgram,
to not install properly. Given the library is listed as an
optionalDependency, and how it's used in the codebase, this install
failure will not cause any problems. It only means that you can't use
the uds feature.
When includeDatadogTelemetry is enabled, the client automatically sends telemetry metrics about itself to help diagnose metric delivery issues in high-throughput scenarios. This feature should matche the behavior of official Datadog clients as described in the docs.
Telemetry is automatically disabled when using mock: true, telegraf: true, or in child clients.
The following metrics are sent every telemetryFlushInterval milliseconds (default: 10 seconds):
| Metric | Description |
|--------|-------------|
| datadog.dogstatsd.client.metrics | Total number of metrics sent |datadog.dogstatsd.client.metrics_by_type
| | Metrics broken down by type (gauge, count, set, timing, histogram, distribution) |datadog.dogstatsd.client.events
| | Total number of events sent |datadog.dogstatsd.client.service_checks
| | Total number of service checks sent |datadog.dogstatsd.client.bytes_sent
| | Total bytes successfully sent |datadog.dogstatsd.client.bytes_dropped
| | Total bytes dropped |datadog.dogstatsd.client.packets_sent
| | Total packets successfully sent |datadog.dogstatsd.client.packets_dropped
| | Total packets dropped |
The metric_dropped_on_receive from the official Datadog clients is intentionally omitted. That metric tracks drops on an internal receive channel, which doesn't apply to hot-shots' architecture. Also bytes_dropped_queue is omitted as this also didn't fit into how hot-shots works.
All telemetry metrics include these tags:
* client:nodejs - Identifies the hot-shots clientclient_version:
* - The hot-shots versionclient_transport:
* - The transport protocol (udp, tcp, uds, stream)
`javascript``
var client = new StatsD({
host: 'localhost',
includeDatadogTelemetry: true,
telemetryFlushInterval: 10000 // Optional, default is 10 seconds
});
Thanks for considering making any updates to this project! This project is entirely community-driven, and so your changes are important. Here are the steps to take in your fork:
1. Run "npm install"
2. Add your changes in your fork as well as any new tests needed
3. Run "npm test"
4. Update README.md with any needed documentation
5. If you have made any API changes, update types.d.ts
6. Push your changes and create the PR
When you've done all this we're happy to try to get this merged in right away.
Versions will attempt to follow semantic versioning, with major changes only coming in major versions.
npm publishing is possible by one person, bdeitte, who has two-factor authentication enabled for publishes. Publishes only contain one additional library, unix-dgram.
hot-shots is licensed under the MIT license.