clustering-tfjs

![npm version](https://www.npmjs.com/package/clustering-tfjs)
![Build Status](https://github.com/CRJFisher/clustering-js/actions)
![License: MIT](https://opensource.org/licenses/MIT)
![TypeScript](https://www.typescriptlang.org/)

Native TypeScript implementation of clustering algorithms powered by TensorFlow.js with full browser and Node.js support.

Features

- ✅ Pure TypeScript/JavaScript (no Python required)
- ✅ Multiple clustering algorithms (K-Means, Spectral, Agglomerative)
- ✅ Powered by TensorFlow.js for performance
- ✅ Works in both Node.js and browsers
- ✅ Platform-optimized bundles (49KB for browser, 163KB for Node.js)
- ✅ TypeScript support with full type definitions
- ✅ GPU acceleration available (WebGL in browser, CUDA in Node.js)
- ✅ Automatic backend selection
- ✅ Extensively tested for parity with scikit-learn

1. Quick Start
2. Installation
3. Algorithms
4. Validation Metrics
5. Backend Selection
6. API Reference
7. Examples
8. Performance
9. Migration from scikit-learn
10. Contributing
11. License

Quick Start

$3

``bash

`For Node.js with acceleration`


npm install clustering-tfjs @tensorflow/tfjs-node
For Node.js with GPU support

npm install clustering-tfjs @tensorflow/tfjs-node-gpu
For browser usage (TensorFlow.js loaded separately)

npm install clustering-tfjs


> Note: For Windows users or if you encounter native binding issues, see our Windows Compatibility Guide.
$3
#### Browser

`html

#### Node.js

`typescript import { Clustering } from 'clustering-tfjs';

// Initialize (optional - auto-detects best backend) await Clustering.init();

// Use algorithms const kmeans = new Clustering.KMeans({ nClusters: 3 }); const data = [[1, 2], [1.5, 1.8], [5, 8], [8, 8], [1, 0.6], [9, 11]]; const labels = await kmeans.fitPredict(data); console.log(labels); // [0, 0, 1, 1, 0, 2]`

`Installation`

`$3`

`bash

`Basic installation (pure JavaScript backend)`


npm install clustering-tfjs
Recommended: With native acceleration

npm install clustering-tfjs @tensorflow/tfjs-node
Optional: With GPU support

npm install clustering-tfjs @tensorflow/tfjs-node-gpu


$3
The browser bundle is available via CDN:

`html

Or install via npm and use with a bundler:

`bash npm install clustering-tfjs @tensorflow/tfjs`

`Algorithms`

`$3`

- Classic centroid-based clustering - Supports custom initialization methods - K-Means++ initialization by default

`$3`

- Graph-based clustering using eigendecomposition - Ideal for non-convex clusters - Supports custom affinity functions

`$3`

- Hierarchical bottom-up clustering - Multiple linkage criteria (ward, complete, average, single) - Memory efficient implementation

`Validation Metrics`

The library includes three validation metrics to evaluate clustering quality and optimize the number of clusters:

`$3`

Measures how similar an object is to its own cluster compared to other clusters. Range: [-1, 1], higher is better.

`$3`

Evaluates intra-cluster and inter-cluster distances. Range: [0, ∞), lower is better.

`$3`

Ratio of between-cluster to within-cluster dispersion. Range: [0, ∞), higher is better.

`$3`

The library includes a built-in findOptimalClusters function that automatically determines the optimal number of clusters:

`typescript import { findOptimalClusters } from 'clustering-tfjs';

// Find optimal k between 2 and 10 clusters const result = await findOptimalClusters(data, { minClusters: 2, maxClusters: 10, algorithm: 'kmeans' // or 'spectral', 'agglomerative' });

console.log(Optimal number of clusters: ${result.optimal.k}); console.log(Silhouette score: ${result.optimal.silhouette}); console.log(All evaluations:, result.evaluations);

// Advanced usage with custom scoring const customResult = await findOptimalClusters(data, { maxClusters: 8, algorithm: 'spectral', algorithmParams: { affinity: 'nearest_neighbors' }, metrics: ['silhouette', 'calinskiHarabasz'], // Skip Davies-Bouldin scoringFunction: (evaluation) => evaluation.silhouette * 2 + evaluation.calinskiHarabasz });`

`Platform Detection & Backend Selection`

The library automatically detects your environment and selects the best backend:

`typescript import { Clustering } from 'clustering-tfjs';

// Check current platform console.log('Platform:', Clustering.platform); // 'browser' or 'node'

// Check available features console.log('Features:', Clustering.features); // { // gpuAcceleration: true, // wasmSimd: false, // nodeBindings: true, // webgl: false // }

// Manually select backend await Clustering.init({ backend: 'webgl' }); // Browser await Clustering.init({ backend: 'tensorflow' }); // Node.js`

`$3`

| Backend | Environment | Use Case | Performance | |---------|------------|----------|-------------| |cpu| Both | Pure JS fallback | Baseline | |webgl| Browser | GPU acceleration | 5-10x faster | |wasm| Browser | CPU optimization | 2-3x faster | |tensorflow | Node.js | Native bindings | 10-20x faster |

The library automatically selects the best available backend if not specified.

`API Reference`

`$3`

All algorithms implement the same interface:

`typescript interface ClusteringAlgorithm { fit(X: Tensor2D | number[][]): Promise; predict(X: Tensor2D | number[][]): Promise; fitPredict(X: Tensor2D | number[][]): Promise; }`

`$3`

`typescript new KMeans({ nClusters: number; init?: 'k-means++' | 'random' | number[][]; nInit?: number; maxIter?: number; tol?: number; // backend selection coming in future version })`

`$3`

`typescript new SpectralClustering({ nClusters: number; affinity?: 'rbf' | 'nearest_neighbors'; gamma?: number; nNeighbors?: number; // backend selection coming in future version })`

`$3`

`typescript new AgglomerativeClustering({ nClusters: number; linkage?: 'ward' | 'complete' | 'average' | 'single'; // backend selection coming in future version })`

`$3`

`typescript // Silhouette Score: [-1, 1], higher is better silhouetteScore(X: Tensor2D | number[][], labels: number[]): Promise

// Davies-Bouldin Index: [0, ∞), lower is better daviesBouldin(X: Tensor2D | number[][], labels: number[]): Promise

// Calinski-Harabasz Index: [0, ∞), higher is better calinskiHarabasz(X: Tensor2D | number[][], labels: number[]): Promise`

`Examples`

Coming soon: Example notebooks and CodePen demos

`Performance`

Based on our benchmarks:

- K-Means: 0.5ms - 200ms depending on dataset size - Spectral: 10ms - 2s (includes eigendecomposition) - Agglomerative: 5ms - 500ms

See benchmarks/ for detailed performance data.

`Migration from scikit-learn`

`python

`scikit-learn`


from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=3)
labels = kmeans.fit_predict(X)

`typescript // clustering-js import { KMeans } from 'clustering-tfjs'; const kmeans = new KMeans({ nClusters: 3 }); const labels = await kmeans.fitPredict(X);`

`$3`

This library has been extensively tested for numerical parity with scikit-learn. Our test suite includes:

- Step-by-step comparisons with sklearn implementations - Identical results for standard datasets - Matching behavior for edge cases

See tools/sklearn_comparison/ for detailed comparison scripts and test/` for parity tests.

Contributing

See CONTRIBUTING.md for guidelines on contributing to this project.

License

MIT

---

Note: This library is under active development. APIs may change in future versions.

clustering-tfjs

Native TypeScript implementation of clustering algorithms powered by TensorFlow.js with full browser and Node.js support.

Features

1. Quick Start
2. Installation
3. Algorithms
4. Validation Metrics
5. Backend Selection
6. API Reference
7. Examples
8. Performance
9. Migration from scikit-learn
10. Contributing
11. License

Quick Start

$3

``bash

`For Node.js with acceleration`


npm install clustering-tfjs @tensorflow/tfjs-node
For Node.js with GPU support

npm install clustering-tfjs @tensorflow/tfjs-node-gpu
For browser usage (TensorFlow.js loaded separately)

npm install clustering-tfjs


> Note: For Windows users or if you encounter native binding issues, see our Windows Compatibility Guide.
$3
#### Browser

`html

#### Node.js

`typescript import { Clustering } from 'clustering-tfjs';

// Initialize (optional - auto-detects best backend) await Clustering.init();

`Installation`

`$3`

`bash

`Basic installation (pure JavaScript backend)`


npm install clustering-tfjs
Recommended: With native acceleration

npm install clustering-tfjs @tensorflow/tfjs-node
Optional: With GPU support

npm install clustering-tfjs @tensorflow/tfjs-node-gpu


$3
The browser bundle is available via CDN:

`html

Or install via npm and use with a bundler:

`bash npm install clustering-tfjs @tensorflow/tfjs`

`Algorithms`

`$3`

- Classic centroid-based clustering - Supports custom initialization methods - K-Means++ initialization by default

`$3`

- Graph-based clustering using eigendecomposition - Ideal for non-convex clusters - Supports custom affinity functions

`$3`

- Hierarchical bottom-up clustering - Multiple linkage criteria (ward, complete, average, single) - Memory efficient implementation

`Validation Metrics`

The library includes three validation metrics to evaluate clustering quality and optimize the number of clusters:

`$3`

Measures how similar an object is to its own cluster compared to other clusters. Range: [-1, 1], higher is better.

`$3`

Evaluates intra-cluster and inter-cluster distances. Range: [0, ∞), lower is better.

`$3`

Ratio of between-cluster to within-cluster dispersion. Range: [0, ∞), higher is better.

`$3`

The library includes a built-in findOptimalClusters function that automatically determines the optimal number of clusters:

`typescript import { findOptimalClusters } from 'clustering-tfjs';

// Find optimal k between 2 and 10 clusters const result = await findOptimalClusters(data, { minClusters: 2, maxClusters: 10, algorithm: 'kmeans' // or 'spectral', 'agglomerative' });

console.log(Optimal number of clusters: ${result.optimal.k}); console.log(Silhouette score: ${result.optimal.silhouette}); console.log(All evaluations:, result.evaluations);

`Platform Detection & Backend Selection`

The library automatically detects your environment and selects the best backend:

`typescript import { Clustering } from 'clustering-tfjs';

// Check current platform console.log('Platform:', Clustering.platform); // 'browser' or 'node'

// Check available features console.log('Features:', Clustering.features); // { // gpuAcceleration: true, // wasmSimd: false, // nodeBindings: true, // webgl: false // }

// Manually select backend await Clustering.init({ backend: 'webgl' }); // Browser await Clustering.init({ backend: 'tensorflow' }); // Node.js`

`$3`

The library automatically selects the best available backend if not specified.

`API Reference`

`$3`

All algorithms implement the same interface:

`typescript interface ClusteringAlgorithm { fit(X: Tensor2D | number[][]): Promise; predict(X: Tensor2D | number[][]): Promise; fitPredict(X: Tensor2D | number[][]): Promise; }`

`$3`

`typescript new KMeans({ nClusters: number; init?: 'k-means++' | 'random' | number[][]; nInit?: number; maxIter?: number; tol?: number; // backend selection coming in future version })`

`$3`

`typescript new SpectralClustering({ nClusters: number; affinity?: 'rbf' | 'nearest_neighbors'; gamma?: number; nNeighbors?: number; // backend selection coming in future version })`

`$3`

`typescript new AgglomerativeClustering({ nClusters: number; linkage?: 'ward' | 'complete' | 'average' | 'single'; // backend selection coming in future version })`

`$3`

`typescript // Silhouette Score: [-1, 1], higher is better silhouetteScore(X: Tensor2D | number[][], labels: number[]): Promise

// Davies-Bouldin Index: [0, ∞), lower is better daviesBouldin(X: Tensor2D | number[][], labels: number[]): Promise

// Calinski-Harabasz Index: [0, ∞), higher is better calinskiHarabasz(X: Tensor2D | number[][], labels: number[]): Promise`

`Examples`

Coming soon: Example notebooks and CodePen demos

`Performance`

Based on our benchmarks:

- K-Means: 0.5ms - 200ms depending on dataset size - Spectral: 10ms - 2s (includes eigendecomposition) - Agglomerative: 5ms - 500ms

See benchmarks/ for detailed performance data.

`Migration from scikit-learn`

`python

`scikit-learn`


from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=3)
labels = kmeans.fit_predict(X)

`typescript // clustering-js import { KMeans } from 'clustering-tfjs'; const kmeans = new KMeans({ nClusters: 3 }); const labels = await kmeans.fitPredict(X);`

`$3`

This library has been extensively tested for numerical parity with scikit-learn. Our test suite includes:

- Step-by-step comparisons with sklearn implementations - Identical results for standard datasets - Matching behavior for edge cases

See tools/sklearn_comparison/ for detailed comparison scripts and test/` for parity tests.

Contributing

See CONTRIBUTING.md for guidelines on contributing to this project.

License

MIT

---

Note: This library is under active development. APIs may change in future versions.

clustering-tfjs

clustering-tfjs

Features

Table of Contents

Quick Start

$3

For Node.js with acceleration

For Node.js with GPU support

For browser usage (TensorFlow.js loaded separately)

$3

Installation

$3

Basic installation (pure JavaScript backend)

Recommended: With native acceleration

Optional: With GPU support

$3

Algorithms

$3

$3

$3

Validation Metrics

$3

$3

$3

$3

Platform Detection & Backend Selection

$3

API Reference

$3

$3

$3

$3

$3

Examples

Performance

Migration from scikit-learn

scikit-learn

$3

Contributing

License

clustering-tfjs

clustering-tfjs

Features

Table of Contents

Quick Start

$3

For Node.js with acceleration

For Node.js with GPU support

For browser usage (TensorFlow.js loaded separately)

$3

Installation

$3

Basic installation (pure JavaScript backend)

Recommended: With native acceleration

Optional: With GPU support

$3

Algorithms

$3

$3

$3

Validation Metrics

$3

$3

$3

$3

Platform Detection & Backend Selection

$3

API Reference

$3

$3

$3

$3

$3

Examples

Performance

Migration from scikit-learn

scikit-learn

$3

Contributing

License

`For Node.js with acceleration`

`Installation`

`$3`

`Basic installation (pure JavaScript backend)`

`Algorithms`

`$3`

`$3`

`$3`

`Validation Metrics`

`$3`

`$3`

`$3`

`$3`

`Platform Detection & Backend Selection`

`$3`

`API Reference`

`$3`

`$3`

`$3`

`$3`

`$3`

`Examples`

`Performance`

`Migration from scikit-learn`

`scikit-learn`

`$3`

`For Node.js with acceleration`

`Installation`

`$3`

`Basic installation (pure JavaScript backend)`

`Algorithms`

`$3`

`$3`

`$3`

`Validation Metrics`

`$3`

`$3`

`$3`

`$3`

`Platform Detection & Backend Selection`

`$3`

`API Reference`

`$3`

`$3`

`$3`

`$3`

`$3`

`Examples`

`Performance`

`Migration from scikit-learn`

`scikit-learn`

`$3`