High-performance TypeScript clustering algorithms (K-Means, Spectral, Agglomerative) with TensorFlow.js acceleration and scikit-learn compatibility
npm install clustering-tfjs



Native TypeScript implementation of clustering algorithms powered by TensorFlow.js with full browser and Node.js support.
- ✅ Pure TypeScript/JavaScript (no Python required)
- ✅ Multiple clustering algorithms (K-Means, Spectral, Agglomerative)
- ✅ Powered by TensorFlow.js for performance
- ✅ Works in both Node.js and browsers
- ✅ Platform-optimized bundles (49KB for browser, 163KB for Node.js)
- ✅ TypeScript support with full type definitions
- ✅ GPU acceleration available (WebGL in browser, CUDA in Node.js)
- ✅ Automatic backend selection
- ✅ Extensively tested for parity with scikit-learn
1. Quick Start
2. Installation
3. Algorithms
4. Validation Metrics
5. Backend Selection
6. API Reference
7. Examples
8. Performance
9. Migration from scikit-learn
10. Contributing
11. License
``bashFor Node.js with acceleration
npm install clustering-tfjs @tensorflow/tfjs-node
> Note: For Windows users or if you encounter native binding issues, see our Windows Compatibility Guide.
$3
#### Browser
`html
`#### Node.js
`typescript
import { Clustering } from 'clustering-tfjs';// Initialize (optional - auto-detects best backend)
await Clustering.init();
// Use algorithms
const kmeans = new Clustering.KMeans({ nClusters: 3 });
const data = [[1, 2], [1.5, 1.8], [5, 8], [8, 8], [1, 0.6], [9, 11]];
const labels = await kmeans.fitPredict(data);
console.log(labels); // [0, 0, 1, 1, 0, 2]
`Installation
$3
`bash
Basic installation (pure JavaScript backend)
npm install clustering-tfjsRecommended: With native acceleration
npm install clustering-tfjs @tensorflow/tfjs-nodeOptional: With GPU support
npm install clustering-tfjs @tensorflow/tfjs-node-gpu
`$3
The browser bundle is available via CDN:
`html
`Or install via npm and use with a bundler:
`bash
npm install clustering-tfjs @tensorflow/tfjs
`Algorithms
$3
- Classic centroid-based clustering
- Supports custom initialization methods
- K-Means++ initialization by default
$3
- Graph-based clustering using eigendecomposition
- Ideal for non-convex clusters
- Supports custom affinity functions
$3
- Hierarchical bottom-up clustering
- Multiple linkage criteria (ward, complete, average, single)
- Memory efficient implementation
Validation Metrics
The library includes three validation metrics to evaluate clustering quality and optimize the number of clusters:
$3
Measures how similar an object is to its own cluster compared to other clusters. Range: [-1, 1], higher is better.
$3
Evaluates intra-cluster and inter-cluster distances. Range: [0, ∞), lower is better.
$3
Ratio of between-cluster to within-cluster dispersion. Range: [0, ∞), higher is better.
$3
The library includes a built-in
findOptimalClusters function that automatically determines the optimal number of clusters:`typescript
import { findOptimalClusters } from 'clustering-tfjs';// Find optimal k between 2 and 10 clusters
const result = await findOptimalClusters(data, {
minClusters: 2,
maxClusters: 10,
algorithm: 'kmeans' // or 'spectral', 'agglomerative'
});
console.log(
Optimal number of clusters: ${result.optimal.k});
console.log(Silhouette score: ${result.optimal.silhouette});
console.log(All evaluations:, result.evaluations);// Advanced usage with custom scoring
const customResult = await findOptimalClusters(data, {
maxClusters: 8,
algorithm: 'spectral',
algorithmParams: { affinity: 'nearest_neighbors' },
metrics: ['silhouette', 'calinskiHarabasz'], // Skip Davies-Bouldin
scoringFunction: (evaluation) => evaluation.silhouette * 2 + evaluation.calinskiHarabasz
});
`Platform Detection & Backend Selection
The library automatically detects your environment and selects the best backend:
`typescript
import { Clustering } from 'clustering-tfjs';// Check current platform
console.log('Platform:', Clustering.platform); // 'browser' or 'node'
// Check available features
console.log('Features:', Clustering.features);
// {
// gpuAcceleration: true,
// wasmSimd: false,
// nodeBindings: true,
// webgl: false
// }
// Manually select backend
await Clustering.init({ backend: 'webgl' }); // Browser
await Clustering.init({ backend: 'tensorflow' }); // Node.js
`$3
| Backend | Environment | Use Case | Performance |
|---------|------------|----------|-------------|
|
cpu | Both | Pure JS fallback | Baseline |
| webgl | Browser | GPU acceleration | 5-10x faster |
| wasm | Browser | CPU optimization | 2-3x faster |
| tensorflow | Node.js | Native bindings | 10-20x faster |The library automatically selects the best available backend if not specified.
API Reference
$3
All algorithms implement the same interface:
`typescript
interface ClusteringAlgorithm {
fit(X: Tensor2D | number[][]): Promise;
predict(X: Tensor2D | number[][]): Promise;
fitPredict(X: Tensor2D | number[][]): Promise;
}
`$3
`typescript
new KMeans({
nClusters: number;
init?: 'k-means++' | 'random' | number[][];
nInit?: number;
maxIter?: number;
tol?: number;
// backend selection coming in future version
})
`$3
`typescript
new SpectralClustering({
nClusters: number;
affinity?: 'rbf' | 'nearest_neighbors';
gamma?: number;
nNeighbors?: number;
// backend selection coming in future version
})
`$3
`typescript
new AgglomerativeClustering({
nClusters: number;
linkage?: 'ward' | 'complete' | 'average' | 'single';
// backend selection coming in future version
})
`$3
`typescript
// Silhouette Score: [-1, 1], higher is better
silhouetteScore(X: Tensor2D | number[][], labels: number[]): Promise// Davies-Bouldin Index: [0, ∞), lower is better
daviesBouldin(X: Tensor2D | number[][], labels: number[]): Promise
// Calinski-Harabasz Index: [0, ∞), higher is better
calinskiHarabasz(X: Tensor2D | number[][], labels: number[]): Promise
`Examples
Coming soon: Example notebooks and CodePen demos
Performance
Based on our benchmarks:
- K-Means: 0.5ms - 200ms depending on dataset size
- Spectral: 10ms - 2s (includes eigendecomposition)
- Agglomerative: 5ms - 500ms
See benchmarks/ for detailed performance data.
Migration from scikit-learn
`python
scikit-learn
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=3)
labels = kmeans.fit_predict(X)
``typescript
// clustering-js
import { KMeans } from 'clustering-tfjs';
const kmeans = new KMeans({ nClusters: 3 });
const labels = await kmeans.fitPredict(X);
`$3
This library has been extensively tested for numerical parity with scikit-learn. Our test suite includes:
- Step-by-step comparisons with sklearn implementations
- Identical results for standard datasets
- Matching behavior for edge cases
tools/sklearn_comparison/ for detailed comparison scripts and test/` for parity tests.See CONTRIBUTING.md for guidelines on contributing to this project.
MIT
---
Note: This library is under active development. APIs may change in future versions.