node-svm

Support Vector Machine (SVM) library for nodejs.

![NPM](https://nodei.co/npm/node-svm/)
![Build Status](https://travis-ci.org/nicolaspanel/node-svm) ![Coverage Status](https://coveralls.io/r/nicolaspanel/node-svm?branch=master)

Support Vector Machines

Wikipedia :

>Support vector machines are supervised learning models that analyze data and recognize patterns.
>A special property is that they simultaneously minimize the empirical classification error and maximize the geometric margin; hence they are also known as maximum margin classifiers.
>![Wikipedia image](http://en.wikipedia.org/wiki/File:Kernel_Machine.png)

Installation

npm install --save node-svm

Quick start

If you are not familiar with SVM I highly recommend this guide.

Here's an example of using node-svm to approximate the XOR function :

``javascript var svm = require('node-svm');

var xor = [ [[0, 0], 0], [[0, 1], 1], [[1, 0], 1], [[1, 1], 0] ];

// initialize a new predictor var clf = new svm.CSVC();

clf.train(xor).done(function () { // predict things xor.forEach(function(ex){ var prediction = clf.predictSync(ex[0]); console.log('%d XOR %d => %d', ex[0][0], ex[0][1], prediction); }); });

/ CONSOLE 0 XOR 0 => 0 0 XOR 1 => 1 1 XOR 0 => 1 1 XOR 1 => 0 */`

More examples are available here.

__Note__: There's no reason to use SVM to figure out XOR BTW...

`API`

`Classifiers`

Possible classifiers are:

| Classifier | Type | Params | Initialization | |-------------|------------------------|----------------|-------------------------------| | C_SVC | multi-class classifier |c | = new svm.CSVC(opts)| | NU_SVC | multi-class classifier |nu | = new svm.NuSVC(opts)| | ONE_CLASS | one-class classifier |nu | = new svm.OneClassSVM(opts)| | EPSILON_SVR | regression |c, epsilon | = new svm.EpsilonSVR(opts)| | NU_SVR | regression |c, nu | = new svm.NuSVR(opts) |

`Kernels`

Possible kernels are:

| Kernel | Parameters | |---------|--------------------------------| | LINEAR | No parameter | | POLY |degree, gamma, r| | RBF |gamma| | SIGMOID |gamma, r |

`Parameters and options`

Possible parameters/options are:

| Name | Default value(s) | Description | |------------------|------------------------|-------------------------------------------------------------------------------------------------------| | svmType |C_SVC| Used classifier | | kernelType |RBF| Used kernel | | c |[0.01,0.125,0.5,1,2] | Cost for C_SVC, EPSILON_SVR and NU_SVR. Can be a Number or an Arrayof numbers | | nu |[0.01,0.125,0.5,1] | For NU_SVC, ONE_CLASS and NU_SVR. Can be a Number or an Arrayof numbers | | epsilon |[0.01,0.125,0.5,1] | For EPSILON_SVR. Can be a Number or an Arrayof numbers | | degree |[2,3,4] | For POLY kernel. Can be a Number or an Arrayof numbers | | gamma |[0.001,0.01,0.5] | For POLY, RBF and SIGMOID kernels. Can be a Number or an Arrayof numbers | | r |[0.125,0.5,0,1] | For POLY and SIGMOID kernels. Can be a Number or an Arrayof numbers | | kFold |4 | k parameter for k-fold cross validation#k-fold_cross-validation). k must be >= 1. If k===1then entire dataset is use for both testing and training. | | normalize |true| Whether to use mean normalization) during data pre-processing | | reduce |true| Whether to use PCA to reduce dataset's dimensions during data pre-processing | | retainedVariance |0.99 | Define the acceptable impact on data integrity (require reduce to be true) | | eps |1e-3| Tolerance of termination criterion | | cacheSize |200| Cache size in MB. | | shrinking |true| Whether to use the shrinking heuristics | | probability |false | Whether to train a SVC or SVR model for probability estimates |

The example below shows how to use them:

`javascript var svm = require('node-svm');

var clf = new svm.SVM({ svmType: 'C_SVC', c: [0.03125, 0.125, 0.5, 2, 8], // kernels parameters kernelType: 'RBF', gamma: [0.03125, 0.125, 0.5, 2, 8], // training options kFold: 4, normalize: true, reduce: true, retainedVariance: 0.99, eps: 1e-3, cacheSize: 200, shrinking : true, probability : false });`

__Notes__ : * You can override default values by creating a.nodesvmrcfile (JSON) at the root of your project. * If at least one parameter has multiple values, node-svm will go through all possible combinations to see which one gives the best results (it performs grid-search to maximize f-score for classification and minimize Mean Squared Error for regression).

##Training

SVMs can be trained using svm#train(dataset) method.

Pseudo code :`javascript var clf = new svm.SVM(options);

clf .train(dataset) .progress(function(rate){ // ... }) .spread(function(trainedModel, trainingReport){ // ... });`

__Notes__ : *trainedModelcan be used to restore the predictor later (see this example for more information). *trainingReport contains information about predictor's accuracy (such as MSE, precison, recall, fscore, retained variance etc.)

`Prediction`


Once trained, you can use the classifier object to predict values for new inputs. You can do so : 
 * Synchronously using

clf#predictSync(inputs)


 * Asynchronously using

clf#predict(inputs).then(function(predicted){ ... });

If you enabled probabilities during initialization you can also predict probabilities for each class : * Synchronously usingclf#predictProbabilitiesSync(inputs). * Asynchronously usingclf#predictProbabilities(inputs).then(function(probabilities){ ... }).

__Note__ : inputs must be a 1d array of numbers

`Model evaluation`


Once the predictor is trained it can be evaluated against a test set.

Pseudo code :`javascript var svm = require('node-svm'); var clf = new svm.SVM(options); svm.read(trainFile) .then(function(dataset){ return clf.train(dataset); }) .then(function(trainedModel, trainingReport){ return svm.read(testFile); }) .then(function(testset){ return clf.evaluate(testset); }) .done(function(report){ console.log(report); });`

`CLI`

node-svm comes with a build-in Command Line Interpreter.

To use it you have to install node-svm globally using npm install -g node-svm.

See $ node-svm -h for complete command line reference.

`help`

shell
$ node-svm help []


Display help information about node-svm 

train

shell
$ node-svm train  [] []


Train a new model with given data set

__Note__: use $ node-svm train -i to set parameters values dynamically.

`evaluate`

shell
$ node-svm evaluate   []


Evaluate model's accuracy against a test set
How it work

node-svm uses the official libsvm C++ library, version 3.20.

For more information see also : * libsvm web site * Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. * Wikipedia article about SVM * node addons

`Contributions`


Feel free to fork and improve/enhance

node-svm in any way your want.

If you feel that the community will benefit from your changes, please send a pull request : * Fork the project. * Make your feature addition or bug fix. * Add documentation if necessary. * Add tests for it. This is important so I don't break it in a future version unintentionally (rungrunt or npm test). * Send a pull request to thedevelop` branch.

#FAQ
###Segmentation fault
Q : Node returns 'segmentation fault' error during training. What's going on?

A1 : Your dataset is empty or its format is incorrect.

A2 : Your dataset is too big.

###Difference between nu-SVC and C-SVC
Q : What is the difference between nu-SVC and C-SVC?

A : Answer here

###Other questions
* Take a look at libsvm's FAQ.
* Create an issue

License

MIT

![githalytics.com alpha](http://githalytics.com/nicolaspanel/node-svm)