Fork me on GitHub

shikken

shikken (執権): the regent for the shogun-toolbox in the R province.

Overview

shikken is a library for the R programming language that provides an interface to the shogun-toolbox, a C++ library for "large scale" machine learning.

Although the shogun-toolbox already provides the (i) r_static interface, and the (ii) r_modular interface you can use from R, the API of these libraries are more consistent with "the shogun API" than they are to "idiomatic R". There are other (minor) disadvantages these libraries have, as well. For instance, the r_static interface can only work with one "shogun machine" at a time, and the r_modular library is currently in its alpha stage, and relies on SWIG, which still has a few problems working with R.

Goals

The goal of the shikken library is to provide a full featured interface to the shogun-toolbox that fits "more naturally" within R, thereby making the shogun-toolbox more accessible to non-SVM experts and the R community at large.

shikken is programmed against the core libshogun C++ library, so it will (overtime) expose a large majority of shogun's functionality while attempting to sidestep some of the disadvantages the r_static and r_modular interfaces have.

The shikken package will also be a "self-contained" package, allowing it to be installed using the normal R channels (eg. through CRAN). Users will not be required to download and install the shogun-toolbox separately, although they can choose to do so and link this package against a custom shogun install.

Examples

building a 2-class SVM for the iris dataset works like so:

  > library(shikken)
  > iris.sub <- subset(iris, Species != "virginica")
  > svm <- SVM(Species ~ ., data=iris.sub)
  > table(predict(svm), iris.sub$Species)
  
          setosa versicolor virginica
    -1         0         50         0
     1        50          0         0
  
(note that the labels are turned to -1/1 internally and are shown as such in the left column)

Installation

This package includes a recent development release of the shogun toolbox (19-May-2011) which it can use to install as a "self contained unit". You can also install your own version (>= 0.11) of the the libshogun C++ library/interface and link this package against that. You can find more information on its wiki pages.

Notes

This library is in its early stages of development (started ~ June 20, 2011), and isn't quite ready for public consumption just yet. I am currently focussing on exposing more of the "vanilla" SVM functionality (ie. linear, gaussian, polynomial, and custom kernels), specifically in the 2-class classification setting. I actually started this project because I wanted to get access to shogun's string kernel functionality, so that will be coming up next.

My C++ is rather rusty. I'm using Rcpp to help with the R < — > C bridge code. The quality of that code will evolve (for the better!) as I get my C++ shoes back. If you have any suggestions on how to improve the code, I'm all ears.

The design of this package has been influenced from the kernlab package, and I've sampled "rather liberally" from some of the code in that package as well. A big thanks goes to the authors of that library.

Download

These are not available yet.

You can clonesoon clone the project with Git by running:

$ git clone git://github.com/lianos/shikken