This .NET library is used to provide kernels for kernel methods found in machine learning.
The abstract base class Kernel<T>
expresses the definition of a kernel, where x1 and x2 are of type T
and φ is some function, possibly unknown, which maps items of type T
to a vector of finite or infinite dimensions:
In order for a kernel implementation to be valid, meaning that there exists some function φ to express the kernel in the above dot product form, it has to be positive definite, i.e. to fulfill the Mercer's condition. The included implementations of Kernel<T>
and their supported combinations all comply to this condition. For any new derivation of Kernel<T>
, it is the developer's duty to fulfill the condition.
The above expression is offered by Kernel<T>
implementations via the Compute
method. But implementations also offer method ComputeSum
which calculates weighted sums of kernel evaluations:
The above 'representer form' is what typically all kernel methods arrive to after training. In theory, it could be computed calling repeatedly the Compute
method then scaling and summing the results, but for many kernels there can be lots of savings computing the 'representer form' directly. The ci and xi components of the formula are registered in the kernel using the AddComponent
method. These components are the training state of the kernel method that uses the kernel. In order to save and load this state, use standard .NET serialization on the kernel.
The hierarchy of the included kernels follows:
The included kernels are:
LinearKernel
andSparseLinearKernel
: Linear kernels for dense and sparse vectors respectively.RbfKernel
andSparseRbfKernel
: Gaussian kernels for dense and sparse vectors respectively.StringKernel<C>
: Kernel for generic strings where 'characters' can be of any typeC
. It is an implementation of the 'all-substrings kernel' as introduced by Vishwanathan and Smola (2004). Note that in order to preserve its linear performance O(n) where n is the length of strings passed in a method, theHashCode
andEquals
implementations of typeC
must have O(1) performance.ScaledKernel<T>
: Produces a kernel by scaling it with a value which must be positive to preserve the positive-definiteness.OffsetKernel<T>
: Produces a kernel by adding to it with a value which must be positive to preserve the positive-definiteness.SumKernel<T>
: Produces a kernel from a sum of otherKernel<T>
implementations.GaussianKernel<T>
: Takes as input an arbitraryKernel<T>
and 'gaussianizes' it. It is a generalization ofRbfKernel
andSparseRbfKernel
which can be seen asGaussianKernel<Vector>
andGaussianKernel<SparseVector>
respectively.MappingKernel<T, S>
: Produces aKernel<T>
implementation by mapping instances ofT
to instances ofS
and delegating them to aKernel<S>
implementation.
This project depends on the following projects, which must reside in sibling directories: