|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--weka.classifiers.Classifier | +--weka.classifiers.functions.SMO
Implements John C. Platt's sequential minimal optimization algorithm for training a support vector classifier using polynomial or RBF kernels. This implementation globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes by default. (Note that the coefficients in the output are based on the normalized/standardized data, not the original data.) Multi-class problems are solved using pairwise classification. To obtain proper probability estimates, use the option that fits logistic regression models to the outputs of the support vector machine. In the multi-class case the predicted probabilities will be coupled using Hastie and Tibshirani's pairwise coupling method. Note: for improved speed standardization should be turned off when operating on SparseInstances.
For more information on the SMO algorithm, see
J. Platt (1998). Fast Training of Support Vector Machines using Sequential Minimal Optimization. Advances in Kernel Methods - Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola, eds., MIT Press.
S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, K.R.K. Murthy, Improvements to Platt's SMO Algorithm for SVM Classifier Design. Neural Computation, 13(3), pp 637-649, 2001.
Valid options are:
-C num
The complexity constant C. (default 1)
-E num
The exponent for the polynomial kernel. (default 1)
-G num
Gamma for the RBF kernel. (default 0.01)
-N <0|1|2>
Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
-F
Feature-space normalization (only for non-linear polynomial kernels).
-O
Use lower-order terms (only for non-linear polynomial kernels).
-R
Use the RBF kernel. (default poly)
-A num
Sets the size of the kernel cache. Should be a prime number.
(default 1000003)
-T num
Sets the tolerance parameter. (default 1.0e-3)
-P num
Sets the epsilon for round-off error. (default 1.0e-12)
-M
Fit logistic models to SVM outputs.
-V num
Number of folds for cross-validation used to generate data
for logistic models. (default -1, use training data)
-W num
Random number seed for cross-validation. (default 1)
Field Summary | |
static int |
FILTER_NONE
|
static int |
FILTER_NORMALIZE
The filter to apply to the training data |
static int |
FILTER_STANDARDIZE
|
static Tag[] |
TAGS_FILTER
|
Constructor Summary | |
SMO()
|
Method Summary | |
java.lang.String[][][] |
attributeNames()
Returns the attribute names. |
double[][] |
bias()
Returns the bias of each binary SMO. |
void |
buildClassifier(Instances insts)
Method for building the classifier. |
java.lang.String |
buildLogisticModelsTipText()
Returns the tip text for this property |
java.lang.String |
cacheSizeTipText()
Returns the tip text for this property |
java.lang.String[] |
classAttributeNames()
|
java.lang.String |
cTipText()
Returns the tip text for this property |
double[] |
distributionForInstance(Instance inst)
Estimates class probabilities for given instance. |
java.lang.String |
epsilonTipText()
Returns the tip text for this property |
java.lang.String |
exponentTipText()
Returns the tip text for this property |
java.lang.String |
featureSpaceNormalizationTipText()
Returns the tip text for this property |
java.lang.String |
filterTypeTipText()
Returns the tip text for this property |
java.lang.String |
gammaTipText()
Returns the tip text for this property |
boolean |
getBuildLogisticModels()
Get the value of buildLogisticModels. |
double |
getC()
Get the value of C. |
int |
getCacheSize()
Get the size of the kernel cache |
double |
getEpsilon()
Get the value of epsilon. |
double |
getExponent()
Get the value of exponent. |
boolean |
getFeatureSpaceNormalization()
Check whether feature spaces is being normalized. |
SelectedTag |
getFilterType()
Gets how the training data will be transformed. |
double |
getGamma()
Get the value of gamma. |
boolean |
getLowerOrderTerms()
Check whether lower-order terms are being used. |
int |
getNumFolds()
Get the value of numFolds. |
java.lang.String[] |
getOptions()
Gets the current settings of the classifier. |
int |
getRandomSeed()
Get the value of randomSeed. |
double |
getToleranceParameter()
Get the value of tolerance parameter. |
boolean |
getUseRBF()
Check if the RBF kernel is to be used. |
java.lang.String |
globalInfo()
Returns a string describing classifier |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
java.lang.String |
lowerOrderTermsTipText()
Returns the tip text for this property |
static void |
main(java.lang.String[] argv)
Main method for testing this class. |
int |
numClassAttributeValues()
|
java.lang.String |
numFoldsTipText()
Returns the tip text for this property |
int[] |
obtainVotes(Instance inst)
Returns an array of votes for the given instance. |
double[] |
pairwiseCoupling(double[][] n,
double[][] r)
Implements pairwise coupling. |
java.lang.String |
randomSeedTipText()
Returns the tip text for this property |
void |
setBuildLogisticModels(boolean newbuildLogisticModels)
Set the value of buildLogisticModels. |
void |
setC(double v)
Set the value of C. |
void |
setCacheSize(int v)
Set the value of the kernel cache. |
void |
setEpsilon(double v)
Set the value of epsilon. |
void |
setExponent(double v)
Set the value of exponent. |
void |
setFeatureSpaceNormalization(boolean v)
Set whether feature space is normalized. |
void |
setFilterType(SelectedTag newType)
Sets how the training data will be transformed. |
void |
setGamma(double v)
Set the value of gamma. |
void |
setLowerOrderTerms(boolean v)
Set whether lower-order terms are to be used. |
void |
setNumFolds(int newnumFolds)
Set the value of numFolds. |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setRandomSeed(int newrandomSeed)
Set the value of randomSeed. |
void |
setToleranceParameter(double v)
Set the value of tolerance parameter. |
void |
setUseRBF(boolean v)
Set if the RBF kernel is to be used. |
int[][][] |
sparseIndices()
Returns the indices in sparse format. |
double[][][] |
sparseWeights()
Returns the weights in sparse format. |
java.lang.String |
toleranceParameterTipText()
Returns the tip text for this property |
java.lang.String |
toString()
Prints out the classifier. |
void |
turnChecksOff()
Turns off checks for missing values, etc. |
void |
turnChecksOn()
Turns on checks for missing values, etc. |
java.lang.String |
useRBFTipText()
Returns the tip text for this property |
Methods inherited from class weka.classifiers.Classifier |
classifyInstance, debugTipText, forName, getDebug, makeCopies, setDebug |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
public static final int FILTER_NORMALIZE
public static final int FILTER_STANDARDIZE
public static final int FILTER_NONE
public static final Tag[] TAGS_FILTER
Constructor Detail |
public SMO()
Method Detail |
public java.lang.String globalInfo()
public void turnChecksOff()
public void turnChecksOn()
public void buildClassifier(Instances insts) throws java.lang.Exception
buildClassifier
in class Classifier
insts
- the set of training instances
java.lang.Exception
- if the classifier can't be built successfullypublic double[] distributionForInstance(Instance inst) throws java.lang.Exception
distributionForInstance
in class Classifier
inst
- the instance to be classified
java.lang.Exception
- if distribution could not be
computed successfullypublic double[] pairwiseCoupling(double[][] n, double[][] r)
n
- the sum of weights used to train each modelr
- the probability estimate from each model
public int[] obtainVotes(Instance inst) throws java.lang.Exception
inst
- the instance
java.lang.Exception
- if something goes wrongpublic double[][][] sparseWeights()
public int[][][] sparseIndices()
public double[][] bias()
public int numClassAttributeValues()
public java.lang.String[] classAttributeNames()
public java.lang.String[][][] attributeNames()
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class Classifier
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-C num
The complexity constant C. (default 1)
-E num
The exponent for the polynomial kernel. (default 1)
-G num
Gamma for the RBF kernel. (default 0.01)
-N <0|1|2>
Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
-F
Feature-space normalization (only for non-linear polynomial kernels).
-O
Use lower-order terms (only for non-linear polynomial kernels).
-R
Use RBF kernel (default poly).
-A num
Sets the size of the kernel cache. Should be a prime number. (default 1000003)
-T num
Sets the tolerance parameter. (default 1.0e-3)
-P num
Sets the epsilon for round-off error. (default 1.0e-12)
-M
Fit logistic models to SVM outputs.
-V num
Number of folds for cross-validation used to generate data
for logistic models. (default -1, use training data)
-W num
Random number seed. (default 1)
setOptions
in interface OptionHandler
setOptions
in class Classifier
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class Classifier
public java.lang.String exponentTipText()
public double getExponent()
public void setExponent(double v)
v
- Value to assign to exponent.public java.lang.String gammaTipText()
public double getGamma()
public void setGamma(double v)
v
- Value to assign to gamma.public java.lang.String cTipText()
public double getC()
public void setC(double v)
v
- Value to assign to C.public java.lang.String toleranceParameterTipText()
public double getToleranceParameter()
public void setToleranceParameter(double v)
v
- Value to assign to tolerance parameter.public java.lang.String epsilonTipText()
public double getEpsilon()
public void setEpsilon(double v)
v
- Value to assign to epsilon.public java.lang.String cacheSizeTipText()
public int getCacheSize()
public void setCacheSize(int v)
v
- Size of kernel cache.public java.lang.String filterTypeTipText()
public SelectedTag getFilterType()
public void setFilterType(SelectedTag newType)
newType
- the new filtering modepublic java.lang.String useRBFTipText()
public boolean getUseRBF()
public void setUseRBF(boolean v)
v
- true if RBFpublic java.lang.String featureSpaceNormalizationTipText()
public boolean getFeatureSpaceNormalization() throws java.lang.Exception
java.lang.Exception
public void setFeatureSpaceNormalization(boolean v) throws java.lang.Exception
v
- true if feature space is to be normalized.
java.lang.Exception
public java.lang.String lowerOrderTermsTipText()
public boolean getLowerOrderTerms()
public void setLowerOrderTerms(boolean v)
v
- Value to assign to lowerOrder.public java.lang.String buildLogisticModelsTipText()
public boolean getBuildLogisticModels()
public void setBuildLogisticModels(boolean newbuildLogisticModels)
newbuildLogisticModels
- Value to assign to buildLogisticModels.public java.lang.String numFoldsTipText()
public int getNumFolds()
public void setNumFolds(int newnumFolds)
newnumFolds
- Value to assign to numFolds.public java.lang.String randomSeedTipText()
public int getRandomSeed()
public void setRandomSeed(int newrandomSeed)
newrandomSeed
- Value to assign to randomSeed.public java.lang.String toString()
toString
in class java.lang.Object
public static void main(java.lang.String[] argv)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Copyright (c) 2003 David Lindsay, Computer Learning Research Centre, Dept. Computer Science, Royal Holloway, University of London