classifiers
Class AltDist_IBk

java.lang.Object
  |
  +--weka.classifiers.Classifier
        |
        +--classifiers.AltDist_IBk
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable, UpdateableClassifier, WeightedInstancesHandler

public class AltDist_IBk
extends Classifier
implements OptionHandler, UpdateableClassifier, WeightedInstancesHandler

K-nearest neighbour classifier. For more information, see

Lindsay, D. (2003) "Use of a compression based similarity metric in an instance based learner for analysing strings of differing length", Valid options are:

-K num
Set the number of nearest neighbours to use in prediction (default 1)

-W num
Set a fixed window size for incremental train/testing. As new training instances are added, oldest instances are removed to maintain the number of training instances at this size. (default no window)

-D
Neighbours will be weighted by the inverse of their distance when voting. (default equal weighting)

-F
Neighbours will be weighted by their similarity when voting. (default equal weighting)

-X
Selects the number of neighbours to use by hold-one-out cross validation, with an upper limit given by the -K option.

-S
When k is selected by cross-validation for numeric class attributes, minimize mean-squared error. (default mean absolute error)

-N
Turns off normalization.

-E
KDTrees class and its options (can only use the same distance function as XMeans).

-Q Distance Metric
This is the distance metric that we are using (default Euclidean distance)

Version:
$Revision: 1.00 $
Author:
David Lindsay (davidl@cs.rhul.ac.uk)
See Also:
Serialized Form

Field Summary
static Tag[] TAGS_WEIGHTING
           
static int WEIGHT_INVERSE
           
static int WEIGHT_NONE
           
static int WEIGHT_SIMILARITY
           
 
Constructor Summary
AltDist_IBk()
          IB1 classifer.
AltDist_IBk(int k)
          IBk classifier.
 
Method Summary
 void buildClassifier(Instances instances)
          Generates the classifier.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 double getAttributeMax(int index)
          Get an attributes maximum observed value
 double getAttributeMin(int index)
          Get an attributes minimum observed value
 boolean getCrossValidate()
          Gets whether hold-one-out cross-validation will be used to select the best k value
 boolean getDebug()
          Get the value of Debug.
 java.lang.String getDistanceMetric()
          The gives details about the distance metric that has been chosen
 SelectedTag getDistanceWeighting()
          Gets the distance weighting method used.
 int getKNN()
          Gets the number of neighbours the learner will use.
 boolean getMeanSquared()
          Gets whether the mean squared error is used rather than mean absolute error when doing cross-validation.
 boolean getNoNormalization()
          Gets whether normalization is turned off.
 int getNumTraining()
          Get the number of training instances the classifier is currently using
 java.lang.String[] getOptions()
          Gets the current settings of IBk.
 int getWindowSize()
          Gets the maximum number of instances allowed in the training pool.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 void setCrossValidate(boolean newCrossValidate)
          Sets whether hold-one-out cross-validation will be used to select the best k value
 void setDebug(boolean newDebug)
          Set the value of Debug.
 boolean setDistanceMetric(java.lang.String distName, java.lang.String[] distOptions)
          Set the distance metric
 void setDistanceWeighting(SelectedTag newMethod)
          Sets the distance weighting method used.
 void setKNN(int k)
          Set the number of neighbours the learner is to use.
 void setMeanSquared(boolean newMeanSquared)
          Sets whether the mean squared error is used rather than mean absolute error when doing cross-validation.
 void setNoNormalization(boolean v)
          Set whether normalization is turned off.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setWindowSize(int newWindowSize)
          Sets the maximum number of instances allowed in the training pool.
 java.lang.String toString()
          Returns a description of this classifier.
 void updateClassifier(Instance instance)
          Adds the supplied instance to the training set
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, makeCopies
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

WEIGHT_NONE

public static final int WEIGHT_NONE
See Also:
Constant Field Values

WEIGHT_INVERSE

public static final int WEIGHT_INVERSE
See Also:
Constant Field Values

WEIGHT_SIMILARITY

public static final int WEIGHT_SIMILARITY
See Also:
Constant Field Values

TAGS_WEIGHTING

public static final Tag[] TAGS_WEIGHTING
Constructor Detail

AltDist_IBk

public AltDist_IBk(int k)
IBk classifier. Simple instance-based learner that uses the class of the nearest k training instances for the class of the test instances.

Parameters:
k - the number of nearest neighbours to use for prediction

AltDist_IBk

public AltDist_IBk()
IB1 classifer. Instance-based learner. Predicts the class of the single nearest training instance for each test instance.

Method Detail

getDebug

public boolean getDebug()
Get the value of Debug.

Overrides:
getDebug in class Classifier
Returns:
Value of Debug.

setDebug

public void setDebug(boolean newDebug)
Set the value of Debug.

Overrides:
setDebug in class Classifier
Parameters:
newDebug - Value to assign to Debug.

setKNN

public void setKNN(int k)
Set the number of neighbours the learner is to use.

Parameters:
k - the number of neighbours.

getKNN

public int getKNN()
Gets the number of neighbours the learner will use.

Returns:
the number of neighbours

setDistanceMetric

public boolean setDistanceMetric(java.lang.String distName,
                                 java.lang.String[] distOptions)
Set the distance metric

Parameters:
distName - the distance metric class name in full
distOptions - the options specified for that distance metric
Returns:
true if successful

getDistanceMetric

public java.lang.String getDistanceMetric()
The gives details about the distance metric that has been chosen

Returns:
true if default Euclidean chosen, false other wise

getWindowSize

public int getWindowSize()
Gets the maximum number of instances allowed in the training pool. The addition of new instances above this value will result in old instances being removed. A value of 0 signifies no limit to the number of training instances.

Returns:
Value of WindowSize

setWindowSize

public void setWindowSize(int newWindowSize)
Sets the maximum number of instances allowed in the training pool. The addition of new instances above this value will result in old instances being removed. A value of 0 signifies no limit to the number of training instances.

Parameters:
newWindowSize - Value to assign to WindowSize.

getDistanceWeighting

public SelectedTag getDistanceWeighting()
Gets the distance weighting method used. Will be one of WEIGHT_NONE, WEIGHT_INVERSE, or WEIGHT_SIMILARITY

Returns:
the distance weighting method used.

setDistanceWeighting

public void setDistanceWeighting(SelectedTag newMethod)
Sets the distance weighting method used. Values other than WEIGHT_NONE, WEIGHT_INVERSE, or WEIGHT_SIMILARITY will be ignored.


getMeanSquared

public boolean getMeanSquared()
Gets whether the mean squared error is used rather than mean absolute error when doing cross-validation.

Returns:
true if so.

setMeanSquared

public void setMeanSquared(boolean newMeanSquared)
Sets whether the mean squared error is used rather than mean absolute error when doing cross-validation.

Parameters:
newMeanSquared - true if so.

getCrossValidate

public boolean getCrossValidate()
Gets whether hold-one-out cross-validation will be used to select the best k value

Returns:
true if cross-validation will be used.

setCrossValidate

public void setCrossValidate(boolean newCrossValidate)
Sets whether hold-one-out cross-validation will be used to select the best k value

Parameters:
newCrossValidate - true if cross-validation should be used.

getNumTraining

public int getNumTraining()
Get the number of training instances the classifier is currently using


getAttributeMin

public double getAttributeMin(int index)
                       throws java.lang.Exception
Get an attributes minimum observed value

java.lang.Exception

getAttributeMax

public double getAttributeMax(int index)
                       throws java.lang.Exception
Get an attributes maximum observed value

java.lang.Exception

getNoNormalization

public boolean getNoNormalization()
Gets whether normalization is turned off.

Returns:
Value of DontNormalize.

setNoNormalization

public void setNoNormalization(boolean v)
Set whether normalization is turned off.

Parameters:
v - Value to assign to DontNormalize.

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in class Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

updateClassifier

public void updateClassifier(Instance instance)
                      throws java.lang.Exception
Adds the supplied instance to the training set

Specified by:
updateClassifier in interface UpdateableClassifier
Parameters:
instance - the instance to add
Throws:
java.lang.Exception - if instance could not be incorporated successfully

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if an error occurred during the prediction

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-K num
Set the number of nearest neighbours to use in prediction (default 1)

-W num
Set a fixed window size for incremental train/testing. As new training instances are added, oldest instances are removed to maintain the number of training instances at this size. (default no window)

-D
Neighbours will be weighted by the inverse of their distance when voting. (default equal weighting)

-F
Neighbours will be weighted by their similarity when voting. (default equal weighting)

-X
Select the number of neighbours to use by hold-one-out cross validation, with an upper limit given by the -K option.

-S
When k is selected by cross-validation for numeric class attributes, minimize mean-squared error. (default mean absolute error)

-Q
Specifies which distance metric is being used (default is Euclidean)

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of IBk.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions()

toString

public java.lang.String toString()
Returns a description of this classifier.

Overrides:
toString in class java.lang.Object
Returns:
a description of this classifier as a string.

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - should contain command line options (see setOptions)


Copyright (c) 2003 David Lindsay, Computer Learning Research Centre, Dept. Computer Science, Royal Holloway, University of London