Set the class attribute. If set, then class based evaluation of clustering
is performed.
- Version:
- $Revision: 1.23 $
- Author:
- Mark Hall (mhall@cs.waikato.ac.nz)
Method Summary |
java.lang.String |
clusterResultsToString()
return the results of clustering. |
static java.lang.String |
crossValidateModel(java.lang.String clustererString,
Instances data,
int numFolds,
java.lang.String[] options,
java.util.Random random)
Performs a cross-validation
for a distribution clusterer on a set of instances. |
static java.lang.String |
evaluateClusterer(Clusterer clusterer,
java.lang.String[] options)
Evaluates a clusterer with the options given in an array of
strings. |
void |
evaluateClusterer(Instances test)
Evaluate the clusterer on a set of instances. |
int[] |
getClassesToClusters()
Return the array (ordered by cluster number) of minimum error class to
cluster mappings |
double[] |
getClusterAssignments()
Return an array of cluster assignments corresponding to the most
recent set of instances clustered. |
int |
getNumClusters()
Return the number of clusters found for the most recent call to
evaluateClusterer |
static void |
main(java.lang.String[] args)
Main method for testing this class. |
void |
setClusterer(Clusterer clusterer)
set the clusterer |
void |
setDoXval(boolean x)
set whether or not to do cross validation |
void |
setFolds(int folds)
set the number of folds to use for cross validation |
void |
setSeed(int s)
set the seed to use for cross validation |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ClusterEvaluation
public ClusterEvaluation()
- Constructor. Sets defaults for each member variable. Default Clusterer
is EM.
setClusterer
public void setClusterer(Clusterer clusterer)
- set the clusterer
- Parameters:
clusterer
- the clusterer to use
setDoXval
public void setDoXval(boolean x)
- set whether or not to do cross validation
- Parameters:
x
- true if cross validation is to be done
setFolds
public void setFolds(int folds)
- set the number of folds to use for cross validation
- Parameters:
folds
- the number of folds
setSeed
public void setSeed(int s)
- set the seed to use for cross validation
- Parameters:
s
- the seed.
clusterResultsToString
public java.lang.String clusterResultsToString()
- return the results of clustering.
- Returns:
- a string detailing the results of clustering a data set
getNumClusters
public int getNumClusters()
- Return the number of clusters found for the most recent call to
evaluateClusterer
- Returns:
- the number of clusters found
getClusterAssignments
public double[] getClusterAssignments()
- Return an array of cluster assignments corresponding to the most
recent set of instances clustered.
- Returns:
- an array of cluster assignments
getClassesToClusters
public int[] getClassesToClusters()
- Return the array (ordered by cluster number) of minimum error class to
cluster mappings
- Returns:
- an array of class to cluster mappings
evaluateClusterer
public void evaluateClusterer(Instances test)
throws java.lang.Exception
- Evaluate the clusterer on a set of instances. Calculates clustering
statistics and stores cluster assigments for the instances in
m_clusterAssignments
- Parameters:
test
- the set of instances to cluster
- Throws:
java.lang.Exception
- if something goes wrong
evaluateClusterer
public static java.lang.String evaluateClusterer(Clusterer clusterer,
java.lang.String[] options)
throws java.lang.Exception
- Evaluates a clusterer with the options given in an array of
strings. It takes the string indicated by "-t" as training file, the
string indicated by "-T" as test file.
If the test file is missing, a stratified ten-fold
cross-validation is performed (distribution clusterers only).
Using "-x" you can change the number of
folds to be used, and using "-s" the random seed.
If the "-p" option is present it outputs the classification for
each test instance. If you provide the name of an object file using
"-l", a clusterer will be loaded from the given file. If you provide the
name of an object file using "-d", the clusterer built from the
training data will be saved to the given file.
- Parameters:
clusterer
- machine learning clustereroptions
- the array of string containing the options
- Returns:
- a string describing the results
- Throws:
java.lang.Exception
- if model could not be evaluated successfully
crossValidateModel
public static java.lang.String crossValidateModel(java.lang.String clustererString,
Instances data,
int numFolds,
java.lang.String[] options,
java.util.Random random)
throws java.lang.Exception
- Performs a cross-validation
for a distribution clusterer on a set of instances.
- Parameters:
clustererString
- a string naming the class of the clustererdata
- the data on which the cross-validation is to be
performednumFolds
- the number of folds for the cross-validationoptions
- the options to the clustererrandom
- a random number generator
- Returns:
- a string containing the cross validated log likelihood
- Throws:
java.lang.Exception
- if a clusterer could not be generated
main
public static void main(java.lang.String[] args)
- Main method for testing this class.
- Parameters:
args
- the options
Copyright (c)
2003 David Lindsay, Computer Learning Research Centre, Dept. Computer Science, Royal Holloway, University of London