weka.classifiers.trees.j48
Class C45Split

java.lang.Object
  |
  +--weka.classifiers.trees.j48.ClassifierSplitModel
        |
        +--weka.classifiers.trees.j48.C45Split
All Implemented Interfaces:
java.lang.Cloneable, java.io.Serializable

public class C45Split
extends ClassifierSplitModel

Class implementing a C4.5-type split on an attribute.

Version:
$Revision: 1.8 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
Serialized Form

Constructor Summary
C45Split(int attIndex, int minNoObj, double sumOfWeights)
          Initializes the split model.
 
Method Summary
 int attIndex()
          Returns index of attribute for which split was generated.
 void buildClassifier(Instances trainInstances)
          Creates a C4.5-type split on the given data.
 double classProb(int classIndex, Instance instance, int theSubset)
          Gets class probability for instance.
 double codingCost()
          Returns coding cost for split (used in rule learner).
 double gainRatio()
          Returns (C4.5-type) gain ratio for the generated split.
 double infoGain()
          Returns (C4.5-type) information gain for the generated split.
 java.lang.String leftSide(Instances data)
          Prints left side of condition..
 double[][] minsAndMaxs(Instances data, double[][] minsAndMaxs, int index)
          Returns the minsAndMaxs of the index.th subset.
 void resetDistribution(Instances data)
          Sets distribution associated with model.
 java.lang.String rightSide(int index, Instances data)
          Prints the condition satisfied by instances in a subset.
 void setSplitPoint(Instances allInstances)
          Sets split point to greatest value in given data smaller or equal to old split point.
 java.lang.String sourceExpression(int index, Instances data)
          Returns a string containing java source code equivalent to the test made at this node.
 double[] weights(Instance instance)
          Returns weights if instance is assigned to more than one subset.
 int whichSubset(Instance instance)
          Returns index of subset instance is assigned to.
 
Methods inherited from class weka.classifiers.trees.j48.ClassifierSplitModel
checkModel, classifyInstance, classProbLaplace, clone, distribution, dumpLabel, dumpModel, numSubsets, sourceClass, split
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

C45Split

public C45Split(int attIndex,
                int minNoObj,
                double sumOfWeights)
Initializes the split model.

Method Detail

buildClassifier

public void buildClassifier(Instances trainInstances)
                     throws java.lang.Exception
Creates a C4.5-type split on the given data. Assumes that none of the class values is missing.

Specified by:
buildClassifier in class ClassifierSplitModel
Throws:
java.lang.Exception - if something goes wrong

attIndex

public final int attIndex()
Returns index of attribute for which split was generated.


classProb

public final double classProb(int classIndex,
                              Instance instance,
                              int theSubset)
                       throws java.lang.Exception
Gets class probability for instance.

Overrides:
classProb in class ClassifierSplitModel
Throws:
java.lang.Exception - if something goes wrong

codingCost

public final double codingCost()
Returns coding cost for split (used in rule learner).

Overrides:
codingCost in class ClassifierSplitModel

gainRatio

public final double gainRatio()
Returns (C4.5-type) gain ratio for the generated split.


infoGain

public final double infoGain()
Returns (C4.5-type) information gain for the generated split.


leftSide

public final java.lang.String leftSide(Instances data)
Prints left side of condition..

Specified by:
leftSide in class ClassifierSplitModel
Parameters:
data - training set.

rightSide

public final java.lang.String rightSide(int index,
                                        Instances data)
Prints the condition satisfied by instances in a subset.

Specified by:
rightSide in class ClassifierSplitModel
Parameters:
index - of subset
data - training set.

sourceExpression

public final java.lang.String sourceExpression(int index,
                                               Instances data)
Returns a string containing java source code equivalent to the test made at this node. The instance being tested is called "i".

Specified by:
sourceExpression in class ClassifierSplitModel
Parameters:
index - index of the nominal value tested
data - the data containing instance structure info
Returns:
a value of type 'String'

setSplitPoint

public final void setSplitPoint(Instances allInstances)
Sets split point to greatest value in given data smaller or equal to old split point. (C4.5 does this for some strange reason).


minsAndMaxs

public final double[][] minsAndMaxs(Instances data,
                                    double[][] minsAndMaxs,
                                    int index)
Returns the minsAndMaxs of the index.th subset.


resetDistribution

public void resetDistribution(Instances data)
                       throws java.lang.Exception
Sets distribution associated with model.

Overrides:
resetDistribution in class ClassifierSplitModel
java.lang.Exception

weights

public final double[] weights(Instance instance)
Returns weights if instance is assigned to more than one subset. Returns null if instance is only assigned to one subset.

Specified by:
weights in class ClassifierSplitModel

whichSubset

public final int whichSubset(Instance instance)
                      throws java.lang.Exception
Returns index of subset instance is assigned to. Returns -1 if instance is assigned to more than one subset.

Specified by:
whichSubset in class ClassifierSplitModel
Throws:
java.lang.Exception - if something goes wrong


Copyright (c) 2003 David Lindsay, Computer Learning Research Centre, Dept. Computer Science, Royal Holloway, University of London