weka.filters.unsupervised.attribute
Class RandomProjection

java.lang.Object
  |
  +--weka.filters.Filter
        |
        +--weka.filters.unsupervised.attribute.RandomProjection
All Implemented Interfaces:
OptionHandler, java.io.Serializable, UnsupervisedFilter

public class RandomProjection
extends Filter
implements UnsupervisedFilter, OptionHandler

Reduces the dimensionality of the data by projecting it onto a lower dimensional subspace using a random matrix with columns of unit length (It will reduce the number of attributes in the data while preserving much of its variation like PCA, but at a much less computational cost).
It first applies the NominalToBinary filter to convert all attributes to numeric before reducing the dimension. It preserves the class attribute.

Valid filter-specific options are:

-N
The number of dimensions (attributes) the data should be reduced to (exclusive of the class attribute).

-P
The percentage of dimensions (attributes) the data should be reduced to (exclusive of the class attribute). This -N option is ignored if this option is present or is greater than zero.

-D
The distribution to use for calculating the random matrix.

  • 1 - Sparse distribution of: (default)
    sqrt(3)*{+1 with prob(1/6), 0 with prob(2/3), -1 with prob(1/6)}
  • 2 - Sparse distribution of:
    {+1 with prob(1/2), -1 with prob(1/2)}
  • 3 - Gaussian distribution
  • -M
    Replace missing values using the ReplaceMissingValues filter -R
    Specify the random seed for the random number generator for calculating the random matrix.

    Version:
    1.0 - 22 July 2003 - Initial version (Ashraf M. Kibriya)
    Author:
    Ashraf M. Kibriya (amk14@cs.waikato.ac.nz)
    See Also:
    Serialized Form

    Field Summary
    static Tag[] TAGS_DSTRS_TYPE
               
     
    Constructor Summary
    RandomProjection()
               
     
    Method Summary
     boolean batchFinished()
              Signify that this batch of input to the filter is finished.
     java.lang.String distributionTipText()
              Returns the tip text for this property
     SelectedTag getDistribution()
              Returns the current distribution that'll be used for calculating the random matrix
     int getNumberOfAttributes()
              Gets the current number of attributes (dimensionality) to which the data will be reduced to.
     java.lang.String[] getOptions()
              Gets the current settings of the filter.
     double getPercent()
              Gets the percent the attributes (dimensions) of the data will be reduced to
     long getRandomSeed()
              Gets the random seed of the random number generator
     boolean getReplaceMissingValues()
              Gets the current setting for using ReplaceMissingValues filter
     java.lang.String globalInfo()
              Returns a string describing this filter
     boolean input(Instance instance)
              Input an instance for filtering.
     java.util.Enumeration listOptions()
              Returns an enumeration describing the available options.
    static void main(java.lang.String[] argv)
              Main method for testing this class.
     java.lang.String numberOfAttributesTipText()
              Returns the tip text for this property
     java.lang.String percentTipText()
              Returns the tip text for this property
     java.lang.String randomSeedTipText()
              Returns the tip text for this property
     java.lang.String replaceMissingValuesTipText()
              Returns the tip text for this property
     void setDistribution(SelectedTag newDstr)
              Sets the distribution to use for calculating the random matrix
     boolean setInputFormat(Instances instanceInfo)
              Sets the format of the input instances.
     void setNumberOfAttributes(int newAttNum)
              Sets the number of attributes (dimensions) the data should be reduced to
     void setOptions(java.lang.String[] options)
              Parses the options for this object.
     void setPercent(double newPercent)
              Sets the percent the attributes (dimensions) of the data should be reduced to
     void setRandomSeed(long seed)
              Sets the random seed of the random number generator
     void setReplaceMissingValues(boolean t)
              Sets either to use replace missing values filter or not
     
    Methods inherited from class weka.filters.Filter
    batchFilterFile, filterFile, getOutputFormat, inputFormat, isOutputFormatDefined, numPendingOutput, output, outputFormat, outputPeek, useFilter
     
    Methods inherited from class java.lang.Object
    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
     

    Field Detail

    TAGS_DSTRS_TYPE

    public static final Tag[] TAGS_DSTRS_TYPE
    Constructor Detail

    RandomProjection

    public RandomProjection()
    Method Detail

    listOptions

    public java.util.Enumeration listOptions()
    Returns an enumeration describing the available options.

    Specified by:
    listOptions in interface OptionHandler
    Returns:
    an enumeration of all the available options.

    setOptions

    public void setOptions(java.lang.String[] options)
                    throws java.lang.Exception
    Parses the options for this object. Valid options are:

    -N
    The number of dimensions (attributes) the data should be reduced to (exclusive of the class attribute).

    -P
    The percentage of dimensions (attributes) the data should be reduced to (exclusive of the class attribute). This -N option is ignored if this option is present or is greater than zero.

    -D
    The distribution to use for calculating the random matrix.

  • 1 - Sparse distribution of: (default)
    sqrt(3)*{+1 with prob(1/6), 0 with prob(2/3), -1 with prob(1/6)}
  • 2 - Sparse distribution of:
    {+1 with prob(1/2), -1 with prob(1/2)}
  • 3 - Gaussian distribution
  • -M
    Replace missing values using the ReplaceMissingValues filter -R
    Specify the random seed for the random number generator for calculating the random matrix. *

    Specified by:
    setOptions in interface OptionHandler
    Parameters:
    options - the list of options as an array of strings
    Throws:
    java.lang.Exception - if an option is not supported

    getOptions

    public java.lang.String[] getOptions()
    Gets the current settings of the filter.

    Specified by:
    getOptions in interface OptionHandler
    Returns:
    an array of strings suitable for passing to setOptions

    globalInfo

    public java.lang.String globalInfo()
    Returns a string describing this filter

    Returns:
    a description of the filter suitable for displaying in the explorer/experimenter gui

    numberOfAttributesTipText

    public java.lang.String numberOfAttributesTipText()
    Returns the tip text for this property

    Returns:
    tip text for this property suitable for displaying in the explorer/experimenter gui

    setNumberOfAttributes

    public void setNumberOfAttributes(int newAttNum)
    Sets the number of attributes (dimensions) the data should be reduced to


    getNumberOfAttributes

    public int getNumberOfAttributes()
    Gets the current number of attributes (dimensionality) to which the data will be reduced to.


    percentTipText

    public java.lang.String percentTipText()
    Returns the tip text for this property

    Returns:
    tip text for this property suitable for displaying in the explorer/experimenter gui

    setPercent

    public void setPercent(double newPercent)
    Sets the percent the attributes (dimensions) of the data should be reduced to


    getPercent

    public double getPercent()
    Gets the percent the attributes (dimensions) of the data will be reduced to


    randomSeedTipText

    public java.lang.String randomSeedTipText()
    Returns the tip text for this property

    Returns:
    tip text for this property suitable for displaying in the explorer/experimenter gui

    setRandomSeed

    public void setRandomSeed(long seed)
    Sets the random seed of the random number generator


    getRandomSeed

    public long getRandomSeed()
    Gets the random seed of the random number generator


    distributionTipText

    public java.lang.String distributionTipText()
    Returns the tip text for this property

    Returns:
    tip text for this property suitable for displaying in the explorer/experimenter gui

    setDistribution

    public void setDistribution(SelectedTag newDstr)
    Sets the distribution to use for calculating the random matrix


    getDistribution

    public SelectedTag getDistribution()
    Returns the current distribution that'll be used for calculating the random matrix


    replaceMissingValuesTipText

    public java.lang.String replaceMissingValuesTipText()
    Returns the tip text for this property

    Returns:
    tip text for this property suitable for displaying in the explorer/experimenter gui

    setReplaceMissingValues

    public void setReplaceMissingValues(boolean t)
    Sets either to use replace missing values filter or not


    getReplaceMissingValues

    public boolean getReplaceMissingValues()
    Gets the current setting for using ReplaceMissingValues filter


    setInputFormat

    public boolean setInputFormat(Instances instanceInfo)
                           throws java.lang.Exception
    Sets the format of the input instances.

    Overrides:
    setInputFormat in class Filter
    Parameters:
    instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
    Returns:
    true if the outputFormat may be collected immediately
    Throws:
    java.lang.Exception - if the input format can't be set successfully

    input

    public boolean input(Instance instance)
                  throws java.lang.Exception
    Input an instance for filtering.

    Overrides:
    input in class Filter
    Parameters:
    instance - the input instance
    Returns:
    true if the filtered instance may now be collected with output().
    Throws:
    java.lang.IllegalStateException - if no input format has been set
    java.lang.Exception - if the input instance was not of the correct format or if there was a problem with the filtering.

    batchFinished

    public boolean batchFinished()
                          throws java.lang.Exception
    Signify that this batch of input to the filter is finished.

    Overrides:
    batchFinished in class Filter
    Returns:
    true if there are instances pending output
    Throws:
    java.lang.NullPointerException - if no input structure has been defined,
    java.lang.Exception - if there was a problem finishing the batch.

    main

    public static void main(java.lang.String[] argv)
    Main method for testing this class.

    Parameters:
    argv - should contain arguments to the filter: use -h for help


    Copyright (c) 2003 David Lindsay, Computer Learning Research Centre, Dept. Computer Science, Royal Holloway, University of London