weka.experiment
Class PairedTTester

java.lang.Object
  |
  +--weka.experiment.PairedTTester
All Implemented Interfaces:
OptionHandler
Direct Known Subclasses:
PairedCorrectedTTester

public class PairedTTester
extends java.lang.Object
implements OptionHandler

Calculates T-Test statistics on data stored in a set of instances.

Valid options from the command-line are:

-D num,num2...
The column numbers that uniquely specify a dataset. (default last)

-R num
The column number containing the run number. (default last)

-F num
The column number containing the fold number. (default none)

-S num
The significance level for T-Tests. (default 0.05)

-G num,num2...
The column numbers that uniquely specify one result generator (eg: scheme name plus options). (default last)

Version:
$Revision: 1.17 $
Author:
Len Trigg (trigg@cs.waikato.ac.nz)

Constructor Summary
PairedTTester()
           
 
Method Summary
 PairedStats calculateStatistics(Instance datasetSpecifier, int resultset1Index, int resultset2Index, int comparisonColumn)
          Computes a paired t-test comparison for a specified dataset between two resultsets.
 Range getDatasetKeyColumns()
          Get the value of DatasetKeyColumns.
 int getFoldColumn()
          Get the value of FoldColumn.
 Instances getInstances()
          Get the value of Instances.
 int getNumDatasets()
          Gets the number of datasets in the resultsets
 int getNumResultsets()
          Gets the number of resultsets in the data.
 java.lang.String[] getOptions()
          Gets current settings of the PairedTTester.
 boolean getProduceLatex()
          Get whether latex is output
 Range getResultsetKeyColumns()
          Get the value of ResultsetKeyColumns.
 java.lang.String getResultsetName(int index)
          Gets a string descriptive of the specified resultset.
 int getRunColumn()
          Get the value of RunColumn.
 boolean getShowStdDevs()
          Returns true if standard deviations have been requested.
 double getSignificanceLevel()
          Get the value of SignificanceLevel.
 java.lang.String header(int comparisonColumn)
          Creates a "header" string describing the current resultsets.
 java.util.Enumeration listOptions()
          Lists options understood by this object.
static void main(java.lang.String[] args)
          Test the class from the command line.
 java.lang.String multiResultsetFull(int baseResultset, int comparisonColumn)
          Creates a comparison table where a base resultset is compared to the other resultsets.
 java.lang.String multiResultsetRanking(int comparisonColumn)
           
 java.lang.String multiResultsetSummary(int comparisonColumn)
          Carries out a comparison between all resultsets, counting the number of datsets where one resultset outperforms the other.
 int[][] multiResultsetWins(int comparisonColumn)
          Carries out a comparison between all resultsets, counting the number of datsets where one resultset outperforms the other.
 java.lang.String resultsetKey()
          Creates a key that maps resultset numbers to their descriptions.
 void setDatasetKeyColumns(Range newDatasetKeyColumns)
          Set the value of DatasetKeyColumns.
 void setFoldColumn(int newFoldColumn)
          Set the value of FoldColumn.
 void setInstances(Instances newInstances)
          Set the value of Instances.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setProduceLatex(boolean l)
          Set whether latex is output
 void setResultsetKeyColumns(Range newResultsetKeyColumns)
          Set the value of ResultsetKeyColumns.
 void setRunColumn(int newRunColumn)
          Set the value of RunColumn.
 void setShowStdDevs(boolean s)
          Set whether standard deviations are displayed or not.
 void setSignificanceLevel(double newSignificanceLevel)
          Set the value of SignificanceLevel.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PairedTTester

public PairedTTester()
Method Detail

setProduceLatex

public void setProduceLatex(boolean l)
Set whether latex is output

Parameters:
l - true if tables are to be produced in Latex format

getProduceLatex

public boolean getProduceLatex()
Get whether latex is output

Returns:
true if Latex is to be output

setShowStdDevs

public void setShowStdDevs(boolean s)
Set whether standard deviations are displayed or not.

Parameters:
s - true if standard deviations are to be displayed

getShowStdDevs

public boolean getShowStdDevs()
Returns true if standard deviations have been requested.

Returns:
true if standard deviations are to be displayed.

getNumDatasets

public int getNumDatasets()
Gets the number of datasets in the resultsets

Returns:
the number of datasets in the resultsets

getNumResultsets

public int getNumResultsets()
Gets the number of resultsets in the data.

Returns:
the number of resultsets in the data

getResultsetName

public java.lang.String getResultsetName(int index)
Gets a string descriptive of the specified resultset.

Parameters:
index - the index of the resultset
Returns:
a descriptive string for the resultset

calculateStatistics

public PairedStats calculateStatistics(Instance datasetSpecifier,
                                       int resultset1Index,
                                       int resultset2Index,
                                       int comparisonColumn)
                                throws java.lang.Exception
Computes a paired t-test comparison for a specified dataset between two resultsets.

Parameters:
datasetSpecifier - the dataset specifier
resultset1Index - the index of the first resultset
resultset2Index - the index of the second resultset
comparisonColumn - the column containing values to compare
Returns:
the results of the paired comparison
Throws:
java.lang.Exception - if an error occurs

resultsetKey

public java.lang.String resultsetKey()
Creates a key that maps resultset numbers to their descriptions.

Returns:
a value of type 'String'

header

public java.lang.String header(int comparisonColumn)
Creates a "header" string describing the current resultsets.

Parameters:
comparisonColumn - a value of type 'int'
Returns:
a value of type 'String'

multiResultsetWins

public int[][] multiResultsetWins(int comparisonColumn)
                           throws java.lang.Exception
Carries out a comparison between all resultsets, counting the number of datsets where one resultset outperforms the other.

Parameters:
comparisonColumn - the index of the comparison column
Returns:
a 2d array where element [i][j] is the number of times resultset j performed significantly better than resultset i.
Throws:
java.lang.Exception - if an error occurs

multiResultsetSummary

public java.lang.String multiResultsetSummary(int comparisonColumn)
                                       throws java.lang.Exception
Carries out a comparison between all resultsets, counting the number of datsets where one resultset outperforms the other. The results are summarized in a table.

Parameters:
comparisonColumn - the index of the comparison column
Returns:
the results in a string
Throws:
java.lang.Exception - if an error occurs

multiResultsetRanking

public java.lang.String multiResultsetRanking(int comparisonColumn)
                                       throws java.lang.Exception
java.lang.Exception

multiResultsetFull

public java.lang.String multiResultsetFull(int baseResultset,
                                           int comparisonColumn)
                                    throws java.lang.Exception
Creates a comparison table where a base resultset is compared to the other resultsets. Results are presented for every dataset.

Parameters:
baseResultset - the index of the base resultset
comparisonColumn - the index of the column to compare over
Returns:
the comparison table string
Throws:
java.lang.Exception - if an error occurs

listOptions

public java.util.Enumeration listOptions()
Lists options understood by this object.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of Options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-D num,num2...
The column numbers that uniquely specify a dataset. (default last)

-R num
The column number containing the run number. (default last)

-F num
The column number containing the fold number. (default none)

-S num
The significance level for T-Tests. (default 0.05)

-G num,num2...
The column numbers that uniquely specify one result generator (eg: scheme name plus options). (default last)

-V
Show standard deviations

-L
Produce comparison tables in Latex table format

Specified by:
setOptions in interface OptionHandler
Parameters:
options - an array containing options to set.
Throws:
java.lang.Exception - if invalid options are given

getOptions

public java.lang.String[] getOptions()
Gets current settings of the PairedTTester.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings containing current options.

getResultsetKeyColumns

public Range getResultsetKeyColumns()
Get the value of ResultsetKeyColumns.

Returns:
Value of ResultsetKeyColumns.

setResultsetKeyColumns

public void setResultsetKeyColumns(Range newResultsetKeyColumns)
Set the value of ResultsetKeyColumns.

Parameters:
newResultsetKeyColumns - Value to assign to ResultsetKeyColumns.

getSignificanceLevel

public double getSignificanceLevel()
Get the value of SignificanceLevel.

Returns:
Value of SignificanceLevel.

setSignificanceLevel

public void setSignificanceLevel(double newSignificanceLevel)
Set the value of SignificanceLevel.

Parameters:
newSignificanceLevel - Value to assign to SignificanceLevel.

getDatasetKeyColumns

public Range getDatasetKeyColumns()
Get the value of DatasetKeyColumns.

Returns:
Value of DatasetKeyColumns.

setDatasetKeyColumns

public void setDatasetKeyColumns(Range newDatasetKeyColumns)
Set the value of DatasetKeyColumns.

Parameters:
newDatasetKeyColumns - Value to assign to DatasetKeyColumns.

getRunColumn

public int getRunColumn()
Get the value of RunColumn.

Returns:
Value of RunColumn.

setRunColumn

public void setRunColumn(int newRunColumn)
Set the value of RunColumn.

Parameters:
newRunColumn - Value to assign to RunColumn.

getFoldColumn

public int getFoldColumn()
Get the value of FoldColumn.

Returns:
Value of FoldColumn.

setFoldColumn

public void setFoldColumn(int newFoldColumn)
Set the value of FoldColumn.

Parameters:
newFoldColumn - Value to assign to FoldColumn.

getInstances

public Instances getInstances()
Get the value of Instances.

Returns:
Value of Instances.

setInstances

public void setInstances(Instances newInstances)
Set the value of Instances.

Parameters:
newInstances - Value to assign to Instances.

main

public static void main(java.lang.String[] args)
Test the class from the command line.

Parameters:
args - contains options for the instance ttests


Copyright (c) 2003 David Lindsay, Computer Learning Research Centre, Dept. Computer Science, Royal Holloway, University of London