FeatureFilterLite

Skip to end of metadata
Go to start of metadata
Author Loretta Auvil
Creation date 06/01/2011
Firing policy all
Package org.seasr.meandre.components.transform.table

DESCRIPTION

Overview: This module scans the input SparseTable and any term (column) whose support does not fall within the range specified has its column removed from the table. This can greatly reduce the total number features used for learning -- improving accuracy and performance.

Data Type Restrictions: The input Table must be an instance of a SparseTable.

Data Handling: A new SparseTable instance is created and only columns that will be kept are copied into it.

Scalability: Creates a second table on the same order of size as the original. Columns from the first table are inserted into the second table; no copies are made. Algorithm makes one pass over the table columns and one pass over the table data.

INPUTS

Name Description Example
sparseTable
The input data table for transformation.
TYPE: org.seasr.datatypes.datamining.table.sparse.SparseTable
 

OUTPUTS

Name Description Example
sparseTable
The resulting modified table.
TYPE: org.seasr.datatypes.datamining.table.sparse.SparseTable
 
error
This port is used to output any unhandled errors encountered during the execution of this component
 

PROPERTIES

Name Description Default value
removeColumnsWithOnlyOneEntry
Remove Columns With Only One Entry: If a columnn in a sparse table has only one entry then remove that column. NOTE: if lower bound is set to a positive value this property is ignored.
true
verbose
Verbose output.
false
upperBoundSupport
Percent Support for Upper Bounds Cutoff: The percent of support above which a given feature (column) will be removed. NOTE: If this value is set to a positive value the "removeColumnsWithAllEntries" property is ignored.
100
_debug_level
Controls the verbosity of debug messages printed by the component during execution.
Possible values are: off, severe, warning, info, config, fine, finer, finest, all
Append ',mirror' to any of the values above to mirror that output to the server logs.
info
lowerBoundSupport
Percent Support for Lower Bounds Cutoff: The percent of support below which a given feature (column) will be removed. NOTE: If this value is set to a positive value the "removeColumnsWithOnlyOneEntry" property is ignored.
0
removeColumnsWithAllEntries
Remove Columns with All Entries Present: If a columnn in a sparse table has every entry possible then remove that column. NOTE: if upper bound is set to a positive value this property is ignored.
true
_ignore_errors
Set to 'true' to ignore all unhandled exceptions and prevent the flow from being terminated. Setting this property to 'false' will result in the flow being terminated in the event an unhandled exception is thrown during the execution of this component
false
Labels:
filter filter Delete
feature feature Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.