Table To Item Sets

Skip to end of metadata
Go to start of metadata
Author Boris Capitanu
Creation date 06/01/2011
Firing policy all
Package org.seasr.meandre.components.discovery.ruleassociation

DESCRIPTION

This module reads a Table and extracts from it items for use in mining association rules with the Apriori algorithm.

Detailed Description: This module takes as input a Table or an Example Table, and extracts items that are used by the Apriori rule association algorithm. An item is an [attribute,value] pair that occurs in the input table. The module uses information from the original table to determine which attributes should be used to form items being considered as possible rule antecedents and rule consequents. A compact representation is created indicating which items are contained in rows in the original table. The items and other information used by the Apriori algorithm are written to the Item Sets output port.

If a Table or an Example Table with no specified input or output attributes is loaded, all attributes (columns) will be used to form items being considered as possible antecedents and consequents for the association rules. If an Example Table with only input attributes or only output attributes is loaded, the chosen attributes will be used to form items considered as possible rule antecedents and possible rule consequents. If an Example Table with both input and output attributes is loaded, the inputs will be used to form items considered as possible rule antecedents, and the outputs used to form items considered as possible rule consequents.

The computational complexity of the Apriori algorithm depends on the number of possible antecedents and consequents, so narrowing the search prior to this step is highly recommended. Use the module Choose Attributes to specify the subset of table attributes that are of interest. If the table has continuous attributes as possible rule antecedents or targets, a Binning module should be used prior to this module to reduce the number of possible values for those continuous attributes.

In a typical itinerary the Item Sets output port from this module is connected to a Generate Multiple Outputs module and then to an Apriori module which forms frequent itemsets based on a minimum support value, and to a Compute Confidence module which forms association rules that satisfy a minimum confidence value.

Limitations: The Apriori and Compute Confidence modules currently build rules with a single item in the consequent.

Data Handling: This module does not modify the input Table in any way.

Scalability: A representation of each row of the table is stored in memory. The representation is usually smaller than the original data.

INPUTS

Name Description Example
table
The table that items and sets will be extracted from
 

OUTPUTS

Name Description Example
error
This port is used to output any unhandled errors encountered during the execution of this component
 
item_sets
The items of interest that were found in the table and a representation of the items that occur together in the table
 

PROPERTIES

Name Description Default value
_debug_level
Controls the verbosity of debug messages printed by the component during execution.
Possible values are: off, severe, warning, info, config, fine, finer, finest, all
Append ',mirror' to any of the values above to mirror that output to the server logs.
info
_ignore_errors
Set to 'true' to ignore all unhandled exceptions and prevent the flow from being terminated. Setting this property to 'false' will result in the flow being terminated in the event an unhandled exception is thrown during the execution of this component
false
Labels:
table table Delete
discovery discovery Delete
converter converter Delete
itemsets itemsets Delete
rule rule Delete
association association Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.