| Author | Boris Capitanu |
| Creation date | 06/01/2011 |
| Firing policy | all |
| Package | org.seasr.meandre.components.discovery.ruleassociation |
DESCRIPTION
This module reads a Table and extracts from it items for use in mining association rules with the Apriori algorithm.
Detailed Description: This module takes as input a Table or an Example Table, and extracts items that are used by the Apriori rule association algorithm. An item is an [attribute,value] pair that occurs in the input table. The module uses information from the original table to determine which attributes should be used to form items being considered as possible rule antecedents and rule consequents. A compact representation is created indicating which items are contained in rows in the original table. The items and other information used by the Apriori algorithm are written to the Item Sets output port.
If a Table or an Example Table with no specified input or output attributes is loaded, all attributes (columns) will be used to form items being considered as possible antecedents and consequents for the association rules. If an Example Table with only input attributes or only output attributes is loaded, the chosen attributes will be used to form items considered as possible rule antecedents and possible rule consequents. If an Example Table with both input and output attributes is loaded, the inputs will be used to form items considered as possible rule antecedents, and the outputs used to form items considered as possible rule consequents.
The computational complexity of the Apriori algorithm depends on the number of possible antecedents and consequents, so narrowing the search prior to this step is highly recommended. Use the module Choose Attributes to specify the subset of table attributes that are of interest. If the table has continuous attributes as possible rule antecedents or targets, a Binning module should be used prior to this module to reduce the number of possible values for those continuous attributes.
In a typical itinerary the Item Sets output port from this module is connected to a Generate Multiple Outputs module and then to an Apriori module which forms frequent itemsets based on a minimum support value, and to a Compute Confidence module which forms association rules that satisfy a minimum confidence value.
Limitations: The Apriori and Compute Confidence modules currently build rules with a single item in the consequent.
Data Handling: This module does not modify the input Table in any way.
Scalability: A representation of each row of the table is stored in memory. The representation is usually smaller than the original data.
INPUTS
| Name | Description | Example |
|---|---|---|
table |
The table that items and sets will be extracted from |
OUTPUTS
| Name | Description | Example |
|---|---|---|
error |
This port is used to output any unhandled errors encountered during the execution of this component |
|
item_sets |
The items of interest that were found in the table and a representation of the items that occur together in the table |
PROPERTIES
| Name | Description | Default value |
|---|---|---|
_debug_level |
Controls the verbosity of debug messages printed by the component during execution. Possible values are: off, severe, warning, info, config, fine, finer, finest, all Append ',mirror' to any of the values above to mirror that output to the server logs. |
info |
_ignore_errors |
Set to 'true' to ignore all unhandled exceptions and prevent the flow from being terminated. Setting this property to 'false' will result in the flow being terminated in the event an unhandled exception is thrown during the execution of this component |
false |