ALCOMFT-TR-03-199

ALCOM-FT
 

Gemma Casas-Garriga
Statistical Strategies for Pruning All the Uninteresting Association Rules
Barcelona. Work packages 1 and 4. December 2003.
Abstract: We propose a general framework to describe formally the problem of capturing the intensity of implication for association rules through statistical metrics. In this framework we present properties that influence the interestingness of a rule, analyze the conditions that lead a measure to perform a perfect prune at a time, and define a final proper order to sort the surviving rules. We will discuss why none of the currently employed measures can capture objective interestingness, and just the combination of some of them, in a multi-step fashion, can be reliable. In contrast, we propose a new simple modification of the Pearson coefficient that will meet all the necessary requirements. We statistically infer the convenient cut-off threshold for this new metric by empirically describing its distribution function through simulation. Final experiments serve to show the ability of our proposal.
Postscript file: ALCOMFT-TR-03-199.ps.gz (134 kb).

System maintainer Gerth Stølting Brodal <gerth@cs.au.dk>