TY - GEN
T1 - REEF: Resolving Length Bias in Frequent Sequence Mining
AU - Richardson, Ariella
AU - Kaminka, Gal A.
AU - Kraus, S.
N1 - Place of conference:Portugal
PY - 2013
Y1 - 2013
N2 - Classic support based approaches efficiently address
frequent sequence mining. However, support based mining
has been shown to suffer from a bias towards short sequences.
In this paper, we propose a method to resolve this bias when
mining the most frequent sequences. In order to resolve the
length bias we define norm-frequency, based on the statistical zscore
of support, and use it to replace support based frequency.
Our approach mines the subsequences that are frequent relative
to other subsequences of the same length. Unfortunately, naive
use of norm-frequency hinders mining scalability. Using normfrequency
breaks the anti-monotonic property of support, an
important part in being able to prune large sets of candidate
sequences. We describe a bound that enables pruning to provide
scalability. Experimental results on textual and computer user
input data establish that we manage to overcome the short
sequence bias successfully, and to illustrate the production of
meaningful sequences with our mining algorithm.
AB - Classic support based approaches efficiently address
frequent sequence mining. However, support based mining
has been shown to suffer from a bias towards short sequences.
In this paper, we propose a method to resolve this bias when
mining the most frequent sequences. In order to resolve the
length bias we define norm-frequency, based on the statistical zscore
of support, and use it to replace support based frequency.
Our approach mines the subsequences that are frequent relative
to other subsequences of the same length. Unfortunately, naive
use of norm-frequency hinders mining scalability. Using normfrequency
breaks the anti-monotonic property of support, an
important part in being able to prune large sets of candidate
sequences. We describe a bound that enables pruning to provide
scalability. Experimental results on textual and computer user
input data establish that we manage to overcome the short
sequence bias successfully, and to illustrate the production of
meaningful sequences with our mining algorithm.
UR - https://scholar.google.co.il/scholar?q=%09%09The+REEF%3A+Resolving+Length+Bias+in+Frequent+Sequence+Mining&btnG=&hl=en&as_sdt=0%2C5
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
SP - 91
EP - 96
BT - IMMM 2013, The Third International Conference on Advances in Information Mining and Management
ER -