In many data mining tasks, the goal is to classify entities into a set of pre-defined groups (classes). A second and equally important goal is the interpretation, i.e. understanding the nature of the population aggregated in each class. These tasks are rendered even more complex when there is no a-priori information regarding the right classification. The current paper is based on two concepts: (1) Bounded-Rationality theory which implements an S-shaped function that represents human logic as a saliency measure to determine the substantial features that characterize each potential group and (2) Classification by clustering (CBC) that applies Decision Tree-like classification in unsupervised clustering problems, where neither an a-priori classification nor target-attributes are known in advance. In the context of these two concepts, the current research contributes: (1) by expanding the saliency measure to all possible types of variables (nominal as well as numerical), (2) by evaluating, using five datasets, a composite model that combines the CBC method and the saliency concept. The findings show that by using clustering algorithms for classification tasks (CBC method) the results are as accurate as those obtained by conventional Decision Trees, but with a better saliency factor.
Bibliographical notePublisher Copyright:
© 2015 Wiley Publishing Ltd.
- classification by clustering (CBC)
- cluster analysis
- data mining
- decision trees