TY - JOUR
T1 - Decision-making method using a visual approach for cluster analysis problems; Indicative classification algorithms and grouping scope
AU - Bittmann, Ran M.
AU - Gelbard, Roy M.
PY - 2007/7
Y1 - 2007/7
N2 - Currently, classifying samples into a fixed number of clusters (i.e. supervised cluster analysis) as well as unsupervised cluster analysis are limited in their ability to support 'cross-algorithms' analysis. It is well known that each cluster analysis algorithm yields different results (i.e. a different classification); even running the same algorithm with two different similarity measures commonly yields different results. Researchers usually choose the preferred algorithm and similarity measure according to analysis objectives and data set features, but they have neither a formal method nor tool that supports comparisons and evaluations of the different classifications that result from the diverse algorithms. Current research development and prototype decisions support a methodology based upon formal quantitative measures and a visual approach, enabling presentation, comparison and evaluation of multiple classification suggestions resulting from diverse algorithms. This methodology and tool were used in two basic scenarios: (I) a classification problem in which a 'true result' is known, using the Fisher iris data set; (II) a classification problem in which there is no 'true result' to compare with. In this case, we used a small data set from a user profile study (a study that tries to relate users to a set of stereotypes based on sociological aspects and interests). In each scenario, ten diverse algorithms were executed. The suggested methodology and decision support system produced a cross-algorithms presentation; all ten resultant classifications are presented together in a 'Tetris-like' format. Each column represents a specific classification algorithm, each line represents a specific sample, and formal quantitative measures analyse the 'Tetris blocks', arranging them according to their best structures, i.e. best classification.
AB - Currently, classifying samples into a fixed number of clusters (i.e. supervised cluster analysis) as well as unsupervised cluster analysis are limited in their ability to support 'cross-algorithms' analysis. It is well known that each cluster analysis algorithm yields different results (i.e. a different classification); even running the same algorithm with two different similarity measures commonly yields different results. Researchers usually choose the preferred algorithm and similarity measure according to analysis objectives and data set features, but they have neither a formal method nor tool that supports comparisons and evaluations of the different classifications that result from the diverse algorithms. Current research development and prototype decisions support a methodology based upon formal quantitative measures and a visual approach, enabling presentation, comparison and evaluation of multiple classification suggestions resulting from diverse algorithms. This methodology and tool were used in two basic scenarios: (I) a classification problem in which a 'true result' is known, using the Fisher iris data set; (II) a classification problem in which there is no 'true result' to compare with. In this case, we used a small data set from a user profile study (a study that tries to relate users to a set of stereotypes based on sociological aspects and interests). In each scenario, ten diverse algorithms were executed. The suggested methodology and decision support system produced a cross-algorithms presentation; all ten resultant classifications are presented together in a 'Tetris-like' format. Each column represents a specific classification algorithm, each line represents a specific sample, and formal quantitative measures analyse the 'Tetris blocks', arranging them according to their best structures, i.e. best classification.
KW - Cluster analysis
KW - Decision support system
KW - Visualization techniques
UR - http://www.scopus.com/inward/record.url?scp=34250209181&partnerID=8YFLogxK
U2 - 10.1111/j.1468-0394.2007.00428.x
DO - 10.1111/j.1468-0394.2007.00428.x
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
SN - 0266-4720
VL - 24
SP - 171
EP - 187
JO - Expert Systems
JF - Expert Systems
IS - 3
ER -