TY - GEN
T1 - Crowd mining
AU - Amsterdamer, Yael
AU - Grossman, Yael
AU - Milo, Tova
AU - Senellart, Pierre
PY - 2013
Y1 - 2013
N2 - Harnessing a crowd of Web users for data collection has recently become a wide-spread phenomenon. A key challenge is that the human knowledge forms an open world and it is thus difficult to know what kind of information we should be looking for. Classic databases have addressed this problem by data mining techniques that identify interesting data patterns. These techniques, however, are not suitable for the crowd. This is mainly due to properties of the human memory, such as the tendency to remember simple trends and summaries rather than exact details. Following these observations, we develop here for the first time the foundations of crowd mining. We first define the formal settings. Based on these, we design a framework of generic components, used for choosing the best questions to ask the crowd and mining significant patterns from the answers. We suggest general implementations for these components, and test the resulting algorithm's performance on benchmarks that we designed for this purpose. Our algorithm consistently outperforms alternative baseline algorithms.
AB - Harnessing a crowd of Web users for data collection has recently become a wide-spread phenomenon. A key challenge is that the human knowledge forms an open world and it is thus difficult to know what kind of information we should be looking for. Classic databases have addressed this problem by data mining techniques that identify interesting data patterns. These techniques, however, are not suitable for the crowd. This is mainly due to properties of the human memory, such as the tendency to remember simple trends and summaries rather than exact details. Following these observations, we develop here for the first time the foundations of crowd mining. We first define the formal settings. Based on these, we design a framework of generic components, used for choosing the best questions to ask the crowd and mining significant patterns from the answers. We suggest general implementations for these components, and test the resulting algorithm's performance on benchmarks that we designed for this purpose. Our algorithm consistently outperforms alternative baseline algorithms.
KW - Association rule learning
KW - Crowd mining
KW - Crowdsourcing
UR - http://www.scopus.com/inward/record.url?scp=84880528401&partnerID=8YFLogxK
U2 - 10.1145/2463676.2465318
DO - 10.1145/2463676.2465318
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84880528401
SN - 9781450320375
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 241
EP - 252
BT - SIGMOD 2013 - International Conference on Management of Data
T2 - 2013 ACM SIGMOD Conference on Management of Data, SIGMOD 2013
Y2 - 22 June 2013 through 27 June 2013
ER -