Categorical relevance judgment

Maayan Zhitomirsky-Geffet, Judit Bar-Ilan, Mark Levene

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


In this study we aim to explore users' behavior when assessing search results relevance based on the hypothesis of categorical thinking. To investigate how users categories search engine results, we perform several experiments where users are asked to group a list of 20 search results into several categories, while attaching a relevance judgment to each formed category. Moreover, to determine how users change their minds over time, each experiment was repeated three times under the same conditions, with a gap of one month between rounds. The results show that on average users form 4–5 categories. Within each round the size of a category decreases with the relevance of a category. To measure the agreement between the search engine's ranking and the users’ relevance judgments, we defined two novel similarity measures, the average concordance and the MinMax swap ratio. Similarity is shown to be the highest for the third round as the users' opinion stabilizes. Qualitative analysis uncovered some interesting points that users tended to categories results by type and reliability of their source, and particularly, found commercial sites less trustworthy, and attached high relevance to Wikipedia when their prior domain knowledge was limited.

Original languageEnglish
Pages (from-to)1084-1094
Number of pages11
JournalJournal of the Association for Information Science and Technology
Issue number9
StatePublished - Sep 2018

Bibliographical note

Publisher Copyright:
© 2018 ASIS&T


Dive into the research topics of 'Categorical relevance judgment'. Together they form a unique fingerprint.

Cite this