A generalized framework for revealing analogous themes across related topics

Zvika Marx, I. Dagan, Eli Shamir

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


This work addresses the task of identifying thematic correspondences across sub-corpora focused on different topics. We introduce an unsupervised algorithmic framework based on distributional data clustering, which generalizes previous initial works on this task. The empirical results reveal interesting commonalities of different religions. We evaluate the results through measuring the overlap of our clusters with clusters compiled manually by experts. The tested variants of our framework are shown to outperform alternative methods applicable to the task.
Original languageAmerican English
Title of host publicationThe conference on Human Language Technology and Empirical Methods in Natural Language Processing
PublisherAssociation for Computational Linguistics
StatePublished - 2005

Bibliographical note

Place of conference:Vancouver, British Columbia, Canada

Cite this