Many academic journals and conferences require that each article include a list of keyphrases. These keyphrases should provide general information about the contents and the topics of the article. Keyphrases may save precious time for tasks such as filtering, summarization, and categorization. In this paper, we investigate automatic extraction and learning of keyphrases from scientific articles written in English. Firstly, we introduce various baseline extraction methods. Some of them, formalized by us, are very successful for academic papers. Then, we integrate these methods using different machine learning methods. The best results have been achieved by J48, an improved variant of C4.5. These results are significantly better than those achieved by previous extraction systems, regarded as the state of the art.
|Number of pages||13|
|Journal||Lecture Notes in Computer Science|
|State||Published - 2005|
|Event||6th International Conference, CICLing 2005 - Mexico City, Mexico|
Duration: 13 Feb 2005 → 19 Feb 2005