The Hebrew Universal Dependency Treebank: Past, Present and Future

Shoval Sadde, Amit Seker, Reut Tsarfaty

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

The Hebrew treebank (HTB), consisting of 6221 morpho-syntactically annotated newspaper sentences, has been the only resource for training and validating statistical parsers and taggers for Hebrew, for almost two decades now. During these decades, the HTB has gone through a trajectory of automatic and semi-automatic conversions, until arriving at its UDv2 form. In this work we manually validate the UDv2 version of the HTB, and, according to our findings, we apply scheme changes that bring the UD HTB to the same theoretical grounds as the rest of UD. Our experimental parsing results with UDv2New confirm that improving the coherence and internal consistency of the UD HTB indeed leads to improved parsing performance. At the same time, our analysis demonstrates that there is more to be done at the point of intersection of UD with other linguistic processing layers, in particular, at the points where UD interfaces external morphological and lexical resources.

Original languageEnglish
Title of host publicationEMNLP 2018 - 2nd Workshop on Universal Dependencies, UDW 2018 - Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages133-143
Number of pages11
ISBN (Electronic)9781948087780
StatePublished - 2018
Externally publishedYes
Event2nd Workshop on Universal Dependencies, UDW 2018, held in conjunction with EMNLP 2018 - Brussels, Belgium
Duration: 1 Nov 2018 → …

Publication series

NameEMNLP 2018 - 2nd Workshop on Universal Dependencies, UDW 2018 - Proceedings of the Workshop

Conference

Conference2nd Workshop on Universal Dependencies, UDW 2018, held in conjunction with EMNLP 2018
Country/TerritoryBelgium
CityBrussels
Period1/11/18 → …

Bibliographical note

Publisher Copyright:
© 2018 Association for Computational Linguistics

Funding

We thank the ONLP team at the Open University of Israel for fruitful discussions throughout the process. We further thank two anonymous reviewers for their detailed and insightful comments. This research is supported by the European Research Council, ERC-StG-2015 scheme, Grant number 677352, and by the Israel Science Foundation (ISF), Grant number 1739/26, for which we are grateful.

FundersFunder number
Open University of Israel
European Research CouncilERC-StG-2015, 677352
Israel Science Foundation1739/26

    Fingerprint

    Dive into the research topics of 'The Hebrew Universal Dependency Treebank: Past, Present and Future'. Together they form a unique fingerprint.

    Cite this