From raw text to universal dependencies – Look, no tags!

Miryam de Lhoneux, Yan Shao, Ali Basirat, Eliyahu Kiperwasser, Sara Stymne, Yoav Goldberg, Joakim Nivre

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

31 Scopus citations

Abstract

We present the Uppsala submission to the CoNLL 2017 shared task on parsing from raw text to universal dependencies. Our system is a simple pipeline consisting of two components. The first performs joint word and sentence segmentation on raw text; the second predicts dependency trees from raw words. The parser bypasses the need for part-of-speech tagging, but uses word embeddings based on universal tag distributions. We achieved a macro-averaged LAS F1 of 65.11 in the official test run and obtained the 2nd best result for sentence segmentation with a score of 89.03. After fixing two bugs, we obtained an unofficial LAS F1 of 70.49.

Original languageEnglish
Title of host publicationCoNLL 2017 - SIGNLL Conference on Computational Natural Language Learning, Proceedings of the CoNLL 2017 Shared Task
Subtitle of host publicationMultilingual Parsing from Raw Text to Universal Dependencies
PublisherAssociation for Computational Linguistics (ACL)
Pages207-217
Number of pages11
ISBN (Electronic)9781945626708
DOIs
StatePublished - 2017
Event2017 SIGNLL Conference on Computational Natural Language Learning- CoNLL Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, CoNLL 2017 - Vancouver, Canada
Duration: 3 Aug 20174 Aug 2017

Publication series

NameCoNLL 2017 - SIGNLL Conference on Computational Natural Language Learning, Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

Conference

Conference2017 SIGNLL Conference on Computational Natural Language Learning- CoNLL Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, CoNLL 2017
Country/TerritoryCanada
CityVancouver
Period3/08/174/08/17

Bibliographical note

Publisher Copyright:
© 2017 Association for Computational Linguistics.

Funding

We are grateful to the shared task organizers and to Dan Zeman in particular, and we acknowledge the computational resources provided by CSC in Helsinki and Sigma2 in Oslo through NeIC-NLPL (www.nlpl.eu). Our parser will be made available in the NLPL dependency parsing laboratory.

FundersFunder number
CSC in Helsinki and Sigma2 in Oslo

    Fingerprint

    Dive into the research topics of 'From raw text to universal dependencies – Look, no tags!'. Together they form a unique fingerprint.

    Cite this