A Unified Morpho-Syntactic Scheme of Stanford Dependencies

Reut Tsarfaty

The 51st Annual Meeting of the Association for Computational Linguistics - Short Papers (ACL Short Papers 2013)
Sofia, Bulgaria, August 4-9, 2013


Stanford Dependencies (SD) provide a functional characterization of the grammatical relations in syntactic parse-trees. SD deliver a useful representation for downstream applications and are a popular choice for parser evaluation. The current design of SD focuses on structurally marked relations and neglects morphosyntactic realization as observed in Morphologically Rich Languages (MRLs). Here we define a novel extension of SD, called Unified-SD (U-SD), which unifies the annotation of structurally- and morphologically-marked relations via an inheritance hierarchy, and provides a principles treatment of morphological elements in syntactic trees. We apply the scheme to create a new U-SD resource for the MRL Modern Hebrew, composed of aligned constituency and dependency treebanks that reflect equivalent U-SD structure in different formal representation types. We present two systems that can automatically predict U-SD annotations, for gold morphologically segmented input as well as for raw (unsegmented) texts, delivering high baseline accuracy on the task.

