Using ZPar-MVT Off-the-Shelf
Overview
ZPar for MVT (Peking University Multi-view Chinese Treebank) is compiled into a program: zpar.mvt
. This program needs to be executed with a set of corresponding statistical models. Some example sets of models are released together with ZPar source so that the public release can be used off-the-shelf.
The current version of ZPar is 0.7. Its release contains a set of models for zpar.mvt
, which support joint Chinese word segmentation and POS tagging, and labeled dependency parsing.
Download and installation
The source code and models can be downloaded from sourceforge. Unzip the source zip file into the source directory and the corresponding model files each into a model directory.
Download the models for ZPar-MVT:
To compile ZPar-PMT, type make zpar.mvt
in the zpar source directory. The binary file zpar.mvt
will placed in the dist
folder.
Usage of ZPar for Chinese-MVT
Suppose that the executable files are saved in the folder zpar/dist/zpar.mvt
and the models are saved in chinese.mvt
. To run zpar
, type
zpar/dist/zpar.mvt chinese.mvt input output
to read Chinese sentences from the input
and write the corresponding parses to output
. In the file input, each line should contain only one sentence.
Annotation Schema
The annotation schema for word segmentation, POS tagging and dependency parsing is described in our COLING2014 paper (see in Reference). It is based on the annotated corpus of People' Daily of Peking University and quite different from CTB. For details, please refer to our paper.
Reference
- Likun Qiu, Yue Zhang, Peng Jin, and Houfeng Wang. 2014. Multi-view chinese treebanking. In Proc of COLING, pages 257–268.
- Yue Zhang and Stephen Clark. 2011. Syntactic Processing Using the Generalized Perceptron and Beam Search. Computational Linguistics, 37(1), March.