Overview

ZPar is a statistical natural language parser, which performs syntactic analysis tasks including word segmentation, part-of-speech tagging and parsing. ZPar supports multiple languages and multiple grammar formalisms. ZPar has been most heavily developed for Chinese (on the Penn Chinese Treebank and Peking University Multiview Treebank) and English (on the Penn Treebank), while it provides generic support for other languages and treebanks. A Romanian model has been trained for ZPar 0.2, for example. ZPar currently supports context free grammars (CFG), dependency grammars and combinatory categorial grammars (CCG).

System Requirements

The ZPar software requires the following basic system configuration

Download and Installation

Binaries and sourrces of the latest release can be downloaded from github. ZPar provides functionalities for different languages and treebanks, such as zpar, zpar.en, zpar.zh, and zpar.mvt for generic language, English Penn Treebank, Chinese Penn Treebank, and Chinese multiview treebank, respectively. Source codes and binaries are provided for Windows, Linux and Mac. Standalone sub-modules can be built for individual tasks, such as segmentor, postagger, conparser, and depparser for word segmentation, POS-tagging, phrase-structure parsing, and dependency parsing.

Quick Start

ZPar can be used off the shelf by referring to the quick start; Sub-modules such as the word segmentor, POS-tagger and parsers can also be used by following the detailed instructions for the compilation, training, and usage of individual modules.

List of Manuals

License

The software source is under GPL (v.3), and a separate commercial license issued by Oxford University for non-opensource. Various models available for download were trained from different text resources, which may require further licenses.

Contributers to the Documentation

Reference