Software | Yue Zhang

ZPar

A statistical multi-language parser, with language-specific support for Chinese and English. ZPar includes a specialized Chinese parser, and gives state-of-the-art speed and accuracies for both Chinese and English on standard Penn Chinese Treebank and Penn Treebank data. It provides integrated systems that perform word segmentation, part-of-speech tagging, dependency parsing and phrase structure parsing. ZPar has also been used for the syntactic analysis of Romanian, French and other languages. ZPar is fast, processing above 50 sentences per second using the standard Penn Teebank (Wall Street Journal) data.
[Documentation] | [Download]

Extensions:

ViZPar (by Miguel Ballesteros and others): ViZPar is a tool that enhances the usability of ZPar by supporting parameter selection and output visualization. ViZPar allows manual feature selection, which makes the tool useful for people interested in obtaining the best parser through feature engineering, provided that the feature templates included in ZPar are optimized for English and Chinese. ViZPar is designed for both constituent and dependency analysis.
[Project Homepage]
python-zpar (by Nitin Madnani and others): python-zpar is a python wrapper around the ZPar parser. python-zpar not only provides a simply python wrapper but also provides an XML-RPC ZPar server to make batch-processing of large files easier.
[Project Homepage]

If you use this system in your paper, please cite the following paper.

@article{Zhang:2011:SPU:1970420.1970425,
  author = {Zhang, Yue and Clark, Stephen},
  title = {Syntactic Processing Using the Generalized Perceptron and Beam Search},
  journal = {Comput. Linguist.},
  issue_date = {March 2011},
  volume = {37},
  number = {1},
  month = mar,
  year = {2011},
  issn = {0891-2017},
  pages = {105--151},
  numpages = {47},
  publisher = {MIT Press},
  address = {Cambridge, MA, USA},
}

ZGen

ZGen is a linearization system that constructs natural lnaguage sentences from bags of words, given optional input syntactic constraints. Depending on the amount of input constraints, ZGen can perform free ordering, partial tree linearization and full tree linearization.
[Download]

ZORE

ZORE is an open information extraction system for Chinese.
[Download]
If you use this system in your paper, please cite our EMNLP 2014 paper.

Likun Qiu and Yue Zhang, ZORE: A Syntax-based System for Chinese Open Relation Extraction, In Proceedings of EMNLP 2014.