The goal of Sinica
Treebank is to provide a syntactic, structure-tagged corpus for
Chinese natural language processing. By extracting grammatical
information from Treebank, we can improve the performance of the
parser and learn more about the syntactic knowledge.
Sinica Treebank was built by CKIP in 1997 with texts taken
Sinica Corpus. Based on ICG grammar (Information-based Case
Grammar), the contexts are automatically parsed before being
manually checked. The present version, Sinica Treebank v3.0,
includes 61,087 trees (361,834 words). There are 1,000 tree
structures open to the public for researchers to download.
Meanwhile, a search interface on the website helps users who are
interested in Chinese syntax and semantic relation.
The structural frame of Sinica
Treebank is based on the Head-Driven Principle; that is, a
sentence or phrase is composed of a core Head and its arguments,
or adjuncts. The Head defines its phrasal category and relations
with other constituents. For example, the Head of a sentence (S)
or verb phrase (VP) is a verb (V). See Chen et al. (1999)
The Construction of Sinica Treebank for details of
supplementary principles, symbol illustrations, semantic roles,
and phrasal structures.
Shih-Min Li, Su-Chu Lin,
Keh-Jiann Chen, 2005, "A
Probe into Ambiguities of Determinative-Measure Compounds",
The 17th ROCLING Conference on Computational Linguistics and
Speech Processing, september 15-16, 2005, national cheng hung
university, tainan, taiwan, ROC.
Li Shih-Min, Su-Chu Lin and Keh-Jiann Chen,
Representations and Logical Compatibility between Temporal
Adverbs and Aspects", International Journal of Computational
Linguistics & Chinese Language Processing, Vol. 10, No. 4.
Li Shih-Min, Su-Chu Lin, Keh-Jiann
Chen. 2004. "Feature
Representations and Logical Compatibility between Temporal Adverbs
and Aspects", 5th Chinese Lexical Semantics Workshop (CLSW-5).
Singapore (June 14-16, 2004) & Genting Highland, Malaysia (June
Lin Su-Chu, Shu-Ling Huang,
Keh-Jiann Chen. 2004. "
Taxonomy of Fine-grain Semantic Roles for Nominal Modifiers",
5th Chinese Lexical Semantics Workshop (CLSW-5). Singapore (June
14-16, 2004) & Genting Highland, Malaysia (June 17-19, 2004).
You Jia-Ming, Keh-Jiann Chen,
Semantic Role Assignment for a Tree Structure", Proceedings
of SIGHAN workshop.
Chen Keh-Jiann, Yu-Ming Hsieh,
Treebanks and Grammar Extraction", Proceedings of IJCNLP-04,
Chen Keh-Jiann, Chu-Ren Huang,
Feng-Yi Chen, Chi-Ching Luo,Ming-Chung Chang, Chao-Jan Chen, and
Zhao-Ming Gao, 2003, "Sinica Treebank: Design Criteria,
Representational Issues and Implementation". In Anne Abeille
(Ed.) Treebanks Building and Using Parsed Corpora. Language and
Speech series. Dordrecht:Kluwer, pp231-248.
Huang Chu-Ren, Keh-Jiann Chen,
Feng-Yi Chen, Keh-Jiann Chen, Zhao-Ming Gao and Kuang-Yu Chen.
Sinica Treebank: Design Criteria, Annotation Guidelines, and
On-line Interface. Proceedings of 2nd Chinese Language
Processing Workshop (Held in conjunction with the 38th Annual
Meeting of the Association for Computational Linguistics,
ACL-2000). 29-37. October 7, 2000, Hong Kong.
Chen Keh-Jiann, et al. 1999. "The
CKIP Chinese Treebank: Guidelines for Annotation", ATALA
Workshop !V Treebanks, Paris, June 18-19 1999: pp85-96.
Feng-Yi Chen, Pi-Fang Tsai,
Keh-Jiann Chen, Chu-Ren Hunag. 1999.
The Construction of Sinica Treebank. Computational
Linguistics and Chinese Language Processing, vol. 4, No. 2.
Chen Keh-Jiann, Chu-Ren Huang,
Li-Ping Chang, Hui-Li Hsu. 1996. "Sinica Corpus: Design
Methodology for Balanced Corpra", Proceedings of the 11th
Pacific Asia Conference on Language, Information, and
Computation (PACLIC II), Seoul Korea, pp.167-176.
Chen Keh-Jiann. 1996. "A
Model for Robust Chinese Parser", Computational Linguistics
and Chinese Language Processing, vol. 1, No. 1. pp.183-204.
Chen Keh-Jiann, Chu-Ren Huang.
1994. "Features Constraints in Chinese Language Parsing",
Proceedings of ICCPOL '94, pp. 223-228.
Chen Keh-Jiann. 1992. "Design
Concepts for Chinese Parsers", 3rd International Conference on
Chinese Information Processing, pp.1-22.