首页 > 代码库 > Upenn树库的基本框架

Upenn树库的基本框架

1 词性标记

 

AD //adverbs
AS //aspect marker
M //measure word (including classifiers)
BA //in ba-const
MSP //some particles
CC //coordinating conj
NN //common nouns
CD //cardinal numbers
NR //proper nouns
CS //subordinating conj
NT //temporal nouns
DEC //的 for relative-clause etc.
OD //ordinal numbers
DEG //associative 的
ON //onomatopoeia
DER //得 in V-de const. and V-de-R
P //prepositions (excluding and )
DEV //地 as the head of DVP
PN //pronouns
DT //determiner
PU //punctuation
ETC //tags for and in coordination phrases
SB //in short bei-construction
FW //foreign words
SP //sentence-final particle
IJ //interjection
VA //predicative adjective
JJ //noun-modifier other than nouns
VC //copula
LB //in long bei-construction
VE //as the main verb
LC //localizer
VV //other verbs

特点:
(1) 长“被” – 短“被”的区别
(2) FW 标记
(3) VC,VE 分得比较细
(4) ON(onomatopoeia 拟声词)
(5) DEG 和DEC 分开
(6) BA 跟 P 没有形式上的联系。标记未体现层级性。

 

2.短语句法标记

ADJP //adjective phrase
ADVP //adverbial phrase headed by AD (adverb)
PP //preposition phrase
CLP //classifier phrase
PRN //parenthetical
CP //clause headed by C (complementizer)
QP //quantifier phrase
DNP //phrase formed by "XP + DEG"
UCP //unidentical coordination phrase
DP //determiner phrase
VP //verb phrase
DVP //phrase formed by "XP + DEV"
VCD //coordinated verb compound
FRAG //fragment
VCP //verb compounds formed by VV + VC
IP //simple clause headed by I (INFL)
VNV //verb compounds formed by A-not-A or A-one-A
LCP //phrase formed by "XP + LC"
VPT //potential form V-de-R or V-bu-R
LST //list marker
VRD //verb resultative compound
NP //noun phrase
VSB //verb compounds formed by a modifier + a head

特点:
(1) Verb compound
(2) FRAG
(3) LST (破折号开头的句子、可以提示篇章连贯/排比句|多项并列句)
(4) CP 和IP 对应一般的句子
(5) PRN、UCP

3.短语功能标记

ADV //adverbial
APP //appositive
PRP //purpose or reason
BNF //beneficiary
Q //question
CND //condition
SBJ //subject
DIR //direction
SHORT //short form
EXT //extent
TMP //temporal
FOC //focus
TPC //topic
HLN //headline
TTL //title
IJ //interjective
WH //wh-phrase
IMP //imperative
VOC //vocative
IO //indirect object
*OP* //operator
LGS //logic subject
*pro* //dropped argument
LOC //locative
*PRO* //used in control structures
MNR //manner
*RNR* //right node raising
OBJ //direct object
*T* //trace of A’-movement
PN //proper names
* //trace of A-movement
PRD //predicate
*?* //other unknown empty categories

特点:
(1) 设置了7 个空范畴标记
(2) 语义范畴、功能标记详细

 

Upenn树库的基本框架