CCG supertagging
Combinatory Categorical Grammar (CCG; Steedman, 2000) is a highly lexicalized formalism. The standard parsing model of Clark and Curran (2007) uses over 400 lexical categories (or supertags), compared to about 50 part-of-speech tags for typical parsers.
Example:
Vinken | , | 61 | years | old |
---|---|---|---|---|
N | , | N/N | N | (S[adj]\ NP)\ NP |
CCGBank
The CCGBank is a corpus of CCG derivations and dependency structures extracted from the Penn Treebank by Hockenmaier and Steedman (2007). Sections 2-21 are used for training, section 00 for development, and section 23 as in-domain test set. Performance is only calculated on the 425 most frequent labels. Models are evaluated based on accuracy.
Model | accuracy | Paper / Source | Code |
---|---|---|---|
Lewis et al. (2016) | 94.7 | LSTM CCG Parsing | |
Vaswani et al. (2016) | 94.24 | Supertagging with LSTMs | |
Low supervision by Søgaard and Goldberg (2016) | 93.26 | Deep multi-task learning with low level tasks supervised at lower layers | |
Xu et al. (2015) | 93.0 | CCG Supertagging with a Recurrent Neural Network |
Lewis et al. (2016)
94.7
Vaswani et al. (2016)
94.24
Søgaard and Goldberg (2016)
93.26
Xu et al. (2015)
93.0