The following tables specify the POS tagging accuracy of the TCEECE sample. Accuracy rates are given for both C7 and C5 tagsets, both by individual tags and by tag pairs.
Tables 1 and 3 specify the accuracy by individual tags for C7 and C5, respectively. They include accuracy measures for all POS tags that appear in the sample. For each tag, column (a) contains the name of the tag, (b) contains the total number of (a)s assigned by CLAWS (‘selected assignments’), (c) contains the total number of (a)s assigned by us (‘relevant assignments’), (d) contains the number of (a)s on which we agree with CLAWS (‘true assignments’), (e) contains ‘precision’ (d / b) and (f) contains ‘recall’ (d / c). (For an explanation of precision and recall, see e.g. this.)
Tables 2 and 4 specify the accuracy by pairs of incorrect and correct tags, included in columns (a) and (b), respectively. For each pair, column (c) contains the total number of the occurrences of (a). Column (d) contains the number of the occurrences of (a) that should have been (b), and column (e) contains the percentage (d / c). The value ‘excl’ in column (b) means occurrences that could not be given a correct tag because of incorrect tokenisation. Pairs for which (d) is less than 3 have been omitted.
The columns in the tables are sortable by clicking on their headers.
(a) Tag | (b) Selected assignments | (c) Relevant assignments | (d) True assignments | (e) Precision (d / b) | (f) Recall (d / c) |
---|---|---|---|---|---|
- | 95 | 95 | 95 | 100.0 % | 100.0 % |
, | 273 | 273 | 273 | 100.0 % | 100.0 % |
; | 32 | 32 | 32 | 100.0 % | 100.0 % |
: | 7 | 3 | 3 | 42.9 % | 100.0 % |
! | 9 | 9 | 9 | 100.0 % | 100.0 % |
? | 5 | 5 | 5 | 100.0 % | 100.0 % |
. | 135 | 131 | 131 | 97.0 % | 100.0 % |
( | 7 | 7 | 7 | 100.0 % | 100.0 % |
) | 7 | 7 | 7 | 100.0 % | 100.0 % |
APPGE | 177 | 176 | 176 | 99.4 % | 100.0 % |
AT | 184 | 183 | 183 | 99.5 % | 100.0 % |
AT1 | 82 | 80 | 79 | 96.3 % | 98.8 % |
BCL | 2 | 2 | 2 | 100.0 % | 100.0 % |
CC | 177 | 178 | 177 | 100.0 % | 99.4 % |
CCB | 40 | 36 | 36 | 90.0 % | 100.0 % |
CS | 88 | 91 | 71 | 80.7 % | 78.0 % |
CSA | 19 | 22 | 18 | 94.7 % | 81.8 % |
CSN | 8 | 10 | 8 | 100.0 % | 80.0 % |
CST | 48 | 52 | 45 | 93.8 % | 86.5 % |
CSW | 4 | 4 | 4 | 100.0 % | 100.0 % |
DA | 18 | 18 | 18 | 100.0 % | 100.0 % |
DA1 | 8 | 8 | 5 | 62.5 % | 62.5 % |
DA2 | 12 | 12 | 12 | 100.0 % | 100.0 % |
DAR | 11 | 6 | 5 | 45.5 % | 83.3 % |
DAT | 3 | 2 | 2 | 66.7 % | 100.0 % |
DB | 21 | 20 | 19 | 90.5 % | 95.0 % |
DB2 | 2 | 2 | 2 | 100.0 % | 100.0 % |
DD | 25 | 24 | 24 | 96.0 % | 100.0 % |
DD1 | 47 | 45 | 40 | 85.1 % | 88.9 % |
DD2 | 5 | 4 | 4 | 80.0 % | 100.0 % |
DDQ | 27 | 27 | 27 | 100.0 % | 100.0 % |
EX | 11 | 11 | 11 | 100.0 % | 100.0 % |
FO | 6 | 6 | 6 | 100.0 % | 100.0 % |
FW | 0 | 2 | 0 | 0.0 % | |
GE | 23 | 23 | 22 | 95.7 % | 95.7 % |
IF | 60 | 48 | 47 | 78.3 % | 97.9 % |
II | 336 | 340 | 325 | 96.7 % | 95.6 % |
IO | 96 | 96 | 96 | 100.0 % | 100.0 % |
IW | 43 | 43 | 43 | 100.0 % | 100.0 % |
JJ | 230 | 234 | 208 | 90.4 % | 88.9 % |
JJR | 6 | 7 | 6 | 100.0 % | 85.7 % |
JJT | 19 | 20 | 19 | 100.0 % | 95.0 % |
JK | 4 | 4 | 4 | 100.0 % | 100.0 % |
MC | 62 | 60 | 60 | 96.8 % | 100.0 % |
MC1 | 12 | 7 | 7 | 58.3 % | 100.0 % |
MCMC | 1 | 1 | 1 | 100.0 % | 100.0 % |
MD | 28 | 28 | 27 | 96.4 % | 96.4 % |
ND1 | 1 | 1 | 1 | 100.0 % | 100.0 % |
NN | 5 | 4 | 4 | 80.0 % | 100.0 % |
NN1 | 517 | 499 | 474 | 91.7 % | 95.0 % |
NN2 | 129 | 122 | 119 | 92.2 % | 97.5 % |
NNA | 0 | 2 | 0 | 0.0 % | |
NNB | 68 | 75 | 67 | 98.5 % | 89.3 % |
NNL1 | 5 | 7 | 5 | 100.0 % | 71.4 % |
NNT1 | 38 | 38 | 37 | 97.4 % | 97.4 % |
NNT2 | 13 | 13 | 13 | 100.0 % | 100.0 % |
NNU | 9 | 8 | 3 | 33.3 % | 37.5 % |
NP1 | 224 | 222 | 201 | 89.7 % | 90.5 % |
NP2 | 4 | 5 | 2 | 50.0 % | 40.0 % |
NPD1 | 6 | 5 | 5 | 83.3 % | 100.0 % |
NPM1 | 13 | 16 | 13 | 100.0 % | 81.3 % |
PN | 4 | 4 | 4 | 100.0 % | 100.0 % |
PN1 | 27 | 31 | 27 | 100.0 % | 87.1 % |
PNQS | 14 | 15 | 14 | 100.0 % | 93.3 % |
PNX1 | 1 | 1 | 1 | 100.0 % | 100.0 % |
PPGE | 11 | 12 | 11 | 100.0 % | 91.7 % |
PPH1 | 86 | 84 | 84 | 97.7 % | 100.0 % |
PPHO1 | 21 | 21 | 21 | 100.0 % | 100.0 % |
PPHO2 | 14 | 14 | 14 | 100.0 % | 100.0 % |
PPHS1 | 44 | 44 | 44 | 100.0 % | 100.0 % |
PPHS2 | 17 | 17 | 17 | 100.0 % | 100.0 % |
PPIO1 | 53 | 52 | 52 | 98.1 % | 100.0 % |
PPIO2 | 4 | 4 | 4 | 100.0 % | 100.0 % |
PPIS1 | 211 | 213 | 211 | 100.0 % | 99.1 % |
PPIS2 | 24 | 24 | 24 | 100.0 % | 100.0 % |
PPX1 | 9 | 9 | 9 | 100.0 % | 100.0 % |
PPX2 | 1 | 1 | 1 | 100.0 % | 100.0 % |
PPY | 98 | 98 | 98 | 100.0 % | 100.0 % |
RA | 13 | 14 | 13 | 100.0 % | 92.9 % |
RG | 51 | 50 | 47 | 92.2 % | 94.0 % |
RGQ | 2 | 3 | 2 | 100.0 % | 66.7 % |
RGR | 1 | 2 | 1 | 100.0 % | 50.0 % |
RGT | 19 | 18 | 18 | 94.7 % | 100.0 % |
RL | 17 | 14 | 13 | 76.5 % | 92.9 % |
RP | 27 | 26 | 25 | 92.6 % | 96.2 % |
RR | 188 | 213 | 182 | 96.8 % | 85.4 % |
RRQ | 14 | 19 | 13 | 92.9 % | 68.4 % |
RRQV | 3 | 0 | 0 | 0.0 % | |
RRR | 12 | 15 | 10 | 83.3 % | 66.7 % |
RRT | 2 | 3 | 1 | 50.0 % | 33.3 % |
RT | 36 | 35 | 33 | 91.7 % | 94.3 % |
TO | 106 | 108 | 106 | 100.0 % | 98.1 % |
UH | 2 | 1 | 1 | 50.0 % | 100.0 % |
VB0 | 1 | 5 | 1 | 100.0 % | 20.0 % |
VBDR | 2 | 2 | 2 | 100.0 % | 100.0 % |
VBDZ | 24 | 24 | 24 | 100.0 % | 100.0 % |
VBG | 7 | 7 | 7 | 100.0 % | 100.0 % |
VBI | 61 | 58 | 57 | 93.4 % | 98.3 % |
VBM | 29 | 29 | 29 | 100.0 % | 100.0 % |
VBN | 15 | 15 | 15 | 100.0 % | 100.0 % |
VBR | 16 | 16 | 16 | 100.0 % | 100.0 % |
VBZ | 57 | 53 | 53 | 93.0 % | 100.0 % |
VD0 | 13 | 13 | 13 | 100.0 % | 100.0 % |
VDD | 6 | 6 | 6 | 100.0 % | 100.0 % |
VDG | 1 | 1 | 1 | 100.0 % | 100.0 % |
VDI | 9 | 9 | 9 | 100.0 % | 100.0 % |
VDN | 2 | 2 | 2 | 100.0 % | 100.0 % |
VDZ | 9 | 9 | 9 | 100.0 % | 100.0 % |
VH0 | 43 | 43 | 43 | 100.0 % | 100.0 % |
VHD | 15 | 14 | 14 | 93.3 % | 100.0 % |
VHG | 4 | 4 | 4 | 100.0 % | 100.0 % |
VHI | 20 | 20 | 20 | 100.0 % | 100.0 % |
VHN | 6 | 7 | 6 | 100.0 % | 85.7 % |
VHZ | 23 | 23 | 23 | 100.0 % | 100.0 % |
VM | 179 | 180 | 178 | 99.4 % | 98.9 % |
VMK | 1 | 1 | 1 | 100.0 % | 100.0 % |
VV0 | 156 | 113 | 111 | 71.2 % | 98.2 % |
VVD | 52 | 48 | 44 | 84.6 % | 91.7 % |
VVG | 43 | 45 | 41 | 95.3 % | 91.1 % |
VVI | 208 | 242 | 208 | 100.0 % | 86.0 % |
VVN | 104 | 99 | 92 | 88.5 % | 92.9 % |
VVZ | 34 | 35 | 34 | 100.0 % | 97.1 % |
XX | 57 | 58 | 57 | 100.0 % | 98.3 % |
ZZ1 | 13 | 0 | 0 | 0.0 % |
(a) Incorrect tag | (b) Correct tag | (c) Total count of (a) | (d) Count of (a) that should have been (b) | (e) Percentage (d / c) |
---|---|---|---|---|
: | excl | 7 | 4 | 57.1 % |
. | excl | 135 | 4 | 3.0 % |
CCB | II | 40 | 3 | 7.5 % |
CS | RRQ | 88 | 6 | 6.8 % |
CS | II | 88 | 3 | 3.4 % |
CS | RR | 88 | 3 | 3.4 % |
CST | DD1 | 48 | 3 | 6.3 % |
DA1 | RR | 8 | 3 | 37.5 % |
DAR | RRR | 11 | 5 | 45.5 % |
DD1 | CST | 47 | 7 | 14.9 % |
IF | CS | 60 | 13 | 21.7 % |
II | CSA | 336 | 3 | 0.9 % |
JJ | RR | 230 | 8 | 3.5 % |
JJ | NP1 | 230 | 3 | 1.3 % |
MC1 | PN1 | 12 | 4 | 33.3 % |
NN1 | NP1 | 517 | 8 | 1.5 % |
NN1 | JJ | 517 | 6 | 1.2 % |
NN1 | NNB | 517 | 6 | 1.2 % |
NN1 | VVI | 517 | 4 | 0.8 % |
NN1 | RR | 517 | 3 | 0.6 % |
NN1 | VVG | 517 | 3 | 0.6 % |
NN2 | excl | 129 | 6 | 4.7 % |
NNU | excl | 9 | 5 | 55.6 % |
NP1 | NN1 | 224 | 15 | 6.7 % |
RRQV | RR | 3 | 3 | 100.0 % |
VBI | VB0 | 61 | 4 | 6.6 % |
VBZ | excl | 57 | 3 | 5.3 % |
VV0 | VVI | 156 | 27 | 17.3 % |
VV0 | NN1 | 156 | 6 | 3.8 % |
VV0 | RR | 156 | 4 | 2.6 % |
VVD | VVN | 52 | 5 | 9.6 % |
VVD | JJ | 52 | 3 | 5.8 % |
VVN | JJ | 104 | 9 | 8.7 % |
VVN | VVD | 104 | 3 | 2.9 % |
ZZ1 | NP1 | 13 | 6 | 46.2 % |
ZZ1 | NNU | 13 | 5 | 38.5 % |
(a) Tag | (b) Selected assignments | (c) Relevant assignments | (d) True assignments | (e) Precision (d / b) | (f) Recall (d / c) |
---|---|---|---|---|---|
AJ0 | 234 | 238 | 212 | 90.6 % | 89.1 % |
AJC | 6 | 7 | 6 | 100.0 % | 85.7 % |
AJS | 19 | 20 | 19 | 100.0 % | 95.0 % |
AT0 | 266 | 263 | 262 | 98.5 % | 99.6 % |
AV0 | 341 | 366 | 323 | 94.7 % | 88.3 % |
AVP | 27 | 26 | 25 | 92.6 % | 96.2 % |
AVQ | 19 | 22 | 16 | 84.2 % | 72.7 % |
CJC | 217 | 214 | 213 | 98.2 % | 99.5 % |
CJS | 119 | 127 | 102 | 85.7 % | 80.3 % |
CJT | 48 | 52 | 45 | 93.8 % | 86.5 % |
CRD | 75 | 68 | 68 | 90.7 % | 100.0 % |
DPS | 177 | 176 | 176 | 99.4 % | 100.0 % |
DT0 | 152 | 141 | 131 | 86.2 % | 92.9 % |
DTQ | 27 | 27 | 27 | 100.0 % | 100.0 % |
EX0 | 11 | 11 | 11 | 100.0 % | 100.0 % |
ITJ | 2 | 1 | 1 | 50.0 % | 100.0 % |
NN0 | 14 | 12 | 7 | 50.0 % | 58.3 % |
NN1 | 556 | 538 | 512 | 92.1 % | 95.2 % |
NN2 | 142 | 135 | 132 | 93.0 % | 97.8 % |
NP0 | 320 | 332 | 298 | 93.1 % | 89.8 % |
ORD | 28 | 28 | 27 | 96.4 % | 96.4 % |
PNI | 31 | 35 | 31 | 100.0 % | 88.6 % |
PNP | 583 | 583 | 580 | 99.5 % | 99.5 % |
PNQ | 14 | 15 | 14 | 100.0 % | 93.3 % |
PNX | 11 | 11 | 11 | 100.0 % | 100.0 % |
POS | 23 | 23 | 22 | 95.7 % | 95.7 % |
PRF | 96 | 96 | 96 | 100.0 % | 100.0 % |
PRP | 439 | 431 | 416 | 94.8 % | 96.5 % |
PUL | 7 | 7 | 7 | 100.0 % | 100.0 % |
PUN | 556 | 548 | 548 | 98.6 % | 100.0 % |
PUR | 7 | 7 | 7 | 100.0 % | 100.0 % |
TO0 | 106 | 108 | 106 | 100.0 % | 98.1 % |
UNC | 6 | 8 | 6 | 100.0 % | 75.0 % |
VBB | 46 | 50 | 46 | 100.0 % | 92.0 % |
VBD | 26 | 26 | 26 | 100.0 % | 100.0 % |
VBG | 7 | 7 | 7 | 100.0 % | 100.0 % |
VBI | 61 | 58 | 57 | 93.4 % | 98.3 % |
VBN | 15 | 15 | 15 | 100.0 % | 100.0 % |
VBZ | 57 | 53 | 53 | 93.0 % | 100.0 % |
VDB | 13 | 13 | 13 | 100.0 % | 100.0 % |
VDD | 6 | 6 | 6 | 100.0 % | 100.0 % |
VDG | 1 | 1 | 1 | 100.0 % | 100.0 % |
VDI | 9 | 9 | 9 | 100.0 % | 100.0 % |
VDN | 2 | 2 | 2 | 100.0 % | 100.0 % |
VDZ | 9 | 9 | 9 | 100.0 % | 100.0 % |
VHB | 43 | 43 | 43 | 100.0 % | 100.0 % |
VHD | 15 | 14 | 14 | 93.3 % | 100.0 % |
VHG | 4 | 4 | 4 | 100.0 % | 100.0 % |
VHI | 20 | 20 | 20 | 100.0 % | 100.0 % |
VHN | 6 | 7 | 6 | 100.0 % | 85.7 % |
VHZ | 23 | 23 | 23 | 100.0 % | 100.0 % |
VM0 | 180 | 181 | 179 | 99.4 % | 98.9 % |
VVB | 156 | 113 | 111 | 71.2 % | 98.2 % |
VVD | 52 | 48 | 44 | 84.6 % | 91.7 % |
VVG | 43 | 45 | 41 | 95.3 % | 91.1 % |
VVI | 208 | 242 | 208 | 100.0 % | 86.0 % |
VVN | 104 | 99 | 92 | 88.5 % | 92.9 % |
VVZ | 34 | 35 | 34 | 100.0 % | 97.1 % |
XX0 | 57 | 58 | 57 | 100.0 % | 98.3 % |
ZZ0 | 13 | 0 | 0 | 0.0 % |
(a) Incorrect tag | (b) Correct tag | (c) Total count of (a) | (d) Count of (a) that should have been (b) | (e) Percentage (d / c) |
---|---|---|---|---|
AJ0 | AV0 | 234 | 9 | 3.8 % |
AJ0 | NP0 | 234 | 3 | 1.3 % |
AV0 | AJ0 | 341 | 4 | 1.2 % |
AV0 | DT0 | 341 | 4 | 1.2 % |
AV0 | CJS | 341 | 3 | 0.9 % |
AVQ | AV0 | 19 | 3 | 15.8 % |
CJC | PRP | 217 | 3 | 1.4 % |
CJS | AV0 | 119 | 7 | 5.9 % |
CJS | AVQ | 119 | 6 | 5.0 % |
CJS | PRP | 119 | 3 | 2.5 % |
CJT | DT0 | 48 | 3 | 6.3 % |
CRD | PNI | 75 | 4 | 5.3 % |
DT0 | AV0 | 152 | 12 | 7.9 % |
DT0 | CJT | 152 | 7 | 4.6 % |
NN0 | excl | 14 | 5 | 35.7 % |
NN1 | NP0 | 556 | 18 | 3.2 % |
NN1 | AJ0 | 556 | 6 | 1.1 % |
NN1 | VVI | 556 | 5 | 0.9 % |
NN1 | AV0 | 556 | 3 | 0.5 % |
NN1 | VVG | 556 | 3 | 0.5 % |
NN2 | excl | 142 | 6 | 4.2 % |
NP0 | NN1 | 320 | 17 | 5.3 % |
NP0 | NN2 | 320 | 3 | 0.9 % |
PRP | CJS | 439 | 18 | 4.1 % |
PUN | excl | 556 | 8 | 1.4 % |
VBI | VBB | 61 | 4 | 6.6 % |
VBZ | excl | 57 | 3 | 5.3 % |
VVB | VVI | 156 | 27 | 17.3 % |
VVB | NN1 | 156 | 6 | 3.8 % |
VVB | AV0 | 156 | 5 | 3.2 % |
VVD | VVN | 52 | 5 | 9.6 % |
VVD | AJ0 | 52 | 3 | 5.8 % |
VVN | AJ0 | 104 | 9 | 8.7 % |
VVN | VVD | 104 | 3 | 2.9 % |
ZZ0 | NP0 | 13 | 6 | 46.2 % |
ZZ0 | NN0 | 13 | 5 | 38.5 % |