Error in Inter-Annotator Reliability Calculation

Hello,
I have a pre-segmented .eaf file which two annotators annotated separately using the same set of 5 codes. If I check for agreement manually, I clearly see that 71% of the segments have the same code, so I would expect a strong value of Cohen’s kappa.

However, when I run the calculation in ELAN, for some reason each of my colleague’s codes is doubled. For example, we have a code named “3” and in the annotation statistics I see that there are two separate “3” codes, even though “3” is typed correctly and always in the same way.
In short, I have a total of 5 codes and my colleague has 10, which leads to a very low Inter-Annotator Agreement.
Have you any clue why this happens?

Thank you,
Marta

Hello,

That is weird indeed and I have no clue why this happens. I don’t think this is a common problem, I don’t recall similar reports and haven’t seen something similar myself either. From your description I take it that you checked that there is not sometimes a white space character before or after the code?

If it is possible to share two files (only the eaf’s) that illustrate the problem, I could have a look (han.sloetjes AT mpi.nl)?

-Han