What exactly is the difference between Kappa ipf and Kappa excl? And which of these values should be reported in a thesis or a paper?
An answer to both questions can best be obtained by consulting the Holle & Rein publication which formed the basis of this functionality in ELAN (Holle & Rein - EasyDIAg: A tool for easy determination of interrater agreement).
The issue mainly concerns how to deal with so-called unmatched annotations of the annotators, annotations of rater X that have no (or insufficient) overlap with an annotation of rater Y. The task of raters may consist of segmentation and/or categorization and the questions is how/if segmentation disagreements should be reflected in the reported agreement values.
An ipf (iterative proportional fitting) algorithm is used in case the unmatched annotations should be part of the agreement calculation, otherwise the “standard” kappa calculation can be performed.
This is all discussed in much more detail (including when or why to report one or the other) in the publication mentioned.
have you solved this issue, which Kappa value should be reported in a paper. thanks.
The question if and how interrater reliability should be reported in a paper, is not an issue that can be solved by a technician. It is up to researchers and/or research communities to decide which methodology is accepted and promoted (or required).
ELAN provides a few alternatives for calculating agreements and a few others are still on the wish list. It might be (or: it is not unlikely) that the method required by a particular community is not available in ELAN (in which case the data have to be exported to a different tool for that calculation).
Apologies if my previous reply was maybe not so clear concerning this part of the initial question. The Holle&Rein paper discusses the question more, but it is always up to the researchers to decide what applies to their situation.