Problem: annotations duplicated after export

I have a weird problem with a csv file. This file contains the text transcripts of a dataset, and it was previously created on Elan by someone else than me.
It seems to be corrupted someway: some of the annotations are duplicated, meaning that a correct annotation with the correct timestamps [ex. 60647,00:30:25.9,00:30:44.5,yeah it’s great,8,5,T2] is often followed by a copy of itself [ex. 60648,00:30:29.9,00:30:48.5,yeah it’s great,8,5,T2].

I would like to manually correct this issue, eliminating the copies, but here comes the problem: every time that I export the file, again, some annotations are duplicated!

To be clear, what I do is the same I’ve done with other files without problems: import (the transcripts) as csv file in Elan, [eventually work on them, but the following step happens regardless of any modification], export them with default settings as Tab-delimited text.
I also tried to import the transcripts, copy the annotations on new tiers and eliminate the elder tiers, but I got the same result.

Potential useful info about these copied annotations:

  • They can be adjacent, or there can be a pause or even another annotation in between;
  • Sometimes they have the same length as the originals, other times they don’t;
  • Duplications happening after export don’t apply only to the lines already duplicated in the original file; they apply also to others.

Hello,

It is difficult to understand what is going on here without the data at hand.
First of all, it might be good to remember that duplication of annotations in tab-delimited export can be intentional, with the “Separate column for each tier” option and its sub-options. Whether or not an annotation is duplicated depends in that case on the existence of overlapping annotations on other tiers in the export. So, duplicated annotations don’t necessarily indicate corruption of a file (but maybe you know it is corrupted).
To be able to say more about what is going on, it would be helpful to have e.g. the original csv file and the imported, corrected eaf file around. If you want you can send the samples to me (han.sloetjes AT mpi.nl)

-Han