Eaf files 'expire' after an extended time without opening

This is problem I’ve had for years now but I have never gotten around to making a post here because I had a work-around that usually fixes the problem.

I use SayMore to manage and organize my ELAN files. Every now and then, I’ll attempt to open an ELAN file that I have not opened for several months and one of two things usually happens:

  1. The eaf file is intact, all tiers and annotations with their values are still there but it ‘forgot’ mp4 media file, so I need to manually re-link it, it never forgets the master media file (which is always a WAV file). A mildly annoying problem with an easy fix

  2. the more serious problem, ELAN opens but there are no media files, no tiers, no annotations, the menu bar works but there’s nothing else.

At least some of the data is still in there because I can see the transcription and translation tiers in SayMore, I can also open the eaf file with an editor like Notepad ++ and see the annotations.

My usual work around is to use the ‘merge transcription’ function and simply ‘peel’ the tiers and annotations out of the broken eaf file and place them into a new eaf file. Until now that has usually worked thus why I’ve refrained from making a post until now. And if it didn’t I could always use The ‘export transcription/translation’ function from SayMore to export the transcription and translation tiers as subtitles and create a new EAF file and import the subtitle files.

This time however, the EAF file I want to open has more than transcription and translation tiers.

So I tried the ‘merge transcription’ function however it did not work, I got the following error, which I have not seen before:

I also viewed the log (pasted at the bottom).

I’m interested in A) recovering the additional tiers from this file, and B) figure out why this keeps happening and what I can change with corpus workflow to prevent this from happening again. I wonder if the problem is related to SayMore.

=============================================================================
LOG FILE:

Aug 15, 2022 11:25:32 AM mpi.eudico.client.annotator.ELAN main
INFO:

@ELAN Launched

Aug 15, 2022 11:25:32 AM mpi.eudico.client.annotator.ELAN main
INFO: ELAN 6.3
Java home: C:\Program Files\ELAN_6.3\runtime
Java version: 17.0.2
Runtime: 17.0.2+8-86
OS name: Windows 11
OS version: 10.0
OS arch.: amd64
User language: en
User home: C:\Users\axhri
User dir: C:\Program Files (x86)\SayMore
File encoding: Cp1252
Classpath: C:\Program Files\ELAN_6.3\app\elan-6.3.jar;C:\Program Files\ELAN_6.3\app\activation-1.1.1.jar;C:\Program Files\ELAN_6.3\app\annot-search-lib-1.7.jar;C:\Program Files\ELAN_6.3\app\annot-tools-1.3.jar;C:\Program Files\ELAN_6.3\app\annotation-schema-1.0.jar;C:\Program Files\ELAN_6.3\app\bridj-0.7.0.jar;C:\Program Files\ELAN_6.3\app\commons-codec-1.11.jar;C:\Program Files\ELAN_6.3\app\commons-logging-1.2.jar;C:\Program Files\ELAN_6.3\app\guk-0.7.jar;C:\Program Files\ELAN_6.3\app\help.zip;C:\Program Files\ELAN_6.3\app\hsqldb-2.3.4.jar;C:\Program Files\ELAN_6.3\app\httpclient-4.5.13.jar;C:\Program Files\ELAN_6.3\app\httpcore-4.4.14.jar;C:\Program Files\ELAN_6.3\app\hunspell-bridj-1.0.4.jar;C:\Program Files\ELAN_6.3\app\im\nl.mpi.gim__V04.jar;C:\Program Files\ELAN_6.3\app\im\nl.mpi.lookup.CJKV__V03.jar;C:\Program Files\ELAN_6.3\app\im\nl.mpi.lookup.IPA__V04.jar;C:\Program Files\ELAN_6.3\app\javafx-base-17.0.1-win.jar;C:\Program Files\ELAN_6.3\app\javafx-base-17.0.1.jar;C:\Program Files\ELAN_6.3\app\javafx-controls-17.0.1-win.jar;C:\Program Files\ELAN_6.3\app\javafx-controls-17.0.1.jar;C:\Program Files\ELAN_6.3\app\javafx-graphics-17.0.1-win.jar;C:\Program Files\ELAN_6.3\app\javafx-graphics-17.0.1.jar;C:\Program Files\ELAN_6.3\app\javafx-media-17.0.1-win.jar;C:\Program Files\ELAN_6.3\app\javafx-media-17.0.1.jar;C:\Program Files\ELAN_6.3\app\javafx-swing-17.0.1-win.jar;C:\Program Files\ELAN_6.3\app\javafx-swing-17.0.1.jar;C:\Program Files\ELAN_6.3\app\javax.activation-api-1.2.0.jar;C:\Program Files\ELAN_6.3\app\jaxb-api-2.3.1.jar;C:\Program Files\ELAN_6.3\app\jaxb-core.jar;C:\Program Files\ELAN_6.3\app\jaxb-impl.jar;C:\Program Files\ELAN_6.3\app\jhall-2.0.0.5.jar;C:\Program Files\ELAN_6.3\app\jlfgr-1.0.jar;C:\Program Files\ELAN_6.3\app\jna-5.4.0.jar;C:\Program Files\ELAN_6.3\app\jna-platform-5.4.0.jar;C:\Program Files\ELAN_6.3\app\json-20160212.jar;C:\Program Files\ELAN_6.3\app\lexan-api-1.1.jar;C:\Program Files\ELAN_6.3\app\lexiconcomponent-1.9.jar;C:\Program Files\ELAN_6.3\app\metadata-api-1.5.0.jar;C:\Program Files\ELAN_6.3\app\mfsearch-1.7.jar;C:\Program Files\ELAN_6.3\app\slf4j-api-1.7.5.jar;C:\Program Files\ELAN_6.3\app\staccato-1.0.0.jar;C:\Program Files\ELAN_6.3\app\vlcj-4.2.0.jar;C:\Program Files\ELAN_6.3\app\vlcj-natives-4.1.0.jar;C:\Program Files\ELAN_6.3\app\xalan-2.4.1.jar;C:\Program Files\ELAN_6.3\app\xercesImpl-2.11.0.jar;C:\Program Files\ELAN_6.3\app\xml-resolver-1.2.jar;C:\Program Files\ELAN_6.3\app\xmlbeans-2.6.0.jar
Library path: C:\Program Files\ELAN_6.3\app\nativelib
Display info:
Screen 1 - isDefault:true, w:1920, h:1080, bitDepth:32
Main screen resolution:120 (w:1536, h:864)

Aug 15, 2022 11:25:32 AM mpi.eudico.client.annotator.prefs.PreferencesReader parse
INFO: Reading preferences: C:\Users\axhri.elan_data\elan.pfsx
External updater thread started
Aug 15, 2022 11:25:33 AM mpi.eudico.client.annotator.DesktopAppHandler setHandlers
INFO: The APP_ABOUT action is not supported on the current platform!
Aug 15, 2022 11:25:33 AM mpi.eudico.client.annotator.DesktopAppHandler setHandlers
INFO: The APP_QUIT_HANDLER action is not supported on the current platform!
Aug 15, 2022 11:25:33 AM mpi.eudico.client.annotator.DesktopAppHandler setHandlers
INFO: The APP_OPEN_FILE action is not supported on the current platform!
Aug 15, 2022 11:25:33 AM mpi.eudico.client.annotator.DesktopAppHandler setHandlers
INFO: The APP_PREFERENCES action is not supported on the current platform!
Aug 15, 2022 11:25:33 AM mpi.eudico.client.annotator.prefs.PreferencesReader parse
INFO: Preferences file does not exist: C:\Users\axhri.elan_data\shortcuts.pfsx
Aug 15, 2022 11:25:33 AM mpi.eudico.client.annotator.commands.ShortcutsUtil readCurrentShortcuts
INFO: Could not load the keyboard shortcut preferences file. The file does not exist or is not valid.
Aug 15, 2022 11:25:35 AM java.util.prefs.WindowsPreferences
WARNING: Could not open/create prefs root node Software\JavaSoft\Prefs at root 0xffffffff80000002. Windows RegCreateKeyEx(…) returned error code 5.
Aug 15, 2022 11:25:35 AM java.util.prefs.WindowsPreferences WindowsRegOpenKey1
WARNING: Trying to recreate Windows registry node Software\JavaSoft\Prefs at root 0xffffffff80000002.
Aug 15, 2022 11:25:35 AM java.util.prefs.WindowsPreferences openKey
WARNING: Could not open windows registry node Software\JavaSoft\Prefs at root 0xffffffff80000002. Windows RegOpenKey(…) returned error code 2.
Error: cvc-complex-type.2.4.a: Invalid content was found starting with element ‘MEDIA_DESCRIPTOR’. One of ‘{PROPERTY}’ is expected.
System id: file:///D:/SayMore/NPK/Sessions/qvz008/qvz008.WAV.annotations.eaf
Public id: null
Line: 6
Column: 72
Error: cvc-id.1: There is no ID/IDREF binding for IDREF ‘a10’.
System id: file:///D:/SayMore/NPK/Sessions/qvz008/qvz008.WAV.annotations.eaf
Public id: null
Line: 3879
Column: 23
java.lang.NullPointerException: Cannot invoke “mpi.eudico.server.corpora.clom.Annotation.getTier()” because “referedAnnotation” is null
at mpi.eudico.server.corpora.clomimpl.dobes.ACM28TranscriptionStore.loadTranscription(ACM28TranscriptionStore.java:701)
at mpi.eudico.server.corpora.clomimpl.abstr.TranscriptionImpl.initialize(TranscriptionImpl.java:222)
at mpi.eudico.server.corpora.clomimpl.abstr.TranscriptionImpl.(TranscriptionImpl.java:154)
at mpi.eudico.client.annotator.ElanFrame2.openEAF(ElanFrame2.java:468)
at mpi.eudico.client.annotator.ElanFrame2$2.run(ElanFrame2.java:339)
at java.desktop/java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:318)
at java.desktop/java.awt.EventQueue.dispatchEventImpl(EventQueue.java:771)
at java.desktop/java.awt.EventQueue$4.run(EventQueue.java:722)
at java.desktop/java.awt.EventQueue$4.run(EventQueue.java:716)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:399)
at java.base/java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:86)
at java.desktop/java.awt.EventQueue.dispatchEvent(EventQueue.java:741)
at java.desktop/java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:203)
at java.desktop/java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:124)
at java.desktop/java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:113)
at java.desktop/java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:109)
at java.desktop/java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
at java.desktop/java.awt.EventDispatchThread.run(EventDispatchThread.java:90)

This is the result of the attempt to validate the file (pasted below)

I’ve started to delete the offending annotations with the IDREF binding issue, but I have no idea how many there are, I’ve deleted two so far and the problem persists. The ones I’ve seen so far are just translation annotations, so it’s not a big deal if I lose them, but I’m wondering if there’s a faster way to fix this instead of deleting, and running the validation each time.

==============================================
Checking file: D:\SayMore\NPK\Sessions\qvz008\qvz008.WAV.annotations.eaf

++ Start of XML validation by SAXParser

Resolved schema to local resource: /mpi/eudico/resources/EAFv3.0.xsd

ERROR: cvc-complex-type.2.4.a: Invalid content was found starting with element ‘MEDIA_DESCRIPTOR’. One of ‘{PROPERTY}’ is expected.
Line: 6
Column: 72
ERROR: cvc-id.1: There is no ID/IDREF binding for IDREF ‘a10’.
Line: 3879
Column: 23
Received 0 warnings and 2 errors
– End of XML validation by SAXParser

Checking contents of the EAF file
++ Start of Tier Types (a.k.a. Linguistic Types)
– End of Tier Types
++ Start of Tiers
Checking tier: Transcription
Checking tier: Phrase Free Translation - EN
Checking tier: duration
Checking tier: hands
Checking tier: head
Checking tier: eyes
Checking tier: Subtitle-Tier
Checking tier: Subtitle-Tier-1
Checking tier: Phrase Free Translation
– End of Tiers
++ Start of Controlled Vocabularies
– End of Controlled Vocabularies

Update:

I fixed it myself, I ended up having to delete the entire translation tier, before I did so I exported the translation tier from SayMore to have a back up. So everything is fine for now but I have no idea if this will happen again.

Upon reflection, one thing that often happens in my workflow, is that after the transcriptions and translations are complete, I export the eaf file to FLEX for glossing, while in FLEX I often notice errors in the transcription, so I fix them in FLEX and I also go back to fix them in the EAF file, I usually do this in the transcription/translation interface of SayMore which supposedly ‘projects’ the related tiers from the ELAN file. So often make edits to the transcription without opening the ELAN file directly. However these edits are only of the values of the annotations, if I have to change the duration or number of annotations I do that in ELAN.

So perhaps that is related, but I never edit the translation tiers, and in this case the errors were with the annotations in one of the translation tiers.

Hello,

So it is clear that the file is damaged/invalid, but it is not clear how this happened. The most serious error is probably that there is an annotation on a dependent tier referring to an annotation that doesn’t exist anymore or doesn’t have the same id (a10) anymore.

While it is not possible to determine the exact cause of the damage, it might be good to emphasize that there is no such thing as eaf files expiring after a time without opening or an expiration date for eaf files.

If you still have the original damaged file, I would be interested in receiving it for inspection (han.sloetjes AT mpi.nl).

-Han