I am using ELAN with a small team of transcribers, each on different computers with different operating systems, who need to edit each others’ transcriptions.
Also, I would like to publish their .eaf files in a public repository as a corpus. However, I noticed that the .eaf files contain computer-specific file path information for the associated audio/video file in the header: <MEDIA_DESCRIPTOR MEDIA_URL="file:///transcriber_name/path/to/video/file.mp4" MIME_TYPE="video/mp4" RELATIVE_MEDIA_URL="./file.mp4"/>
I want to remove the file path information from the .eaf files, since it could compromise information about the transcribers’ computer setup, and the info is not useful to the public user anyway. However, I noticed that if I simply delete this line, or anonymize MEDIA_URL with a gibberish path, the next transcriber is prompted to locate the media file, and that line gets overwritten with another path anyway.
Is there a way I could effectively remove or anonymize this information? Or get ELAN to stop storing it? I’m grateful for any ideas! Thank you.
