Diarize an unknown nr of speakers

Hej. I’ve been experimenting with the Diarizers and I cannot get the “diarize an unknown number of speakers” recognizer to output anything. I’ve tried a bunch of different settings and every time there are 0 annotations in the new tier. I don’t find a very robust documentation; any idea what the problem could be?
elan5.3, Mac os 10.12.6

I just tried and managed to get some output. The main problem might be to generate the Speech vs Non-Speech segmentation, which is required as input. Do you have a segmentation as input which includes Speech segments?
Also, after running the recognizer and receiving an empty result tier, are there meaningful messages in the report (accessible via the Report… button)?

Apart from the information available under the “?” or Help button, I’m afraid there is no further documentation of the settings.


Does “non-speech” in this instance mean a part of the tier with no annotation? I generated my whole transcription with speech recognition, so the whole tier is annotated with either the transcription or an annotation that says “recognizer didn’t recognize any speech” – The tier is well segmented in terms of time slot boundaries between speech/non-speech.

I suppose this is probably the reason I get no output. Is there any way to “find 'n delete”, or maybe bulk delete blank annotations?

Yes, I think the diarization recognizer expects a tier with annotations with value Speech or Non-Speech (or just a tier with Speech annotations, the Non-Speech segments are probably ignored).
In ELAN there are function in the Tier menu to remove empty annotations (Remove Annotations or Values) or to change the value of all annotations of selected tiers to a specific value (Label and Number Annotations). The Find and Replace could also be used to replace all values of annotations on selected tiers by e.g. the value “Speech”.