Bug report: analyzers do not fully parse sequences of stems/roots

weijian · 28 May 2026 09:41

Dear Divya,

I have encountered a problem in the interlinearization mode of ELAN 7.1 (Apple Silicon; similar behavior on Windows). The lexicon analyzer does not seem to parse sequences of stems or roots, even though all morphemes exist in the lexicon. For example:

stemstemsuffix outputs stem-*-suffix

I tried a workaround with manually configuring a set of parse and gloss analyzers. While this parse the sequence and retrieve the gloss, sense-level associated fields (such as grammatical category and sense id) are not retrieved during ambiguity selection when the lexical entry concerned has multiple senses. I did not test this very systematically (meaning that I didn’t try out every possible combination/order of analyzers).

A set of minimal working examples can be downloaded here. They include a test lexicon file, an .eaf file configured with the lexicon analyzer, and an .eaf file with manually configured gloss analyzers.

I would greatly appreciate your help with this. My understanding is that a fix of the lexicon analyzer would be the most convenient to end users (if it is indeed a bug). However, if the next release is not going to be ready soon, I would also be grateful for any advice of a temporary workaround.

Thank you so much as always!

Best wishes,
Weijian

divya.kanekal · 1 June 2026 09:02

Dear Weijian ,

Thank you for the details and example files.

I can confirm this is indeed a bug in the Lexicon Analyzer. In all these cases, the middle morpheme is not resolved even if it exists in the lexicon, and the missing marker (*) is produced instead. This is particularly impactful for languages with productive reduplication or compounding The fix handles all of these combinations.

A fix has been identified that resolves all of these combinations in one go, and it will be included in an upcoming release.

Regarding workaround : until the fix is released, the Parse and Gloss Analyzer combination remains the most practical workaround for the parsing itself, but if sense-level field retrieval is critical for your work, there is unfortunately no complete workaround available at this time.