Expert-based context penalization matrix
After careful analysis and consideration, we came to the conclusion that a more detailed context-based penalization system (i.e., one going beyond nasalization and labialization contexts) is not realistic. We therefore propose only minor changes to the current system.
1. It is not clear why T (voiceless palatal plosive [c]) is not included in the nasalization matrix. We recommend that it is included.
2. Also for the nasalization matrix, we recommend that the following restriction - when examining diphone [ * ][ o,O,u,U,y,i,I,e,E,a,A ], disable the use of its l-context according to the following table - is only retained for obstruents (i.e., that it is dropped from the non-nasal sonorants [j, l, r]).
I am adding the list of penalties penalties.txt to be used during the experiment. It is quite large, since it is auto-generated from the test script.
But to avoid wrong conclusions, could you please look at it (briefly, just pick the most problematic items) and confirm that the penalties are ok?
The values mean:- 10000 - disable. It is such a huge cost that this unit should never be selected (if there are other more appropriate variants)
- 10 - penalty. Another unit should be preferred, if possible, but the unit can be used e.g. in case when a concatenation cost to other units is too high
- 0 - no penalty
The lines in form have/want-[unit]: u/a-[d*] -> 10.0 means: penalise the use of unit with left context [u] when [a] is required for any diphone with left phone [d]
The lines in form +want/have: [*A]+C/J -> 1000000.0 means: disable the use of unit with right context [ň] when [č] is required for any diphone with right phone [á]
I am not sure if there are typos context penalty matrices:
- diphone [ * ][ o,O,u,U,y,i,I,e,E,a,A ], disable the use of its l-context ... - should it be: disable the use of its r-context ...?
The penalties.txt uses left and right contexts. I will re-generate the penalties table if there should only be left context in both cases.
