Task #4247
closedTask #3676: RA3a - Context definition and penalisation matrix
Expert-based context penalization matrix
100%
Files
Updated by Skarnitzl Radek about 7 years ago
- Status changed from New to Assigned
Updated by Skarnitzl Radek over 6 years ago
- Status changed from Assigned to Resolved
- % Done changed from 0 to 100
After careful analysis and consideration, we came to the conclusion that a more detailed context-based penalization system (i.e., one going beyond nasalization and labialization contexts) is not realistic. We therefore propose only minor changes to the current system.
1. It is not clear why T (voiceless palatal plosive [c]) is not included in the nasalization matrix. We recommend that it is included.
2. Also for the nasalization matrix, we recommend that the following restriction - when examining diphone [ * ][ o,O,u,U,y,i,I,e,E,a,A ], disable the use of its l-context according to the following table - is only retained for obstruents (i.e., that it is dropped from the non-nasal sonorants [j, l, r]).
Updated by Tihelka Dan over 6 years ago
- File penalties.txt penalties.txt added
- Status changed from Resolved to Assigned
I am adding the list of penalties penalties.txt to be used during the experiment. It is quite large, since it is auto-generated from the test script.
But to avoid wrong conclusions, could you please look at it (briefly, just pick the most problematic items) and confirm that the penalties are ok?
The values mean:- 10000 - disable. It is such a huge cost that this unit should never be selected (if there are other more appropriate variants)
- 10 - penalty. Another unit should be preferred, if possible, but the unit can be used e.g. in case when a concatenation cost to other units is too high
- 0 - no penalty
The lines in form have/want-[unit]: u/a-[d*] -> 10.0 means: penalise the use of unit with left context [u] when [a] is required for any diphone with left phone [d]
The lines in form +want/have: [*A]+C/J -> 1000000.0 means: disable the use of unit with right context [ň] when [č] is required for any diphone with right phone [á]
I am not sure if there are typos context penalty matrices:
- diphone [ * ][ o,O,u,U,y,i,I,e,E,a,A ], disable the use of its l-context ... - should it be: disable the use of its r-context ...?
The penalties.txt uses left and right contexts. I will re-generate the penalties table if there should only be left context in both cases.
Updated by Tihelka Dan over 6 years ago
- Due date changed from 31.12.2017 to 10.07.2018
Updated by Matoušek Jindřich almost 6 years ago
- Status changed from Assigned to Feedback
- Assignee changed from Skarnitzl Radek to Tihelka Dan