Task #4131: Select utterances containing units with detected annotation errors - HQSYN16 - Projects of Department of Cybernetics & NTIS P1 - Cybernetic Systems, University of West Bohemia

Actions

Copy link

Task #4131

closed

Task #3672: RA1d - Automatic cleaning of speech corpora

Task #3690: Annotation error detection

Task #3899: Submit a paper on anomaly-based annottaion errors detection (Jimp)

Task #4128: Final listening test based evaluation of annotation error detection

Select utterances containing units with detected annotation errors

Added by Matoušek Jindřich about 9 years ago. Updated about 9 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Matoušek Jindřich

Target version:

RA1: Analysis of artifacts in synthetic speech

Start date:

03.01.2017

Due date:

11.01.2017

% Done:

Estimated time:

Description

Analyze logs of synthesized utterances and grep utterances that contain unit(s) with detected annotation error
Sort the filtered utterances according to the number of units with detected error
Select the following utterances:
1. 20 utterances with the most units that contain an error
2. 20 utterances with just one unit containing an error
3. 20 utterances with something in between (depending on the result of logging)
Store texts and waveforms of the selected utterances

Files

Download all files

annot-errors.detected.with_stats.txt (13.9 KB) annot-errors.detected.with_stats.txt		Tihelka Dan, 09.01.2017 13:37
annot-errors.detected.with_rels.txt (17.3 KB) annot-errors.detected.with_rels.txt		Tihelka Dan, 10.01.2017 14:06
least_frequent.txt (1000 Bytes) least_frequent.txt		Matoušek Jindřich, 10.01.2017 15:26
mean_frequent.txt (954 Bytes) mean_frequent.txt		Matoušek Jindřich, 10.01.2017 15:26
most_frequent.txt (1015 Bytes) most_frequent.txt		Matoušek Jindřich, 10.01.2017 15:26

Related issues

Actions

Copy link

Updated by Matoušek Jindřich about 9 years ago

Blocked by Task #4130: Prepare words with detected annotation errors added

Actions

Copy link

Updated by Matoušek Jindřich about 9 years ago

Blocked by Task #4129: Synthesize & log a large portion of text by TTS system with annotation errors added

Actions

Copy link

Updated by Matoušek Jindřich about 9 years ago

Blocks Task #4132: Synthesize the selected utterances by TTS system with/without the annotation errors added

Actions

Copy link

Updated by Matoušek Jindřich about 9 years ago

Words detected as containing annotation errors (and being really misannotated) are attached here (#4130).

Actions

Copy link

Updated by Tihelka Dan about 9 years ago

File annot-errors.detected.with_stats.txt annot-errors.detected.with_stats.txt added
Status changed from New to Resolved
Assignee changed from Tihelka Dan to Matoušek Jindřich

In annot-errors.detected.with_stats.txt, the words detected as containing annotation errors (in the file ...) were extended with the number representing how many units from the words were used during the synthesis of the large tests (see #4129). The list of words was sorted according to the number of selected units.

Actions

Copy link

Updated by Matoušek Jindřich about 9 years ago

Status changed from Resolved to Feedback
Assignee changed from Matoušek Jindřich to Tihelka Dan

The absolute numbers of units are fine but it might be better to specify also the average number of units (containing annotation errors) per synthetic phrase.

Actions

Copy link

Updated by Tihelka Dan about 9 years ago

File annot-errors.detected.with_rels.txt annot-errors.detected.with_rels.txt added

Attachment annot-errors.detected.with_rels.txt contains the similar statistics as annot-errors.detected.with_stats.txt, with the difference that the first number represents W/(P +1), where:

W is the number of selections from the given word and
P is the number of phrases it was used in.

The second column in annot-errors.detected.with_rels.txt corresponds to the first column in annot-errors.detected.with_stats.txt.

Actions

Copy link

Updated by Tihelka Dan about 9 years ago

Status changed from Feedback to Resolved
Assignee changed from Tihelka Dan to Matoušek Jindřich

Actions

Copy link

Updated by Matoušek Jindřich about 9 years ago

File least_frequent.txt least_frequent.txt added
File mean_frequent.txt mean_frequent.txt added
File most_frequent.txt most_frequent.txt added
Status changed from Resolved to Closed

The following items selected:

20 "most frequent" items (a combination of the most frequently selected units in absolute and relative numbers) -- most_frequent.txt
20 "least frequent" items (a combination of the least frequently selected units in absolute and relative numbers) -- least_frequent.txt
20 "mean frequent" items (a combination of the moderately frequently selected units in absolute and relative numbers) -- mean_frequent.txt

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

HQSYN16

Custom queries

Task #4131

Select utterances containing units with detected annotation errors

Updated by Matoušek Jindřich about 9 years ago

Updated by Matoušek Jindřich about 9 years ago

Updated by Matoušek Jindřich about 9 years ago

Updated by Matoušek Jindřich about 9 years ago

Updated by Tihelka Dan about 9 years ago

Updated by Matoušek Jindřich about 9 years ago

Updated by Tihelka Dan about 9 years ago

Updated by Tihelka Dan about 9 years ago

Updated by Matoušek Jindřich about 9 years ago