How to make a new voice for Ekho
(updated on Oct 21, 2012)
- Get the syllable list. Jyutping for Cantonese, Pinyin for Mandarin. Hangul for Korean
- Record all syllable in the list in 16bit format. Audacity is a recommended tool. Two methods could be used to record:
a) Keep about 1/3 second silence between every two syllables as tag to the following cutting script.
b) Read some word following each syllable. This can improve the effect of being in context. But it needs a lot of manually cutting.
c) Use Audacity's sound detecting function. Audacity->Sound Finder. It will pop up a dialog how you detect the sound like silence level, silence length and clip start and end. Change DB to 36, silence length to 0.1s, Clip start and end each 0.01s. Then get all the clips with a numerical label each (There is a bug when over 500 clips). Next, Audacity->File->Export Labels and open it with a text editor, the text file contains the start and end point of clips and labels. Last, replace the labels by deleting the whole column and insert appropriate recording name column and save in the original format. Now replace the numerical labels by Audacity->File->Import->Labels importing the edited label file and find each recording is correctly labelled with the recording names. Finally choose ->File->Export Multiple to split the recording apart save each with the new label as the name and done.
- Cut each syllable's wave and save as WAV format.
If you record with method a） above, you can use split_wav.pl to cut it.
If you record with method b） above, you need to cut it manually. Audacity can export selected frame to WAV. Shortcut keys is useful to accelerate this process. It's recommended to apply Audacity's fade-in and fade-out effect to avoid "click" sound at the boundary.
- Rename the file according to the phonetic symbol of the syllable.
If it is cut by split_wav.pl, you can use gen_record_list_sound_db.pl to rename it.
If it is recorded with method b) and it's Mandarin, you can use Silas' Python script cutter-helper.pl.
- Add the new voice data directory under Ekho. Cantonese voice data directory name should begin with "jyutping-" while "pinyin-" is for Mandarin and "hangul-" is for Korean.
You can also record the voice and send the files to me (Cameron) to finish the rest work as long as you are going to release it under GPL license ;-)