CMU ARCTIC To FreeTTS

This page provides details on importing CMU ARCTIC voices to FreeTTS. It has been tested on the awb, bdl, jmk, and slt ARCTIC 0.95 voices.

Note that this is very preliminary documentation meant to help those who are chomping at the bit to get CMU ARCTIC voices into FreeTTS. There is still some work to do (e.g., the resulting voices do not sound exactly like they do in Festival), but the work is ready for people to give it a shot.

Obtaining the CMU ARCTIC Voices

Many many thanks to the folks behind the CMU ARCTIC voices. They have given the world a great wealth of knowledge and data.

You can obtain a CMU ARCTIC voice from the CMU ARCTIC web site. We've succesfully imported the bdl, slt, jmk, and awb voices (note: the awb voice requires you do a little extra work - see the note at the end of this page).

As of the writing of this document (January 2005), we have not attempted to create a CMU ARCTIC voice ourselves, but expect the tools provided by FreeTTS will support new CMU ARCTIC voices that you create.

Preparing to Import a CMU ARCTIC Voice

You need to have downloaded, built, and installed the Festival Speech Synthesis System.

The script to import a CMU ARCTIC Voice is tools/ArcticToFreeTTS/ArcticToFreeTTS.sh. It is designed to be run from the voice directory that contains the CMU ARCTIC data. Before running the script, you need to edit it to ensure that the following variables are set specifically to your environment:

JAVA_HOME - points to your installation of the Java platform.
ESTDIR - points to the speech_tools directory of the Festival installation.
FESTIVALDIR - points to the festival directory of the Festival installation.
FESTVOXDIR - points to the festvox directory of the FestVox installation.
FREETTSDIR - points to the FreeTTS directory of the FreeTTS installation.

At some point, we will probably make life a little easier for you (or perhaps you'd like to make life easier for us and submit your changes back to this open source effort), but right now, please just edit the script.

Importing a CMU ARCTIC Voice into FreeTTS

The procedure for importing a CMU ARCTIC voice into FreeTTS is rather simple:

Make sure you've edited the tools/ArcticToFreeTTS/ArcticToFreeTTS.sh script as described above.
Download the voice you'd like to import, and unpack it in a directory. For example:

cd to the directory where you unpacked the voice. The contents of this directory should look something like the following:

AREADME         emu             festvox         pm
COPYING         etc             lab             pm_lab
README          f0              lpc             sts
bin             festival        mcep            wav

Run the ArcticToFreeTTS.sh script:

The script will output a fair amount of information and will take a fair amount of time to complete (about 1 hour on a 867MHz PowerBook G4 Mac OS X system). The script creates a new voice directory under your FreeTTS installation. For example, assume you imported the slt voice. The script will create the following contents in the following directory:

com/sun/speech/freetts/en/us/cmu_us_slt_arctic/

ArcticVoice.java                nums_cart.txt
ArcticVoiceDirectory.java       part_of_speech.txt
cmu_us_slt_arctic.txt           phoneset.txt
dur_stat.txt                    phrasing_cart.txt
durz_cart.txt                   prefix_fsm.txt
f0_lr_terms.txt                 suffix_fsm.txt
int_accent_cart.txt             voice.Manifest
int_tone_cart.txt

The script also compiles the voice for you and places the resulting jar in the FreeTTS lib directory. For example, lib/cmu_us_slt_arctic.jar. At this point, the voice is ready to be used. You can test the voice as follows.

Make sure the jar file is OK:

java -jar lib/cmu_us_slt_arctic.jar 

VoiceDirectory 'com.sun.speech.freetts.en.us.cmu_us_slt_arctic.ArcticVoiceDirectory'

Name: slt_arctic
        Description: CMU ARCTIC Cluster Unit Voice
        Organization: cmu
        Domain: general
        Locale: en_US
        Style: standard
        Gender: MALE
        Age: YOUNGER_ADULT
        Pitch: 100.0
        Pitch Range: 12.0
        Pitch Shift: 1.0
        Rate: 150.0
        Volume: 1.0

Test the voice using the freetts demo application. Note that the CMU ARCTIC voices are very large. As such, it may take several minutes before FreeTTS has loaded the voice and made it ready for use.

Note on the AWB CMU ARCTIC Voice

The AWB CMU ARCTIC voice uses a different unit selection algorithm than the other CMU ARCTIC voices. That is, whereas the other CMU ARCTIC voices select units based upon phone, stress, onset, and coda features, the AWB CMU ARCTIC voice selects units based solely on phone and stress features. As such, the AWB CMU ARCTIC voice needs a little extra work to import it to FreeTTS.

For one thing, the ArcticToFreeTTS tools expect the Scheme for the ARCTIC voices to follow a particular pattern. In particular, the etc/voice.defs file defines a FV_FULLVOICENAME variable that is built up from the various attributes of the voice. The ArcticToFreeTTS tools expect there to be a file in the festvox subdirectory whose name is the value of FV_FULLVOICENAME with a ".scm" extension. In this file, it is expected that there will be a definition of the form "voice_$FV_FULLVOICENAME" where $FV_FULLVOICENAME expands to the value of FV_FULLVOICENAME.

All the ARCTIC voices except the AWB voice follow this pattern. The AWB voice, however, uses "f0clunits" as an attribute for some of the types in etc/voice.defs. Before importing, you'll need to change all the occurrences of "f0clunits" in the cmu_us_awb_arctic etc/voice.defs file to "clunits" in order to match the pattern expected by the ArcticToFreeTTS tools:

FV_INST=cmu
FV_LANG=us
FV_NAME=awb_arctic
FV_TYPE=clunits
FV_VOICENAME=$FV_INST"_"$FV_LANG"_"$FV_NAME
FV_FULLVOICENAME=$FV_VOICENAME"_"$FV_TYPE
FV_F0TYPE=clunits
FV_VOICENAME=$FV_INST"_"$FV_LANG"_"$FV_NAME
FV_FULLVOICENAME=$FV_VOICENAME"_"$FV_F0TYPE
FV_F0TYPE=clunits
FV_F0TYPE=clunits

If you do not make this change, you'll see the following error when you run the ArcticToFreeTTS tools. The error will occur very early in the ArcticToFreeTTS output, so if you are not watching, you'll miss it. You'll know you didn't make this change, though, because you'll have empty dur_stat.txt and durz_cart.txt files in the com/sun/speech/freetts/en/us/cmu_us_awb_arctic/ directory that was created when you ran the ArcticToFreeTTS tools:

SIOD ERROR: unbound variable : voice_cmu_us_awb_arctic_f0clunits

Once you've made this change, you can run the ArcticToFreeTTS tools to get the AWB data in FreeTTS form.

Once you've imported the AWB data, you now need to modify the ArcticVoice.java file that was created in com/sun/speech/freetts/en/us/cmu_us_awb_arctic/ when you imported the voice. The modification tells the voice to use a different unit selector, and requires that you override the "getUnitSelector" method. First, you'll need to add some extra imports to the file:

import java.io.IOException;
import com.sun.speech.freetts.Item;
import com.sun.speech.freetts.UtteranceProcessor;
import com.sun.speech.freetts.clunits.ClusterUnitSelector;

Next, you'll need to override the getUnitSelector method:

    public UtteranceProcessor getUnitSelector() throws IOException {
        return new AwbUnitSelector(getDatabase());
    }

Finally, you'll need to define the new unit selector. This can be added as a class at the end of the ArcticVoice.java file:

class AwbUnitSelector extends ClusterUnitSelector {
    private static final String VOWELS = "aeiou";

    public AwbUnitSelector(URL url) throws IOException {
        super(url);
    }

    /**
     * Sets the cluster unit name given the segment.
     *
     * @param seg the segment item that gets the name
     */
    protected void setUnitName(Item seg) {
        String cname = null;

        String segName = seg.getFeatures().getString("name");

        /* If we have a vowel, then the unit name is the segment name
         * plus a 0 or 1, depending upon the stress of the parent.
         * Otherwise, the unit name is the segment name plus "coda"
         * or "onset" based upon the seg_onsetcoda feature processor.
         */
        if (VOWELS.indexOf(segName.charAt(0)) >= 0) {
            cname = segName + seg.findFeature("R:SylStructure.parent.stress");
        } else {
            cname = segName;
        }
        seg.getFeatures().setString("clunit_name", cname);
    }
}

Once you've made these changes to the ArcticVoice.java file, you can rebuild the voice as follows:

ant -Darctic_voice=cmu_us_awb_arctic -find build.xml

Good luck, and have fun.