customizing pronunciation with the new voices (speech synthesis / say command))

Question

Level 1

0 points

customizing pronunciation with the new voices (speech synthesis / say command))

Hi,

I’m trying to use embedded commands with the “say" command to customize the pronunciation of some words (I’m using made-up words so the speech synthesizer does not place the lexical stress where I want it to). I followed the documentation found there: https://developer.apple.com/library/mac/documentation/userexperience/conceptual/ SpeechSynthesisProgrammingGuide/FineTuning/FineTuning.html . So for instance:

say -v Tom “[[inpt PHON]] gr1AESOWIHN [[inpt TEXT]]”

should produce “grashoing” with stress on the first syllable.

My problem is that it only works with Alex and the original MacinTalk voices, but not with the new voices introduced in Lion (such as Tom and Samantha): they seem to support embedded speech commands (the parts between [[ and ]]), but they pronounce the phonemic representation as a list of letters and numbers (so “jee ar one a ee …”).

Do you know how I can get around this? How do I provide phoneme input with the new voices?

MacBook Air (13-inch Mid 2013), OS X Mavericks (10.9)

Posted on Nov 17, 2013 2:05 PM

Reply

Answer 1

red_menace

Level 6

17,030 points

Nov 17, 2013 4:29 PM in response to kerep

The functionality of premium voices is a bit different than the standard ones. As far as I know, the new multilingual voices don't support phonemes (probably for IP reasons).

Reply

Answer 2

kreme

Level 1

35 points

Mar 8, 2014 8:20 PM in response to kerep

I know I'm a litte late to this party, but after conducting my own tests and then finding this post, I would like to say that it appears Scansoft voices don't respond to Phonemes. I learned this long ago while messing with TextAloud on the PC. They seem to only respond to plain text and alternate spellings unless the 'PRON SYM' is used, but the latter has unpredictable results and I don't know how to do this in on the Mac side. For example, the workaround for that partiular word goes something like this:

say -v Tom “[[inpt PHON]] gra showing [[inpt TEXT]]”

It works when tested in the terminal and sounds like the expected pronunciation after listening to Alex with the phonemes you chose. The breaks in the word are where you'll get emphasis for particular sounds. It is my assumption that you will have to do all edits for these voices in this manner.

Hope this helps for others that stumble upon this post.

I have to admit that I have a gripe about Apple not offering a way to edit pronunciations for Text-to-Speech within the system. The voiceover utility is a seperate entitiy and has no effect on TTS. I have sent feedback to Apple requesting an editor of sorts, and I would like to urge others to do the same.

Reply

Answer 3

kreme

Level 1

35 points

Mar 8, 2014 8:29 PM in response to kreme

I also wanted to add that you will have to play with the plain text spellings to get the words to sound exactly like you want them to sound. Another alternative of the spelling could go something like this, which puts more emphasis on the GRAH part of the word:

say -v Tom “[[inpt PHON]] grash owing [[inpt TEXT]]”

Reply