Reply to post

[FAQ]Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice

Author
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
2016/02/16 14:27:46 (permalink)

Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice

You can read about SSML markup here:
 
                https://www.w3.org/TR/speech-synthesis11/
 
You could try to insert any SSML tags into sentences and most voices should understand and interpret them correctly. How they actually interpret tags is up to the voice software implementation, I have no influence on this. @Voice will automatically wrap sentences sent to TTS engine into <speak>…</speak> tags, so don’t add these manually.
 
To add speech replacements in @Voice app, open the Settings menu, press "Edit speech", then use the + button to add a new speech replacement.
 
Example: The following speech replacement will add a 500 ms pause after each comma:
 
Replacement Kind:    Case sensitive
Pattern:    ,
Replace:    ,<break time="500ms"/>

 
 Another example, this one will add 1 second pause after each word:
 
Replacement Kind:    RegEx
Pattern:    (?<!&)\b\w+\b(?!;)
Replace:    $0<break time="1s"/>
 
To edit pronunciation of words you could use SSML <phoneme …> element with IPA or other phonetic alphabets (depending on what the voice in use actually supports, e.g.:
 
Replacement Kind:    Case Insensitive
Pattern:    psychology
Replace:    <phoneme alphabet="ipa" ph="saɪˈkɒlədʒɪ"/>
 
Note that some TTS engines and voices may not support SSML tags, or support some of them, but not the phoneme tag. In my testing I found out that Google TTS voices do support them. Also in the past I successfully tested Ivona TTS and Pico TTS for phoneme support.
 
Greg
post edited by Admin - 2024/03/02 13:04:38

11 Replies Related Threads

    DukeArchibald
    User
    • Total Posts : 0
    • Reward points: 0
    • Joined: 2020/01/27 19:00:59
    • Status: offline
    Re: Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice 2020/04/16 09:23:57 (permalink)
    I would like to change the full voice while reading without having to change it manually (i'm doing dialogue)
    (example from en-gb-x-rjs#male_1-local to en-gb-x-rjs#female_1-local)
    if it's possible is there a way to do that inside the html file ?
    Admin
    Administrator
    • Total Posts : 275
    • Reward points: 0
    • Joined: 2010/11/22 00:00:00
    • Location: USA
    • Status: offline
    Re: Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice 2020/04/16 11:44:55 (permalink)
    Currently I don't have this ability. I could consider it, if more users asked for such a feature. However, it would be very difficult (most probably impossible) to determine automatically which voice should be used, and probably very few people would be interested in manually annotating text to switch voices in a way that would make sense.
    DukeArchibald
    User
    • Total Posts : 0
    • Reward points: 0
    • Joined: 2020/01/27 19:00:59
    • Status: offline
    Re: Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice 2020/04/16 21:09:14 (permalink)
    thanks for the quick response
    2 more question:
     
    I need to have numbers formated like this #-## for example 0-08 or 2-28 be read as zero-zero-eight or two-twenty eight, how should I do that ? I tried <say-as number:cardinal>0-08</say-as> and <say-as number:ordinal>0-08</say-as> but both gave me zero-oh-eight
    and I don't really know were to put this one: for the text to speech saving file, can we have an option to save the filename as the table of content chapter name
    Admin
    Administrator
    • Total Posts : 275
    • Reward points: 0
    • Joined: 2010/11/22 00:00:00
    • Location: USA
    • Status: offline
    Re: Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice 2020/04/17 05:14:07 (permalink)
    I really don't know which TTS voices support say-as SSML tag. I tried here Google TTS voices, and they would not make any sound if I embedded say-as tag you specified. Weather a TTS engine says "oh" or "zero" depends only on that TTS engine code, not @Voice app. In @Voice speech replacements you would have to substitute 0 number with "zero" word to make sure any engine says "zero" there. If you want to hear the word "zero" for each 0 that appears at the beginning of number, you could use RegEx replacement like:
     
    Type: RegEx
    Pattern: \b0(\d*)
    Replace: zero $1
     
    If you wanted a group of two zeroes at the start of number said as "zero zero", you would have to add this replacement _before_ the one listed above:
     
    Type: RegEx
    Pattern: \b00(\d*)
    Replace: zero zero $1
     
    etc., for example if you wanted three zeroes replaced that way, again on the replacement list before them should be "\b000(\d*)" and "zero zero zero $1"
    DukeArchibald
    User
    • Total Posts : 0
    • Reward points: 0
    • Joined: 2020/01/27 19:00:59
    • Status: offline
    Re: Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice 2020/04/17 07:18:56 (permalink)
    ok, thank you for the regex support :)
    lily
    User
    • Total Posts : 0
    • Reward points: 0
    • Joined: 2021/01/23 06:01:41
    • Status: offline
    Re: Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice 2021/01/23 07:43:14 (permalink)
    I tried the comma pause with google & samsung's tts but they both end up reading out "less than speak" (I didn't add this) and "less than break time"
     
    Would you happen to know what I'm doing wrong?
    Admin
    Administrator
    • Total Posts : 275
    • Reward points: 0
    • Joined: 2010/11/22 00:00:00
    • Location: USA
    • Status: offline
    Re: Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice 2021/01/23 08:03:25 (permalink)
    UPDATE a few minutes after the original post: Did some testing right now, and found out that Google Text-to-speech voices no longer support SSML, or at least don't support <break> tag (I did not test others), or at least don't support them in Spanish language, as that was the ebook I was currently listening to. Instead it reads the tags literarily, as you said. As for Samsung TTS voice, it did not read the tags, but ignored the pause. The only voices that interpreted this tag correctly was Acapela TTS voice that I had installed on my phone.
     
    Google cloud voices (Standard and WaveNet) do interpret <break> SSML tag correctly.
     
    --- Previous answer ---
     
    To answer I would have to know _exactly_ what you entered into this speech replacement, and what other speech replacements you have, as they may modify text in unintended ways, if used incorrectly. Try to disable ALL speech replacements that you have, except the one for comma pause, then edit again the comma pause speech replacement and enter EVERYTHING exactly as advised in the first post here, and making sure there are no extra spaces etc.
     
    If you need more help afterwards, export all your speech replacements from the menu of "Edit speech" screen, then send me the exported files as email attachments, and describe the problem again in the body of the email message, as I get lots of emails and won't remember why this was sent to me otherwise.
    post edited by Admin - 2021/01/23 08:22:25
    apukwa
    User
    • Total Posts : 9
    • Reward points: 0
    • Joined: 2013/10/25 00:45:26
    • Status: offline
    Re: Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice 2021/04/08 13:09:56 (permalink)
    On the same topic... I'd like to try to use SSML or the new Change Voices feature (neato, by the way, thanks for that!) to do the following:
     
    1. During "quoted text", SSML emphasis moderate (or pitch +.25). 
      • Challenge:  I can detect quotes fine... but how do I know if it's a starting quote or an ending quote (or, put another way, how do I know if I'm already at SSML emphasis moderate and should switch back)?
      • My plan if I can't determine this is to switch back to emphasis=none at the end of every sentence.
    2. During italicized text, SSML emphasis reduced (or pitch -.25).
      • Challenge:  How do I detect start or end of italics?
    3. During bolded text, SSML emphasis strong
      • Challenge:  How do I detect start or end of bold?
     
    I should mention that 99% of the time we're using ePubs, occasionally mobi.

    Let me know your thoughts and, again, thanks so much for all you do with this app.  It's definitely the most used app for both myself and my wife, by far.  I calculated that I read more than a million words a month with it!
     
     
    Admin
    Administrator
    • Total Posts : 275
    • Reward points: 0
    • Joined: 2010/11/22 00:00:00
    • Location: USA
    • Status: offline
    Re: Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice 2021/04/08 13:44:00 (permalink)
    The biggest challenge is that @Voice speech replacements work only within one sentence context (what @Voice highlights in yellow while it speaks. If your beginning quote is in one sentence, then there are multiple sentences inside that quote, any RegEx expression would not work. Similarly for italicized or bolded text, and here there are additional challenges, as styles of text may be defined in different ways in HTML code, for example: <b>Bolded text</b>, but it can be also <emphesis>Bolded text</emphesis>, or it can be done with CSS styles, like <p style="arbitrary-name">...</p> and so on. Similar for italics.
     
    For "quoted text" also different styles of quote characters exist that would have to be considered. Some of them have distinct opening quote and closign quote characters, others use the same character for both.
     
    For this purpose, rather a dedicated text processing program would have to be created, or maybe there are similar programs already existing that could be adapted.
     
    For the simplest application, lets consider that the regular double quote character that you may type from the keyboard is used, and that BOTH the opening and closing double quote is present within the same @Voice "sentence" (yellow highlight). In this case the following RegEx replacement would work:
     
    Type: Regular Expression (RegEx)
    Pattern: (.*)(\".*?")(.*)
    Replace: $1 {{VoiceName#p=1.25}} $2 {{VoiceName}} $3
     
    The stuff in parenthesis are RegEx "capture groups" and you can refer to them with $number, so above $1 is what was in the first capture group (.*), $2 is the parenthesis and what was inside of them, and $3 is the trailer. This would replace this sample sentence:
     
    This is "a test of" double quotes.
     
    with:
     
    This is {{VoiceName#p=1.25}} "a test of" {{VoiceName}} double quotes.
     
    And note it would work correctly only for the first group of quoted text within that sentence. You could maybe extend this to two or max 3 groups within a sentence max.
     
    Greg
    apukwa
    User
    • Total Posts : 9
    • Reward points: 0
    • Joined: 2013/10/25 00:45:26
    • Status: offline
    Re: Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice 2021/04/08 15:05:21 (permalink)
    Thanks so kindly Greg!  You're always so helpful.
     
    For ePub / mobi, etc.  is there a standard for italics & bolding?  Or do you translate everything into HTML?
     
    What do you suggest the best way be to view the uninterpreted source so I can check out what to replace?
    Admin
    Administrator
    • Total Posts : 275
    • Reward points: 0
    • Joined: 2010/11/22 00:00:00
    • Location: USA
    • Status: offline
    Re: Using SSML (Speech Synthesis Markup Language) to edit speech in @Voice 2021/04/08 16:30:31 (permalink)
    EPUB and MOBI formats _are_HTML, internally. If you want to examine the internals of an EPUB, just open the EPUB file with any ZIP/UNZIP app or program and you will see internal files and folders in it, with .html or .xhtml contents of the ebook chapters. EPUB file is simply a ZIP file with a different extension, and some standarized contents internally.
     
    For MOBI it's not that simple, but @Voice converts it to EPUB anyway to read it. You could copy the converted EPUB file from @Voice eBooks folder, or use e.g. Calibre program on a computer to convert them to EPUB.
     
    Greg
    Jump to:
    © 2024 APG vNext Commercial Version 5.1