Reply to post

Hot!Working with quoted text and RegEx

Author
DJugs
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2020/04/20 11:43:22
  • Status: offline
2020/04/21 09:50:40 (permalink)

Working with quoted text and RegEx

Greetings.
 
I have an Ivona voice installed. It is the only TTS system that I've encountered that allows me to use SSML to change the tonal characteristics of the voice (pitch, rate, etc). I've taken advantage of this by using "Edit speech" to substitute opening and closing quotation marks with <prosody> tags to alter the pitch of the voice when the quote is rendered by the TTS engine. I do this to differentiate the quoted speaker's voice from the author's voice and to add some dynamism to the otherwise monotonous nature of TTS voice readings, it was exciting to have that experience. I've encountered a re-occurring problem, however. I can effectively do this hack only if the quoted text is a single phrase or sentence because data is sent to the TTS engine (and hence the RegEx engine) one sentence at a time. If there is quoted text that includes multiple sentences, the <prosody> tag that replaces the opening quotation mark is lost at the completion of the initial sentence, which is spoken at a higher pitch, but all subsequent sentences are then spoken at the default pitch; of course, because they are sent without <prosody> tags. I have no way of capturing these sentences to place them in <prosody> tags. I have conjured a hack for quotes that include only two sentences by capturing the opening and closing quotation marks separately with two "Edit speech" entries, but if there are more than two sentences, I lose the sentences in the middle.
 
So, my question is, can you develop a way to provide the option to send an extended quote to the engine as a single block, instead of splitting it into individual sentences? I'm hardly a programmer, but it's apparent that the current delimiter for what chunks are sent to the TTS engine is the sentence-ending period (.). Setting the delimiter to quotation mark pairs seems like it would be simple enough to code. Or, maybe even better, allow users the ability to specify the delimiters for the chunks that are sent to the TTS engine ourselves via RegEx.
 
I've seen articles written where quoted text included the same quotation marks twice, for e.g. “The quick “brown” fox jumped...” instead of the appropriate syntax which I believe should be “The quick ‘brown’ fox jumped...”. The former would likely cause a problem if all characters between (“) and (”) were sent to the engine since that string would then be “The quick “brown”. I must also mention the fact that some articles use straight quotes (" ') and not curled quotes (“ ‘). Allowing users to set our own delimiters with RegEx is the only solution I can see unless you can conjure an all condition RegEx code and give us the option to activate it as a delimiter.
 
Please let me know what you think of this.
 
Best regards!
post edited by DJugs - 2020/04/21 09:57:37

4 Replies Related Threads

    Admin
    Administrator
    • Total Posts : 275
    • Reward points: 0
    • Joined: 2010/11/22 00:00:00
    • Location: USA
    • Status: offline
    Re: Working with quoted text and RegEx 2020/04/21 11:51:37 (permalink)
    The chunks sent to TTS engine must be of limited length. Some TTS engines will not accept anything longer than about 500 characters, a few maybe will allow up to 2000 characters, but not much beyond that. The app must use the smallest reasonable limit. It still could process longer chunks of text with RegEx (e.g. have the RegEx replacements work on paragraphs, not sentences), but it would take even longer time to process each fragment of text before sending it to the TTS engine (first get the paragraph, do the replacements in it, then split it back into sentences etc. etc.) What you try to do would probably be best done in a separate, text processing program that reads one file, adds the speech commands to it, outputs it as a new file, and then that new file is taken for reading aloud...
     
    I may one day attempt to add paragraph speech replacements, but don't know if the results will be good. And frankly, I'm running out of time, patience and energy for all this work, particularly in the light of all the bad reviews and hostility I get in Google Play.
     
    Greg
    reyanoo
    User
    • Total Posts : 0
    • Reward points: 0
    • Joined: 2018/10/18 09:58:47
    • Status: offline
    Re: Working with quoted text and RegEx 2024/01/15 17:36:36 (permalink)
    Hi Guys,

    Iknow this way too late, but can you share the exact expression where I can copy directly to @voice edit speach?

    It would be great if you share:
    The between "" expression.
    And the one that starts with "
    And the one that ends with "

    Very greatful for your big brains for figuring this out.
    Admin
    Administrator
    • Total Posts : 275
    • Reward points: 0
    • Joined: 2010/11/22 00:00:00
    • Location: USA
    • Status: offline
    Re: Working with quoted text and RegEx 2024/01/18 08:27:53 (permalink)
    Hi,
    it would be best these days to contact me by email, I stopped checking this forum for new posts... These expressions would be tricky, as there are different double-quotation characters in Unicode, and the book you're reading may not use the " character you may type from the keyboard.
     
    To capture text enclosed in double quotes:
    "(.*?)"
    and $1 capture group would give you what's inside of "", without the quote marks, $0 all with the quote marks.
     
    This would capture the last " and everything until the end of sentence:
    "([^"]+)$
    $0 and $1 as before
     
    This would capture text from the beginning of line until the " character, but only if there is no other " in that sentence:
     
    ^([^"]*)"[^"]*$
    $0 is the entire sentence, $1 the text before ".
    Admin
    Administrator
    • Total Posts : 275
    • Reward points: 0
    • Joined: 2010/11/22 00:00:00
    • Location: USA
    • Status: offline
    Re: Working with quoted text and RegEx 2024/01/26 16:48:31 (permalink)
    Note to anyone reading this thread - the original poster intent was to change the pitch or the sound of voice for dialog portions in a novel. @Voice "Edit speech" feature is not good for this purpose, and these edits work only within one sentence, and dialog passages spoken by any character may contain multiple sentences.
     
    We later collaborated on creating voice changes with a different mechanism, by editing the text of entire paragraphs, and this worked reasonably well and yielded pretty good results. Paragraph text editing or filtering is not part of @Voice app user interface, but can be done with special "filter" files. If anyone is interested in this, contact me by email, listed on @Voice's About screen.
     
    Greg 
    Jump to:
    © 2024 APG vNext Commercial Version 5.1