Working with quoted text and RegEx
Greetings.
I have an Ivona voice installed. It is the only TTS system that I've encountered that allows me to use SSML to change the tonal characteristics of the voice (pitch, rate, etc). I've taken advantage of this by using
"Edit speech" to substitute opening and closing quotation marks with
<prosody> tags to alter the pitch of the voice when the quote is rendered by the TTS engine. I do this to differentiate the quoted speaker's voice from the author's voice and to add some dynamism to the otherwise monotonous nature of TTS voice readings, it was exciting to have that experience. I've encountered a re-occurring problem, however. I can effectively do this hack only if the quoted text is a single phrase or sentence because data is sent to the TTS engine (and hence the RegEx engine) one sentence at a time. If there is quoted text that includes multiple sentences, the
<prosody> tag that replaces the opening quotation mark is lost at the completion of the initial sentence, which is spoken at a higher pitch, but all subsequent sentences are then spoken at the default pitch; of course, because they are sent without
<prosody> tags. I have no way of capturing these sentences to place them in
<prosody> tags. I have conjured a hack for quotes that include only two sentences by capturing the opening and closing quotation marks separately with two
"Edit speech" entries, but if there are more than two sentences, I lose the sentences in the middle.
So, my question is, can you develop a way to provide the option to send an extended quote to the engine as a single block, instead of splitting it into individual sentences? I'm hardly a programmer, but it's apparent that the current delimiter for what chunks are sent to the TTS engine is the sentence-ending period (.). Setting the delimiter to quotation mark pairs seems like it would be simple enough to code. Or, maybe even better, allow users the ability to specify the delimiters for the chunks that are sent to the TTS engine ourselves via RegEx.
I've seen articles written where quoted text included the same quotation marks twice, for e.g.
“The quick “brown” fox jumped...” instead of the appropriate syntax which I believe should be
“The quick ‘brown’ fox jumped...”. The former would likely cause a problem if all characters between (“) and (”) were sent to the engine since that string would then be
“The quick “brown”. I must also mention the fact that some articles use straight quotes (" ') and not curled quotes (“ ‘). Allowing users to set our own delimiters with RegEx is the only solution I can see unless you can conjure an all condition RegEx code and give us the option to activate it as a delimiter.
Please let me know what you think of this.
Best regards!
post edited by DJugs - 2020/04/21 09:57:37