Reply to post

Date interpretation help

Author
laLibrarian
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2021/03/20 09:05:09
  • Status: offline
2021/03/20 12:34:13 (permalink)

Date interpretation help

I really appreciate @Voice for reading epubs (pro license purchased) -- thanks! The speech edit feature is usually great.
 
But I'm stuck: I'm 'reading' a book that starts every chapter with an American format date. As long as the day of the month is >12, the dates read properly: 1/13/1959 is said as "13th of January 1959". But if the day of the month is ≤12, the dates are read in European order: 1/3/59 is  "first of March" instead of "third of January", and 2/8/1959 is "second of August" instead of "8th of February".
 
My RegEx skill is minimal and I'm unfamiliar with SSML (I know HTML). I could use some help with the formulas for replacing these dates.
 
I'm trying to match on \b(?<day>\d{1,2})/(?<month>\d{1,2})/(?<year>\d{2,4})\b — which seems to work (found at microsoft site).
 
And for replacements I've tried mm/dd/yyyy, <mdy>, $M $D $Y, and '[Month][Date Field Separator][Day][Date Field Separator][Year]'. The SSML documentation shows <say-as interpret-as="date" format="dm">4/6</say-as> as a formula, but that's too specific for this use case. Using <say-as interpret-as="date" format="mdy"> as a replacement formula did not work.
 
I've spent almost 3 hours on this already, please help ...
 
p.s. I don't see a way to search the forum for keywords, am I missing something?

4 Replies Related Threads

    Admin
    Administrator
    • Total Posts : 275
    • Reward points: 0
    • Joined: 2010/11/22 00:00:00
    • Location: USA
    • Status: offline
    Re: Date interpretation help 2021/03/20 14:56:17 (permalink)
    I would not put much faith into SSML documentation - it may say one thing, but what the actual TTS voice in use understands and implements is another thing. Also, some voices will say the date formatted like this correctly, at least if your TTS voice is set to American English. For example, Google TTS American English voices say them correctly in my tests.
     
    Otherwise, what would you like the TTS voice to actually read aloud? When you encounter e.g.:
     
    10/13/1959
     
    Would it be OK if it reads "ten thirteen, nineteen fifty nine"? Most people in America would understand it perfectly. Or do you want the app to say precisely: "thirteenth of October, nineteen fifty nine" or "October thirteen, nineteen fifty nine"?
     
    If the first option above is OK, then a simple substitution like:
     
    Type: RegEx
    Pattern: \b(\d{1,2})/(\d{1,2})/(\d{2,4})\b
    Replace: $1 $2, $3
     
    is all you need. If you want the name of the month to be said aloud, again we don't need SSML (which may be not implemented fully, or implemented incorrectly), but instead we would need to create 12 substitutions like:
     
    Type: RegEx
    Pattern: \b(1|01)/(\d{1,2})/(\d{2,4})\b
    Replace: January $2, $3
     
    Type: RegEx
    Pattern: \b(2|02)/(\d{1,2})/(\d{2,4})\b
    Replace: February $2, $3
     
    ...
     
    Type: RegEx
    Pattern: \b(12)/(\d{1,2})/(\d{2,4})\b
    Replace: December $2, $3
     
    If the TTS voice you are using says years like "one thousand nine hundred fifty nine" and you want "nineteen fifty nine", then we need to modify the year part - here I'm giving only the simple numeric replacement:
     
    Type: RegEx
    Pattern: \b(\d{1,2})/(\d{1,2})/(\d{2})(\d{2}){0,1}\b
    Replace: $1 $2, $3 $4
     
     
    laLibrarian
    User
    • Total Posts : 0
    • Reward points: 0
    • Joined: 2021/03/20 09:05:09
    • Status: offline
    Re: Date interpretation help 2021/03/20 16:02:50 (permalink)
    Thank-you. The Ivona voice I'm using is already saying the name of the month. But I'd be OK with your first suggestion.
    Admin
    Administrator
    • Total Posts : 275
    • Reward points: 0
    • Joined: 2010/11/22 00:00:00
    • Location: USA
    • Status: offline
    Re: Date interpretation help 2021/03/20 17:26:03 (permalink)
    In this case it's an Ivona TTS bug - interpreting dates as either American or European standard, depending on the numeric value of the day part in MM/DD/YYYY format. However if you apply any of my suggestions in the previous post, Ivona date processing won't be used and it will just say the numbers (or month names) as instructed by the replacement definitions.
     
    Greg
    laLibrarian
    User
    • Total Posts : 0
    • Reward points: 0
    • Joined: 2021/03/20 09:05:09
    • Status: offline
    Re: Date interpretation help 2021/03/21 09:31:47 (permalink)
    It is definitely an Ivona thing, Greg — thanks for helping me work it out.
     
    The Ivona engine does not like the simple version. But the month-by-month setup does work, although only after I exited and then restarted the @Voice app. (I actually didn't think it was working, closed the app to do other things, and then after I opened it again it was working properly.)
     
    I'll try to add screenshots of what I did for anyone else who encounters this issue, but here's the example:
    Type: RegEx
    Pattern: \b(2|02)/(\d{1,2})/(\d{2,4})\b
    Replace: $2 February $3
    Ivona actually reads 2/4/59 as "the fourth of February, fifty-nine" using these parameters – so it's still converting the day of the month to an ordinal and adding "of". Whoever wrote the Ivona code had an opinion about dates.
     
    Anyway, it is working well for me, now -- thanks again!

    post edited by laLibrarian - 2021/03/21 09:51:05
    Jump to:
    © 2024 APG vNext Commercial Version 5.1