Reply to post

Helpful ReplyReading Hidden Codes in Epubs and unvoiced words bestween<>

Author
reyanoo
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2018/10/18 09:58:47
  • Status: offline
2018/10/18 20:16:43 (permalink)

Reading Hidden Codes in Epubs and unvoiced words bestween<>

Hi,
 
This is my first time posting here, and i hope that i am seeking for the correct answer in the correct forum, please bare with me as these maybe some very stupid questions.
 
Some times when i try to play epub books (only EPubs) in @voice, it reads the hidden codes behind the text style..
Example: 
Actual Text    With that in mind, I readily agreed to go without hesitation 
How it reads   With that in mind, I readily agreed to font style equals 2 binds slash slash go without hesitation 
 
As a temp solution i copy and past the chapters in .txt and save as plain text for it to only read the visible text and it takes only small effort but it is annoying when it happens when listening while driving.
So i would like the for tts to read only the visible text from the epub as shown on the screen.
 
another small thing: most of the time the tts doesnt read words between <>. i say most of the time because sometimes it reads it if its <two words> or more
Example:
Actual Text    (and the man was going to <Paris> after he finishes the job)
How it reads  (and the man was going to  after he finishes the job)
 
There's probably a very easy solution to these issues that eludes me giving my ignorance, so please be kind :)
 
Thank you in advance
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2018/10/18 20:45:44 (permalink)
Hi,
thanks for writing! This usually happens when there is an HTML encoding error in epub file. However to tell more, I would need to have that epub file and examine the code myself. If during such examination I'll find an error in @Voice code that I could fix, or even it it's not an error in my code, but I can somehow automatically correct error in HTML encoding, I'll certainly do this.
If you legally can, please email me this ebook file as email attachment, or send me a link to download or purchase this book. Also indicate where exactly the error happens, in which chapter and the place within that chapter. Thank you!
 
Greg
reyanoo
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2018/10/18 09:58:47
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2018/10/18 21:17:58 (permalink)
Hi,
 
Thanks for the quick response,
Here is a link to a novel that you will find the mentioned error in first page (Title Page) and also in some areas in the middle of the chapters. 
Link to GDrive
reyanoo
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2018/10/18 09:58:47
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2018/10/20 05:49:45 (permalink) ☄ Helpfulby Rofa 2019/10/17 12:34:21
hi Greg,
 
Did you find a solution to my issues?
 
I also would like to add that i never had the HTML CODE reading in epubs problem before the latest update.. as  yesterday i tried some of the old epubs that never had the problem, and it has the same issue now..  hope this helps
post edited by reyanoo - 2018/10/20 06:04:05
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2018/10/20 09:15:45 (permalink)
I barely got the file right now. Listened to the title page, all was read correctly. The text on that page is:
Evil God
Arc 1
byAmane Noir
Translation Group: Uncommitted Translations
 
and this is exactly what I hear. No extra "hidden text" between <> characters are heard. Maybe you have used @Voice "Edit speech" feature and entered some speech replacements incorrectly? Please disable them all and try to listening to the same page again, then enabling one by one you could identify the offending replacements and fix or delete them.
 
Greg
reyanoo
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2018/10/18 09:58:47
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2018/10/20 20:00:32 (permalink)
Yep, You are absolutely right.
The faulty was i replaced both <> symbols with  , and it stared the whole issue.
[<font]After deleting both enteritis it worked like a charm.
 
[<font]Thanks Greg.
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2018/10/23 18:17:26 (permalink)
Yes, epub files are HTML or XHTML files embedded inside a ZIP file (with extension renamed to .epub). Where you actually see a < character on the screen, in HTML source code it has to be encoded as:
 
&lt;
 
and > as
 
&gt;
 
If you need to replace them, you would need to replace these codes instead of literal < or > characters. It's best to double-tap a sentence where they occur in your ebook, then go to "Edit speech" screen, add a new replacement or edit and existing one. You will then see the current sentence that you double-tapped at the bottom of screen as "sample text", with all the codes visible. After you type your replacement, you may press the "TEST: ORIGINAL" button there and see how your replacement worked, what will be sent to the speech engine. You may also press the round loudspeaker button there and listen to that sample.
reyanoo
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2018/10/18 09:58:47
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2018/11/12 14:27:19 (permalink)
Very helpful.
Thanks a lot Greg,
Rofa
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2019/10/15 15:33:56
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2019/10/16 01:58:42 (permalink)
Dear Greg,
I have a similar problem, but it is more serious.
The program always reads all the HTML code that is hidden in the text.
It reads only correct, if I delete all the formation of the text first.
How can I solve this problem?
Thanks in advance.
Rofa
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2019/10/16 04:23:20 (permalink)
By "delete all the formation of the text", do you mean speech replacements under @Voice Settings menu - Edit speech function? If so, one or more of your speech replacements are incorrect and cause this problem. Disable them all, test reading. If it's OK, you may try re-enable them one by one or in smaller groups and test each time, until you identify offending entries. Delete the problematic entries completely.
Rofa
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2019/10/15 15:33:56
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2019/10/17 13:03:31 (permalink)
No, I do not mean speech repalacements, but if the text is bold or italic or it is a headline and so on the problem occurs.
I have no problems with pure text.
But now the problem is less serious after I discovered your advice to reyanoo.
I wrote the code [<font] into the speech replacements and it is much better now.
Here is an example:
 
This is the text I would like to be read:
70.218 Frauen und Mädchen in Deutschland sind beschnitten, das geht aus der Dunkelzifferstatistik hervor, die Terre des Femmes – Menschenrechte für die Frau zum Internationalen Mädchentag am 11. Oktober veröffentlichte.

And this is the text the program actually reads:
70.218 Frauen und Mädchen in Deutschland sind beschnitten, das geht aus der <a href="https://www.frauenrechte.de/images/downloads/presse/dunkelzifferstatistik/FGM-C-Dunkelzifferstatistik-2019.pdf"; class="avar_hl_snt" style="color: rgb(0, 0, 0);">Dunkelzifferstatistik hervor, die Terre des Femmes – Menschenrechte für die Frau zum Internationalen Mädchentag am 11. Oktober veröffentlichte.
 
Greetings from Rofa
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2019/10/18 04:14:01 (permalink)
The first test to make on your device with this file: open the Settings menu in @Voice, press "Edit speech". Look at the list of speech replacements. If there are any replacements there defined at all, press menu - "Disable all". Then go back to that text and try reading aloud again.
 
If the above does not help, I could only advise further on this, if you send me the ORIGINAL file or link on which this problem happens. It may be that the original text (whatever it is, a web page, an ebook chapter or something else) has an error in HTML formatting.
 
Rofa
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2019/10/15 15:33:56
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2019/10/18 13:26:53 (permalink)
I disabled all my 21 speech replacements and it looks as if this was the solution.
Thank you very much!
Rofa
post edited by Rofa - 2019/10/18 13:30:01
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2019/10/18 15:51:36 (permalink)
Thank you. If you want to recover the "good" speech replacements from your list, you could try to re-enable them one by one or in smaller groups, and testing reading aloud, until you identify the offending replacement(s) and delete or fix them...
CorbettMD
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2021/08/28 10:59:32
  • Location: Toronto, Canada
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2022/09/14 13:47:04 (permalink)
Hello! Useful post. I have a more subtle puzzle to solve. I use a lot of regex to clean up text and remove citations... But for .epub articles from some journals, the HTML often found inside inline citations messes up regex matching. Here is an example article and my settings files: https://drive.google.com/...qusbueG8KG1mkZIKw6pD_P
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
Re: Reading Hidden Codes in Epubs and unvoiced words bestween<> 2022/10/02 20:31:12 (permalink)
This RegEx will match anything written between <sup> and </sup> tags:
 
Type: RegEx
Pattern: <sup\b.*>.*?</sup>
Replace:                     (whatever you want or empty to suppress)
 
Greg
Jump to:
© 2024 APG vNext Commercial Version 5.1