Reply to post

Helpful Reply[FAQ]Silencing citations in scientific papers

Author
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
2017/09/20 19:47:13 (permalink)

Silencing citations in scientific papers

The regular expression (RegEx) given below will make @Voice app skip reading aloud citations in parenthesis often found in scientific papers, formatted similar to these:
(Derwing, Rossiter, & Munro, 2002; Thomas, 2004)

(American Psychological Association, 1993; Arredondo et al., 1996; Council of National Psychological Associations for the Advancement of Ethnic Minorities, 2000; Cross, Bazron, Dennis, & Isaacs, 1989; Dulles Conference Task Force, 1978; C. Hall, 1997; Korman, 1974; Marsella, 1998; President’s Commission on Mental Health, 1978; Ridley, Mendoza, Kanitz, Angermeier, & Zenk, 1994; D W Sue, Arredondo, & McDavis, 1992; D W Sue, Bingham, Porche-Burke, & Vasquez, 1999; D W Sue et al., 1982; D W Sue, Carter, et al., 1998)
Here are instructions on how to create a speech replacement to skip these citations when reading aloud:
 
In @Voice app, press menu (3 vertical dots button top right of screen), press Settings - Edit speech. Press the [+] button on top to create a new speech replacement. Then enter the following:
 
Replace type:   RegEx
Pattern:   
(\([^)]*,\s*\d\d\d\d(\)|(;[^)])*)\))|(\([^)]*,\s*\d\d\d\d(\)|(;[^)]*))$)|(^[^()]*,\s*\d\d\d\d(\)|(;[^)])*)\))|(^\s*\d\d\d\d\))
Replace with:  - leave empty for silence
 
Note - the "Pattern" field content should be all in one line. It is very complex and every character matters, so it would be best to open this page on your Android device in a web browser, copy the "Pattern" content starting from the very first "(" character and up to and including the last ")". Then switch to @Voice app, edit speech and paste this expression into the "Pattern" field.
 
If you want, instead of leaving the "Replace with:" field empty, you can enter there some other text, for example "(citations)". In this case you will hear one word "citations" instead of the long list of names and years of publications.
 
More, background info:
You might ask, why is this expression so long and complex? The “Edit speech” feature in @Voice works within one “sentence” only, or rather a fragment of text that @Voice sends as one piece to TTS engine. These pieces are normally sentences, but sometimes sentence splitting cannot distinguish between a dot after an abbreviation that @Voice app does not know, and a dot ending a sentence. Further, some TTS engines limit the maximum length of text to about 500 characters, so if a sentence is longer, I have to split it.
 
Some citations are longer than 500 characters, so Edit speech cannot handle them well. Still I managed to build the above Regular Expression (RegEx) that matches either a complete citation within one phrase (one yellow highlight in @Voice), or a partial one that is broken off at the end of the phrase, or continues from the previous phrase and ends in the current one. 
 
Of course, if you manage to improve the above RegEx replacement or have better ideas, please share them in this thread by posting below. Happy and productive listening!
 
Greg
post edited by Admin - 2017/09/20 19:53:43
leopoldocosta
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2018/10/26 05:56:15
  • Status: offline
Re: Silencing citations in scientific papers 2018/10/26 10:02:51 (permalink)
Desta Greg, can u help me understand why it didnt work for the following text?


A desobediência da lei de Deus é pecado (I João 3:4; 5:17) e é o que provoca a separação eterna da presença de Deus (Gên. 2:17; Rom. 6:23).




It keeps reading inside parenthesis
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
Re: Silencing citations in scientific papers 2018/10/28 10:43:43 (permalink)
Well, these citations have different format that does not match RegEx created for scientific papers... The following substitution should work for you:
 
Replace type:   RegEx
Pattern:   \([^)]*?\d+:\d+\)
Replace with:  - leave empty for silence
 
 
Greg
leopoldocosta
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2018/10/26 05:56:15
  • Status: offline
Re: Silencing citations in scientific papers 2018/10/28 11:48:09 (permalink)
Dear Greg,

Thanks for your reply. I tested it but what happend is that it only removed the text after the semicolon. The ones without semicolon were removed correctly, thank you.

Can you help with these with semicolon?

Original:
A desobediência da lei de Deus é pecado (I João 3:4; 5:17) e é o que provoca a separação eterna da presença de Deus (Gên. 2:17; Rom. 6:23).

Modified:
A desobediência da lei de Deus é pecado (I João 3:4; e é o que provoca a separação eterna da presença de Deus (Gên. 2:17;.



Also, some texts inside parenthesis may or may not have dashes, dots and/or semicolon.


In my readings, all texts inside parenthesis can be removed without problem.
post edited by leopoldocosta - 2018/10/28 15:47:55
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
Re: Silencing citations in scientific papers 2018/11/02 08:39:18 (permalink)
Sorry about late answer. For me the RegEx expression works fine for the "(I João 3:4; 5:17)" part, so maybe you mis-typed some characters in the "Pattern" field, or added unnecessary spaces in front, at the end or in the middle of expression "\([^)]*?\d+:\d+\)". There are other problems though - @Voice does not recognize "Gên." and "Rom." as valid abbreviations for Portuguese and ends sentences on each dot there. I must think of some other solutions to the problem of abbreviations...
leopoldocosta
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2018/10/26 05:56:15
  • Status: offline
Re: Silencing citations in scientific papers 2019/01/28 09:58:42 (permalink)
Hi Admin, thanks for the reply.

Did you think of other solutions to the "Gên." and "Rom" abbreviations?
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
Re: Silencing citations in scientific papers 2019/01/28 10:36:38 (permalink)
They could be added to a list of known abbreviations, say for Portuguese language. It could be done as follows:
 
1. create a text file named abbrev-por.txt. the contents of this file should be:
 
Gên.
Rom.
 
It could contain more abbreviations if needed, one per line, similar to the above.
 
2. Save that file into the directory named .config (exactly like this, starting with a dot), under the main data folder of @Voice app. You could find out what this folder is on your device by opening the Setting menu in @Voice, the press "@Voice folder location". Look at "Location:" line near the top, note the folder path, but do not change anything in there, just press the Back button.
 
Note: any file or folder with a name starting with a dot, like .config, is considered a "hidden" file or folder on Android and similar systems. This won't matter if you look at the folders after connecting your phone say to a PC computer, it will show you and let you access all files and folder, including the ones with names starting with a dot.
 
If you want to do this on Android device only, you would need to use a good file explorer app, set to show also hidden files and folders, such as "ES File Explorer" found in Google Play, best the paid version (the free one is not so good).
 
Greg
frozzenLemon
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2022/02/17 14:07:08
  • Status: offline
Re: Silencing citations in scientific papers 2022/02/17 15:50:15 (permalink)
Hello Admin,
First of, I'll like to thank you for helping people like me with understanding and having the option to use @Voice app for listening to reading material. But my question is pertaining to the line of code for "pattern" from the one written above. In my PDFs,  I have some citations with page numbers included, which the speech edit is not effecting. This could be happening because of the period after 'p' but I do not know. I have included a sample from the reading with what I'm talking about.
 
Example:
"Thus, urban space as a whole, and the places that make up urban life,2 are in constant communication and defined by a complex realm of social practices (Chase et al., 2008, p. 6; see also Purcell, 2008).
The everyday experiences of urban space and urban life are essential for considering changing senses of place as these create personal and collective demands on the socio- spatial order. ‘The practices of everyday urbanism ...inevitably lead to change not through abstract political ideologies ...but through specific concerns that arise from the lived experience’ of urban dwellers (Chase et al., 2008, p. 10). However, when it comes to shaping urban space today, economic growth has become the imperative, arguably more so than the lived experience of place."
 
Again, thank you for your time.
Admin
Administrator
  • Total Posts : 275
  • Reward points: 0
  • Joined: 2010/11/22 00:00:00
  • Location: USA
  • Status: offline
Re: Silencing citations in scientific papers 2022/02/17 17:41:20 (permalink) ☄ Helpfulby CorbettMD 2022/09/19 15:41:10
The problem is that I constructed the previous RegEx to end with semicolon ; right after the year, if there were more than one quotation in the parenthesis. However yours has also p. 10 etc. (page numbers). The RegEx expression to include all of this gets horribly complicated... I came up with the following, and it seems to work, but you would also need to teach @Voice to recognize "p." as an abbreviation.
 
First the RegEx - note that the below would have to be typed or pasted into the Pattern field as one line, without any spaces in it:
 
(\([^)]*,\s*\d\d\d\d\,*\s*(p\.){0,1}\s*\d*\s*(\)|(;[^)])*)\))|(\([^)]*,\s*\d\d\d\d,*\s*(p\.){0,1}\s*\d*\s*(\)|(;[^)]*))$)|(^[^()]*,\s*\d\d\d\d\,*\s*(p\.){0,1}\s*\d*\s*(\)|(;[^)])*)\))|(^\s*\d\d\d\d\,*\s*(p\.){0,1}\s*\d*\s*\))
 
To teach @Voice a new English abbreviation (so that it does not end the sentence when it encounters "p."), you would need to create a text file named abbrev-eng.txt, and type into it the abbreviations, one per line. In this case the file could contain only one line:
 
p.
 
Then copy this file to .config folder under @Voice home directory. You could do this copying e.g. on a computer, after connecting a USB cable to the phone, or on the phone using the "File manager" function on @Voice Settings menu.
 
Greg
Juliano Borotto
User
  • Total Posts : 0
  • Reward points: 0
  • Joined: 2021/08/28 03:45:19
  • Status: offline
Re: Silencing citations in scientific papers 2022/12/17 10:45:07 (permalink)
Consegui usando a seguinte expressão: \([^)]+\)
post edited by Juliano Borotto - 2022/12/17 16:03:57
Jump to:
© 2024 APG vNext Commercial Version 5.1