Acoustic voice analysis in the COVID-19 era
Objective. Among the different procedures used by the ENT, acoustic analysis of voice has become widely used for correct diagnosis of dysphonia. The instrumental measurements of acoustic parameters were limited during the COVID-19 pandemic by the common belief that a face mask affects the results of the analysis. The purpose of our study was to investigate the impact of surgical masks on F0, jitter, shimmer and harmonics-to-noise ratio (HNR) in adults. Methods. The study was carried out on a selected group of 50 healthy subjects. Voice samples were recorded directly in Praat. All subjects were trained to voice a vocal sample of a sustained /a/, at a conversational voice intensity, with no intensity or frequency variation, for the Maximum Phonation Time (MPT), wearing the surgical mask and then without wearing the surgical mask. Results. None of the variations in acoustic voice analysis detected wearing a surgical mask and not wearing a surgical mask were statistically significant. Conclusions. Our study demonstrates that the acoustic voice analysis procedure can continue to be performed with the use of a surgical mask for the patient, even during the COVID- 19 pandemic.
During the ongoing COVID-19 pandemic caused by SARS-CoV-2, the World Health Organization and other public health organisations agree that face masks can limit the spread of respiratory viral diseases 1,2. Whether masks are useful depends on the mechanisms for transmission for SARS-CoV-2, which are likely an association of contact, droplet and aerosol modes. Surgical face masks have been in use since the early 1900s to help prevent infection of surgical wounds from staff-generated oral and nasal bacteria 3. Today, applications have evolved from prevention of patient infection to prevention of employee exposure. However, there is ongoing debate about the use of surgical masks as respiratory protection devices 4. For ENT specialists, dysphonia examination by laryngoscopy requires unavoidable contact with the upper airway, and any reflex coughing or sneezing during procedures will cause direct contamination to medical staffs and office workers 5,6. Among the different procedures used by the ENT, acoustic analysis of voice has become widely used for correct diagnosis of dysphonia, but the instrumental measurements of acoustic perturbation was limited during the COVID-19 pandemic by the common belief that a face mask affects the results of the analysis. The purpose of our study was to investigate the impact of surgical mask on F0, jitter, shimmer and harmonics-to-noise ratio (HNR) in adults.
Materials and methods
The study was carried out on a selected group of 50 healthy subjects (20 men and 30 women, mean age 47 years, range 26-69) recruited among hospital staff of the ENT Department of the Polyclinic Hospital in Bari (South Italy).
Participants were approached and informed about the study objectives and significance. All participants who agreed to participate in the study signed an informed consent form, previously approved by the local hospital Ethics Committee.
Inclusion criterion was ability to phonate and sustain a vowel for at least 10 seconds. The participants were excluded if they met any of the following criteria: reporting recent voice problems or a voice disorder history, a condition that might affect the normal voice function, any previous formal voice training or voice therapy, any laryngeal, mouth, or throat abnormality, or any respiratory infection for the last 2 weeks before recording. The subjects who met selection criteria were recruited. The participants were asked to stand in front of a microphone (Samson Meteor Mic - USB Studio Condenser Microphone) at a distance of 20 cm from the lips, in a quiet room (< 30 dB background noise). Voice samples were recorded directly in Praat. All subjects were trained to voice a vocal sample of a sustained /a/, at a conversational voice intensity, always within 55 dB and 65 dB, on average (not including recordings the average intensity of which was out of range), as constant as possible, with no intensity or frequency variation, for the Maximum Phonation Time (MPT), wearing a surgical mask and then without a surgical mask. The vocal parameters analysed with Praat were median pitch, mean pitch, minimum pitch, maximum pitch, number of pulses, number of periods, jitter (local), jitter (rap), jitter (ppq5), jitter (ddp), shimmer (local), shimmer (apq3), shimmer (apq5), shimmer (apq11), shimmer (dda) and mean harmonics-to-noise ratio (HNR).
The results are recorded as average and standard deviation (SD). Results were then submitted to statistical analysis by comparing mean values of each parameter. All parameters were analysed in the same patients during phonation with surgical mask (SM) and without surgical mask (NSM). We used Student’s test with p = 0.05 significance level after evaluating the t value in each parameter.
As illustrated in Table I, the acoustic analysis showed that there was not a significant difference (at the 0.05 level) in median pitch values (Mean SM = 187.36; SD SM = 52.36; Mean NSM = 189.38; SD NSM = 55.52; p = 0.8523) and in the mean pitch values (Mean SM = 183.52; SD SM = 51.13; Mean NSM = 185.52; SD NSM = 55.12; p = 0.8513) in the two different situations (wearing surgical mask – not wearing surgical mask) (Tab. I).
As can be seen in Table II, differences in HNR values were not significant (Mean SM = 20.91; SD SM = 3.44; Mean NSM = 20.92; SD NSM = 3.47; p = 0.9885). At the same time, significant differences were not noticed in jitter or shimmer values (jitter local Mean SM = 0.327; SD SM = 0.134; Mean NSM = 0.298; SD NSM = 0.124; p = 0.2641; shimmer local Mean SM = 3.34; SD SM = 1.420; Mean NSM = 3.165; SD NSM = 1.572; p = 0.5605) (Tabs. III, IV). In conclusion, none of the variations in acoustic voice analysis detected in the same patients with surgical mask and without surgical mask were statistically significant.
Acoustic voice analysis is considered to be a very useful technique for detection of voice disorders that can be detected by analysing several acoustic parameters7. Subjective assessment methods, such as auditory perceptual analysis, largely depend on the experience of professionals, and may lead to different results. This requirement encourages the use of objective measurement of voice. Processing of a speech signal is used to yield a set of voice parameters. It allows detection of vocal fold pathologies, or other related pathologies, by comparing patients’ data with that of other individuals having normal healthy voices 7. Voice disorders require often voice therapy and other treatments that are based on an initial assessment to quantify deviation from normal measures and an ongoing evaluation to record the progress. Measuring treatment outcomes is the basic component of evidence-based practice. The objective assessment of voice, especially acoustic analysis, has received our attention because of its comparatively low cost, ease of application and quantitative output. Previous studies 8,9 have found that fundamental frequency (F0) can be affected by different factors, i.e., age, vocal fold length and language or ethnological background. Until now, no study has investigated the effects of the use of a surgical mask on acoustic parameters. According to previous studies, one of the most investigated voice acoustic parameters has been voice perturbation 10,11. Subsequently, we investigated parameters such as F0, jitter, shimmer and harmonics-to-noise ratio (HNR) during phonation wearing surgical mask and then not wearing surgical mask. The fundamental frequency or mean pitch (F0) of a speech signal refers to the approximate frequency of the (quasi-)periodic structure of voiced speech signals. Jitter (%) is defined as cycle-to-cycle and short-term perturbation in the fundamental frequency of the voice. The shimmer (%) is a cycle-to- cycle, short-term perturbation in the amplitude of voice. Another acoustic parameter (HNR) is influenced by both the shimmer and jitter and referred to as the mean ratio of harmonics to non-harmonics 12.
In accordance with such a high risk of infection, only emergency consultations and procedures should be performed by ENT specialists during the COVID-19 pandemic in areas with confirmed SARS-CoV-2 cases 13. In China, Cheng et al. noted that the rate of work-related SARS-CoV-2 infection was higher among ENT specialists that in other medical specialties 14. During the lockdown of the population in Italy, ENT activities were reduced to emergency treatments and those that could not be deferred without constituting a real loss of chance for the patient’s recovery or survival. ENT specialists are exposed to SARS-CoV-2 infection because of the necessity to examine the upper respiratory tract. At the same time, they perform procedures that generate aerosolised secretions and often bleeding 15. In the study by Krajewska et al. 16 ENT units are important for preoperative testing for SARS-CoV-2: this should be performed in all individuals undergoing high-risk procedures. The authors also assert that chest CT should be performed in patients before ENT interventions, because it could be of great value in individuals with negative RT-PCR.
According to Tysome et al., high-risk procedures must be performed using enhanced personal protective equipment 17. As highlighted by Lescanne et al. 18, during ENT examinations or procedures that not need exposure to projection/aerosolisation of organic material of human origin, the ENT medical team should wear clean outfits as well as single-use gloves in case of contact with a mucosa. If worn properly, a face mask is a disposable device that is used to help block large-particle droplets, sprays, splashes, or splatters that may contain viruses and bacteria. It is used to create a physical barrier between the potential contaminants in the immediate environment and the mouth and nose of the wearer and it is also useful to block saliva and respiratory secretions from the wearer to another 19. In our study, the surgical masks used were three-ply. This three-ply material is made up of a melt-blown polymer, most commonly polypropylene, placed between non-woven fabric. For examinations and procedures with exposure to projection/aerosolisation of organic material of human origin, protection must be supplemented by wearing a surgical mask, protective goggles, a single-use plastic apron and single-use gloves. Insofar as an asymptomatic patient may be infectious, the same precautions must be employed whether the patient is ill with, suspected of having, or without any clinical evidence of COVID-19 infection 20. After the examination, the professional must carefully disrobe in compliance with hygiene rules, with the immediate elimination of gloves, hair cap, mask and gown. The room where the examination is carried out must undergo air renewal as per legislation 20. Most of these best practice recommendations are not based on scientific data established for the COVID-19 infection, but come from what is known about other viral respiratory infections.
For ENT specialists, voice acoustic analysis is a very valuable technique for voice disorders diagnosis and therapy monitoring 21. Speech signal processing allows the extraction of a set of voice parameters that may be used to diagnose many pathologies of the vocal cords in individuals by comparison with healthy voice. The parameters obtained by the acoustic analysis have the advantage of describing the voice objectively rather than subjective perceptual analysis, and they represent a useful method to objectify the dysphonia, even in the pandemic period. The use of the surgical mask provides the patient and operator with the right protection necessary to perform this procedure, and at the same time it does not involve important alterations of the vocal parameters to be analysed. Several types of software have been developed for acoustic analysis, namely, Praat 22, LingWAVES 23, Multidimensional Voice Program 24 etc. The current study used Praat (version 6.1.16) for voice analyses, which is a computer software package for speech, phonetic and voice analysis. It was first designed in 1992 by Paul Boersma and David Weenick from the Institute of Phonetic Sciences, University of Amsterdam. Praat can be used on various operating systems and uses the finest algorithms including the most accurate algorithm of pitch analysis, articulatory synthesis and gradual learning algorithm for free variation. We used the inbuilt option of voice report in Praat pulses menu, which includes pitch and perturbation analyses. In particular, the voice samples collected for perturbation measures were analysed by selecting the middle 3 seconds from the sound wave. Each acoustic signal was perceptually examined for instability and visually displayed using Praat with an oscillogram and “Show intensity” and “Show pulses” settings. We acoustically analysed the voice samples recorded by each participant wearing and not wearing the surgical mask in order to find objective voice measurements including the F0, jitter, shimmer, and HNR. The statistical comparison carried out between the parameters extracted with and without surgical mask did not reveal any significant differences that would lead to an avoidance of the procedure for health safety reasons.
Excluding positive COVID-19 cases for which the use of more adequate protective devices is necessary, our study demonstrates that the acoustic voice analysis procedure can continue to be performed with the use of surgical mask for the patient during the COVID-19 pandemic.
Figures and tables
|Median pitch (Hz) SM||Median pitch (Hz) NSM||Mean pitch (Hz) SM||Mean pitch (Hz) NSM||Minimum pitch (Hz) SM||Minimum pitch (Hz) NSM||Maximum pitch (Hz) SM||Maximum pitch (Hz) NSM|
|T-test||p = 0.8523||p = 0.8513||p = 0.4549||p = 0.8986|
|Number of pulses SM||Number of pulses NSM||Numbers of periods SM||Numbers of periods NSM||Mean HNR (dB) SM||Mean HNR (dB) NSM|
|T-test||p = 0.9800||p = 0.9791||p = 0.9885|
|Jitter local SM (%)||Jitter local NSM (%)||Jitter rap SM (%)||Jitter rap NSM (%)||Jitter ppq5 SM (%)||Jitter ppq5 NSM (%)||Jitter ddp SM (%)||Jitter ddp NSM (%)|
|T-test||p = 0.2641||p = 0.1051||p = 0.2052||p = 0.9764|
|Shimmer local SM (%)||Shimmer local NSM (%)||Shimmer apq3 SM (%)||Shimmer apq3 NSM (%)||Shimmer apq5 SM (%)||Shimmer apq5 NSM (%)||Shimmer apq11 SM (%)||Shimmer apq11 NSM (%)||Shimmer dda SM (%)||Shimmer dda NSM (%)|
|T-test||p = 0.5605||p = 0.4531||p = 0.3835||p = 0.9443||p = 0.5794|
- Publisher Full Text
- European Centre for Disease Prevention and Control. Using face masks in the community. Reducing COVID-19 transmission from potentially asymptomatic or pre-asymptomatic people through the use of face masks. ECDC: Stockholm; 2020.
- Belkin NL. A century after their introduction, are surgical masks necessary?. AORN J. 1996; 64:602-7. DOI
- Long Y, Hu T, Liu L. Effectiveness of N95 respirators versus surgical masks against influenza: a systematic review and meta-analysis. J Evid Based Med. 2020; 13:93-101. DOI
- van Doremalen N, Bushmaker T, Morris DH. Aerosol and surface stability of SARS-CoV-2 as compared with SARS-CoV-1. N Engl J Med. 2020; 382:1564-7. DOI
- Ong SWX, Tan YK, Chia PY. Air, surface environmental, and personal protective equipment contamination by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) from a symptomatic patient. JAMA. 2020; 323:1610-2. DOI
- Gorris C, Ricci Maccarini A, Vanoni F. Acoustic analysis of normal voice patterns in italian adults by using Praat. J Voice. 2019;S0892-1997(19)30034-7. DOI
- Braun A. Studies in Forensic Phonetics: BEIPHOL 64. 1995.
- Mennen I, Schaeffler F, Docherty G. Cross-language differences in fundamental frequency range: a comparison of English and German. J Acoust Soc Am. 2012; 131:2249-60. DOI
- Petrović-Lazić M, Babac S, Vuković M. Acoustic voice analysis of patients with vocal fold polyp. J Voice. 2011; 25:94-7. DOI
- Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N. The effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010; 24:47-56. DOI
- Di Nicola V, Fiorella ML, Spinelli DA. Acoustic analysis of voice in patients treated by reconstructive subtotal laryngectomy. Evaluation and critical review. Acta Otorhinolaryngol Ital. 2006; 26:59-68.
- Cui C, Yao Q, Zhang D. Approaching otolaryngology patients during the COVID-19 pandemic. Otolaryngol Head Neck Surg. 2020; 163:121-31. DOI
- Cheng X, Liu J, Li N. Otolaryngology providers must be alert for patients with mild and asymptomatic COVID-19. Otolaryngol Head Neck Surg. 2020; 162:809-10. DOI
- Zou L, Ruan F, Huang M. SARS-CoV-2 Viral load in upper respiratory specimens of infected patients. N Engl J Med. 2020; 382:1177-9. DOI
- Krajewska J, Krajewski W, Zub K. COVID-19 in otolaryngologist practice: a review of current knowledge. Version 2. Eur Arch Otorhinolaryngol. 2020; 277:1885-97. DOI
- Tysome JR, Bhutta MF. COVID-19: protecting our ENT Workforce. Clin Otolaryngol. 2020; 45:311-2. DOI
- Lescanne E, van der Mee-Marquet N, Juvanon JM. Best practice recommendations: ENT consultations during the COVID-19 pandemic. Eur Ann Otorhinolaryngol Head Neck Dis. 2020;S1879-7296(20)30126-5. DOI
- Brewster DJ, Chrimes N, Do TB. Consensus statement: safe airway society principles of airway management and tracheal intubation specific to the COVID-19 adult patient group. Med J Aust. 2020; 212:472-81. DOI
- Liang T, Yu L. Handbook of COVID-19 prevention and treatment. Zhejiang University School of Medicine: Zhejiang; 2020.
- Di Nicola V, Fiorella ML, Luperto P. La valutazione obiettiva della disfonia. Possibilità e limiti [Objective evaluation of dysphonia. Possibilities and limitations]. Acta Otorhinolaryngol Ital. 2001; 21:10-21.
- Boersma P, Weenink D. Praat: doing phonetics by computer [Computer program]. Version 6.0.37. 2018. Publisher Full Text
- Caffier PP, Möller A, Forbes E. The vocal extent measure: development of a novel parameter in voice diagnostics and initial clinical experience. Biomed Res Int. 2018; 2018:3836714. DOI
- Lovato A, De Colle W, Giacomelli L. Multi-dimensional voice program (MDVP) vs Praat for assessing euphonic subjects: a preliminary study on the gender-discriminating power of acoustic analysis software. J Voice. 2016; 30:765.e1-765.e5. DOI
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
© Società Italiana di Otorinolaringoiatria e chirurgia cervico facciale , 2020
- Abstract viewed - 2066 times
- PDF downloaded - 1237 times