Improvement of speech perception in noise and quiet using a customised Frequency-Allocation Programming (FAP) method
The objective of this study is the evaluation of speech recognition of experienced CI recipients with standard MAP settings using the default frequency-allocation tables with the optimised frequency-allocation MAP. This is an observational, cross-sectional and quantitative-approach study on 50 adult cochlear implant recipients, 20 bilateral and 30 unilateral implant recipients. 23 men (46%) and 27 women (54%). All subjects were Ñ18 years old. Differences between the means of the thresholds in tone audiometry in free fields were statistically significant in both unilaterally and bilaterally implanted patients; similar results were obtained between average benefits in disyllabic for both unilaterally and bilaterally implanted patients (p < 0.005). The differences between the means of scores were statistically significant for unilaterally and bilateral implanted patients, similarly occurred with the average in HINT test between standard SRT50% programming and frequency allocation fitting (p < 0.005). Patients using the frequency allocation method, which assigns frequencies based on fundamental frequencies, showed better perception of disyllabic words and open-set sentences in quiet and in noise than patients previously fitted with standard fitting techniques. The method can be applied to different processors and used with different strategies of stimulation. It allows reduction of current intensity levels as well as an increase in the dynamic range and improves the quality of the representation of the signal.
Cochlear implants were originally designed to allow speech perception in the absence of noise. While success has been achieved in restoring speech-in-quiet understanding, their performance with respect to music and speech in background noise is less than ideal 1-3. Processor fitting is based on subjective responses to stimuli presented in different channels of the electrode array. This subjectivity becomes a problem when dealing with patients who are not cooperative (e.g. toddlers), or who have difficulties with language development and communication skills 4.
In these cases, the use of information obtained from objective measurements is useful in assessing electrode functionality and thresholds of hearing (T-levels) and thresholds level of comfort (C-levels) estimates. Examples of these include the stapedius reflex evoked by electrical stimulation, neural response telemetry and electrically evoked potentials. However, listening to subjective information provided by the patient is an important aspect to keep in mind 5-8.
Proper adjustment of the speech processor of the cochlear implant is essential to provide good quality of sound perception and speech intelligibility. For fitting a multichannel cochlear implant, the channels must be checked in order to verify the functionality of stimulation that is sent to each of the electrodes, estimate the perception threshold (T-level) of the electrical impulses and estimate the maximum level of comfort (C-level) or maximum level of stimulation that the patient accepts without feeling discomfort. The T and C-levels yield the dynamic ranges for each electrode 9.
Frequency-allocation fitting improves musical and melodic auditory recognition in cochlear implant patients. It can be applied to different processors and different stimulation strategies. The outcome of applying the frequency allocation method allows reduction of current intensity levels and an increase in the dynamic ranges, which decreases band-overlapping when mapping and improves audio quality of the representation of the signal 8.
Music and human voices, far from being made up of pure tones, are made up of fundamental notes and a series of harmonics added to the fundamental frequency, which distinguishes one instrument from another, for example. The harmonics are integer multiples of the fundamental note. As one gets further away from the fundamental, the volume decreases and becomes practically inaudible as the sixth or seventh harmonic is approached 9 10.
Sound-processing strategies for cochlear implants represent a set of rules that defines how the sound processor analyses acoustic signals and codes them for delivery to the cochlear implant. The most commonly used coding strategy, the ACE (Advanced Combination Encoder), takes advantage of the 22 closely spaced intracochlear electrodes and the place-pitch selectivity of the cochlea to deliver spectral resolution. The places of stimulation in the cochlea greatly depend upon ongoing spectral analysis as each of the 22 electrodes are associated with a frequency-band. A primary function of a MAP (program) is to translate the spectral information in the incoming acoustic signal into instructions for channel stimulation. The frequency-to-channel allocation is assigned by programming software depending on the coding strategy, type of speech processor and number of channels available for stimulation. While in routine clinical practice frequency table are assigned by defaults, the original impetus for our work was to understand if, by customising frequency-to-channel allocation, it would be possible to improve the differentiation of complex sound and, as a result, improve speech understanding and music recognition.
A spectrum analyser computes the Discrete Fourier Transform (DFT), a mathematical process that transforms a waveform into the components of its frequency spectrum, of the input signal and is, therefore, a useful tool to predict how the speech processor will extract the spectral components of a sound. Prior work was carried out with the spectrum analyser (Spectra LAB FFT Special Analysis System version 4.32.11 by Sound Tecnologic Inc) in order to create WAV files containing complex sounds to be used to customise the frequency-to-channel allocation (Frequency Table). Chords were executed and recorded between the 3rd and 6th octave, containing harmonic and non-harmonic sounds, altered and unaltered scales and ascending and descending tone scales. Piano, guitar, trumpets, trombone, soprano saxophone and violin were selected for their specific fundamental frequency and interpreted at the same intensity and tempo.
In our previous work, aiming to improve music recognition, we successfully used these WAV files to optimise a MAP frequency-allocation table for a CI user 8. Based on these earlier findings, in this comparative performance study, we assessed speech recognition of experienced CI recipients with the standard MAP settings using the default frequency-allocation tables with the optimised frequency-allocation MAP.
Materials and methods
An observational, cross-sectional and quantitative-approach study was performed in 50 consecutively implanted, experienced adult cochlear implant recipients being treated routinely in our implant clinic. With respect to the cross-sectional nature of this study, it must be specified that the post-treatment auditive test was performed after a three-month period because of the particularities of the patient’s evolution and responses to the treatment/intervention. All clinical investigations were conducted according to the principles expressed in the Declaration of Helsinki. The study was approved by the Ethics Committee of our Hospital.
Subjects comprised 20 bilateral implant recipients, 11 men (55%) and 9 women (45%), mean age 45.6 years (SD = 11.54), with an age ranging from 33 years and 69 years. The average value of auditive thresholds previously to the implantation were 86.11 dB (SD = 10.08) for the right ear and 88.33 dB (SD = 8.40) for the left ear; the average value in percentage for the logoaudiometry was 38.89% (SD = 13.32) and 30 unilateral implant recipients: 16 men (53.33%) and 14 women (46.67%), mean age 35.82 years (SD=7.68), and age ranging from 19 years and 48 years; The average value in the auditive thresholds prior to the implantation was 86.59 dB (SD = 8.78) in the implanted ear, being their average percentage in logoaudiometry 34.55% (SD = 14.79). All the patients had a Nucleus® 24 Contour Advanced implant with full electrode insertion.
Surgeries were performed in the same implant centre by the same surgical team.
All patients were short-term deafened, i.e. under 5 years, preimplantation. At the time of their enrollment in the study, all patients were experienced users with a minimum of 1-year post-implantation CI use in, at least, one ear, and had a stable map with a minimum of 18 active channels (i.e. to allow modification of the full set of frequency-allocation bands). All the patients used the Freedom or CP810 sound processor programmed with the ACE signal processing strategy with a stimulation rate between 900 Hz and 1,200 Hz.
Patients with incomplete insertions and comorbidities preventing them from cooperating with the modified programming and evaluations were excluded from the study.
Cochlear Custom Sound Suite, version 5.0 was used as the software platform to MAP all sound processors using standard clinical procedures at first and subsequent fittings, deactivating extra-cochlear electrodes. For the comparison, the modified mapping technique involving modified frequency-allocation bands was created using the same software. Performance with both the standard and modified MAP was compared on speech recognition test measures acutely and after a short take-home trial with the modified MAP.
Principals of the customised Frequency MAP
The Frequency Allocation Table (FAT) defines the frequency range (frequency bandwidth) that is assigned to each active channel in the map. Each channel covers a specific frequency range and a given channel receives stimulation when its bandwidth has sufficient energy to be selected as a maxim. Increasing a band width of a channel may change the electrode associated with a given frequency to the lower adjacent one. For example, increasing channel 11 frequency band from (1688-1938) to (1688-2019) will result in representing the frequency 2,000 Hz with electrode 11 instead of electrode 10. Changing the electrodes used to transmit the electrical stimulation inside the cochlea changes the pitch perception of the user; therefore, optimising frequency table has the potential to improve discrimination of complex sounds. Also, enlarging the bandwidth increases the potential for the electrode to be stimulated.
As per the standard ACE strategy, to optimise speech recognition, the aim is to present the fundamental frequencies selecting the maximal energy bands from an acoustic sound signal. The principle of the modified frequency allocation adjustments is focused on individually tailoring and improving resolution for the identification of fundamental frequencies.
The specific steps for the modified Frequency MAP were as follows for each subject enrolled: Standard MAP. As per the routine standard MAP procedure, only activated intracochlear electrode channels are used with T and C-levels set in response to subjective responses and balanced. (Note that electrodes in areas of sparse neuronal population are routinely deactivated as detected by NRT and impedance measures). Modified MAP andglobal adjustment of T and C levels. Prior to adjusting the frequency bands, the electrode channels from the standard MAP are balanced and a global reduction of 25% for both T and C is implemented. With the full MAP created, switching into live-voice mode, with the clinician’s voice as stimulus, the C-levels are modified globally until a dynamic range of at least 47 is reached. As the patient becomes accustomed to the sound quality and intensity during the fitting session, the dynamic range stabilises between 49 and 51 current levels, enabling increased intensity resolution. Modification of bands and frequency. Using the WAV sound files created in our prior work that represent the fundamental frequency and first harmonic of a range of preselected musical instruments for the modified frequency MAP, the boundaries of the frequency-allocation channels are reallocated via programming software for stimulation to the corresponding cochlear implant channels. These values are based on the patient’s listening ability to audibly detect the difference between each set of instrument WAV files presented within each stimuli series (Table I).
The stimuli versus response
The electrode test sequence
Figure 1 shows the electrodogram obtained with the Nucleus Implant Communicator (NIC); the variance between Standard Fitting and Frequency Fitting. For it a WAV file of the 4th-octave F-chord (Between 400 Hz and 1000 Hz) was used for the coding session, and only the frequency bands were modified; whereas the Dynamic Range, the T level (threshold) and the levels C (comfort) were not altered.
The power of signal in a given frequency band [f1, f2] is:
So, if we expand the frequency band, f'1 < and f'2 > f'2, the power of the signal in the new frequency band will, most likely, increase.
This may help to describe why after the frequency mapping some channels seem to show increased current levels.
All subjects were assessed with recorded Spanish speech materials in the sound field with the subjects seated at a 1 meter distance from the speaker with both their standard MAP and their individually tailored modified FAP MAP. Subjects were assessed at visit 0 with their optimised standard MAP. Following fitting of the FAP MAP and fine tuning, as needed, and after a 3-month take home trial, all subjects were reassessed using the same test materials and conditions.
Aided Tone audiometry in free field. Aided hearing thresholds for warble tones in the free field for 250, 500, 1,000, 2,000, 4,000 and 6,000 Hz were measured for each ear and bilaterally, where applicable, with the patient seated 1 metre away from the speaker. Results are shown for the mean thresholds of frequencies 500, 1,000 and 2,000 Hz. For bilateral CI users, ears were assessed simultaneously with each of two speakers located at + 45º.
Hearing in Noise Test (HINT). The HINT sentence test measures a person’s ability to hear speech in quiet and in noise. The HINT test is used in its validated Spanish version. The sentence test consists of lists of 20 phonetically balanced sentences with a total of 100 words presented with adaptive signal-to-noise ratio (SNR) or in quiet. The HINT test battery consists of four test conditions. For each test, the speech stimuli are presented from a speaker located directly in front of the subject at 0° azimuth, one metre from the subject’s head. For each of the four test conditions, the subject is required to listen to a sentence and repeat it. The four test conditions are: (1) sentences with no competing noise, (2) sentences with competing noise presented directly in front of the patient (S0N0), (3) noise presented at 90° to the right of the patient, and (4) noise presented at 90° to the left of the patient. In all conditions, the competing noise is presented at a fixed loudness of 65 dB SPL. The loudness of the sentences presented is varied throughout the test, depending on whether the patient repeats it correctly or not. The software automatically adaptively modifies the noise level by varying the SNR until a result 50% correct speech recognition is achieved in silence. The resulting score is the SNR at which 50% was achieved and is called sentence reception threshold at 50% (SRT50%). For automation of the test and scoring, a test version of a software application for Windows was used: HINT for Windows. Only the S0N0 configuration was applied 11.
Disyllabic Word Test. The speech recognition test in quiet was conducted in a calibrated sound field with patients seated 1 metre away and an at an azimuth angle of 0° to the speaker stimulus using CD-recorded, calibrated speech stimuli. The test was performed according to the “Protocol for the assessment of hearing and speech in Spanish Language in a program for cochlear implants” with two lists of 25 words presented for each condition at 65 dB SPL with the percent-correct word score recorded for each 12.
Open-set Spanish Sentences Test. Recorded sentence materials were presented in quiet at a fixed level of 65 dB SPL in the free field with the subject seated 1 metre away from the speaker at 0 azimuth. The materials consist of a Spanish adaptation of the “Everyday sentences test” (CID Sentence test) 12. Materials were comprised of 100 sentences making up 10 lists. Percent-correct word scores are recorded for the daily listening condition for each patient (i.e. unilateral or bilateral CI use).
For statistical data processing, SPSS (version 21.0) was used. Hypothesis testing was considered statistically significant when the corresponding p-value was less than 0.05. The statistical comparison for independent samples was performed by using the Student t-test.
Tone audiometry in free fields
The means of the warble tonal thresholds of patients with standard programming were 30.4 dB (unilateral) and 34.2 dB (bilateral), and with frequency programming were 21.9 dB (unilateral) and 24.7 dB (bilateral). The differences between the means of the thresholds were statistically significant in unilaterally and bilaterally implanted patients, (p < 0.005). In the box plots of Figure 2, the distribution of means of warble-tone thresholds were evaluated for unilateral and bilateral patients by using the standard and FAP MAPs.
The average score achieved for the disyllabic test was 73% (unilateral) and 83.2% (bilateral) in patients with standard fitting, and 85.25% (unilateral) and 92.5% (bilateral) in patients with frequency allocation fitting. The differences between average benefits in disyllabic tests were statistically significant for both unilaterally and bilaterally implanted patients (p < 0.005).
Open-set sentence test
The average benefit for open-set sentence scores was 75.7% (unilateral) and 85% (bilateral) in patients with a standard fitting, and 87.5% (unilateral) and 96% (bilateral) in patients with frequency allocation fitting. The differences between the means of scores were significant for unilaterally and bilateral implanted patients (p < 0.005). In the box plots of Figure 3, the distribution of the percentage-correct scores for the disyllabic and sentence tests for the two different groups, unilateral and bilaterally implanted patients are shown.
In the box plot in Figure 4, SRT50% distribution for the conditions Standard Programming vs FAP is shown. The mean SRT50% values for patients with conventional programming was 19.3 and 9.8 for unilateral and bilateral patients with frequency allocation fitting, respectively. Differences between mean values of standard SRT50% programming and frequency allocation fitting were significant (p < 0.005).
The methodology presented herein is supported by the fundamental physiological principles presented in the place theory of Hermann von Helmholtz in the XIX century, which was later verified and modified by Georg von Békésy. Our method is also based on frequency principles; moreover, the auditory nerve’s frequency selectivity is considered 13-16. Following these principles, our method relies on the fundamental frequency. The allocation is independent of the electrode within the cochlea and of the neural response of the area.
To allocate the remaining frequencies, a study of postlingually deafened adults with auditory memory was used. Music files were used, which introduces a subjective element in the methodology since all the patients that participated had music melody memory. Once the dynamic range is established in the different channels, for each patient. This remains stable along the electrode array, because of the physiology of the inner ear. It was assumed that hearing deprivation in this case is insufficient to produce alterations or degenerations in the hearing neural paths 17.
The clinical impact of cochlear implants has been extremely successful. The search for ways to optimise the benefit of cochlear implants has been ongoing to improve not only speech understanding in quiet, but also in noise and for music perception; both are difficult situations for most CI users 1 8 18.
Cochlear implant processors should be properly fitted with at least some degree of customisation. The goal is to establish a set of parameters that define the electrical impulses generated by the device in response to sound that yield optimum speech intelligibility. Quality programming of the cochlear implant system is crucial 19 20.
It is important to remember that programming the speech processor is not the sole determinant in good performance of a patient. The age of implantation, family support, duration of deafness, the communicative context in which the patient lives, cognitive ability and use of the device are among the variables that can affect performance 3 20 21.
CI users who perform well on tests of speech and sentences in quiet often report difficulty understanding in everyday noisy environments 22-24.
Baudhuin and colleagues studied implanted children to evaluate the parameters that influence better speech perception. Evaluating the findings of their study, they confirmed the need to create an individualised configuration for the fitting of each child, as well as the importance of speech recognition testing as follow-up checks, both in quiet and in noise 25. In the Zhou and Pfingst studies, it can be concluded that the site-specific adjustments of the T-level settings improved modulation sensitivity at low levels and speech-perception thresholds 26.
In addition, as in Baudhuin’s study 25, our current study also included T-level estimates.
Performance optimisation of patients in their daily lives is the goal of using frequency allocation fitting. By harnessing the full potential of the cochlea, stimulating throughout the spiral ganglion in each sampling window, it is possible to provide further spectral and temporal interaction. It was previously observed that the use of the frequency allocation method allowed better musical and melodic perception and recognition compared to standard fitting 8.
The results of this investigation indicate that subjects showed significant improvements when using the frequency allocation method. The test data clearly demonstrate that patients show better speech recognition in quiet, as well as better speech recognition in noise.
The study performed by Matthias Meeuws should be noted, nemaly “Computer-assisted CI fitting: Is the learning capacity of the intelligent agent FOX beneficial for speech understanding? 27: the processor was programmed with a predictive mode after their patient’s responses to verbal and tonal stimuli. In our study, responses to music frequency bands are considered.
In implanted patients whose native language is a tonal language, the described method is of special importance due to specific characteristics of those languages. It is also important to specify that all patients (10/10) chose to continue using the map optimised by using frequency allocation. As this is a new method developed at our centre, it has not been possible to compare the results with other studies on the same type of changes in the parameters.
Patients using the frequency allocation programming method, which assigns frequencies based on fundamental frequencies, showed better perception of disyllabic words and open-set sentences in quiet and in noise than patients previously fitted with the standard fitting techniques. The method can be applied to different processors and used with different strategies of stimulation. It allows reduction of current intensity levels as well as an increase in the dynamic range, which enables a less disturbing mapping of each audio band and improves the quality of the signal representation.
Figures and Tables
|Fundamental note||125 Hz - 250 Hz||3 rd octave chord piano (C-F)||Channel 22 205 Hz 210 Hz|
|250 Hz - 500 Hz||3 rd octave chord guitar-piano (C-F)||Channels 21-20-19-18|
|400 Hz - 1 kHz||4 th octave chord trumpet-trombone sax soprano (C)||Channels 21-20-19 18-17-16|
|800 Hz - 2 kHz||4 th octave chord piano-string (C)||Channels 17-16-15 14-13-12|
|Harmonics||2 kHz - 4 kHz||5 th octave chord piano (C-F)||Channels 11-10-9 8-7-6|
|4 K Hz - 8 kHz||Vowels and consonants Use of familiar voices (specific to each individual)||Channels 6-5-4 3-2-1|