Speech perception in noise by young sequential bilingual children
The objective of this study was to ascertain the effects of competitive noise on second language perception skills of sequentially bilingual children
and to compare the results with those relating to matched monolingual peers. Fifteen bilingual immigrant children (aged 6-10 years) (BL) learning
through their second language (L2), which was Italian, were matched with 15 peers who only spoke Italian (IO). All immigrant children had arrived
in Italy and were exposed to L2 after their 4th year of life. The speech-to-noise ratio (SNR) needed to obtain 50% intelligibility – the speech reception
threshold (SRT) – for Italian words was measured against the Italian version of ICRA noise, using an adaptive method. Moreover, presentation
of phrases against a contralateral continuous discourse (informational masking) was carried out to exclude possible biases due to differences in
memory, attention, or other central auditory processing disorders between groups. The SNR was -2.7 dB (SD 1.7; range: -5.5 to + 0.9) for the BL
group and -5.3 dB (SD 2.3; range: -8.8 to -0.9) for the IO group (p < 0.01). With contralateral continuous discourse presentation the SNR were
-32.8 dB (SD 2.4; range: -36.1 to -28.2) for the BL group and -27.8 dB (SD 2.1; range: -31.7 to -24.1) for the OI group (p < 0.01). Even sequential
bilingual individuals exposed to L2 at 4 years old had worse speech perception in noise than their matched IO peers. On the other hand, the BL group
demonstrated superior divided attention skills in tests with competitive contralateral discourse (p < 0.01).
Understanding speech through noise is a skill that develops well into an individual’s adolescent years and becomes adult-like at around the age of 15 1-4. Younger children’s developmental listening disadvantage is of particular concern at school because early educational skills may be taught in noisy settings. Some children also appear to be at a double disadvantage when listening under adverse conditions (noise, reverberation, background babble) 5-7: this subgroup may account for 5-10% of the scholastic population in many European countries 8, and corresponds to sequential bilingual children learning in their second language (L2). In fact, within a short space of time, these children generally acquire the same lexical and morpho-syntactic skills as their monolingual peers, but do not reach the same level of phonological skills in their L2 5 6 especially when they are exposed to the L2 beyond the critical period for complete phonological code acquisition. Several studies have demonstrated that phonological competences are important in achieving good intelligibility in adverse listening conditions 6 7 9.
To date, most of the research on bilingualism has focused on simultaneous bilinguals who are exposed to both languages from birth. Immigrant children are generally sequential bilingual, however, either because they arrive in the host country during their childhood, having already learned their mother language (L1), or because they are exposed primarily to the family language, and it is not until they attend nursery school at around 3 years of age that they gradually become immersed in their L2 10. With increasing immigration, it is becoming common for children to grow up sequential bilingual in many developed countries, including Italy. In some northern Italian regions, 10% of schoolchildren are of foreign nationality and, among them, the incidence of those who start their education in Italian schools has increased (from 3.7% in 2013 to 4.9% in 2014) 11.
It is well established that by 6 months infants recognise native-language phonetic categories. Moreover, early and simultaneously bilingual infants are able to discriminate the sounds of both their languages by 12 months of age 12. On the other hand, by the end of the first year of life, monolinguals’ perception of speech has been dramatically altered by exposure to their single native language. In fact, at the phonetic level, exposure to a specific language reduces infants’ abilities to discriminate foreign-language speech sounds and this ability declines sharply between 6 and 12 months of age 13 14.
In other words, exposure to a specific language results in “neural commitment” to the acoustic properties of that language. Neural commitment to the native language interferes with foreign-language processing, causing difficulty in foreign-language speech perception in infancy and adulthood. Thus, with respect to early and simultaneously bilingual infants, sequential bilingual children are faced with the task of learning their host country’s language when they already have a still-developing phonology that reflects their L1. Most authors agree that the influence of L1 categories is strong enough to interfere with native-like processing of L2 categories, even in infants exposed to L2 very early. In other words, early and intensive exposure to a second language may not necessarily be sufficient to build native-like phonemic categories, or to perform as well as native speakers in discriminating between the two languages 10 15.
Unfortunately, the findings of current studies on perceptual skills in sequential bilingual individuals are still controversial, mostly because they differ in terms of how much children have been exposed to L2, and at what age they started to learn it 16 17.
Moreover, several factors may explain replication difficulties, including the specific language, sociodemographic variables, the location of research in conjunction with language status, and the fact that experimental tasks are not always sensitive and controlled well enough (particularly from an audiological standpoint) to detect subtle differences in speech-in-noise perception 18.
In contrast, bilinguals may have cognitive advantages over monolingual speakers in verbal tasks when it comes to solving conflicting information and inhibiting irrelevant information 2 19 20. For both these aspects, the underlying mechanisms and their interactions have yet to be fully understood, and the factors influencing bilingual immigrant children’s speech comprehension in noisy settings need to be investigated more systematically.
The aims of the present study were to ascertain the effects of noise on speech perception skills (due to a reduced audibility of several acoustic cues) in 15 typically-developing sequential bilingual (BL) children (aged 6-12) learning their L2; compare the results with those of 15 matched monolingual peers speaking only Italian (OI). Our first hypothesis was that BL children might have more difficulties in listening under adverse conditions, compared to their matched peers, due to their lower phonological competences. For this purpose, we presented lists of words in competition with the Italian version of the ICRA (International Collegium of Rehabilitative Audiology) noise (Test 1).
To exclude possible biases due to differences in memory, attention, or other central auditory processing disorders, we carried out a second test using sentences presented in competition with a contralateral continuous discourse.
This different task was more demanding with regards to memory (children had to repeat phrases, not single words), to divided attention skills (they had to solve conflicting information) and to other central auditory functions (they had to use morpho-syntactic competences, i.e. top-down control). On the other hand, in the second test both the target and the masker were clearly audible.
Materials and methods
Fifteen sequentially bilingual immigrant children (BL group) and 15 native Italian only speakers (IO group) with no self-reported hearing impairments were enrolled, for a perspective study, from three different primary schools in Padua (Italy). These schools had similar socio-economic conditions and no significant differences in mean scores of the INVALSI (Italian National Institute for the Evaluation of the School System) 21 tests.
Parents gave their informed consent to each child’s participation in the study.
The Institutional Review Board of the Azienda Universitaria-Ospedaliera di Padova, Italy, approved the study.
The BL group was represented by 8 females and 7 males, aged 6-10 years (mean = 8.66, SD = 1.71); the IO group included 7 females and 8 males, aged 6-12 years (mean = 8.60; SD = 1.72), matched for gender, age and school proficiency with the BL group. The details of all the children involved in the study are given in Table I.
All participants had normal otoscopic findings and a hearing threshold of 20 dB HL or better bilaterally for the frequencies 250, 500, 1000, 2000, 4000, 8000 Hz. Moreover, they responded correctly in over 90% of trials in quiet speech audiometry with words and phrases presented at 40 dB SL.
Parents reported that participants had no history of neurological, cognitive, or communication disorders. Their school teachers completed a simple form for each participant concerning biographical details, potential socio-economic disadvantages, and grades obtained in the previous 6 months in the following subjects: math, Italian language, history and geography. Grades were expressed on a scale from 1 to 10, where 10 was the highest and 1 the lowest, and 6 is a pass. A school proficiency (SP) with an average grade of ≥ 7 and no socio-economic disadvantages were established as an inclusion criterion.
For the BL group, there were additional inclusion criteria: no exposure to the Italian language (L2) before the age of 4 years; having lived in Italy for at least 2 years, with regular, constant exposure to Italian at school and in the community, and to their first language (L1) at home.
To establish the children’s age at the time of their exposure to L2 and their need to use the language, all parents of the immigrant children completed a questionnaire reporting when they arrived in Italy, the language environment at home, and the percentage of output in L1 and L2, the language used for specific activities, and the language(s) used for interactions between family members 22.
Finally, a further inclusion criterion for BL participants was that the percentage of L1 vs L2 exposure ranged between 35% to 65%, so all children could be considered competent speakers in both languages 23.
For test # 1, the speech signal consisted of 20 lists of 10 Italian words familiar to children in competition with the ICRA noise generated by multiple superposition of all words (available at http://acustica.ing.unife.it/). These tests were validated with Italian normal hearing subjects and impaired hearing patients of different ages. ICRA noise is similar to “cocktail party noise”, but has long-term spectrums and modulation characteristics like natural speech. Thus, it overlaps in time and frequency in such a way that portions of the primary speech signal are rendered inaudible. A monotic presentation was used, i.e. words and noise to a single ear randomly chosen among subjects.
For test # 2 the target speech signal consisted of 20 lists, each comprising 10 Italian phrases. The masking was represented by the Italian translation of a passage from a novel by Conrad with the silent pauses (between words and periods) omitted, but still perfectly and easily comprehensible.
This masking signal was presented contralaterally to the primary message. The mixture stimuli were constructed by having the interferers precede the target sentence (for about a second), and then following the target sentence for another second. Thus, this masking paradigm produced a listening situation where the target and masker signals were clearly audible but the listener had difficulties in segregating the elements of the target signal from the elements of the similar-sounding distracters. This masking is called “informational” and has different effects with respect to the energetic one used in test # 1.
All tests were conducted in a sound treated acoustic chamber. None of the participants had heard or read the test material before the experiments and, to avoid memory effects, each list was used only once with each child. The full assessment took 30 minutes to complete for each child and was divided into three 10-minute sessions, with two breaks.
Recorded stimuli were delivered with a portable compact disc player through a two-channel Madsen Astera 2 audiometer and a set of Sennheiser HDA 200 headphones. Before testing participants, a precision sound level meter (Bruel & Kjaer, type 2231) was used to calibrate each channel separately. Audiometer intensity (linearity) was also checked to ensure that noise levels were accurate and achievable on audiometer potentiometer manipulation. Words were presented at a constant level of 70 dB SPL. After a period of familiarisation with the test words, the speech-to-noise ratio (SNR) needed to obtain a 50% Italian word intelligibility, the speech reception threshold (SRT), was measured for the two tests using a one-down/one-up adaptive procedure in 2-dB steps 24.
Briefly, a word was presented, and the children then responded by orally repeating the word to an experimenter. They were encouraged to guess if they were unsure. The experimenter (who was blinded to the experimental hypothesis) compared their response with the target sentence. If every word in their sentence was correct, the noise level for the next sentence was increased by 2 dB; if they made a mistake, the noise level was decreased by 2 dB. The SNR was calculated as the median of at least 6 track inversions, and after two trial lists. We adopted this traditional adaptive assessment (i.e. measuring the SNR needed to reach the SRT), as generally recommended in studies with normal hearing subjects 25 26 and previously validated for Italian subjects 27.
The following test variables were randomised: right vs. left ear presentation, word list sequences and time-ordered sequences of the two different speech tests.
The demands on executive function were low, with the primary demand being the need to keep arbitrary rules in mind to respond appropriately.
In test 2, the procedure was identical but with different stimuli and masking, as reported in the previous paragraph. They had to correctly repeat each word of the phrase. The executive demand was higher because it included remembering the 6 to 8 words of the simple phrases.
STATISTICA 7.1 software (Stat Soft Italia srl, Milan, Italy) was used for basic statistical analysis, and t-test to assess the differences between the BL and OI groups. The regression tendency curves were calculated on the SNRs as a function of age, or of each L2 background descriptor. All data were expressed as mean ± standard deviation from the mean (SD). Values where p < 0.05 were considered statistically significant and a p ≤ 0.001 was judged highly significant.
In the BL group, the children’s age at the time of their first exposure to L2 ranged from 4 to 7 years (mean 5.0 ± 1.1), and the number of years since they had been learning L2 ranged from 2 to 6 (mean 3.7 ± 1.2). Language output was 56% for their mother tongue (SD = 6.2; range = 45-65%) and 44% for L2 (SD = 15.3; range = 35-55%). As reported in last paragraph of the Methods section, a range of L2 exposure between 35% to 65% was an inclusion criterion.
Average school proficiency was 8.3 ± 0.8 for the BL group, and 8.2 ± 0.8 for the IO group.
The SNR values needed to obtain a 50% intelligibility (the SRT) are given in Table II for each child. The mean SNR was higher in the BL group (worse performance) than in the IO group, with a significant difference (p < 0.01) (Fig. 1).
The mean age, SP, SNR with noise and with contralateral competitive continuous discourse for the two groups were compared with a t-test after checking for the adequacy of sample sizes, normal distribution of the data, and homogeneity of variances.
There were no significant differences in age or school proficiency between the BL and IO groups. Moreover, SP did not correlate with age in either of the groups.
SNR correlated significantly with age in the IO group (perception in noise improved with age) (p < 0.05), but not in the BL group.
To understand which factor could influence SNR performance, the BL group was analysed in more detail, looking at: range of exposure to L1 vs L2, the effect of mother language (Slavic vs Romance), age at the time of exposure to L2 (4 years old vs 5 to 7 years old) and the number of years since starting to learn L2 (more or less than 3 years)
Within the range of exposure to L1 vs L2 above reported, there was no significant correlation with speech intelligibility (Table III). No differences emerged between Slavic and Romance L1 background (T²H = 9.697 F(7.7) = 0.746 p > 0.5), and the significant differences in the SNRs between these two subgroups and IO speakers was confirmed (Table IV).
Age at time of first exposure to L2 did not correlate with SNR, whereas years of exposure to L2 did significantly correlate (Table V).
Our group of sequential bilingual children needed a significantly higher SNR than their mother-tongue peers when the primary message was masked with noise. These children might therefore have speech perception difficulties in adverse listening conditions, where many phonetic cues (particularly low-energy ones) are masked by noise and reverberation. In other words, many consonant contrasts become less audible, so listeners must rely on secondary cues 5 and that is why they should have an intrinsic redundancy of acoustic indexes. While learning in their second language, these children should acquire acoustic indexes as complete as possible in order to process the foreign language, but beyond a critical period their acquisition of these indexes will never reach the level of mother-language individuals.
A limit of this study is the small number of participants. However, our data might demonstrate that even sequential bilingual individuals exposed to L2 at 4 years old had worse speech perception in noise than their matched IO peers. Most probably, the acquisition of L2 phonological skills at the age of 4 might be already too late to catch up to their mother language peers with regards to these competences. These data are in agreement with that observed in a group of 9-year-old Turkish-German bilingual children, who demonstrated difficulties with certain German vowel contrasts, despite having started learning German at 2 to 4 years of age 16. Although in the past some authors considered that the critical period for acquiring phonological skills might be restricted to 5-6 years of age 28, more recent studies demonstrated that these abilities decline sharply between 6 and 12 months of age 13. The above reported study by Darcy & Krüger together with the present demonstrate that early and intensive exposure to a second language may not necessarily be sufficient to build native-like phonemic categories, or to perform as well as native speakers in difficult listening conditions 10 15.
In the IO group, speech perception in noise was correlated with age. As previously observed, this is a process that improves gradually and reaches adult-like performances at the age of 15-16, because of the improvement in auditory processing efficiency and attentional control with age 1 2 29.
However, in the BL group speech perception in noise did not correlate with age per se, but did correlate significantly with years of exposure to L2. In this regard, one should consider that phonemic categories continue to be refined with age, providing children have a significant exposure to a given language. It may be that, between the two mechanisms involved in L2 perception in noise, further refinement of phonemic categories with exposure to the language becomes more relevant than the age-related improvement in auditory central processing and divided attention. This explains why in our BL group perception in noise correlated significantly with years of exposure to L2 and not with age per se.
Regarding test # 2, bilingual children needed a significantly lower SNR than their IO-speaking counterparts. In other words, they demonstrated better performance when distracted by contralateral continuous conversation. In fact, this masking interferes with selective attention to the primary signal, not with the acoustic cues. These results reinforced test # 1 hypothesis, i.e. that bilingual children’s worse speech perception in noise was only due to their weaker phonological competences, not to any lexical, grammatical or other cognitive factors, such as attention or memory. Thus, the selection criteria used to recruit BL children were strong enough to avoid inclusion of any subject with reduced skills in these last aspects, due to socio-economic disadvantage, previous pathologies, or unknown factors.
The strength of our study lies in that we carefully selected a homogeneous group of bilingual children who were competent speakers in both languages, and had been exposed to L2 between 4 and 7 years of age. Some studies have reported greater individual differences in non-native than in native speakers, a finding that probably reflects heterogeneity of the population of non-native speakers considered 19. Our data demonstrated very similar SDs between the IO and BL groups, probably due to our strict participant selection.
On the other hand, a possible bias might stem from the fact that L1 was represented by two different language groups, i.e. Slavic languages in 7 of our bilinguals, and Romance languages in 8. No significant differences emerged in the test scores between these two subgroups, however. It is likely that the different degrees of phonetic similarity between these languages and Italian are not strong enough to modify the children’s responses.
Moreover, the remote possibility of differences in the basic auditory function between children of various ethnic groups has already been excluded.
Sequential bilingual children might have speech perception difficulties in adverse listening conditions. We feel it could be important to test the effect of different ages of initial exposure to L2, and compare laboratory results with tests conducted in the classroom (or in virtually-reproduced classroom listening conditions) regarding comprehension of less redundant speech material.
Finally, as the numbers of immigrant children are increasing in many developed countries and effective listening is a linchpin of school learning 30, we hope these findings are kept in mind in the future and applied directly to more engineering-oriented disciplines associated with verbal communication (i.e. design of classrooms acoustic and classrooms communication systems).
Figures and tables
|Bilingual participants = BL group||Monolingual participants = IO Group|
|Subject||Age||Gender||L1||L2 - Age||%L1||%L2||L2 - Years||SP||Subject||Age||Gender||L1||SP|
|Bilingual group (BL)||Monolingual group (IO)|
|Subject||SNR noise||SNR phrases||Subject||SNR noise||SNR phrases|
|% exposure to L2||44.0||6.86|
|% exposure to L2||44.0||6.86|
|BL subgroups F(7,7) = 0.74591; p < 0.646|
|Italian vs Romance||Italian vs Slavic|
|F(7,14) = 137.29; p < 0.000||F(7,15) = 126.26; p < 0.000|
|t value||gl||p value||t value||gl||p-value|
|Correlation between age at time of first exposure to L2 and SNR for words in noise|
|Age at time of first exposure to L2||5.00||1.13|
|Correlation between years of exposure to L2 and SNR for words in noise|
|Years of exposure to L2||3.06||1.175|