Published: 2019-01-31

Probe-based confocal laser endomicroscopy in detecting malignant lesions of vocal folds

Department of Otorhinolaryngology, Head and Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, University Hospital, Erlangen, Germany
Department of Otorhinolaryngology, Head and Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, University Hospital, Erlangen, Germany
Department of Otorhinolaryngology, Head and Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, University Hospital, Erlangen, Germany
Department of Otorhinolaryngology, Head and Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, University Hospital, Erlangen, Germany
Pattern Recognition Lab, Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
Department of Otorhinolaryngology, Head and Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, University Hospital, Erlangen, Germany
Department of Otorhinolaryngology, Head and Neck Surgery, Universität Regensburg, University Hospital, Regensburg, Germany
Confocal laser endomicroscopy Laryngeal cancer Non-invasive histological imaging Vocal folds


Probe-based confocal laser endomicroscopy (CLE) is an innovative technique for real-time, non-invasive analysis of the surface epithelium.
While being successfully used for diagnosis by experts, this method has not yet been established in clinical routine, partly due to the lack of standards and criteria for classifying various lesions. Our aim was to determine the diagnostic value and inter-rater reliability of CLE in detecting malignant lesions of the vocal cords. 58 video sequences were extracted from the probe-based CLE (GastroFlex probe with a Cellvizio® laser system) examinations of 3 patients with squamous cell carcinomas and 4 patients with benign alterations of the vocal folds. Two ENT surgeons, who were blinded to the histological result, were asked to identify the sequences representing a carcinoma. We showed an accuracy, sensitivity, specificity, PPV and NPV of 91.38-96.55%, 100%, 87.8-95.2%, 77.27-89.47% and 100%, respectively, with an inter-rater reliability of k = 0.89 (“almost perfect agreement”). Probe-based CLE is a promising method for diagnosis and assessment of vocal fold lesions in vivo. Our results suggest that, with adequate training, the diagnostic value of this technique can be improved and potentially provide important information during oncological surgery.


The vocal folds are the most frequent location of laryngeal cancer, accounting for more than two-thirds of all cases 1 2. More than 90% are classified as squamous cell carcinomas (SCC) 1 2. The diagnosis is provided by biopsy and histopathological assessment. Leucoplakia, erythroplakia and papillomatosis are the macroscopically visible changes from which SCC usually originate 3 4. Up to 50% of leukoplakias show no dysplasia or invasive carcinoma in subsequent histological examination 5 6. The unnecessary excision or biopsy of these lesions could have a negative impact on voice due to scarring of the vocal cords 7.

A number of optical imaging methods, such as confocal laser endomicroscopy (CLE), narrow-band imaging, fluorescence endoscopy and optical coherence tomography (OCT) have been suggested as having the potential to improve the laryngoscopic analysis of mucosal lesion with white light 8-13.

Probe-based CLE is a novel technique that enables the imaging of cell outlines at the surface of a lesion with a magnifying power of up to 1000. The method requires administration of fluorescein as a contrast agent, which accumulates in the intercellular spaces but not in the nuclei 14. As they differ from other imaging techniques, the acquired images require the clinician or pathologist to have special training 15. In the last few years, probe-based CLE has been intensively studied in gastroenterology and has expanded its application to other areas such as the head and neck, pulmonology and urology 16-21. The aim of this study was to assess the diagnostic value of probe-based CLE in identifying malignant lesions of the vocal cords in comparison to the accepted gold standard of histopathological examination.

Materials and methods

This research was carried out in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki). The local ethics committee approved the study and all patients gave written informed consent.

The study was carried out at the ENT Department of a tertiary level university hospital. Between July and October 2015, seven patients (three women, four men: average age 56.7 ± 5.8 years) underwent microlaryngoscopic examination under general anaesthesia. All patients had a suspected unilateral lesion of unknown nature in the vocal cords. Consequently, they all had an indication for this procedure.

After documentation of findings under white light microscopy, 5 ml fluorescein (Fluorescein Alcon 10%, Alcon PHARMA GmbH, Freiburg, Germany) was administered intravenously and the vocal cords were scanned by probe-based CLE (GastroFlex probe with Cellvizio laser system, Mauna Technologies, Paris, France).

The images were taken within five minutes of intravenous fluorescein injection, as the image quality was expected to deteriorate after eight minutes 22. The probe was placed on the vocal cords under direct vision. A biopsy was subsequently performed in the area of interest.

The video recordings were analysed and compared with the histological results. 58 representative CLE video sequences (3,224 images) of healthy vocal cords, benign lesions (hyperplasia, hyperkeratosis, polyps, cysts) and malignant lesions were extracted and presented independently to two medical professionals (blinded examiners) for assessment.

The examiners were two ENT specialists who had undergone training and certification as provided online by Cellvizio 23. At the present time, there is no certification programme available for head and neck lesions, but there is one for lesions in the oesophagus. Since the epithelium of these regions is for the most part similar and the prevalence of squamous cell carcinomas is also comparable (over 90% for both), we used the training programme designed for the oesophagus to help us learn how to classify the lesions in the vocal cords 24. The blinded examiners had to identify video sequences showing malignancy. The histological findings were regarded as the reference standard for subsequent statistical analysis.

Statistical analysis

The accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy were calculated with 95% confidence intervals for each examiner. Inter-rater reliability/agreement was tested statistically using Cohen’s kappa (Cohen’s kappa coefficient). The κ-values were interpreted according to the widely accepted Landis und Koch classification (25). Agreement with values of κ between 0 and 0.20 was regarded as slight, between 0.21 and 0.40 adequate, between 0.41 and 0.60 moderate, between 0.61 and 0.8 substantial, and between 0.81 and 1.0 almost perfect. Statistical analysis was carried out using SPSS version 25.0 (IBM Corp. Released 2013. IBM SPSS Statistics for Windows, Version 25.0. Armonk, New York, United States of America).


All seven patients underwent CLE examination without complications. Intravenous administration of fluorescein did not cause any adverse effects. The examination with the CLE probe and recording of the findings prolonged the procedure in general anaesthesia for about 10 minutes. Squamous cell carcinoma of the vocal folds was found in 3 patients (42.9%) and benign changes without dysplasia were found in the remaining 4 patients (57.1%). The benign changes were pseudoepitheliomatous hyperplasia, hyperkeratosis without dysplasia, a retention cyst and hyperplasia. Normal, unremarkable mucosa of the contralateral vocal cord during microlaryngoscopic examination with white light was regarded and classified as healthy mucosa, even in patients with unilateral SCC.

Of the 58 representative video sequences, 17 showed malignancies (29.3%) and 41 benign lesions or healthy normal mucosa (70.7%).

The accuracy, sensitivity, specificity, PPV (positive predictive value) and NPV (negative predictive value) were 91.38-96.55%, 100%, 87.8-95.2%, 77.27-89.47% and 100%, respectively, for the two examiners when identifying the video sequences with malignant alterations (Table I).

Inter-rater reliability was tested with Cohen’s kappa (k) and evaluated according to the Landis and Koch, classification as well as Fleiss’s criteria. The two examiners obtained a k value of 0.89, which is to be interpreted as almost perfect or excellent agreement. Figure 1 shows a typical image of healthy mucosa and malignant changes.

Of the 58 video sequences, 2 sequences representing healthy mucosa were incorrectly classified by both examiners independently as representing malignant changes (false positives, Fig. 2). Additionally, examiners E1 and E2 disagreed on three other sequences, representing false-positive findings for examiner E1 (Fig. 3).


In this study, probe-based CLE showed very good results in identifying malignant lesions. We obtained an accuracy, sensitivity, specificity, PPV, NPV of 91.38-96.55%, 100%, 87.8-95.2%, 77.27-89.47% and 100%, respectively. We also showed excellent inter-observer agreement, as suggested by Cohen’s kappa statistics. This represents an improvement in the results compared to a previous study of our group 19 which suggested that examinations of the vocal cords would only show a fair agreement between observers. We attribute the improvement in the results to two methodical changes in our work. First, the examiners underwent certified training provided by the Cellvizio Academy 23 in the interpretation of the CLE images and, most importantly, of the video sequences, as also suggested by Oetter et al. 17. As the images obtained by this technique differ from the other methods usually applied in daily routine, specific training is required.

The quality of the images was variable, as seen in Figures 2 and 3 depicting healthy mucosa. This should not, however, be regarded as a limitation of the study as it represents the expected set-up in the operating theatre. Some of these lesions have a friable surface, resulting in slight bleeding of the mucosa, which could impair the quality of the images. Additionally, the rugged surface of a tumour makes it more difficult to assure proper contact between the cell surface and the probe.

Even though our results suggest reliability and a high degree of certainty in identifying malignant lesions in the vocal folds, we did not demonstrate that CLE can add diagnostic value to microlaryngoscopic examination with white light, as there was already a high suspicion of malignancy. Further studies with a larger number of patients will have to be carried out to address this question. The three patients with SCC only underwent biopsy due to the suspicion of advanced glottic carcinoma and therefore there was no indication for cordectomy Type I using transoral laser microsurgery (TLM).

The penetration depth of probe-based (pCLE) is limited to about 60 μm, thus providing a two-dimensional visualisation of the most superficial layer of the lesion. Due to this fact, it is usually not possible to differentiate between in situ carcinoma and invasive carcinoma, since the stromal invasion cannot be demonstrated 26. This low penetration depth of the probe is possibly also the reason why benign lesions of the vocal folds, such as polyps, cannot be adequately differentiated from healthy mucosa 19. The histopathological changes of these benign lesions are usually found in the lamina propria under a healthy superficial layer of epithelium 26. For this reason, we opted for a two-category question for this study on malignant/non-malignant changes. Additional information about deeper layers could possibly be provided by combining pCLE with other endoscopic techniques such as optic coherence tomography 11 or narrow band imaging (NBI) 12 13. NBI provides information about the examined areas by evaluating surrounding perpendicular and longitudinal vascular pattern changes 12 13. Newest reports on the diagnostic value of predicting malignancy using NBI show an accuracy, sensitivity, specificity, PPV, NPV of 96%, 100%, 95%, 88%, 100%, respectively, which appear very similar to our results 12. The assumption that the contralateral vocal fold, when inconspicuous to examination with white light during microlaryngoscopy, represents healthy mucosa must be seen as a limitation of this study. A biopsy of the contralateral vocal fold to fully exclude epithelial changes would, however, not be ethically acceptable.

Some groups provide their own “training programme” before showing the pCLE images to be analysed to the test examiners 11. This bears the risk of bias with respect to interpretation and makes comparison between studies more difficult. Because there is no specific training programme for head and neck cancer, we had to opt for the use of the training programme available for the oesophagus, despite its limitations when extrapolated to the mucosa of the larynx. Moore et al. reported an accuracy, sensitivity, specificity, PPV, and NPV of 100% using a similar methodical approach: 29 offline images and 6 video sequences 27. Oetter et al. 17 investigated the value of pCLE in the classification of lesions of the oral mucosa and suggested a classification and scoring system to facilitate the interpretation of these images by examiners without prior experience in this technique. The suggested scoring system enabled a sensitivity of over 95% and a specificity of 89% and showed excellent agreement between examiners.

The contrast agent (fluorescein), administrated by i.v. injection prior to the examination, accumulates in the intercellular spaces, thereby enabling the imaging of the cell outlines as well as the small capillaries, but does not accumulate in nuclei. Visualisation of nuclei is, however, essential for diagnosis and grading of malignant lesions in head and neck cancer 28. The most commonly used contrast agent for the nuclei in CLE is acriflavine 29. When administrated topically, acriflavine passes the cell membrane and binds strongly to the acidic constituents of the nucleus, thus enabling the staining of the superficial labels of the epithelium. It allows the differentiation of epithelial cells, goblet cells and other pathological patterns 30.

The potential of staining agents of the nuclei in the head and neck region was examined by Linxweiler et al. in formalin-fixed samples in 2016 18. Acriflavin showed that it stained the nuclei while suppressing the autofluorescence of collagen fibres, only marginally improving the margin detection of tumours in the head and neck. Because of this only modestly positive net effect, the authors do not recommend the use of acriflavine 18. As formaldehyde changes the nuclear proteins and tissue autofluorescence, these results cannot be directly transferred to the examination with CLE in vivo. Recently, an improved version of acriflavine, acrinol, was suggested as an alternative topical contrast agent for the nucleus 31. This contrast agent has shown minimal mucosal irritation, being mostly excreted in stools while still showing increased nuclear density and prominent abnormalities in carcinoma cells 31. Further studies on the toxicity of acrinol will, however, be required before it can be routinely applied in vivo.

Motivated by the difficulty in reliable and reproducible image interpretation, algorithms for the automatic classification of CLE images have been recently emerging 15 32 33. The application of deep learning algorithms to CLE images, as described by Aubreville et al. based on Convolutional Neural Networks, was able to correctly recognise CLE images of oral SCC with an accuracy of 88.3%, a sensitivity of 86.6% and a specificity of 90% 15. To confirm the robustness of this model the group applied the algorithm to our dataset of the vocal cords and obtained an accuracy of 90.7% 33. The quality of the video sequences can be also diminished due to motion artefacts that are usually caused by slight movements of the probe during examination. Although these motion artefacts are relatively easy for a human examiner to overlook, they represent a relevant interfering factor in automatic analysis. Detection of motion artefacts demonstrated that the performance can be improved by pattern recognition algorithms that recognise malignant changes 34. For comparison, the results presented in this study gained by two trained and certified examiners show an accuracy of 91.38-96.55%. We find both these results very encouraging and worth pursuing in future studies. If the detection of SCC through pCLE proves to be reliable in further studies, this novel technique could help to better define surgical margins in real time (e.g. TLM) and allow a more selective use of biopsy in the follow-up of patients who underwent TLM, as recurrence incidence was shown to be around 15% 35.


In this study, we showed that malignant lesions of the vocal cords can be reliably and accurately differentiated from healthy epithelium with an accuracy, sensitivity, specificity, PPV, and NPV of 91.38-96.55%, 100%, 87.8-95.2%, 77.27-89.47% and 100%, respectively. This suggests the potential of the non-invasive diagnosis of SCC in vivo. This is particularly important in the vocal folds, as a biopsy can cause scarring with irreversible damage to the vocal folds. Further development of staining of the nuclei and especially the optimisation of training programs for the human examiner as well as the deep learning algorithms that constitute the core of the automatic classification of CLE images are very promising and should be the focus of further investigation.

Figures and tables

Fig. 1.First image (A) on the left represents a typical image of healthy mucosa and the image on the right (B) the typical example of squamous cell carcinoma.

Fig. 2.Example of a false-positive finding. Both examiners independently assessed this sequence as representing malignant changes. Note the relatively poor quality of this CLE image with three capillaries filled with erythrocytes (frame 085, Fig. A). The second image represents a later part of the sequence (frame 119, Fig. B) showing many small, similar cells without clear aberrant pleomorphism.

Fig. 3.Example of an image, on which the examiners disagreed, representing a false positive of examiner E1. The quality of the image is also of poor quality, but the polygonal aspect suggests a benign lesion.

E1 (95% CI) E2 (95% CI)
Accuracy 91.38% (81.02-97.14%) 96.55% (88.09-99.58%)
Sensitivity 100.00% (80.49-100.00%) 100.00% (80.49-100.00%)
Specificity 87.80% (73.80-95.92%) 95.12% (83.47-99.40%)
Positive predictive value 77.27% (59.93-88.55) 89.47% (68.75-97.05%)
Negative predictive value 100.00% 100.00%
Table I.Diagnostic metrics. Accuracy, sensitivity, specificity, value and negative, negative predictive value with corresponding 95% confidence intervals (95% CI).


  1. Pantel M, Guntinas-Lichius O. Laryngeal carcinoma: epidemiology, risk factors and survival. HNO. 2012; 60:32-40. DOI
  2. Reiter R, Brosch S, Smith E. Management of T1a vocal fold carcinoma. Laryngo-Rhino-Otologie. 2013; 92:797-807. DOI
  3. Robert Koch Institut. Zentrum für Krebsregisterdaten, Krebs in Deutschland 2007/2008. Ausgabe. 2012; 8:12-3.
  4. Schultz P. Vocal fold cancer. Eur Ann Otorhinolaryngol Head Neck Dis. 2011; 128:301-8. DOI
  5. Chen M, Cheng L, Li C. Nonsurgical treatment for vocal fold leukoplakia: an analysis of 178 cases. Biomed Res Int. 2017;6958250. DOI
  6. Isenberg JS, Crozier DL, Dailey SH. Institutional and comprehensive review of laryngeal leukoplakia. Ann Otol Rhinol Laryngol. 2008; 117:74-9. DOI
  7. Benninger MS, Alessi D, Archer S. Vocal fold scarring: current concepts and management. Otolaryngol Head Neck Surg. 1996; 115:474-82. DOI
  8. Betz CS, Kraft M, Arens C. Optische diagnoseverfahren zur Tumorfrühdiagnostik im oberen Luft-Speise-Weg. HNO. 2016; 64:41-8.
  9. Piazza C, Dessouky O, Peretti G. Narrow-band imaging: a new tool for evaluation of head and neck squamous cell carcinomas. Review of the literature. Acta Otorhinolaryngol Ital. 2008; 28:49-54.
  10. Lingen MW, Kalmar JR, Karrison T. Critical evaluation of diagnostic aids for the detection of oral cancer. Oral Oncol. 2008; 44:10-22. DOI
  11. Volgger V, Girschick S, Ihrler S. Evaluation of confocal laser endomicroscopy as an aid to differentiate primary flat lesions of the larynx: a prospective study. Head Neck. 2016; 38:1695-704. DOI
  12. Šifrer R, Rijken JA, Leemans CR. Evaluation of vascular features of vocal cords proposed by the European Laryngological Society. Eur Arch Otorhinolaryngol. 2018; 275:147-51. DOI
  13. Simo R, Bradley P, Chevalier D. European Laryngological Society: ELS recommendations for the follow-up of patients treated for laryngeal cancer. Eur Arch Otorhinolaryngol. 2014; 271:2469-79. DOI
  14. Abbaci M, Breuskin I, Casiraghi O. Confocal laser endomicroscopy for non-invasive head and neck cancer imaging: a comprehensive review. Oral Oncol. 2014; 50:711-6. DOI
  15. Aubreville M, Knipfer C, Oetter N. Automatic classification of cancerous tissue in laserendomicroscopy images of the oral cavity using deep learning. Sci Rep. 2017; 7:11979. DOI
  16. Neumann H, Kiesslich R, Wallace MB. Confocal laser endomicroscopy: technical advances and clinical applications. Gastroenterology. 2010; 139:388-92. DOI
  17. Oetter N, Knipfer C, Rohde M. Development and validation of a classification and scoring system for the diagnosis of oral squamous cell carcinomas through confocal laser endomicroscopy. J Transl Med. 2016; 14:159. DOI
  18. Linxweiler M, Al Kadah B, Bozzato A. Noninvasive histological imaging of head and neck squamous cell carcinomas using confocal laser endomicroscopy. Eur Arch Otorhinolaryngol. 2016; 273:4473-83. DOI
  19. Goncalves M, Iro H, Dittberner A. Value of confocal laser endomicroscopy in the diagnosis of vocal cord lesions. Eur Rev Med Pharmacol Sci. 2017; 21:3990-7.
  20. Vasilev I, Mamenko I, Tabanakova I. Probe-based confocal laser endomicroscopy in metastatic pulmonary calcification. J Bronchology Interv Pulmonol. 2018; 25:60-2. DOI
  21. Kriegmair MC, Ritter M, Michel MS. Modern endoscopic imaging tools for urothelial carcinoma of the urinary bladder. Aktuelle Urol. 2017; 4:296-305. DOI
  22. Becker V, von Delius S, Bajbouj M. Intravenous application of fluorescein for confocal laser scanning microscopy: evaluation of contrast dynamics and image quality with increasing injection-to-imaging time. Gastrointest Endosc. 2008; 68:319-23. DOI
  23. Mauna Kea Technologies. Cellvizio Academy.Publisher Full Text
  24. Cook Mb, Chow WH, Devesa SS. Oesophageal cancer incidence in the United States by race, sex, and histologic type, 1977-2005. Br J Cancer. 2009; 101:855-9. DOI
  25. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977; 33:159-74.
  26. Nunes RB, Behlau M, Nunes MB. Clinical diagnosis and histological analysis of vocal nodules and polyps. Braz J Otorhinolaryngol. 2013; 79:434-40.
  27. Moore C, Mehta V, Ma X. Interobserver agreement of confocal laser endomicroscopy for detection of head and neck neoplasia. Laryngoscope. 2016; 126:632-7. DOI
  28. Agaimy A, Weichert W. Grading von Tumoren der Kopf-Hals-Region. Pathologe. 2016; 37:285-92. DOI
  29. Geheonea DI, Saftoiu A, Ciurea T. Confocal laser endomicroscopy of the colon. J Gastrointestin Liver Dis. 2010; 19:207-11.
  30. Hoffman A, Goetz M, Vieth M. Confocal laser endomicroscopy: technical status and current indications. Endoscopy. 2006; 38:1275-83. DOI
  31. Kumagai Y, Takubo K, Ishida H. Acrinol: dye with potential for nuclear staining in confocal laser endomicroscopy. Dig Endosc. 2017; 29:811-2. DOI
  32. Dittberner A, Rodner E, Ortmann W. Automated analysis of confocal laser endomicroscopy images to detect head and neck cancer. Head Neck. 2016; 38:E1419-26. DOI
  33. Aubreville M, Goncalves M, Knipfer C. Patch-based carcinoma detection on confocal laser endomicroscopy images - a cross-site robustness assessment. In: Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - Volume 2: BIOIMAGING.27-34I. DOI
  34. Stoeve MP, Aubreville M, Oetter N. Motion artifact detection in confocal laser endomicroscopy images. Bildverarbeitung für die Medizin. 2018.
  35. Galli A, Giordano L, Sarandria D. Oncological and complication assessment of CO2-laser assisted endoscopic surgery for T1- and T2 glottic tumors: clinical experience. Acta Otorhinolaryngol Ital. 2016; 36:167-73. DOI



Department of Otorhinolaryngology, Head and Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, University Hospital, Erlangen, Germany


Department of Otorhinolaryngology, Head and Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, University Hospital, Erlangen, Germany


Department of Otorhinolaryngology, Head and Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, University Hospital, Erlangen, Germany


Department of Otorhinolaryngology, Head and Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, University Hospital, Erlangen, Germany


Pattern Recognition Lab, Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany


Department of Otorhinolaryngology, Head and Neck Surgery, Friedrich-Alexander-Universität Erlangen-Nürnberg, University Hospital, Erlangen, Germany


Department of Otorhinolaryngology, Head and Neck Surgery, Universität Regensburg, University Hospital, Regensburg, Germany


© Società Italiana di Otorinolaringoiatria e chirurgia cervico facciale , 2019

  • Abstract viewed - 3021 times
  • PDF downloaded - 833 times