Bibliography

send mail DeutschEnglish
Peer-reviewed articles    |    Grants, scholarships and awards    |    Books and book chapters    |    Other publications    |    Posters (presented at conferences)    |    Conference talks and lectures    |    Television & Radio
Peer-reviewed articles (impact factor: 104.494; Altmetric score: 1534)

highlighted
A40. Christian T. Herbst, Angela S. Stoeger, Roland Frey, Jörg Lohscheller, Ingo R. Titze, Michaela Gumpenberger, W. Tecumseh Fitch (2012). How Low Can You Go? Physical Production Mechanism of Elephant Infrasonic Vocalizations. Science, 337 (6094), 595 - 599 - show abstract
Elephants can communicate using sounds below the range of human hearing ("infrasounds" below 20 hertz). It is commonly speculated that these vocalizations are produced in the larynx, either by neurally controlled muscle twitching (as in cat purring) or by flow-induced self-sustained vibrations of the vocal folds (as in human speech and song). We used direct high-speed video observations of an excised elephant larynx to demonstrate flow-induced self-sustained vocal fold vibration in the absence of any neural signals, thus excluding the need for any "purring" mechanism. The observed physical principles of voice production apply to a wide variety of mammals, extending across a remarkably large range of fundamental frequencies and body sizes, spanning more than five orders of magnitude.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A39. Christian T. Herbst, Stellan Hertegard, Daniel Zangger-Borch, Per-Ake Lindestad (2016). Freddie Mercury -- Acoustic Analysis of Speaking Fundamental Frequency, Vibrato and Subharmonics. Logopedics Phoniatrics Vocology, 42 (1), 29-38 - show abstract
Freddie Mercury was one of the 20th Century's best known singers of commercial contemporary music. This study presents an acoustical analysis of his voice production and singing style, based on perceptual and quantitative analysis of publicly available sound recordings. Analysis of six interviews revealed a median speaking fundamental frequency of 117.3 Hz, which is typically found for a baritone voice. Analysis of voice tracks isolated from full band recordings suggested that the singing voice range was 37 semitones, within the pitch range of F#2 (about 92.2 Hz) to G5 (about 784 Hz). Evidence for higher phonations up to a fundamental frequency of 1347 Hz was not deemed reliable. Analysis of 240 sustained notes from 21 a-cappella recordings revealed a surprisingly high mean fundamental frequency modulation rate (vibrato) of 7.0 Hz, reaching the range of vocal tremor. Quantitative analysis utilizing a newly introduced parameter to assess the regularity of vocal vibrato corroborated its perceptually irregular nature, suggesting that vibrato (ir)regularity is a distinctive feature of the singing voice. Imitation of subharmonic phonation samples by a professional rock singer, documented by endoscopic high-speed video at 4132 frames per second, revealed a 3:1 frequency locked vibratory pattern of vocal folds and ventricular folds.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A38. Christian T. Herbst, Hanspeter Herzel, Jan G. Svec, Megan Wyman, W. T. S. Fitch (2013). Visualization of system dynamics using phasegrams. R Soc Interface, 10 (85), 1-14 - show abstract
A new tool for visualization and analysis of system dynamics is introduced: the phasegram. Its application is illustrated with both classical nonlinear systems (logistic map and Lorenz system) and with biological voice signals. Phasegrams combine the advantages of sliding-window analysis (such as the spectrogram) with well-established visualization techniques from the domain of nonlinear dynamics. In a phasegram, time is mapped onto the x-axis, and various vibratory regimes, such as periodic oscillation, subharmonics or chaos, are identified within the generated graph by the number and stability of horizontal lines. A phasegram can be interpreted as a bifurcation diagram in time. In contrast to other analysis techniques, it can be automatically constructed from time-series data alone: no additional system parameter needs to be known. Phasegrams show great potential for signal classification and can act as the quantitative basis for further analysis of oscillating systems in many scientific fields, such as physics ( particularly acoustics), biology or medicine.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:

others (in chronological order)
A37. Matthias Echternach, Fabian Burk, Michael Burdumy, Christian T. Herbst, Marie Köberlein, Michael Döllinger, Bernhard Richter (2017). The influence of vocal mass lesions on the passaggio region of professional singers. Laryngoscope, 127 (6), 1392-1401 - show abstract
OBJECTIVES/HYPOTHESIS: In professional classical singing, an even voice quality throughout the entire singing voice range is essential. Transitions between vocal registers (passaggio) are the technically most challenging aspects in classical singing. It is hypothesized that they are most affected by vocal fold mass lesions (VFML).

STUDY DESIGN: Cohort study.

METHODS: In this study, the effect of VFML on vocal fold vibration in the passaggio regions was analyzed in four female and three male singers suffering from organic dysphonia. The singers were asked to sing an ascending glissando through the passaggio regions, before and after treatment. The vocal fold vibration was documented with transnasal endoscopic high-speed imaging recordings at 20,000 frames per second, supplemented by synchronized acoustic and electroglottographic recordings.


RESULTS: Major irregularities were found in the passaggio region of four singers before treatment, whereas the respective phonations below the passaggio were almost regular. In two female singers only the upper, but not the lower passaggio was affected. In all four of these participants, the passaggio region was more regular after treatment. In the remaining three participants, the VFML showed no effect on the passaggio region. However, the singers' ability to reach higher pitches was impaired, but was resolved after treatment.


CONCLUSIONS: The data in this case study strongly suggest that the passaggio region could be affected by VFML, even if phonation outside the passaggio regions is unimpaired. When planning surgical procedures for professional singers, clinical examination protocols should therefore include phonatory tests across the passaggio regions.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A36. Matthias Echternach, Fabian Burk, Florian Rose, Christian T. Herbst, Michael Burdumy, Michael Döllinger, Bernhard Richter (2017). Auswirkungen von Phonationsverdickungen auf die Biomechanik der Stimmlippenschwingungen in den Registerübergangsregionen bei professionellen Sängerinnen. HNO, accepted - show abstract

Einleitung: Der Einfluss von funktionellen Phonationsverdickungen auf das Schwingungsverhalten der Stimmlippen bei stimmlich herausfordernden Aufgaben ist im Detail nicht verstanden.

Material und Methoden: In dieser Studie wurden Glissandi von 220 Hz bis 440 Hz und von 440 Hz bis 880 Hz auf dem Vokal [a] bei je 4 professionellen Sängerinnen (a) ohne organischen Befund und ohne Dysphonie (Gruppe A), (b) mit funktionellen Phonationsverdickungen (Gruppe B) und (c) mit organischer Dysphonie (Gruppe C) mittels Hochgeschwindigkeitsendoskopie (HSDI, 20000 Bildern pro Sekunde) akustischen und elektroglottographischen (EGG) Signalen untersucht. Anhand der EGG sample entropy wurden Zeitfenster zur Analyse von Registerübergangsphänomen gebildet. Ferner wurden alle Stimmsignale (glottal area waveform (GAW), akustisches und EGG-Signal) einer perzeptiven Bewertung hinsichtlich des Auftretens von Registrierungsvorgängen unterzogen.

Ergebnisse: Die absolute sample entropy zeigte Maxima in Grundfrequenzbereichen, in denen typischerweise Registerübergänge zu finden sind. Die absoluten Werte der sample entropy lagen für die Gruppe C nur für das untere Glissando oberhalb der beiden anderen Gruppen. Gruppe B unterschied sich weder im Rating noch in den Werten der sample entropy deutlich von Gruppe A.

Fazit: Funktionelle Phonationsverdickungen wirken sich nicht negativ in Hinblick auf die Biomechanik in stimmtechnisch herausfordernden Bereichen wie Registerübergängen aus. Die Verwendung der sample entropy als Kriterium zur Detektion von Registerübergängen ist vielversprechend, bedarf jedoch weiterer Validierung.
A35. Maxime Garcia, Christian T. Herbst, Daniel L. Bowling, Jacob Dunn, W. Tecumseh Fitch (2017). Acoustic allometry revisited: morphological determinants of fundamental frequency in primate vocal production. Scientific Reports, 7 (10450), 1 - 11 - show abstract
A fundamental issue in the evolution of communication is the degree to which signals convey accurate (``honest'') information about the signaler. In bioacoustics, the assumption that fundamental frequency (fo) should correlate with the body size of the caller is widespread, but this belief has been challenged by various studies, possibly because larynx size and body size can vary independently. In the present comparative study, we conducted excised larynx experiments to investigate this hypothesis rigorously and explore the determinants of fo. Using specimens from eleven primate species, we carried out an inter-specific investigation, examining correlations between the minimum fo produced by the sound source, body size and vocal fold length (VFL). We found that, across species, VFL predicted minimum fo much better than body size, clearly demonstrating the potential for decoupling between larynx size and body size in primates. These findings shed new light on the diversity of primate vocalizations and vocal morphology, highlighting the importance of vocal physiology in understanding the evolution of mammal vocal communication.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A34. Matthias Echternach, Fabian Burk, Marie Köberlein, Andreas Selamtzis, Michael Döllinger, Michael Burdumy, Bernhard Richter, Christian T. Herbst (2017). Laryngeal evidence for the first and second passaggio in professionally trained sopranos. PLoS ONE, 12 (5), e0175865 - show abstract
Introduction
Due to a lack of empirical data, the current understanding of the laryngeal mechanics in the passaggio regions (i.e., the fundamental frequency ranges where vocal registration events usually occur) of the female singing voice is still limited.

Material and Methods
In this study the first and second passaggio regions of 10 professionally trained female classical soprano singers were analyzed. The sopranos performed pitch glides from A3 (fo = 220 Hz) to A4 (fo = 440 Hz) and from A4 (fo = 440 Hz) to A5 (fo = 880 Hz) on the vowel [i:]. Vocal fold vibration was assessed with trans-nasal high speed videoendoscopy at 20,000 fps, complemented by simultaneous electroglottographic (EGG) and acoustic recordings. Register breaks were perceptually rated by 12 voice experts. Voice stability was documented with the EGG-based sample entropy. Glottal opening and closing patterns during the passaggi were analyzed, supplemented with open quotient data extracted from the glottal area waveform.

Results
In both the first and the second passaggio, variations of vocal fold vibration patterns were found. Four distinct patterns emerged: smooth transitions with either increasing or decreasing durations of glottal closure, abrupt register transitions, and intermediate loss of vocal fold contact. Audible register transitions (in both the first and second passaggi) generally coincided with higher sample entropy values and higher open quotient variance through the respective passaggi.

Conclusions
Noteworthy vocal fold oscillatory registration events occur in both the first and the second passsagio even in professional sopranos. The respective transitions are hypothesized to be caused by either (a) a change of laryngeal biomechanical properties; or by (b) vocal tract resonance effects, constituting level 2 source-filter interactions.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A33. Christian T. Herbst (2017). A review of singing voice sub-system interactions - towards an extended physiological model of "support". Journal of Voice, 31 (2), 249.e13--249.e19 - show abstract
During phonation, the respiratory, the phonatory and the resonatory parts of the voice organ can interact, where physiological action in one sub-system elicits a direct effect in another. Here, three major of these synergies are reviewed, creating a model of voice sub-system interactions: (a) Vocal tract adjustments can influence the behavior of the voice source via non-linear source-tract interactions; (b) the type and degree of vocal fold adduction controls the expiratory airflow rate; and (c) the tracheal pull caused by the respiratory system affects the vertical larynx position and thus the vocal tract resonances.
The relevance of the presented model is discussed, suggesting, amongst others, that functional voice building work concerned with a particular voice sub-system may evoke side effects or benefits on other sub-systems, even when having a clearly defined and isolated physiological target.
Finally, four seemingly incongruous historic definitions of the concept of singing voice "support" are evaluated, showing how each of these pertain to different voice sub-systems at various levels of detail. It is argued that presumed discrepancies between these definitions can be resolved by putting them into the wider context of the sub-system interaction model presented here, thus offering a framework for reviewing and potentially refining some current and historical pedagogical approaches.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A32. Matthias Echternach, Fabian Burk, Marie Köberlein, Christian T. Herbst, Michael Döllinger, Bernhard Richter (2016). Oscillatory characteristics of the vocal folds across the tenor passaggio. Journal of Voice, 31 (3), 381.e5--381.e14 - show abstract
Introduction: Recent research has revealed that classically trained tenors tend to constrict epilaryngeal structures when singing in and above the passaggio (ie, the frequency region where register events typically occur). These constrictions complicate visibility of vocal fold oscillatory patterns with transoral rigid high-speed video endoscopy, thus limiting the current understanding of laryngeal dynamics in the passaggio region of tenors.

Materials and Methods: This investigation analyzed seven professionally trained western classical tenors using high-speed digital imaging (HSDI) at 20,000 frames per second via transnasal flexible endoscopy. The participants produced transitions (a) from modal to falsetto register and (b) from modal to stage voice above the passaggio (SVaP) during ascending pitch glides from A3 (220 Hz) to A4 (440 Hz) on vowel /i/. HSDI data were complemented by simultaneous acoustic and electroglottographic recordings.

Results: For many subjects both transition types were associated with constrictions of the epilaryngeal structures during the pitch glide. These constrictions appeared to be more distinct for the SVaP than for falsetto. No major irregularities of vocal fold oscillations in the sense of fundamental frequency jumps were observed for either transition type. However, during the transitions, the open quotient derived from the glottal area waveform (OQGAW) increased; in falsetto, the OQGAW was greater and the electroglottographic cepstral peak prominence was lower than in SVaP.

Conclusions: Epilaryngeal constrictions should be considered typical for tenors singing at high fundamental frequencies. Vocal fold oscillatory patterns are changing not only for the register shift from modal to falsetto but also for the transition from modal to SVaP, indicating a need for laryngeal adjustments during these transitions.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A31. Christian T. Herbst, Vit Hampala, Maxime Garcia, Riccardo Hofer, Jan G. Svec (2016). Hemi-laryngeal setup for studying vocal fold vibration in three dimensions. Journal of Visualized Experiments, accepted - show abstract
The voice of humans and most non-human mammals is generated in the larynx through self-sustaining oscillation of the vocal folds. Direct visual documentation of vocal fold vibration is challenging, particularly in non-human mammals. As an alternative, excised larynx experiments provide the opportunity to investigate vocal fold vibration under controlled physiological and physical conditions. However, the use of a full larynx merely provides a top view of the vocal folds, thus excluding crucial portions of the oscillating structures from observation during their interaction with aerodynamic forces. This limitation can be overcome by utilizing a hemi-larynx setup where one half of the larynx is mid-sagittally removed, thus providing both a superior and a lateral view of the remaining vocal fold during self-sustained oscillation.

Here, a step-by-step guide for the anatomical preparation of hemi-laryngeal structures and their mounting on the laboratory bench is given. Exemplary phonation of the hemi-larynx preparation is documented with high-speed video data captured by two synchronized cameras (superior and lateral views), showing three-dimensional vocal fold motion and corresponding time-varying contact area. The documentation of the hemi-larynx setup in this publication will facilitate application and reliable repeatability in experimental research, thus providing voice scientists with the potential to better understand the biomechanics of voice production.
A30. Christian T. Herbst, Harm K. Schutte, Daniel L. Bowling, Jan G. Svec (2016). Comparing chalk with cheese -- The EGG contact quotient is only a limited surrogate of the closed quotient. Journal of Voice, 31 (4), 401--409 - show abstract
The electroglottographic (EGG) contact quotient (CQegg), an estimate of the relative duration of vocal fold contact per vibratory cycle, is the most commonly used quantitative analysis parameter. The purpose of this study is to quantify the CQegg's relation to the closed quotient, a measure more directly related to glottal width changes during vocal fold vibration and the respective sound generation events.

Thirteen singers (six females) phonated in four extreme phonation types, while independently varying the degree of breathiness and vocal register. EGG recordings were complemented by simultaneous videokymographic (VKG) endoscopy, which allows for calculation of the videokymographic closed quotient (CQvkg). The CQegg was computed using five different algorithms, all used in previous research.

All CQegg algorithms produced CQegg values that clearly differed from the respective CQvkg, with standard deviations around 20 % of cycle duration. The difference between CQvkg and CQegg was generally greater for phonations with lower CQvkg. The largest differences were found for low-quality EGG signals with a signal-to-noise ratio (SNR) below 10 dB, typically stemming from phonations with incomplete glottal closure. Disregarding those low-quality signals, the best match between CQegg and CQvkg was found for a CQegg algorithm operating on the first derivative of the EGG signal.

These results show that the terms ``closed quotient'' and ``contact quotient'' should not be used interchangeably. They relate to different physiological phenomena. Phonations with incomplete glottal closure having an EGG SNR below 10 dB are not suited for CQegg analysis.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A29. Maxime Garcia, Bruno Gingras, Daniel L. Bowling, Christian T. Herbst, Markus Böckle, Yann Locatelli, W. Tecumseh Fitch (2016). Structural classification of Wild Boar (Sus scrofa) vocalizations. Ethology, 122 (4), 329--342 - show abstract
Determining whether a species' vocal communication system is graded or discrete requires definition of its vocal repertoire. In this context, research on domestic pig (Sus scrofa domesticus) vocalizations for example has led to significant advances in our understanding of communicative functions. Despite their close relation to domestic pigs, little is known about wild boar (Sus scrofa) vocalizations. The few existing studies, conducted in the 1970's, relied on visual inspections of spectrograms to quantify acoustic parameters and lacked statistical analysis. Here, we use objective signal processing techniques and advanced statistical approaches to classify 616 calls recorded from semi-free ranging animals. Based on four spectral and temporal acoustic parameters - quartile Q25, duration, spectral flux and spectral flatness - extracted from a multivariate analysis, we refine and extend the conclusions drawn from previous work and present a statistically validated classification of the wild boar vocal repertoire into four call types: grunts, grunt-squeals, squeals and trumpets. While the majority of calls could be sorted into these categories using objective criteria, we also found evidence supporting a graded interpretation of some wild boar vocalizations as acoustically continuous, with the extremes representing discrete call types. Using objective criteria based on modern techniques and statistics in respect to acoustic continuity, examining both production and perception levels, advances the understanding on vocal variation. Integrating our findings with recent studies on domestic pig vocal behavior and emotions, emphasize the importance of grunt-squeals for acoustic approaches to animal welfare and underline the need of further research investigating the role of domestication on animal vocal communication.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A28. Laura Enflo, Christian T. Herbst, Johan Sundberg, Anita McAllister (2016). Comparing vocal fold contact criteria derived from electroglottographic and acoustic signals. JVoice, 30 (4), 381-388 - show abstract

Objectives: Collision threshold pressure (CTP), i.e., the lowest subglottal pressure facilitating vocal fold contact during phonation, is likely to reflect relevant vocal fold properties. The amplitude of an electroglottographic (EGG) signal or the amplitude of its first derivative (dEGG) has been used as criterion of such contact. Manual measurement of CTP is time-consuming, making the development of a simpler, alternative method desirable.

Method: In this investigation we compare CTP values measured manually to values automatically derived from dEGG, and to values derived from a set of alternative parameters, some obtained from audio and some from EGG signals. One of the parameters was the novel EGG wavegram, which visualizes sequences of EGG or dEGG cycles, normalized with respect to period and amplitude. Raters with and without previous acquaintance with EGG analysis marked the disappearance of vocal fold contact in dEGG and in wavegram displays of /pa:/-sequences produced with continuously decreasing vocal loudness by seven singer subjects.

Results: Vocal fold contact was equally accurately identified in displays of both dEGG amplitude and wavegram. Automatically derived CTP values showed high correlation with those measured manually, and with those derived from the ratings of the visual displays. Seven other parameters were tested as criteria of such contact. Mainly due to noise in the EGG signal, most of them yielded CTP values differing considerably from those derived from the manual and the automatic methods, while the EGG spectrum slope showed a high correlation.

Conclusion: The possibility of measuring CTP automatically seems promising for future investigations.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A27. Vit Hampala, Maxime Garcia, Jan G. Svec, Ronald C. Scherer, Christian T. Herbst (2016). Relationship between the Electroglottographic Signal and Vocal Fold Contact Area. Journal of Voice, 30 (2), 161-171 - show abstract
Objective. Electroglottography (EGG) is a widely used non-invasive method that purports to measure changes in relative vocal fold contact area (VFCA) during phonation. Despite its broad application, the putative direct relation between the EGG waveform and VFCA has to date only been formally tested in a single study, suggesting an approximately linear relationship. However, in that study flow-induced vocal fold vibration was not investigated. A rigorous empirical evaluation of EGG as a measure of VFCA under proper physiological conditions is therefore still needed.

Methods/Design. Three red deer larynges were phonated in an excised hemilarynx preparation utilizing a conducting glass plate. The time varying contact between the vocal fold and the glass plate was assessed by high-speed video recordings at 6000 fps, synchronized to the EGG signal.

Results. The average differences between the normalized [0,1] VFCA and EGG waveforms for the three larynges were 0.180 (+-0.156), 0.075 (+-0.115) and 0.168 (+/-+-0.184) in the contacting phase, and 0.159 (+-0.112), -0.003 (+-0.029) and 0.004 (+-0.032) in the de-contacting phase.

Discussion and Conclusion: Overall there was a better agreement between VFCA and the EGG waveform in the de-contacting phase than in the contacting phase. Disagreements may be caused by non-uniform tissue conductance properties, electrode placement, and electroglottograph hardware circuitry. Pending further research, the EGG waveform may be a reasonable first approximation to change in medial contact area between the vocal folds during phonation. However, any quantitative and statistical data derived from EGG should be interpreted cautiously, allowing for potential deviations from true VFCA.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A26. Christian T. Herbst, Jakob Unger, Hanspeter Herzel, Jan G. Svec, Jörg Lohscheller (2015). Phasegram analysis of vocal fold vibration documented with laryngeal high-speed video endoscopy. Journal of Voice, Feb 12. pii: S0892-1997(15)00257-X. doi: 10.1016/j.jvoice.2015.11.006. [Epub ahead of print] - show abstract
Introduction. In a recent publication, the phasegram, a bifurcation diagram over time, has been introduced as an intuitive visualization tool for assessing the vibratory states of oscillating systems. Here, this non-linear dynamics approach is augmented with quantitative analysis parameters, and it is applied to clinical laryngeal high-speed video (HSV) endoscopic recordings of healthy and pathologic phonations.
Methods. HSV data from a total of 73 females diagnosed as healthy (n=42), or with functional dysphonia (n=15) or unilateral vocal fold paralysis (n=16), were quantitatively analyzed. Glottal area waveforms (GAW) as well as left and right hemi-GAWs (hGAW) were extracted from the HSV recordings. Based on Poincaré sections through phase space embedded signals, two novel quantitative parameters were computed: The phasegram entropy (PE), and the phasegram complexity estimate (PCE), inspired by signal entropy and correlation dimension computation, respectively.
Results. Both PE and PCE assumed higher average values (suggesting more irregular vibrations) for the pathological as compared to the healthy participants, significantly discriminating the healthy from the paralysis group (p=0.02 for both PE and PCE). Comparisons of individual PE or PCE data for the left and right hGAW within each subject resulted in asymmetry measures for the regularity of vocal fold vibration. The PCE-based asymmetry measure revealed significant differences between the healthy and the paralysis group (p=0.03).
Conclusions. Quantitative phasegram analysis of GAW and hGAW data is a promising tool for the automated processing of HSV data in research and in clinical practice.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A25. Coen Elemans, Jeppe Have Rasmussen, Christian T. Herbst, Daniel Düring, Sue Anne Zollinger, Henrik Brumm, Kyle Srivastava, Niels Svane, Ming Ding, Ole Larsen, Samuel Sober, Jan G. Svec (2015). Universal mechanisms of sound production and control in birds and mammals. Nature Communications, 6 (8978), 1-13 - show abstract
As animals vocalize, their vocal organ transforms motor commands into vocalizations for social communication. In birds, the physical mechanisms by which vocalizations are produced and controlled remain unresolved because of the extreme difficulty in obtaining in vivo measurements. Here, we introduce an ex vivo preparation of the avian vocal organ that allows simultaneous high-speed imaging, muscle stimulation and kinematic and acoustic analyses to reveal the mechanisms of vocal production in birds across a wide range of taxa. Remarkably, we show that all species tested employ the myoelastic-aerodynamic (MEAD) mechanism, the same mechanism used to produce human speech. Furthermore, we show substantial redundancy in the control of key vocal parameters ex vivo, suggesting that in vivo vocalizations may also not be specified by unique motor commands. We propose that such motor redundancy can aid vocal learning and is common to MEAD sound production across birds and mammals, including humans.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A24. Brian P. Gill, Christian T. Herbst (2015). Voice Pedagogy - What Do We Need?. Logopedics Phoniatrics Vocology, 41 (4), 168-173 - show abstract
The final keynote panel of the 10th Pan-European Voice Conference (PEVOC) was concerned with the topic ``Voice Pedagogy -- What do we need?'' In this communication the panel discussion is summarized and the authors provide a deepening discussion on one of the key questions, addressing the roles and tasks of people working with voice students. In particular, a distinction is made between (a) voice building (derived from the German term "Stimmbildung"), primarily comprising the functional and physiological aspects of singing; (b) coaching, mostly concerned with performance skills; and (c) singing voice rehabilitation. Both public and private educators are encouraged to apply this distinction to their curricula, in order to arrive at more efficient singing teaching and to reduce the risk of vocal injury to the concerned singers.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A23. Christian T. Herbst (2015). Glottale Adduktion im Gesang. Vox Humana, 11 (1), 12-17
A22. Ingo R. Titze, Ronald Baken, Kenneth Bozeman, Svante Granqvist, Nathalie Henrich-Bernardoni, Christian T. Herbst, David Howard, Eric Hunter, Dean Kaelin, Raymond Kent, Jody Kreiman, Malte Kob, Anders Lofqvist, Scott McCoy, Donald Miller, Hubert Noe, Ronald C. Scherer, John Smith, Brad Story, Jan G. Svec, Sten Ternström, Joe Wolfe (2015). Toward a consensus on symbolic notation of harmonics, resonances, and formants in vocalization. J. Acoust. Soc. Am., 137 (5), 3005-3007 - show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A21. Hana Sramkova, Svante Granqvist, Christian T. Herbst, Jan G. Svec (2015). The softest sound levels of the human voice in normal subjects. J. Acoust. Soc. Am., 137 (1), 407-418 - show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A20. Christian T. Herbst, Jinook Oh, Jitka Vydrova, Jan G. Svec (2015). DigitalVHI -- a freeware open source software application to capture the Voice Handicap Index and other questionnaire data in various languages. Logopedics Phoniatrics Vocology, 40 (2), 70-74 - show abstract
In this short report we introduce DigitalVHI, a free open-source software application for obtaining Voice Handicap Index (VHI), and other questionnaire data, which can be put on a computer in clinics and used in clinical practice. The software can be downloaded from http://www.christian-herbst.org/DigitalVHI/
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A19. Christian T. Herbst, Markus Hess, Frank Müller, Jan G. Svec, Johan Sundberg (2015). Glottal adduction and subglottal pressure in singing. Journal of Voice, 29 (4), 391-402 - show abstract
Previous research suggests that independent variation of vocal loudness and glottal configuration (type and degree of vocal fold adduction) does not occur in untrained speech production. This study investigated whether these factors can be varied independently in trained singing, and how subglottal pressure is related to average glottal airflow, voice source properties and sound level under these conditions.

A classically trained baritone produced sustained phonations on the endoscopic vowel [i:] at pitch D4 (approx. 294 Hz), exclusively varying either (a) vocal register; (b) phonation type (from ``breathy'' to ``pressed'' via cartilaginous adduction); or (c) vocal loudness, while keeping the others constant. Phonation was documented by simultaneous recording of videokymographic, electroglottographic, airflow and voice source data, and by percutaneous measurement of relative subglottal pressure.

Register shifts were clearly marked in the EGG wavegram display. As compared with chest register, falsetto was produced with greater pulse amplitude of the glottal flow, H1-H2, mean airflow, and with lower MFDR, subglottal pressure, and sound pressure. Shifts of phonation type (breathy/flow/neutral/pressed) induced comparable systematic changes. Increase of vocal loudness resulted in increased subglottal pressure, average flow, sound pressure, MFDR, glottal flow pulse amplitude and H1-H2.

When changing either vocal register or phonation type, subglottal pressure and mean airflow showed an inverse relationship, i.e, variation of glottal flow resistance. The direct relation between subglottal pressure and flow when varying only vocal loudness demonstrated independent control of vocal loudness and glottal configuration. Achieving such independent control of phonatory control parameters would be an important target in vocal pedagogy and in voice therapy.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A18. Shaheen N. Awan, Andrew R Krauss, Christian T. Herbst (2014). An Examination of the Relationship Between Electroglottographic Contact Quotient (CQEGG), EGG Decontacting Phase Profile, and Acoustical Spectral Moments. Journal of Voice, 29 (5), 519-529 - show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A17. Christian T. Herbst (2014). Glottal efficiency of periodic and irregular in vitro red deer voice production. Acta Acustica united with Acustica, 100 (4), 724-733 - show abstract
Two female red deer larynges were artificially phonated in an excised larynx setup by varying subglottal pressure as the independent parameter. The acquired data were annotated as periodic, subharmonic and irregular by means of the recently developed phasegram technique. Glottal efficiency was non-linearly dependent on subglottal pressure. Above 1 kPa subglottal pressure the glottal efficiency increased linearly by about 3.1 and 3.7 dB per kPa, respectively, in the two larynges. At subglottal pressures above 1.5 kPa the glottal efficiency of the irregular segments was in average about 2.5 to 3 dB greater than that of the periodic and subharmonic segments. The results of this pilot study suggest that an irregular sound production mechanism at higher subglottal pressures could be a means to gain an energetic advantage in animal vocal communication when converting metabolic to acoustic energy.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A16. Christian T. Herbst, Jörg Lohscheller, Jan G. Svec, Nathalie Henrich Bernadoni, Gerald Weissengruber, W. Tecumseh Fitch (2014). Glottal opening and closing events investigated by electroglottography and super-high-speed video recordings. J Exp Biol, 217 (6), 955-963 - show abstract
Previous research has suggested that the peaks in the first derivative (dEGG) of the electroglottographic (EGG) signal are good approximate indicators of the events of glottal opening and closing. These findings were based on high-speed video (HSV) recordings with frame rates 10 times lower than the sampling frequencies of the corresponding EGG data. The present study attempts to corroborate these previous findings, utilizing super-HSV recordings. The HSV and EGG recordings (sampled at 27 and 44 kHz, respectively) of an excised canine larynx phonation were synchronized by an external TTL signal to within 0.037 ms. Data were analyzed by means of glottovibrograms, digital kymograms, the glottal area waveform and the vocal fold contact length (VFCL), a new parameter representing the time-varying degree of `zippering' closure along the anterior--posterior (A--P) glottal axis. The temporal offsets between glottal events (depicted in the HSV recordings) and dEGG peaks in the opening and closing phase of glottal vibration ranged from 0.02 to 0.61 ms, amounting to 0.24--10.88% of the respective glottal cycle durations. All dEGG double peaks coincided with vibratory A--P phase differences. In two out of the three analyzed video sequences, peaks in the first derivative of the VFCL coincided with dEGG peaks, again co-occurring with A--P phase differences. The findings suggest that dEGG peaks do not always coincide with the events of glottal closure and initial opening. Vocal fold contacting and de-contacting do not occur at infinitesimally small instants of time, but extend over a certain interval, particularly under the influence of A--P phase differences.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A15. David Howard, Jenevora Williams, Christian T. Herbst (2013). "Ring" in the solo child singing voice. JVoice, 28 (2), 161-169 - show abstract
Objectives/Hypothesis. Listeners often describe the voices of solo child singers as being `pure' or `clear', these terms would suggest that the voice is not only pleasant but also clearly audible. The audibility or clarity could be attributed to the presence of high-frequency partials in the sound: a `brightness' or `ring'. This paper aims to investigate spectrally the acoustic nature of this `ring' phenomenon in children's solo voices, and in particular, relating it to their `non-ring' production. Additionally, this is set in the context of establishing to what extent, if any, the spectral characteristics of `ring' are shared with those of the singer's formant cluster associated with professional adult opera singers in the 2.5 to 3.5 kHz region.

Methods. A group of child solo singers, acknowledged as outstanding by a singing teacher who specializes in teaching professional child singers, were recorded in a major UK concert hall performing Come unto him, all ye that labour, from the aria He shall feed his flock from The Messiah by GF Handel. Their singing was accompanied by a recording of a piano played through in-ear headphones. Sound pressure recordings were made from well within the critical distance in the hall. The singers were observed to produce notes with and without `ring', and these recordings were analyzed in the frequency domain to investigate their spectra.

Results. The results indicate that there is evidence to suggest that `ring' in child solo singers is carried in two areas of the output spectrum: firstly in the singer's formant cluster region, centered around 4 kHz, which is more than 1000 Hz higher than what is observed in adults; and secondly in the region around 7.5-11 kHz where a significant strengthening of harmonic presence is observed. A perceptual test has been carried out demonstrating that 94% of 62 listeners label a synthesized version of the calculated overall average `ring' spectrum for all subjects as having `ring' when compared to a synthesized version of the calculated overall average `non-ring' spectrum.

Conclusions. The notion of `ring' in the child solo voice manifests itself not only with spectral features in common with the projection peak found in adult singers but also in a higher frequency region. It is suggested that the formant cluster at around 4 kHz is the children's equivalent of the singers' formant cluster; the frequency is higher than in the adult, most likely due to the smaller dimensions of the epilaryngeal tube. The frequency cluster observed as a strong peak at about 7.5-11 kHz, when added to the children's singers' formant cluster, may be the key to cueing the notion of 'ring' in the child solo voice.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A14. Daniel L. Bowling, Christian T. Herbst, W. Tecumseh Fitch (2013). Social Origins of Rhythm? Synchrony and Temporal Regularity in Human Vocalization. PLOS ONE, 8 (11), e80402 - show abstract
Humans have a capacity to perceive and synchronize with rhythms. This is unusual in that only a minority of other species exhibit similar behavior. Study of synchronizing species (particularly anurans and insects) suggests that simultaneous signal production by different individuals may play a critical role in the development of regular temporal signaling. Accordingly, we investigated the link between simultaneous signal production and temporal regularity in our own species. Specifically, we asked whether inter-individual synchronization of a behavior that is typically irregular in time, speech, could lead to evenly-paced or ``isochronous'' temporal patterns. Participants read nonsense phrases aloud with and without partners, and we found that synchronous reading resulted in greater regularity of durational intervals between words. Comparison of same-gender pairings showed that males and females were able to synchronize their temporal speech patterns with equal skill. These results demonstrate that the shared goal of synchronization can lead to the development of temporal regularity in vocalizations, suggesting that the origins of musical rhythm may lie in cooperative social interaction rather than in sexual selection.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A13. Christian T. Herbst, Jan G. Svec, Jörg Lohscheller, Roland Frey, Michaela Gumpenberger, Angela S. Stoeger, W. Tecumseh Fitch (2013). Complex vibratory patterns in an elephant larynx. J Exp Biol, 216, 4054-4064 download PDF - show abstract
Elephant low-frequency vocalizations are produced by flow-induced self-sustaining oscillations of laryngeal tissue. To date, little is known in detail about the vibratory phenomena in the elephant larynx. Here we provide a first descriptive report of the complex oscillatory features found in the excised larynx of a 25 year old female African elephant (Loxodonta africana), the largest animal sound generator ever studied experimentally.

Sound production was documented with high-speed video, acoustic measurements, airflow and sound pressure level recordings. The anatomy of the larynx was studied with computed tomography (CT) and dissections. Elephant CT vocal anatomy data were further compared to the anatomy of an adult human male.

We observed numerous unusual phenomena, not typically reported in human vocal fold vibrations. Phase delays along both the inferior-superior and anterior-posterior (A-P) dimension were commonly observed, as well as transverse travelling wave patterns along the A-P dimension, as yet not documented in the literature. Acoustic energy was mainly created during the instant of glottal opening. The vestibular folds, when adducted, participated in the tissue vibration, effectively increasing the generated sound pressure level by 12 dB.

The complexity of the observed phenomena is partly attributed to the distinct laryngeal anatomy of the elephant larynx, which is not simply a large-scale version of its human counterpart. Travelling waves may be facilitated by low fundamental frequencies and increased vocal fold tension. A travelling wave model is proposed, to account for three types of phenomena: A-P travelling waves, ``conventional'' standing wave patterns, and irregular vocal fold vibration.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A12. Jakob Unger, Tobias Meyer, Christian T. Herbst, W. Tecumseh Fitch, Michael Döllinger, Jörg Lohscheller (2013). Phonovibrographic wavegrams: Visualizing vocal fold kinematics. J. Acoust. Soc. Am., 133 (2), 1055-1064 - show abstract
Recently, endoscopic high-speed laryngoscopy has been established for commercial use as a state-of-the-art technique to examine vocal fold kinematics. Since modern cameras provide sampling rates of several thousand frames per second, a high volume of data has to be considered for visual and objective analysis. A method for visualizing endoscopic high speed videos in three-dimensional cycle-based graphs combining and extending the approaches of phonovibrograms and electroglottographic wavegrams is presented. To build a phonovibrographic wavegram, individual cycles of a phonovibrogram are segmented, normalized in cycle duration, and concatenated over time. For analyzing purposes, the emerging three-dimensional scalar field is visualized with different rendering techniques providing information of different aspects of vocal fold kinematics. The phonovibrographic wavegram incorporates information about the glottal closure type, size, and location of the amplitudes, symmetry, periodicity, and phase information. The potential of the approach to visualize the characteristics of vocal fold vibration in a compact and intuitive way is demonstrated within two healthy and three pathologic subjects. The phonovibrographic wavegram allows a comprehensive analysis of vocal fold kinematics and reveals information that remains hidden with other visualization techniques.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A11. Angela S. Stoeger, Daniel Mietchen, Sukhun Oh, Shermin de Silva, Christian T. Herbst, Soonwhan Kwon, W. Tecumseh Fitch (2012). An Asian Elephant Imitates Human Speech. Current Biology, 22, 1-5 - show abstract
Vocal imitation has convergently evolved in many species, allowing learning and cultural transmission of complex, conspecific sounds, as in birdsong. Scattered instances also exist of vocal imitation across species, including mockingbirds imitating other species or parrots and mynahs producing human speech. Here, we document a male Asian elephant (Elephas maximus) that imitates human speech, matching Korean formants and fundamental frequency in such detail that Korean native speakers can readily understand and transcribe the imitations. To create these very accurate imitations of speech formant frequencies, this elephant (named Koshik) places his trunk inside his mouth, modulating the shape of the vocal tract during controlled phonation. This represents a wholly novel method of vocal production and formant control in this or any other species. One hypothesized role for vocal imitation is to facilitate vocal recognition by heightening the similarity between related or socially affiliated individuals. The social circumstances under which Koshik's speech imitations developed suggest that one function of vocal learning might be to cement social bonds and, in unusual cases, social bonds across species.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A10. Christian T. Herbst, Elke Duus, Harald Jers, Jan G. Svec (2012). Quantitative Voice Class Assessment of Amateur Choir Singers: A Pilot Investigation. International Journal of Research in Choral Singing (IJRCS), 4 (1), 47-59 download PDF - show abstract
The required pitch range (RPR), i.e. the pitch range that is determined by the music to be sung, is dependent on voice class (most commonly: soprano, alto, tenor or bass). Ideally, it should lie well within the boundaries of the physiologic voice range. In amateur choir singing however, the individual singer's choice of voice class does not necessarily result in optimal use of vocal potential. This study tries to establish an objective, quantitative method to determine voice class, and to highlight unused potential as regards voice range.

Twenty-one members of an amateur choir (15 female, 6 male) were examined by means of standard voice range profile (VRP) measurement. The RPR (as defined by the singers' chosen voice class) was compared to maximum phonational frequency range (MPFR) as determined by the physiological VRP measurement. The difference between the upper limit of the RPR and the highest pitch in the VRP, expressed in semitones, was defined as ``upper reserve'' (UR); the difference between the lower limit of the RPR and the lowest pitch measured with the VRP was defined as the ``lower reserve'' (LR). The ``tessitura shift'' (TS) was defined as half the difference between upper and lower reserve [ TS = (LR - UR) / 2 ]. It is a measure of the offset of the RPR in relation to the MPFR, expressed in semitones.

The average physiologic voice range was 37.7 semitones (min 31, max 45). With the exception of the sopranos, all voice classes had more upper reserve than lower reserve, which was reflected by the average TS per voice class: soprano 2.33; other voice classes: -2.83 to -6.3. Results imply that individual singers might profit from changing their voice class (from soprano to alto, or vice versa), in order to better exploit their physiological voice range.

We concluded that upper and lower reserve measurements are well suited to indicate the degree of voice usage in extreme frequency ranges, whereas the TS can be used as an indicator of the ``alignment'' of RPR within the physiological voice range. Amateur choir singers' choice of voice class is a strategic decision that might crucially influence the singers' phonatory behavior, and thus their long-term vocal health. The indicators presented in this study may be useful for making such a decision.
A9. Christian T. Herbst, Jan G. Švec (2012). Adjustment of glottal configurations in singing. Journal of Singing, 70 (3), 301-308
A8. Bruno Gingras, Markus Boeckle, Christian T. Herbst, W. Tecumseh Fitch (2012). Call acoustics reflect body size across four clades of anurans. Journal of Zoology, 289 (2), 143-150 - show abstract
An inverse relationship between body size and advertisement call frequency has been found in several frog species. However, the generalizability of this relationship across different clades and across a large distribution of species remains underexplored. We investigated this relationship in a large sample of 136 species belonging to four clades of anurans (Bufo, Hylinae, Leptodactylus and Rana) using semi-automatic, high-throughput analysis software. We employed two measures of call frequency: fundamental frequency (F0) and dominant frequency (DF). The slope of the relationship between male snout-vent length (SVL) and frequency did not differ significantly among the four clades. However, Rana call at a significantly lower frequency relative to size than the other clades, and Bufo call at a significantly higher frequency relative to size than Leptodactylus. Because the relationship between F0 and body size may be more straightforwardly explained by biomechanical constraints, we confirmed that a similar inverse relationship was observed between F0 and SVL. Finally, spectral flatness, an indicator of the tonality of the vocalizations, was found to be inversely correlated with SVL, contradicting an oft-cited prediction that larger animals should have rougher voices. Our results confirm a tight and widespread link between body size and call frequency in anurans, and suggest that laryngeal allometry and vocal fold dimensions in particular are responsible.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A7. Christian T. Herbst (2012). Freddie Mercury - Akustische Stimm-Analyse. L.O.G.O.S. Interdisziplinär, 20 (3), 174-183 download PDF - show abstract
In dieser Studie wurde das öffentlich zugängliche Tonmaterial des Sängers Freddie Mercury akustisch analysiert. Es wurde eine mittlere Sprechstimmlage von ungefähr 109 bis 128 Hertz und ein Singstimmumfang von drei Oktaven (G bis g'', ca. 98 -- 784 Hz) festgestellt. Freddie Mercury war von der Sprechstimmlage her Bariton, sang jedoch meistens in Tenorlage. Das Stimmtimbre zeigte sich sehr variabel. Freddie Mercury sang sowohl im Brust- als auch im Falsett-Register, der Grad der glottischen Adduktion wurde abhängig vom ästhetischen Kontext entlang der Dimension "behaucht"/"gepresst" variiert. Die Stimme hatte ein unregelmäßiges und schnelles Vibrato (ca. 7 Hz) mit relativ weiter Auslenkung (ca. 1.5 Halbtöne). Das stellenweise "raue" Stimmtimbre ist auf subharmonische Oszillations-Phänomene (Periodenverdopplung) im Larynx zurückzuführen. Der Gesamteindruck einer Stimme, welche bis ans Limit ausgereizt wurde, ist durchaus kompatibel mit der exzentrischen Künstlerpersönlichkeit Freddie Mercurys.

This study provides an acoustical analysis of Freddie Mercury's voice, mostly based on the commercially available a-cappella sound material. The average speaking fundamental frequency was in the range of 109 -- 128 Hz and the singing voice range stretched across three octaves (G2 -- G5, ca. 98 -- 784 Hz). Theses results suggest that Freddie Mercury was a Baritone who sang as a Tenor. Being able to flexibly adjust his voice timbre, he sang in both chest and head (falsetto) voice. He was capable of manipulating glottal adduction along the dimension of breathy vs. pressed, varying with aesthetical context. Freddie Mercury's voice was characterized by an irregular and fast vibrato (ca. 7 Hz) with a relatively large amplitude of about 1.5 semi-tones. The perceptually rougher sounds were likely to be caused by subharmonic oscillatory phenomena (period doubling, tripling and quadrupling) in the larynx. In conclusion, the collected data suggests that Freddie Mercury drove his voice well to its limits, which is in good agreement with his eccentric stage persona.
A6. Christian T. Herbst, Qingjun Qiu, Harm K. Schutte, Jan G. Švec (2011). Membranous and cartilaginous vocal fold adduction in singing. J. Acoust. Soc. Am., 129 (4), 2253-2262 - show abstract
While vocal fold adduction is an important parameter in speech, relatively little has been known on the adjustment of the vocal fold adduction in singing. This study investigates the possibility of separate adjustments of cartilaginous and membranous vocal fold adduction in singing. Six female and seven male subjects, singers and non-singers, were asked to imitate an instructor in producing four phonation types: ``aBducted falsetto'' (FaB), ``aDducted falsetto'' (FaD), ``aBducted Chest'' (CaB), and ``aDducted Chest'' (CaD). The phonations were evaluated using videostroboscopy, videokymography (VKG), electroglottography (EGG), and audio recordings. All the subjects showed less posterior (cartilaginous) vocal fold adduction in phonation types FaB and CaB than in FaD and CaD, and less membranous vocal fold adduction (smaller closed quotient) in FaB and FaD than in CaB and CaD. The findings indicate that the exercises enabled the singers to separately manipulate (a) cartilaginous adduction and (b) membranous medialization of the glottis though vocal fold bulging. Membranous adduction (monitored via videokymographic closed quotient) was influenced by both membranous medialization and cartilaginous adduction. Individual control over these types of vocal fold adjustments allows singers to create different vocal timbres.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A5. Christian T. Herbst, W. T. S. Fitch, Jan G. Švec (2010). Electroglottographic wavegrams: a technique for visualizing vocal fold dynamics noninvasively. J. Acoust. Soc. Am., 128 (5), 3070-3078 - show abstract
A method for analyzing and displaying electroglottographic (EGG) signals (and their first derivative, DEGG) is introduced: the electroglottographic wavegram ("wavegram" hereafter). To construct a wavegram, the time-varying fundamental frequency is measured and consecutive individual glottal cycles are identified. Each cycle is locally normalized in duration and amplitude, the signal values are encoded by color intensity and the cycles are concatenated to display the entire voice sample in a single image, similar as in sound spectrography. The wavegram provides an intuitive means for quickly assessing vocal fold contact phenomena and their variation over time. Variations in vocal fold contact appear here as a sequence of events rather than single phenomena, taking place over a certain period of time, and changing with pitch, loudness and register. Multiple DEGG peaks are revealed in wavegrams to behave systematically, indicating subtle changes of vocal fold oscillatory regime. As such, EGG wavegrams promise to reveal more information on vocal fold contacting and de-contacting events than previous methods.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A4. Christian T. Herbst, David Howard, Josef Schlömicher-Thier (2010). Using electroglottographic real-time feedback to control posterior glottal adduction during phonation. JVoice, 24 (1), 72 - 85 - show abstract
The goal of this pilot study was to determine whether the ability to change the degree of posterior glottal adduction (PGA) during phonation can be acquired more easily with the aid of electroglottographic (EGG) real-time feedback. The subject was a 37-year-old untrained female with habitually breathy voice. Before the experiment, she participated in one voice coaching session where exercises for increasing PGA were explained and executed. During the experiment, phonation has been monitored simultaneously with videostroboscopy, electroglottography, and audio recording. While phonating, the subject saw amplitude and period normalized EGG waveform representing one glottal cycle consecutively changing over time. The assignment was to increase the width of the EGG waveform during phonation. Laryngeal imaging revealed a posterior glottal chink during habitual phonation. The subject could only introduce intentional changes into the EGG waveform after its relevance had been explained, and after recapitulation of the exercises of the voice coaching session: An increase of the EGG waveform width coincided with the increase of high-frequency partials and an increase of PGA. For pitches B3 and B4, full glottal closure could be achieved. At G5, a reduction of the posterior glottal chink occurred. The findings of this study suggest that the skill to control the degree of PGA can be acquired, and that EGG real-time feedback can be a crucial element in optimizing the process of skill acquisition, but only if (1) the context and nature of the feedback is explained and (2) proper instructions are provided. The EGG contact quotient might not be sensitive to changes of PGA in falsetto phonation.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A3. Christian T. Herbst, Jan G. Švec, Sten Ternström (2009). Investigation of four distinct glottal configurations in classical singing - a pilot study. JASA-EL, 125 (3), EL104-EL109 - show abstract
This study investigates four qualities of singing voice in a classically trained baritone: "naïve falsetto", "countertenor falsetto", "lyrical chest" and "full chest". Laryngeal configuration and vocal fold behavior in these qualities were studied using laryngeal videostroboscopy, videokymography, electroglottography, and sound spectrography. The data suggest that the four voice qualities were produced by independently manipulating mainly two laryngeal parameters: (1) the adduction of the arytenoid cartilages and (2) the thickening of the vocal folds. An independent control of the posterior adductory muscles versus the vocalis muscle is considered to be the physiological basis for achieving these singing voice qualities.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
A2. Christian T. Herbst (2007). Der Knabensolist in der Oper - Ein akustisches Portrait. L.O.G.O.S. Interdisziplinär, 15 (3), 166-174 - show abstract
Ein Mezzosopran des Tölzer Knabenchors sang bei den Salzburger Osterfestspielen 2006 den Yniold in Claude Debussy's "Pelléas et Mélisande". Während der Hauptprobe der Inszenierung wurde das Audio-Signal mit einem mit fixem Abstand zum Mund des Sängers befestigten Mikrofon abgenommen. Zusätzliche Evidenz wurde mittels Elektroglottografie bzw. Video-Endoskopie gewonnen. Ziel der Studie war die Klärung der Frage, ob durch die Sammlung und Interpretation objektiver Daten Aussagen über die angewandte Gesangstechnik getroffen werden können.

Mihilfe eines automatisierten computergestützten Prozesses wurden aus dem akustischen Signal Segmente mit Vokalen und stimmhaften Konsonanten extrahiert. Das darauf basierende Tessiturogramm zeigt eine auf h' zentrierte Normalverteilung der Tonhöhe. Im Schalldruckpegel-Histogramm kommen Ton-Segmente mit Schalldruckpegel um ca. 90 dB am häufigsten vor, als Spitzenwert wurden 112 dB bei einem Messabstand von 30 cm zwischen Mund und Mikrofon gemessen. Der Schalldruckpegel steigt quasi linear um 10 dB pro Oktave. Das Langzeitspektrum weist Formanten-Cluster zwischen 3000 und 4000 Hz bzw. 7300 und 8200 Hz auf.

Der berechnete Scheitelfaktor (crest factor) nimmt um ca. 3 dB pro Oktave ab, was als Indikator für einen Rückgang der hochfrequenten Partialtöne mit zunehmender Höhe anzusehen ist. Auffällig ist ein abrupter Abfall des Scheitelfaktors ab f'' (ca. 700 Hz). Entsprechende elektroglottografische Daten zeigen eine deutliche Veränderung der EGG-Wellenform ab f'', eine Verkürzung der glottischen Verschlussphase ist evident. Eine zusätzliche Untersuchung mittels Video-Endoskopie zeigt, dass ohne vom Sänger intendierten Registerausgleich keine Tonhöhensteigerung mehr möglich ist, sobald der musculus vocalis den Punkt seiner maximalen Kontraktion erreicht hat.

Es kann vermutet werden, dass ein reines "Lauter Singen" als Strategie für die grosse Bühne nicht ausreichend ist. Die geforderte Steigerung des Schalldruckpegels muss mit einer entsprechenden physiologischen Prädisposition und einer exzellenten Gesangstechnik einhergehen. Die gegenständliche Studie zeigt, dass der adäquate Register- bzw. Lagenausgleich ein wesentliches Merkmal der bühnentauglichen Stimme ist.
A1. Christian T. Herbst, Sten Ternström (2006). A comparison of different methods to measure the EGG contact quotient. Logopedics Phoniatrics Vocology, 31 (3), 126-138 - show abstract
The results from six published electroglottographic (EGG-based) methods for calculating the EGG contact quotient (CQEGG) were compared to closed quotients derived from simultaneous videokymographic imaging (CQKYM). Two trained male singers phonated in falsetto and in chest register, with two degrees of adduction in both registers. The maximum difference between methods in the CQEGG was 0.3 (out of 1.0). The CQEGG was generally lower than the CQKYM. Within subjects, the CQEGG co-varied with the CQkym, but with changing offsets depending on method. The CQEGG cannot be calculated for falsetto phonation with little adduction, since there is no complete glottal closure. Basic criterion-level methods with thresholds of 0.2 or 0.25 gave the best match to the CQKYM data. The results suggest that contacting and de-contacting in the EGG might not refer to the same physical events as do the beginning and cessation of airflow.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
top of page
Grants, scholarships and awards
G14. Annual Sataloff Award for Young Investigators. The Voice Foundation, April 2016. [awarded for the Journal of Voice publication ``Phasegram analysis of vocal fold vibration documented with laryngeal high-speed video endoscopy'' by Christian T. Herbst, Jakob Unger, Hanspeter Herzel, Jan G. Svec, and Jörg Lohscheller.]
G13. APART Grant [Austrian Programme for Advanced Research and Technology]. Austrian Academy of Sciences, November 2014
G12. Award. Croatian Choral Directors Association, April 2014. [for Scientific Research in the Field of Chorusology]
G11. Society of Experimental Biology, Young Scientist Award, Animal Section: Runner Up. SEB Annual Main Meeting 2013, July 2013. [for the contribution Christian T. Herbst, Angela S. Stoeger, Roland Frey, Jörg Lohscheller, Ingo R. Titze, Michaela Gumpenberger, W. Tecumseh Fitch. Sound production mechanism in elephant infrasound vocalizations.]
G10. AQL Best Paper Award. 10th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research, June 2013. [for the contribution Christian T. Herbst, W. Tecumseh Fitch, Jörg Lohscheller, Jan G. Svec. Estimation of the vertical glottal shape based on empirical high-speed video and electroglottographic data.]
G9. 2nd Annual Hamdan International Presentation Award. The Voice Foundation's 42nd Annual Symposium, June 2013. [for the contribution Christian T. Herbst, Angela S. Stoeger, Roland Frey, Jörg Lohscheller, Ingo R. Titze, Michaela Gumpenberger, W. Tecumseh Fitch. Sound production mechanism in elephant infrasound vocalizations.]
G8. Van Lawrence Prize. British Voice Association, May 2012. [in recognition of the contribution to the field of voice: Christian T. Herbst, Jan G. Švec, J. Schlömicher-Thier, W. Tecumseh Fitch: Analyzing the female 'middle register' with EGG wavegrams] - show abstract
The choice of singing register and the degree of vocal fold adduction are two concepts that are not easily discriminated by inexperienced singers. This is particularly true for the mid range (pitch C4 -- C5) of untrained female classical singers, where adducted falsetto, the desired sound quality in this range, is rarely observed. As an underlying physiological principle, vocal fold adduction can be separately controlled by (a) cartilaginous adduction, i.e. the adduction of the posterior glottis via the arytenoids (controlled by the singer with the degree of ``breathiness'' / ''pressedness''); and by (b) membranous medialization through vocal fold bulging (controlled by the choice of vocal register, i.e. chest vs. falsetto) [1].

In this study, singing exercises and instructions for adjusting adductory settings (cartilaginous adduction vs. membranous medialization) in the female mid-range were performed by both trained and untrained female classical singers. Phonation was monitored by acoustic recording, electroglottography (EGG) and laryngeal imaging. EGG wavegrams [2], a novel method for displaying EGG signals, were used for data analysis.

EGG wavegram data revealed distinct differences between the targeted phonation types for each individual. The observed differences established themselves as (a) presence/absence of vocal fold contact; (b) duration of vocal fold contact per glottal cycle; (c) changes in the overall EGG signal amplitude; (d) distinctness of opening/closing events; (e) perturbations seen in the wavegrams. Inter-subject data variation suggests that the individual's anatomy influences vocal fold contact in singing. EGG wavegrams proved to be useful in documenting changes of both singing register and glottal adduction.
G7. Research Grant: VOICE -- Vision On Innovation for Choral music in Europe. Research programme on vocal health for amateur singers. European Commission, Education, Audiovisual and Culture Executive Agency (EACEA), March 2012
G6. Dean's Prize. Palacký University Olomouc, Faculty of Science, December 2011. [for the publication Herbst CT, Qiu Q., Schutte HK, Švec JG: Membranous and cartilaginous vocal fold adduction in singing. Journal of the Acoustical Society of America 129(4): 2253-2262 (2011)]
G5. Dean's Prize. Palacký University Olomouc, Faculty of Science, May 2011. [for the best PhD project in Physics]
G3. SEMPRE Conference Award. Society for Education, Music and Psychology Research, May 2006
G2. Promotion Grant ("Förderstipendium"). University Mozarteum, December 2004
G1. ERASMUS Mobility Grant. University Mozarteum, December 2003
top of page
Books and book chapters
B4. Christian T. Herbst, Jan G. Svec (submitted/accepted). Biophysics of Vocal Production in Mammals. in: Vertebrate Sound Production and Acoustic Communication. Springer, Fitch, W. Tecumseh and Popper, Arthur and Suthers, Rod - show abstract
Most mammals, including humans, produce sound in agreement with the myoelastic-aerodynamic theory (MEAD): by converting aerodynamic energy into acoustic energy via flow-induced self-sustaining oscillation of the vocal folds or other laryngeal tissue. The generated laryngeal sound is filtered by the vocal tract and radiated from the mouth and/or the nose.

In this chapter, some basic biophysical principles of the MEAD theory are explained, mostly based on research done in humans. Empirical evidence and concepts for nonhuman mammals are provided when available and applicable.

In particular, biomechanical properties of vibrating laryngeal tissue and respective vibratory modes are described, and the oscillatory components and forces necessary for flow-induced self-sustaining vibration are discussed. The notions of fundamental frequency and its control, periodicity, and irregularity are explored, followed by a basic description of non-linear phenomena (NLP) such as bifurcations, subharmonics, or chaos. Subglottal pressure and glottal airflow are essential parameters of voice production, and their influence on the generated voice source spectrum is considered. Finally, linear and non-linear effects of the vocal tract are reviewed, and the efficiency sound production is discussed.
B3. Christian T. Herbst, David Howard, Jan G. Svec (submitted/accepted). The sound source in singing -- basic principles and muscular adjustments for fine-tuning vocal timbre. in: The Oxford Handbook of Singing. Oxford University Press, D. Howard, J. Nix, G. Welch Eds.
B2. Christian T. Herbst, Jan G. Svec (2014). Basics of voice acoustics -- a tutorial. in: Sataloff's Textbook of Otolaryngology. JP medical publishers, Sataloff, R. T.
B1. Christian T. Herbst (2012). Investigation of glottal configurations in singing. Palacký University in Olomouc, the Czech Republic (Doctoral Dissertation) download PDF (An updated version of the software can be found here.)
top of page
Other publications
O1. Ben Larson, Christian T. Herbst, Eric Hunter (2013). EGG Wavegram Python Source Code Tutorial. The National Center for Voice and Speech. download PDF
top of page
Posters (presented at conferences)
P6. Jan G. Svec, Hana Sramkova, Svante Granqvist, Christian T. Herbst (2016). Update on the Recommended Maximum Background Noise Levels for Voice Measurements. 10th International Conference on Voice Physiology and Biomechanics (ICVPB), Universidad Tecnica Federico Santa Maria, Vina del Mar, Chile. March 17, 2016. presented by Christian T. Herbst. download PDF
P5. Christian T. Herbst, Hiroki Koda, Takumi Kunieda, Juri Suzuki, Takeshi Nishimura (2016). Electroglottographic assessment of in vivo Japanese Macaque sound production. 10th International Conference on Voice Physiology and Biomechanics (ICVPB), Universidad Tecnica Federico Santa Maria, Vina del Mar, Chile. March 16, 2016. download PDF
P4. Christian T. Herbst, W. T. S. Fitch, Jan G. Švec (2010). Wavegrams: A new technique for visualizing vocal fold dynamics noninvasively using electroglottographic signals. COST Action 2103 Summer School - Modeling and Assessment of the Human Voice, Erlangen, Germany. September 2010. download PDF - show abstract
Electroglottography (EGG) is a non-invasive low-cost method to monitor relative vocal fold contact area (VFCA) during phonation. Increase and decrease of VFCA is related to glottal closing and opening, respectively. In this study, a new method for analyzing and displaying EGG signals (and their first derivative, DEGG) is introduced: the electroglottographic wavegram (short: wavegram). It (a) allows monitoring the EGG (or DEGG) signal over time; and (b) provides an intuitive means for quickly assessing the duration of glottal closure and its variation over time.

Based on the EGG or DEGG signal, the time-varying fundamental frequency is calculated and consecutive individual glottal cycles are identified. Each cycle is locally normalized in duration and amplitude and the cycles are then plotted consecutively. The plotting process resembles that of a spectrogram, but instead of spectral amplitudes, the signal deflections are encoded by color intensity. The wavegram presents the time on the x-axis, normalized cycle duration on the y-axis and the signal deflection on the color-intensity-coded z-axis.

The wavegram reveals changes of vocal fold contact duration in time. It also shows phenomena that remain overlooked in traditional EGG-display techniques, such as multiple DEGG peaks. While these phenomena have usually been considered artifacts, the wavegram displays revealed consistent behavior of these peaks in a large number of subjects. They indicate subtle changes of vocal fold oscillatory regime.

Wavegram analysis suggests that the phenomenon of vocal fold closing and opening is more complex than commonly assumed. Rather than a single event, vocal fold opening and closing should be considered a sequence of events, taking place over a certain period of time. Data show that the sequence of these events can change with pitch, loudness and register. The EGG signal thus promises to reveal more (physiological) information on vocal fold closure and opening events than previously thought.
P3. Josef Schlömicher-Thier, Donald G. Miller, Hubert Noe, Christian T. Herbst (2009). Yodeling - Acoustic and Physiologic Properties. The Voice Foundation's 38th Annual Symposium, June 2009. download PDF - show abstract
Yodelling is sustained phonation with nonsensical combinations of vowels and consonants. It is characterized by drastic timbral changes, caused by (a) abrupt changes of laryngeal mechanism (chest vs. falsetto registers); and (b) typical choice of vowels. The register transitions coincide with relatively large intervallic leaps.

The goal of this study was to better understand physiologic and acoustic properties of yodelling. In particular, the relationship between voice source characertistics and the vocal tract was investigated. Two yodellers (one female, one male), originating from the Austrian regions of Salzburg and Styria, were examined by means of flexible video-endoscopy, electroglottography and recording of acoustic data.

Preliminary results suggest that formant tuning plays an important role in yodelling. It is hypothesized that yodellers intuitively choose certain combinations of fundamental frequency and vowel, in order to facilitate the abrupt changes of laryngeal mechanism that are typical for yodelling.
P2. Christian T. Herbst, Elke Duus, Harald Jers (2009). Voice category assessment of amateur choir singers. 4th International Conference on the Physiology and Acoustics of Singing, January 2009. download PDF - show abstract
The tessitura, i.e. the pitch range that is determined by the music to be sung, is dependent on voice category. Ideally, it lies well within the boundaries of the physiologic voice range. In amateur choir singing however, the individual singer's choice of voice category does not necessarily result in optimal use of vocal potential. This study tries to establish an objective, quantitative method to determine voice category and to highlight unused potential as regards voice range.

21 members of an amateur choir (15 female, 6 male) have been examined by means of standard voice range profile (VRP) measurement. In order to collect data of 'habitual' singing, the subjects have also been asked to sing a short piece of music of their own choice in convenient key, tempo and loudness. The tessitura (as determined by the singers chosen voice category) has been compared to (a) the pitch range as determined by the VRP measurement; and (b) with the tessitura of the 'habitual' singing. The difference between the upper limit of the tessitura and the highest pitch in the VRP, expressed in semi-tones, has been defined as 'upper reserve' (UR); the difference between the lower limit of the tessitura and the lowest pitch measured with the VRP has been defined as the 'lower reserve' (LR). The 'reserve index' (RI) has been defined as the relation between upper and lower reserve [ RI = (OR - UR) / (OR + UR) ].

In average, the sopranos were 10 years older than the altos (overall average age: 49,7 years). The average physiologic voice range was 37,9 semitones (min 31, max 45). Older females had less physiologic voice range, but their habitual singing was generally higher. With the exception of the sopranos, all voice categories had more upper reserve than lower reserve, which is reflected by the average reserve index per voice category: soprano -0,36; other voice categories: 0,11 - 0,54. The reserve index was inversely related to age ( RI = 0,855 - age * 0,013).

In the examined choir, older females sang soprano in a relatively high tessitura (in some cases reaching the upper limit of the physiologic voice range), whereas younger females sang alto in a relatively low tessitura. This sub-optimal situation could be caused by insufficient vocal technique, or it might be explained in a sociologic/social context.
P1. Christian T. Herbst, Jan G. Švec, Qingjun Qiu, Harm Schutte (2007). Overall and posterior glottal adduction in singing. 7th Pan European Voice Conference (PEVOC), Groningen, The Netherlands. August 2007. download PDF - show abstract
It is known that glottal adduction can be adjusted both posteriorly by PCA/LCA/IA muscles as well as in overall by the TA muscles. A previously conducted pilot study on a baritone suggested that an independent control of the posterior and TA adduction allows achieving better flexibility in controlling the singing voice quality. The goal of this study was to design phonatory exercises to isolate these two types of glottal adduction. Four extreme phonation types were targeted, using the chest and falsetto registers with and without breathiness: a) 'naïve' falsetto (breathy), b) 'resonant falsetto', c) 'light chest' (breathy) and d) 'dramatic/operatic chest'.

6 female and 6 male singers and non-singers were asked to imitate the instructor (i.e. the baritone who participated in the previously conducted pilot study), producing those 4 phonation types at a pitch located within the range of the chest/falsetto register transition (C#4 to F4). In order to maintain the desired registration (chest or falsetto), the target notes were reached by singing a descending (for falsetto) or ascending (for chest) scale of five notes. (The subjects were asked not to 'blend or mix the registers'). The phonation was monitored by videostroboscopy, videokymography (VKG), electroglottography (EGG) and audio recording.

The results showed distinct laryngeal configurations and vocal fold vibration characteristics for the four phonation types. All subjects showed less adducted posterior glottis in the two breathy phonation types than in the non-breathy phonations. In some cases, the arytenoid processes were clearly vibrating during the breathy phonations. All subjects had mucosal waves and sharp lateral peaks in VKG when phonating in 'dramatic/operatic chest' voice. In 9 subjects, mucosal waves of some degree were found in all phonation types, i.e.,even in both the falsetto phonations.

The findings of this study suggest that the designed phonatory exercises can be used to produce 4 extreme types of singing voice and to train singers to gain an independent control of the voice register and glottal adduction, making the voice more flexible. The data also showed that the closed quotient can in some subjects achieve larger values in 'resonant falsetto' than in 'light chest' phonations, implying that the closed quotient is not a sole indicator of the voice register in singing.
top of page
Conference talks and lectures
C120. Christian T. Herbst (2017). Speech and singing voice assessment with electroglottegraphy. VIII Annual COST related Symposium Copenhagen & VII World Voice Consortium Congress (invited lecture), World Voice Consortium, Copenhagen, Denmark. December 8, 2017.
C119. Mona Kirstin Fehling, Bernhard Schick, Jan G. Svec, Christian T. Herbst, Jörg Lohscheller (2017). Zusammenhang zwischen der Morphologie von Stimmlippentrajektorien und vertikaler Schwingungsdynamik. 34. Wissenschaftliche Jahrestagung der Deutschen Gesellschaft für Phoniatrie und Pädaudiologie (DGPP) - Dreiländertagung D-A-CH, Deutsche Gesellschaft für Phoniatrie und Pädaudiologie e. V., September 15, 2017. presented by Mona Kirstin Fehling. - show abstract
Hintergrund: Die objektive Analyse endoskopischer Hochgeschwindigkeits-Videoaufnahmen der Stimmlippen (SL) basiert auf einer initialen Segmentierung der SL-kanten. So lassen sich an beliebigen Positionen entlang der glottalen Achse die Bewegungsvorgänge beider SL individuell durch Trajektorien beschreiben, die in Abhängigkeit des Schwingungsmusters unterschiedliche Zeitverläufe aufweisen. Charakteristisch für die Trajektorienmorphologie ist der Zeitpunkt maximaler Auslenkung, welcher Öffnungs- und Schlussphase trennt und unterschiedlich starke Krümmung aufweist.

Material und Methoden: Für ein Kollektiv 100 stimmgesunder Probanden wird die Ausprägung der Trajektorienkrümmung zum Zeitpunkt maximaler Auslenkung entlang der gesamten glottalen Achse mittels eines Krümmungsparameters ermittelt. Mittels Regressionsanalyse werden zwei Winkelparameter ermittelt, welche zum Zeitpunkt der maximalen Auslenkung die Steilheit von Öffnungs- und Schlussphase quantifizieren. Die Methode wird auf stationäre sowie auf nicht-stationäre Phonationsparadigmen angewendet.

Ergebnisse: Die Analyse stimmgesunder Sequenzen zeigt eine Geschlechts- sowie eine Frequenzabhängigkeit der Krümmungsmorphologie. Männer weisen eine im Mittel niedrigere Krümmung gegenüber Frauen auf, welche geschlechtsunabhängig mit zunehmender Frequenz abnimmt. Zudem zeigt sich ein Unterschied zwischen beiden Winkelparametern, welche die Steilheit der Öffnungs- und Schlussphase quantifizieren, wobei die Öffnung einen steileren Verlauf zeigt. Bei auftretenden lateralen Asymmetrien konnten zudem Phasenverschiebungen zwischen den SL-trajektorien identifiziert werden, wobei die Phase der SL mit geringerer Krümmung vorwegläuft.

Diskussion: Unterschiedliche Krümmungen lassen sich auf eine veränderte laterale Dynamik sowie eine ausgeprägt vertikale Phasenverschiebung in der Schwingungsdynamik zurückführen. Bei der Segmentierung kann dies dazu führen, dass die SL-kanten an unterschiedlichen lateralen Positionen extrahiert werden und somit abrupt springen. Dieser vertikale Versatz tritt zum Zeitpunkt maximaler SL-auslenkung auf, was als "Trajektorienknick" bzw. starke Krümmung interpretiert wird. Werden die SL bei der Endoskopie unter einem leicht schrägen Winkel betrachtet, kann dies zudem zu einer deutlichen Phasenverschiebung zwischen linker und rechter Stimmlippe führen. Diese Effekte sind bei der klinischen Bewertung sowie der computergestützten zu berücksichtigen, da auftretende Asymmetrien in den Trajektorien Artefakte darstellen und als vorliegende Schwingungsasymmetrien fehlinterpretiert werden können.
- show citations
please wait until both citation applets (Scopus, Altmetrics) are loaded:
C118. Christian T. Herbst, Brian P. Gill (2017). Delineation of Three Main Areas of Voice Pedagogy: Voice Building, Coaching, and Voice Rehabilitation. The Voice Foundation's 46th Annual Symposium: Care of the Professional Voice, The Voice Foundation, Philadelphia, PA. June 4, 2017. presented by Brian P. Gill. - show abstract
The final keynote panel of the 10th Pan-European Voice Conference (PEVOC) was concerned with the topic ``Voice Pedagogy -- What do we need?'' In this presentation the panel discussion is summarized and the authors provide a deepening discussion on one of the key questions, addressing the roles and tasks of people working with voice students. In particular, a distinction is made between (a) voice building (derived from the German term ``Stimmbildung''), primarily comprising the functional and physiological aspects of singing; (b) coaching, mostly concerned with performance skills; and (c) singing voice rehabilitation. Both public and private educators are encouraged to apply this distinction to their curricula, in order to arrive at more efficient singing teaching and to reduce the risk of vocal injury to the concerned singers.

Reference
Gill B.P., Herbst C.T. Voice Pedagogy - What do we need? Logop Phoniatr Vocol. 2016, 41 (4), 168-173
C117. Christian T. Herbst, Harm K. Schutte, Daniel L. Bowling, Jan G. Svec (2017). Comparing chalk with cheese -- The EGG contact quotient is only a limited surrogate of the closed quotient. The Voice Foundation's 46th Annual Symposium: Care of the Professional Voice, The Voice Foundation, Philadelphia, PA. June 1, 2017. presented by Jan G. Svec. - show abstract
The electroglottographic (EGG) contact quotient (CQegg), an estimate of the relative duration of vocal fold contact per vibratory cycle, is the most commonly used quantitative analysis parameter. The purpose of this study is to quantify the CQegg's relation to the closed quotient, a measure more directly related to glottal width changes during vocal fold vibration and the respective sound generation events.

Thirteen singers (six females) phonated in four extreme phonation types, while independently varying the degree of breathiness and vocal register. EGG recordings were complemented by simultaneous videokymographic (VKG) endoscopy, which allows for calculation of the videokymographic closed quotient (CQvkg). The CQegg was computed using five different algorithms, all used in previous research.

All CQegg algorithms produced CQegg values that clearly differed from the respective CQvkg, with standard deviations around 20 % of cycle duration. The difference between CQvkg and CQegg was generally greater for phonations with lower CQvkg. The largest differences were found for low-quality EGG signals with a signal-to-noise ratio (SNR) below 10 dB, typically stemming from phonations with incomplete glottal closure. Disregarding those low-quality signals, the best match between CQegg and CQvkg was found for a CQegg algorithm operating on the first derivative of the EGG signal.

These results show that the terms ``closed quotient'' and ``contact quotient'' should not be used interchangeably. They relate to different physiological phenomena. Phonations with incomplete glottal closure having an EGG SNR below 10 dB are not suited for CQegg analysis.
C116. Christian T. Herbst (2017). Gesangstechnik an der Schnittstelle zwischen Pädagogik und Stimmforschung. Stimmwelten (invited lecture), Universitätsklinik für Hals-, Nasen- und Ohrenkrankheiten, Kopf- und Halschirurgie, Inselspital Bern, Bern, Switzerland. April 29, 2017.
C115. Christian T. Herbst (2017). A review of singing voice sub-system interactions - towards an extended physiological model of "support". Chorusology Symposium (invited lecture), Malta Diocese Catholic Institute, Valletta, Malta. April 21, 2017.
C114. Christian T. Herbst (2017). Voice building, coaching, or therapy? - A delineation of areas of responsibility in voice pedagogy. Chorusology Symposium (invited lecture), Malta Diocese Catholic Institute, Valletta, Malta. April 20, 2017.
C113. Christian T. Herbst (2017). Electroglottographic investigation of primate vocalization. SPIRITS program workshop "Biology and Evolution of Speech" (invited lecture), Kyoto, Japan. February 23, 2017.
C112. Christian T. Herbst (2016). The myoelastic-earodynamic theory of voice production in humans, mammals and birds. XXIV Pacific Voice Conference (invited lecture), Pacific Voice & Speech Foundation, Warsaw, Poland. October 5, 2016.
C111. Christian T. Herbst (2016). The sound source in singing: What electroglottography (EGG) can tell us about glottal configurations. The Singing Voice Science Workshop (invited lecture), John J. Cali School of Music at Montclair State University, Montclair, NJ. June 9, 2016.
C110. Christian T. Herbst (2016). "Et in tres unum sunt" - Interactions between sound source, vocal tract, and pulmonary system in singing. The Singing Voice Science Workshop (invited lecture), John J. Cali School of Music at Montclair State University, Montclair, NJ. June 8, 2016.
C109. Christian T. Herbst, Jakob Unger, Hanspeter Herzel, Jan G. Svec, Jörg Lohscheller (2016). Phasegram Analysis of Vocal Fold Vibration Documented with Laryngeal High-Speed Video Endoscopy. 45th Annual Symposium: Care of the Professional Voice, The Voice Foundation, June 5, 2016. presented by Jan G. Svec. - show abstract
Objective. In a recent publication, the phasegram, a bifurcation diagram over time, has been introduced as an intuitive visualization tool for assessing the vibratory states of oscillating systems. Here, this non-linear dynamics approach is augmented with quantitative analysis parameters, and it is applied to clinical laryngeal high-speed video (HSV) endoscopic recordings of healthy and pathologic phonations.

Methods/Design. HSV data from a total of 73 females diagnosed as healthy (n=42), or with functional dysphonia (n=15) or unilateral vocal fold paralysis (n=16), were quantitatively analyzed. Glottal area waveforms (GAW) as well as left and right hemi-GAWs (hGAW) were extracted from the HSV recordings. Based on Poincaré sections through phase space embedded signals, two novel quantitative parameters were computed: The phasegram entropy (PE), and the phasegram complexity estimate (PCE), inspired by signal entropy and correlation dimension computation, respectively.

Results. Both PE and PCE assumed higher average values (suggesting more irregular vibrations) for the pathological as compared to the healthy participants, significantly discriminating the healthy from the paralysis group (p=0.02 for both PE and PCE). Comparisons of individual PE or PCE data for the left and right hGAW within each subject resulted in asymmetry measures for the regularity of vocal fold vibration. The PCE-based asymmetry measure revealed significant differences between the healthy and the paralysis group (p=0.03).

Conclusions. Quantitative phasegram analysis of GAW and hGAW data is a promising tool for the automated processing of HSV data in research and in clinical practice.
C108. Christian T. Herbst, Brian P. Gill (2016). Voice building, coaching, or therapy? - A delineation of areas of responsibility in voice pedagogy. Ars Choralis 2016. 4th International Symposium on Chorusology - Choral Art - Singing - Voice (invited lecture), Croatian Choral Directors Association, Zagreb, Croatia. March 31, 2016.
C107. Christian T. Herbst (2016). A review of singing voice sub-system interactions -- towards an extended physiological model of "support". Ars Choralis 2016. 4th International Symposium on Chorusology - Choral Art - Singing - Voice (invited lecture), Croatian Choral Directors Association, Zagreb, Croatia. March 31, 2016.
C106. Christian T. Herbst, Hiroki Koda, Takumi Kunieda, Juri Suzuki, Maxime Garcia, W. Tecumseh Fitch, Takeshi Nishimura (2016). In vivo assessment of Japanese Macaque sound production using electroglottography. 11th International Conference on the Evolution of Language (Evolang XI), University of Southern Mississippi, New Orleans, Louisiana, USA. March 21, 2016. presented by Takeshi Nishimura. - show abstract
While the call repertoire of Japanese Macaques (Macaca fuscata) has been described based on acoustic evidence (Green, 1975), little is known about the underlying laryngeal function, mostly due to experimental difficulties in vivo. As an alternative to direct laryngeal observation, vocal fold vibration can be assessed non-invasively with electroglottography (EGG). A low intensity, high-frequency current is passed between two electrodes placed on each side of the larynx. The admittance variations resulting from vocal fold (de)contacting during laryngeal sound production are largely proportional to the time-varying relative vocal fold contact area (Hampala et al., 2015).

Here, we present the results of a pilot study performed with a female Japanese Macaque who was trained to vocalize upon a visual stimulus. A total of 369 ``coo'' calls, 17 ``grunts'', and 5 ``chirps'' were documented with SPL-calibrated microphone signals and simultaneous EGG recordings. In the coos and the grunts, an EGG signal with cyclic content corresponding to the microphone signal was found. The absence of an EGG trace for the high-frequency chirps might have been caused by a low-pass filter in the EGG device hardware.

26 recorded calls contained transitions between coos and grunts, and the EGG evidence suggests that the transitions between the individual call types regularly occurred during as little as one to five vibratory cycles. This suggests that the coos and the grunts constitute distinct laryngeal mechanisms (comparable to ``registers'' in human singing), potentially generated by the same vibrating structures. Excised larynx experiments are warranted to test this hypothesis, also investigating the potential influence of the species' vocal membranes.

References:
Green, S. (1975). "Variation of Vocal Pattern with Social Situation in the Japanese Moneky (Macaca fuscata): A FieldStudy," in Primate Behaviour. Developments in Field and Laboratory Research, edited by L. A. Rosenblum (Academic Press, New York), pp. 1-102.
Hampala, V., Garcia, M., Svec, J. G., Scherer, R. C., and Herbst, C. T. (2015). "Relationship between the Electroglottographic Signal and Vocal Fold Contact Area," Journal of Voice in press.


C105. Coen Elemans, Jeppe Have Rasmussen, Christian T. Herbst, Daniel Düring, Sue Anne Zollinger, Henrik Brumm, Kyle Srivastava, Niels Svane, Ming Ding, Ole Larsen, Samuel Sober, Jan G. Svec (2016). Universal mechanisms of sound production and control in birds and mammals. 10th International Conference on Voice Physiology and Biomechanics (ICVPB), Universidad Tecnica Federico Santa Maria, Vina del Mar, Chile. March 15, 2016. presented by Christian T. Herbst.
C104. Christian T. Herbst (2015). Monitoring the mammalian and avian sound source with electroglottography. IBAC 2015, XXV International Bioacoustics Congress (invited lecture), International Bioacoustics Council, Murnau, Germany. September 7, 2015.
C103. Christian T. Herbst (2015). Elephant on the bench - ex-vivo investigation of mammalian sound production. Séminaire du département Parole et Cognition (invited lecture), GIPSA-lab, Grenoble, France. June 11, 2015.
C102. Christian T. Herbst, Vit Hampala, Maxime Garcia, Ronald C. Scherer, Jan G. Svec (2015). Electroglottography and Direct Measurement of Vocal Fold Contact Area -- a High-Speed Video Update. 44th Annual Symposium: Care of the Professional Voice, The Voice Foundation, May 27, 2015. - show abstract
Objective. Electroglottography (EGG) is a popular non-invasive method that purports to measure changes in relative vocal fold contact area (VFCA) during phonation. Despite its broad application, the putative direct relation between the EGG waveform and the VFCA has to date only been formally tested in a single study (Scherer et al., 1988), suggesting an approximately linear relationship between VFCA and the EGG signal magnitude. However, in that study flow-induced vocal fold vibration was not investigated. A rigorous empirical evaluation of EGG as a measure of relative vocal fold contact area under proper physiological conditions is therefore still needed.

Methods/Design. In order to address this issue, three red deer larynges where phonated in an excised hemi-larynx preparation utilizing a conducting glass plate. The time varying contact between the vocal fold and the glass plate was assessed by high-speed video recordings made in the sagittal plane at 6000 fps, synchronized to the EGG signal (+/- 0.167 ms).

Results and Conclusions. In the contacting phase, the EGG waveform systematically preceded the measured VFCA. The average difference between the normalized [0..1] VFCA and EGG data in the three larynges was 0.180 (+/- 0.156), 0.075 (+/- 0.115) and 0.168 (+/- 0.184) in the contacting phase, and 0.159 (+/- 0.112), -0.003 (+/- 0.029) and 0.004 (+/- 0.0.32) in the de-contacting phase. In the de-contacting phase, there was thus a good agreement between VFCA and the EGG waveform in two out of three larynxes. Disagreements between the VFCA and EGG waveforms could have been caused by errors in data normalization, electrode placement, anisotropic conductance properties of the vocal folds, and possible effects of electroglottograph hardware circuitry. Pending further research to clarify the issue, quantitative EGG data should be interpreted cautiously, allowing for potential errors.
C101. Christian T. Herbst, Markus Hess, Frank Müller, Jan G. Svec, Johan Sundberg (2015). Glottal Adduction and Subglottal Pressure in Singing. 44th Annual Symposium: Care of the Professional Voice, The Voice Foundation, May 27, 2015. - show abstract
Previous research suggests that independent variation of vocal loudness and glottal configuration (type and degree of vocal fold adduction) does not occur in untrained speech production. This study investigated whether these factors can be varied independently in trained singing, and how changes of subglottal pressure are related to changes of average glottal airflow, voice source properties and sound level under these conditions.

A classically trained baritone produced sustained phonations on the endoscopic vowel [i:] at pitch D4 (approx. 294 Hz), exclusively varying either (a) vocal register; (b) phonation type (from ``breathy'' to ``pressed'' via cartilaginous adduction); or (c) vocal loudness, while keeping the others constant. Phonation was documented by simultaneous recording of videokymographic, electroglottographic, airflow and voice source data, and by percutaneous measurement of relative subglottal pressure.

Register shifts were clearly marked in the EGG wavegram display. As compared with chest register, falsetto was produced with greater pulse amplitude of the glottal flow, H1-H2, mean airflow, and with lower MFDR, subglottal pressure, and sound pressure. Shifts of phonation type (breathy/flow/neutral/pressed) induced comparable systematic changes. Increase of vocal loudness resulted in increased subglottal pressure, average flow, sound pressure, MFDR, glottal flow pulse amplitude and H1-H2.

When changing either vocal register or phonation type, subglottal pressure and mean airflow showed an inverse relationship, i.e, variation of glottal flow resistance. The direct relation between subglottal pressure and flow when varying only vocal loudness demonstrated independent control of vocal loudness and glottal configuration. Achieving such independent control of phonatory control parameters would be an important target in vocal pedagogy and in voice therapy.
C100. Christian T. Herbst, Jan G. Svec (2015). Electroglottography -- a high-speed video update. 11th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research (AQL), The Royal National Throat, Nose and Ear Hospital, London, UK. April 8, 2015.
C99. Christian T. Herbst (2015). Effizienz der Tonproduktion in Sprache und Gesang. 4. Jahrestagung - Symposium 'FIT ON STAGE' (invited lecture), Österr. Gesellschaft für Musikermedizin (ÖGfMM), Vienna, Austria. March 21, 2015.
C98. Maxime Garcia, Markus Boeckle, Christian T. Herbst, Bruno Gingras, Yann Locatelli, W. Tecumseh Fitch (2014). Call classification design of the Wild Boar (Sus scrofa) complex vocalization system.. VII European Conference on Behavioural Biology (ECBB), Czech and Slovak Ethological Society, Prague, Czech Republic. July 18, 2014. presented by Maxime Garcia.
C97. Christian T. Herbst, Jinook Oh, Jitka Vydrova, Jan G. Svec (2014). DigitalVHI -- a Multi-Lingual Freeware Software Application to Capture Voice Handicap Index Data. 43rd Annual Symposium: Care of the Professional Voice, The Voice Foundation, May 31, 2014. - show abstract
The voice handicap index (VHI) is a questionnaire to quantify the functional, physical and emotional impacts of a voice disorder on a patient's quality of life [1]. The VHI has been used in numerous studies as an indicator for finding evidence of voice disorders, and as a retrospective test of the outcome of clinical interventions.

Despite the widespread use of the tool, to the best of our knowledge, there does not seem to be any computer software available to facilitate the computer-aided capture of VHI data. Such software is needed to store the questionnaire results electronically, to automatically calculate the final scores as well as to facilitate handling the data for clinical studies.

Here, we introduce DigitalVHI, a freeware open source software application to capture Voice Handicap Index data [2]. Both a Mac OS X and a Microsoft Windows version, as well as the original Python source code are available for download at http://www.christian-herbst.org/DigitalVHI/

DigitalVHI consists of a simple user interface, which has successfully been tested over a period of two years in the voice clinic led by author JV. The final result of each questionnaire data acquisition is saved as a PDF file, and the collected data is appended to a file in CSV format (data can then be imported to OpenOffice, Microsoft Excel, R, SPSS, etc., for further processing). To maximize data security of sensitive patient data, no internet connection is required to run the software. The DigitalVHI user interface (including all questionnaire data) can be easily translated to any language by creating additional language packs.

Acknowledgement: This project has been co-financed by the European Social Fund and the state budget of the Czech Republic within the project no. CZ.1.07/2.3.00/30.0004 "POST-UP" (CH, JGS) and the projects no. CZ.1.07/2.4.00/17.0009 and CZ 1.07/2.3.00/20.0057 (JV, JGS).

References:

[1] B. Jacobson, et al., "The voice handicap index (VHI): development and validation," J.Speech-Lang.Path., vol. 6, pp. 66-70, 1997.
[2] C. T. Herbst, et al., "DigitalVHI-a freeware open-source software application to capture the Voice Handicap Index and other questionnaire data in various languages," Logoped Phoniatr Vocol, Sep 19 2013 (early online, doi: 10.3109/14015439.2013.830769).

C96. Christian T. Herbst, Hanspeter Herzel, Jan G. Svec, Megan Wyman, W. Tecumseh Fitch (2014). Visualizing voice dynamics with phasegrams. 43rd Annual Symposium: Care of the Professional Voice, The Voice Foundation, May 31, 2014. - show abstract
``Normal'' voice production is characterized by (nearly) periodic vocal fold vibration. The deviation from periodicity by introducing subharmonic or irregular oscillations is either inside (as, e.g., in the case of some singing styles or in certain mammalian vocalizations) or outside the voice's normal range of operation (in the case of most voice pathologies). In order to assess these different oscillatory states, a novel tool for visualization and analysis of voice dynamics ``on the way to chaos'' is introduced: the phasegram [1, 2].

Phasegrams combine the advantages of sliding-window analysis (such as the spectrogram) with well-established visualization techniques from the domain of non-linear dynamics. In a phasegram, time is mapped onto the x-axis, and various vibratory regimes, such as periodic oscillation, subharmonics or chaos, are identified within the generated graph by the number and stability of horizontal lines.

A phasegram can be interpreted as a bifurcation diagram in time. In contrast to other analysis techniques, it can be automatically constructed from time-series data alone: no additional system parameter needs to be known. Phasegrams show great potential for signal classification and can act as the quantitative basis for further analysis of oscillating systems in many scientific fields, such as physics (particularly acoustics), biology or medicine. The phasegram's usefulness for voice analysis will be demonstrated by analyzing electroglottographic (EGG) signals of excised larynx experiments, singing and pathologic voice production.
C95. Shaheen N. Awan, Andrew R Krauss, Christian T. Herbst (2014). An Examination of the Relationship Between Electroglottographic (EGG) Contact Quotient, EGG Decontacting Phase Profile, and Acoustical Spectral Moments. 43rd Annual Symposium: Care of the Professional Voice, The Voice Foundation, May 29, 2014. presented by Shaheen Awan.
C94. Christian T. Herbst, Jörg Lohscheller, Jan G. Svec, Nathalie Henrich Bernadoni, Gerald Weissengruber, W. Tecumseh Fitch (2014). Electroglottographic and super high-speed video investigation of glottal opening and closing events. 43rd Annual Symposium: Care of the Professional Voice, The Voice Foundation, May 29, 2014. - show abstract
Previous research has suggested that the peaks in the first derivative (dEGG) of the electroglottographic signal (EGG) are good approximate indicators of the events of glottal opening and closing. These findings were based on high-speed video (HSV) recordings with frame rates ten times lower than the sampling frequencies of the corresponding EGG data. The current study attempts to corroborate these previous findings, utilizing super-HSV recordings.

The HSV and EGG recordings (sampled at 27 kHz and 44 kHz, respectively) of excised canine larynx vocalization were synchronized by an external TTL signal to within 0.037 ms. Data were analyzed by means of EGG, dEGG, the glottal area waveform, digital kymograms, glottovibrograms, and the vocal fold contact length (VFCL), a new parameter representing the time-varying degree of ``zippering'' closure along the anterior-posterior (A-P) glottal axis.

The temporal offsets between glottal events (depicted in the HSV recordings) and dEGG peaks in the opening and closing phase of glottal vibration ranged from 0.02 to 0.61 ms, amounting to 0.24 -- 10.88 % of the respective glottal cycle durations. All dEGG double peaks coincided with vibratory A-P phase differences. In two out of the three analyzed video sequences, peaks in the first derivative of the VFCL coincided with dEGG peaks, again co-occurring with A-P phase differences.

The findings suggest that dEGG peaks do not always coincide with the events of glottal closure and initial opening. Vocal fold contacting and de-contacting do not occur at infinitesimally small instants of time, but extend over a certain interval, particularly under the influence of A-P phase differences [1].


Acknowledgements: This research was supported by an ERC Advanced grant no. 230604 `SOMACCA' (C.T.H.), a start-up grant from the University Vienna (W.T.F.), the European Social Fund Project OP VK CZ.1.07/2.3.00/20.0057 (J.G.S.) and by the DFG grant LO1413/2-2 (J.L.).


Reference:

[1] C. T. Herbst, et al., "Glottal opening and closing events investigated by electroglottography and super-high-speed video recordings," J Exp Biol (accepted).
C93. Christian T. Herbst (2014). Same, but different - physical aspects of mammalian sound production. COSB Seminar (invited lecture), Center for Organismal Systems Biology, Faculty of Life Sciences, University of Vienna, Vienna, Austria. May 19, 2014.
C92. Christian T. Herbst (2014). Phasegrams - a novel method for visualizing oscillations in non-linear systems. Seminar, Dept. of Biophysics, Palacky University, Olomouc, Czech Republic. May 15, 2014.
C91. Christian T. Herbst (2014). Freddie Mercury - Acoustical Voice Analysis. 5th Czech-Slovak Symposium on ART VOICE (invited lecture), Clinic of Otolaryngology, Facultiy of Medicine, Comenius University, Bratislava, Slovakia. May 10, 2014.
C90. Christian T. Herbst (2014). Electroglottography -- a low-cost method to non-invasively assess vocal fold vibration. 5th Czech-Slovak Symposium on ART VOICE (invited lecture), Clinic of Otolaryngology, Facultiy of Medicine, Comenius University, Bratislava, Slovakia. May 9, 2014.
C89. Christian T. Herbst (2014). Electroglottography -- a low-cost method to non-invasively assess vocal fold vibration. 3. International Symposium on Chorusology - Choral Art - Singing - Voice (invited lecture), Croatian Choral Directors Association, Zagreb, Croatia. April 25, 2014.
C88. Christian T. Herbst (2014). Mutation und Klang - Physiologische Hintergründe und Rahmenbedingungen. Tag der Kinder- & Jugendsingstimme Salzburg (invited lecture), Universität Mozarteum, Salzburg, Austria. April 5, 2014.
C87. Christian T. Herbst (2014). Assessment of vocal fold vibration with videokymography and related techniques. 1st conference of POST-UP, Palacky University, Olomouc, Czech Republic. January 22, 2014.
C86. Christian T. Herbst (2014). The Phasegram - a new method for visualizing system dynamics. 2nd NYU International Symposium VOICE SOURCE CHARACTERISTICS: Methods and Discoveries (invited lecture), NYU Steinhardt School of Culture, Education and Human Development, Department of Music and Performing Arts Professions, New York. January 11, 2014.
C85. Christian T. Herbst (2013). Vom Knabensopran zur Männerstimme - eine Momentaufnahme im Stimmbruch. Grazer Stimmtage (invited lecture), Hals-, Nasen-, Ohren-Universitätsklinik, Klinische Abteilung für Phoniatrie, Graz, Austria. November 15, 2013.
C84. Christian T. Herbst (2013). Brunftzeit! - Der Sexualdimorphismus bei Vokalisationen von Tier und Mensch. Grazer Stimmtage (invited lecture), Hals-, Nasen-, Ohren-Universitätsklinik, Klinische Abteilung für Phoniatrie, Graz, Austria. November 15, 2013.
C83. Christian T. Herbst (2013). Mythos "Primärklang"? - Physiologische Möglichkeiten zur Beeinflussung der Klangfarbe im Gesang. Fachtagung "Musikpädagogik konkret" (invited lecture), Hochschule für Musik und Darstellende Kunst, Frankfurt, Germany. October 18, 2013.
C82. Christian T. Herbst (2013). Angewandte Stimmphysiologie und Akustik. CAS Singstimme -- Fehlfunktionen erkennen, abbauen, vermeiden (invited lecture), Hochschule der Künste Bern, Bern, Switzerland. October 12, 2013. - show abstract
Im Theorieteil werden die akustischen und physiologischen Grundprinzipien der Stimmgebung (Sprache und Stimme) dargestellt. Im praktischen Teil wird mit Probanden gearbeitet (Diagnose und didaktische Arbeit), die als "Grenzfälle" zwischen Pädagogik und Pathologie gelten. Es wird ein Ansatz vermittelt, der sich auf die physiologische Parametrisierung der beeinträchtigten Sprech- und Singstimme konzentriert. Dieses Modul versteht sich als Brückenschlag zwischen Phoniatrie, Sprachtherapie und (Gesangs)didaktik.
C81. Christian T. Herbst (2013). Physiologische Wechselwirkungen zwischen Glottiskonfiguration und Sängeratmung. Symposium "Vom Atem zum Gesang" (invited lecture), Austrian Voice Institute, EVTA Austria, Salzburg, Austria. September 28, 2013. - show abstract
Das Atmungssystem kann sowohl aus physiologischer als auch aus gesangspädagogischer Sicht nicht isoliert vom Stimmapparat betrachtet und behandelt werden. Der Einfluss der Kehlkonfiguration auf den Atemluftstrom und das resultierende Stimmtimbre wurde bereits in der Mitte des 19. Jahrhunderts von Manuel Garcia in seinem "Traité complet de l'art du chant" beschrieben. Darüber hinaus hat die Größe des subglottischen Drucks unmittelbare Auswirkungen auf die Qualität der Stimmlippenschwingungen und beeinflusst somit ebenfalls das Stimmtimbre. Im gegenständlichen Vortrag werden jene physiologischen Wechselwirkungen anhand von aktuellen stimmwissenschaftlichen Erkenntnissen erläutert und daraus - soweit möglich - pädagogische Empfehlungen abgeleitet.
C80. Roland Frey, Elena Volodina, Ilya Volodin, David Reby, Megan Wyman, Christian T. Herbst, Angela S. Stoeger, W. Tecumseh Fitch (2013). The anatomy of low frequency vocalization in mammals. XXIV International Bioacoustics Congress, International Bioacoustics Council (IBAC), Pirenopolis, Brazil. September 10, 2013. presented by Roland Frey.
C79. Christian T. Herbst (2013). Voice pedagogy: What do we need?. 10th Pan European Voice Conference (PEVOC) (invited lecture), Palacky University, Olomouc, Czech Republic. August 23, 2013.
C78. David Howard, Jenevora Williams, Christian T. Herbst (2013). 'Ring' in the solo child's singing voice. 10th Pan European Voice Conference (PEVOC), Palacky University, Olomouc, Czech Republic. August 22, 2013. presented by David Howard.
C77. Maxime Garcia, Christian T. Herbst, Bruno Gingras, Markus Boeckle, Yann Locatelli, W. Tecumseh Fitch (2013). Call classification design of the Wild Boar (Sus scrofa) complex vocalization system.. 10th Pan European Voice Conference (PEVOC) (invited lecture), Palacky University, Olomouc, Czech Republic. August 22, 2013. presented by Maxime Garcia. - show abstract
Wild boars live in complex social systems in which individuals interact intensively using multicomponent communication signals such as olfactory and acoustic cues.

Possibly related to their complex vocal tract anatomy, characterized by two pairs of vocal folds, wild boar vocalizations are very diversified, and their heterogeneity was reported in an empirical study led by Klingholz et al. (1979). This analysis had however no statistical support and relied mainly on visual inspections and manual measurements of the parameters generally used in bioacoustics studies at the time. Due to technical advances and deeper knowledge of the physical properties of sounds nowadays, this classification could potentially be validated, or improved, based on a more objective, ``hands-off'' signal analysis and statistical approach.

Here, following a primary visual inspection and computer-aided extraction of acoustical parameters, we applied on the resulting dataset several multivariate analysis approaches, which have proven useful in the identification of vocal repertoires in various species. We attempted to establish, by a comparative means, which classification method is the most appropriate, based on objectivity and repeatability of the measurements.

Quantification and structural characterization of wild boar vocal repertoire is crucial to a better understanding of this species' acoustic communication. This study can provide a solid foundation for further investigation on the production mechanisms (Excised Larynx Experiments), functionality (Playback Experiments), geographical variation, as well as social relevance and transmission of these acoustic signals. Eventually this will help identifying the context and selection pressures that drove the emergence of such vocal displays.
C76. Adam Novozamsky, Jiri Sedlar, Christian T. Herbst, Jan G. Svec, Barbara Zitova, Jan Flusser (2013). VKFD: Computerized analysis of videokymographic data. 10th Pan European Voice Conference (PEVOC), Palacky University, Olomouc, Czech Republic. August 22, 2013. presented by Adam Novozamsky.
C75. Christian T. Herbst, Jan G. Svec, Jörg Lohscheller, Roland Frey, Michaela Gumpenberger, Angela S. Stoeger, W. Tecumseh Fitch (2013). Super Size Me! -- Vibratory Characteristics of an Elephant Larynx. 10th Pan European Voice Conference (PEVOC) (invited lecture), Palacky University, Olomouc, Czech Republic. August 22, 2013. - show abstract
Elephants are the largest land-based mammals. Their low-frequency vocalizations in the infrasonic range (fundamentals below 20 Hz) have been hypothesized to be produced by either of two fundamentally different sound production mechanisms: (a) by a regular pattern of successive EMG bursts (e.g. 20-30 Hz for cat purrs) resulting in consecutive active muscle contractions (AMC); or (b) by flow-induced self-sustaining oscillations in accordance with the myoelastic-aerodynamic (MEAD) theory of sound production.
In a recent publication the author and collaborators have documented self-sustaining, flow-induced vocal fold oscillations in an excised elephant larynx (Loxodonta africana), thus rejecting the AMC mechanism as a plausible cause for elephant infrasound vocal production. Rather, sounds were produced in a manner directly paralleling human speech or song.

Here, a more detailed analysis of the vibratory phenomena seen in the excised elephant larynx is presented. Vocal fold oscillation occurred with a wide variety of vibratory modes, including periodic and complex subharmonic regimes, as well as irregular patterns typically seen in deterministic chaos. Phase delays along the inferior-superior and anterior-posterior (A-P) dimension were commonly observed, as well as travelling wave patterns along the A-P dimension, as yet not documented in the literature. These phenomena might have been facilitated by the large dimensions of the elephant vocal folds (length: 104 mm, thickness: 32 mm). The vestibular folds, when adducted, participated in the tissue vibration, effectively increasing the generated sound pressure level by 12 dB.

In conclusion, the same basic physical principles of voice production apply to mammals of various sizes (i.e. bats, humans, elephants), suggesting that the myoelastic-aerodynamic theory extends across a remarkably wide range of body sizes and vocal frequencies (more than four orders of magnitude). The elephant larynx is, however, not simply a linearly scaled version of the human model, thus giving rise to a range of vibratory phenomena not regularly seen in non-pathologic human phonation.
C74. Laura Enflo, Christian T. Herbst, Johan Sundberg, Anita McAllister (2013). Comparing vocal fold contact criteria derived from electroglottographic and acoustic signals. 10th Pan European Voice Conference (PEVOC), Palacky University, Olomouc, Czech Republic. August 22, 2013. presented by Laura Enflo.
C73. Christian T. Herbst, Svante Granqvist (2013). Voice acoustics, microphones, recording and computers. One-Day Crash Course on 'Voice' (invited lecture), European Academy of Voice, August 21, 2013.
C72. Christian T. Herbst, Angela S. Stoeger, Roland Frey, Jörg Lohscheller, Ingo R. Titze, Michaela Gumpenberger, W. Tecumseh Fitch (2013). Sound production mechanism in elephant infrasound vocalizations. Annual Meeting of the Society of Experimental Biology (invited lecture), Valencia, Spain. July 3, 2013. - show abstract

The sound production of most mammals can be explained by one of two fundamentally different sound production mechanisms: According to the myoelastic-aerodynamic (MEAD) theory of sound production, the primary sound source is generated by flow-induced self-sustaining oscillations of the vocal folds. In an alternative mechanism, sound is created by active muscle contractions (AMC). Here, a regular pattern of successive EMG bursts (e.g. 20--30 Hz for cat purrs) causes the intrinsic laryngeal muscles to modulate the respiratory airflow. See Fig. 1A for body mass and fundamental frequency data of selected mammals producing either MEAD or AMC driven vocalizations.
Elephants are the largest land mammals. They produce low-frequency vocalizations in the infrasonic range (fundamentals below 20 Hz). Both AMC and MEAD have been suggested in the literature as sound production mechanisms, but to date no physiologic evidence for either case has been produced.
Using high-speed video, acoustic and electroglottographic recordings, we documented flow-induced, self-sustaining oscillations of the vocal folds of an excised elephant larynx (Loxodonta africana) at fundamental frequencies below 20 Hz (Fig. 1B and C). We also observed a range of nonlinear phenomena, which are directly comparable to those documented in humans and other mammals. Due to the absence of any neural signals in the excised larynx setup, the AMC mechanism can be rejected for elephant infrasound vocal production. Rather, sounds are produced in a manner directly paralleling human speech or song.
We conclude that the same physical principles of voice production apply to mammals of various sizes (i.e. bats, humans, elephants), and that the MEAD theory extends across a remarkably wide range of body sizes and vocal frequencies (more than four orders of magnitude).
C71. Christian T. Herbst, W. Tecumseh Fitch, Jörg Lohscheller, Jan G. Svec (2013). Estimation of the vertical glottal shape based on empirical high-speed video and electroglottographic data. 10th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio. June 3, 2013.
C70. Christian T. Herbst, Angela S. Stoeger, Roland Frey, Jörg Lohscheller, Ingo R. Titze, Michaela Gumpenberger, W. Tecumseh Fitch (2013). Sound production mechanism in elephant infrasound vocalizations. The Voice Foundation's 42nd Annual Symposium: Care of the Professional Voice, Philadelphia, PA. May 30, 2013. - show abstract
The sound production of most mammals can be explained by one of two fundamentally different sound production mechanisms: According to the myoelastic-aerodynamic (MEAD) theory of sound production, the primary sound source is generated by flow-induced self-sustaining oscillations of the vocal folds. In an alternative mechanism, sound is created by active muscle contractions (AMC). Here, a regular pattern of successive EMG bursts (e.g. 20--30 Hz for cat purrs) causes the intrinsic laryngeal muscles to modulate the respiratory airflow.

Elephants are the largest land mammals. They produce low-frequency vocalizations in the infrasonic range (fundamentals below 20 Hz). Both AMC and MEAD have been suggested in the literature as sound production mechanisms, but to date no physiologic evidence for either case has been produced.

Using high-speed video, acoustic and electroglottographic recordings, we documented flow-induced, self-sustaining oscillations of the vocal folds of an excised elephant larynx (Loxodonta africana) at fundamental frequencies below 20 Hz. We also observed a range of nonlinear phenomena, which are directly comparable to those documented in humans and other mammals. Due to the absence of any neural signals in the excised larynx setup, the AMC mechanism can be rejected for elephant infrasound vocal production. Rather, sounds are produced in a manner directly paralleling human speech or song.

We conclude that the same physical principles of voice production apply to mammals of various sizes (i.e. bats, humans, elephants), and that the myoelastic-aerodynamic theory extends across a remarkably wide range of body sizes and vocal frequencies (more than four orders of magnitude).
____

Reference: C. T. Herbst, A. Stoeger, et al., "How Low Can You Go? Physical Production Mechanism of Elephant Infrasonic Vocalizations," Science, vol. 337, pp. 595-599 2012.
C69. Christian T. Herbst (2013). Physiologische Grundlagen der Stimmproduktion. 11th World Voice Day (invited lecture), Austrian Voice Institute, Salzburg, Austria. April 15, 2013.
C68. Christian T. Herbst (2013). Vortrag und Praxisdemonstration: Physiologische Grundlagen der Stimmbildung und stimmphysiologische Diagnostik im Gesangsunterricht. Symposium Kinderchorleitung (invited lecture), Universität der Künste, Berlin, Germany. April 13, 2013.
C67. Christian T. Herbst (2013). Assessment of vocal fold vibration with videokymography and related techniques. Seminar, Dept. of Biophysics, Palacky University, Olomouc, Czech Republic. April 3, 2013.
C66. Christian T. Herbst (2013). Of elephants and men - common denominators in mammalian voice production. Seminar (invited lecture), Institute of Biology, University of Southern Denmark, Odense, Danmark. February 1, 2013.
C65. Christian T. Herbst (2012). Python-powered voice analysis. Py4Science lecture series (invited lecture), Research Institute of Molecular Pathology, University of Vienna, Vienna, Austria. December 7, 2012. - show abstract
This talk will focus on voice analysis tools programmed in Python. In particular, I will (a) show how digital kymograms can be created from high-speed videos with a Python plugin written for the FIJI/ImageJ image analysis framework; (b) present a set of modules for integrating Praat (a powerful scriptable voice analysis application) with Python-powerd signal processing algorithms; (c) use Python and ffmpeg to create video animations for teaching and presentations; and (d) demonstrate how a set of signals can be visualized and organized in an HTML browser.
C64. Christian T. Herbst (2012). Investigation of glottal configurations in singing. 6th International Conference on the Physiology and Acoustics of Singing (invited lecture), Department of Music, College of Fine Arts, University of Nevada, Las Vegas, October 18, 2012. - show abstract
It is well known that the voice timbre can be controlled in the vocal tract in various ways. The adjustment of the voice character at the laryngeal level, however, receives less attention, particularly in the pedagogic literature. Hence, this presentation focuses on the sound source: How can singers control and fine-tune the voice timbre by adjustments of the vocal folds? And what are the possibilities of monitoring these maneuvers in a pedagogical or therapeutic setting?

The timbral voice characteristics can be controlled at the laryngeal level by (a) cartilaginous adduction, i.e. the adduction of the posterior glottis via the arytenoids (controlled by the singer with the degree of ``breathiness'' / ''pressedness''); and by (b) membranous medialization through vocal fold bulging (controlled by the choice of vocal register, i.e. chest vs. falsetto). These two maneuvers can be controlled separately by both trained and untrained singers.

A pedagogical model that incorporates the two described physiological parameters consists of four quadrants: aBducted falsetto, aDducted falsetto, aBducted chest, and aDducted chest. Accomplished singers can ``navigate'' this map at will, thus facilitating subtle timbral changes at the laryngeal level. This concept is very promising for voice pedagogy and therapy, and for better understanding various singing styles.

In conclusion of the presentation, a novel method for monitoring vocal fold contact in voice production is put forward: the electroglottographic (EGG) wavegram. It is shown how features seen in this non-invasive technique are related to cartilaginous adduction and membranous medialization. The applicability of the method in the singing study and in a speech therapy setting is discussed.
C63. Christian T. Herbst (2012). Von der Stimmbandschwingung zum Klang -- Qualität und Beeinflussbarkeit des Stimmtimbres auf glottaler Ebene. 51. Berliner Gesangswissenschaftliche Tagung (invited lecture), Universität Potsdam, October 13, 2012. - show abstract
Für die Stimmproduktion in Sprache und Gesang sind drei physikalische Systeme nötig: eine Energiequelle (die Lungen), eine Klangquelle (die Stimmlippen), und ein Modifikator des Klanges (der Vokaltrakt). In dieser Präsentation wird auf die Funktion der Klangquelle auf physischer und physiologischer Ebene näher eingegangen. Ganz salopp gesagt werden folgende zwei Fragen erörtert: "Wie wird Klang eigentlich produziert?'' und "Wie kann der Sprecher/Sänger das steuern?''

Der Stimmklang wird durch die Modifikation des von den Lungen kommenden Atemluftstroms erzeugt. So entstehen periodische Änderungen des Luftdruckes, welche die Basis für den resultierenden Stimmklang bilden. Die relevanten Größen, nämlich der zeitvariable Luftdruck bzw. Luftfluss können direkt auf laryngealer Ebene nur mit extrem invasiven Methoden bzw. gar nicht gemessen werden. Aus diesem Grund greift man in der Stimmdiagnose bzw. der Stimmforschung auf indirekte Messmethoden zurück. Zwei dieser Verfahren, die videoendoskopische Untersuchung (Stroboskopie oder Hochgeschwindigkeits-Videoaufnahmen) und die Elektroglottographie werden hier kurz vorgestellt. Die auf jene Weise gewonnene Daten werden in Verhältnis zu dem resultierenden Stimmklang gesetzt.

Bei der Frage nach der Beeinflussbarkeit des Stimmklanges auf laryngealer Ebene muss zwischen Faktoren in verschiedenen Zeithorizonten unterschieden werden: anatomische Rahmenbedingungen ändern sich meist stetig über viele Jahre (Alterungsprozess, hormonelle und Umwelteinflüsse, organische Veränderungen durch Stimmgebrauch). Auf motorischer Ebene sind mittelfristige (Muskeltonus) bis kurzfristige Beeinflussungen des Stimmklangs möglich. Die Konfiguration des Larynx vor und während der Stimmproduktion durch die extrinsische und intrinsische Kehlkopfmuskulatur spielt sich meist im Zehntelsekundenbereich ab. Die Stimmlippenschwingung selbst, welche als passives physikalisches Phänomen nicht von muskulärer Aktivität abhängig ist, kann nur in (Bruchteilen von) Millisekunden gemessen werden (Schwingungsfrequenzen von ca. 50 - 3500 Hz).

In diesem Vortrag soll auch kurz auf die Beeinflussung der Stimmlippenschwingung (und des resultierenden Klanges) durch die intrinsische Kehlkopfmuskulatur eingegangen werden. Stimmlippenschluss kann durch zwei Arten der glottalen Adduktion (i.e. die Annäherung der Stimmlippen in Phonationsstellung) erreicht werden: durch (a) Adduktion des membranösen Teils der Glottis (eine Verdickung der Stimmlippe durch Aktivität des m. vocalis, gesteuert durch das verwendete Gesangsregister); und (b) Adduktion des knorpeligen Teils der Glottis (durch Positionierung der Aryknorpel, entlang der Dimension "behaucht" vs. "gepresst"). Beide Adduktionsformen können unabhängig voneinander gesteuert werden, und zwar auch von Laiensängern. Auf diese Art kann das Stimmtimbre auf glottaler Ebene beeinflusst werden, entsprechend den ästhetischen Rahmenbedingungen des jeweiligen Gesangsstiles. Dieser Ansatz ist auch in einem therapeutischen Kontext vielversprechend, und zwar bei der Behandlung funktioneller Stimmstörungen (z.B. psychogene Dysphonie).
C62. Christian T. Herbst (2012). How does the singer's instrument work? - Some physical and physiological insights for a conductor's daily work. European Academy for Choral Conductors (invited lecture), Chorverband Österreich, Graz, Austria. September 13, 2012.
C61. Christian T. Herbst (2012). Freddie Mercury - Acoustical Voice Analysis. 10th International Voice Symposium Salzburg (invited lecture), Austrian Voice Institute, Salzburg, Austria. August 26, 2012.
C60. Christian T. Herbst (2012). From vocal fold vibration to sound - why is glottal closure important?. Voice Symposium Salzburg, pre-symposium workshop: Interventional Laryngology and Indirect Phonosurgery (invited lecture), Austrian Voice Institute, Salzburg, Austria. August 24, 2012.
C59. Christian T. Herbst, Jan G. Švec, Josef Schlömicher-Thier, W. T. S. Fitch (2012). Analyzing the female middle register with EGG wavegrams. The Voice Foundation's 41st Annual Symposium: Care of the Professional Voice, May 31, 2012. - show abstract
The choice of singing register and the degree of vocal fold adduction are two concepts that are not easily discriminated by inexperienced singers. This is particularly true for the mid range (pitch C4 -- C5) of untrained female classical singers, where adducted falsetto, the desired sound quality in this range, is rarely observed. As an underlying physiological principle, vocal fold adduction can be separately controlled by (a) cartilaginous adduction, i.e. the adduction of the posterior glottis via the arytenoids (controlled by the singer with the degree of ``breathiness'' / ''pressedness''); and by (b) membranous medialization through vocal fold bulging (controlled by the choice of vocal register, i.e. chest vs. falsetto) [1].

In this study, singing exercises and instructions for adjusting adductory settings (cartilaginous adduction vs. membranous medialization) in the female mid-range were performed by both trained and untrained female classical singers. Phonation was monitored by acoustic recording, electroglottography (EGG) and laryngeal imaging. EGG wavegrams [2], a novel method for displaying EGG signals, were used for data analysis.

EGG wavegram data revealed distinct differences between the targeted phonation types for each individual. The observed differences established themselves as (a) presence/absence of vocal fold contact; (b) duration of vocal fold contact per glottal cycle; (c) changes in the overall EGG signal amplitude; (d) distinctness of opening/closing events; (e) perturbations seen in the wavegrams. Inter-subject data variation suggests that the individual's anatomy influences vocal fold contact in singing. EGG wavegrams proved to be useful in documenting changes of both singing register and glottal adduction.

References
[1] C. T. Herbst, et al., "Membranous and cartilaginous vocal fold adduction in singing," J Acoust Soc Am, vol. 129, pp. 2253-2262, 2011.
[2] C. T. Herbst, et al., "Electroglottographic wavegrams: a technique for visualizing vocal fold dynamics noninvasively," J Acoust Soc Am, vol. 128, pp. 3070-8, Nov 2010.


C58. Christian T. Herbst (2012). Introduction to voice acoustics: formants, articulation and formant tuning. BVA Acoustics Study Day (invited lecture), British Voice Association, London, UK. May 20, 2012. - show abstract
When analyzing the human voice as an acoustical system, it can be decomposed into three parts: the power source (i.e. the lungs); the sound source (i.e. the larynx); and the sound modifiers (i.e. the vocal tract). In this 60 minute tutorial, the basic physical and physiological mechanisms of sound modification through the vocal tract are discussed: After introducing periodic vibration, harmonic series and the sound spectrum, the acoustic filter function of the vocal tract, facilitated by formants, is explained. The concept of formant tuning is established, and two example applications thereof (female and male singing voice) are portrayed.
C57. Christian T. Herbst (2012). Investigation of the mammal voice source in an excised larynx setup. CogBio Seminar, University of Vienna, Department of Cognitive Biology, Vienna, Austria. April 23, 2012. - show abstract
The source of the human voice originates in the larynx. It is in most cases generated by the vibrating vocal folds. In this presentation, a new method for visualization and analysis of the electroglottographic (EGG) signal (i.e. a physiological correlate of vocal fold vibration) is presented. This method, termed ``EGG wavegram'', allows to display EGG signals (and their first derivative, DEGG) across various phonations in one graph, whilst retaining the original appearance of the unaltered waveform.

The EGG signal is decomposed into consecutive individual cycles, each of which is normalized in both duration and amplitude, and is displayed on the y-axis, going from bottom to top. Overall time is shown on the x-axis. In a DEGG wavegram, the first derivative of the EGG signal is used as the input signal. In such a display, the contacting and de-contacting phases for each glottal cycle are approximated by (a) one or more dark horizontal line(s) at the lower end of the graph (contacting phase), and (b) one or more light horizontal line(s) in the upper section of the graph (de-contacting phase).

Much like in a sound spectrogram, information on vibratory behavior developing in time is compacted into one single graph, thus providing insight into changes of vocal fold dynamics. As such, the wavegram allows intuitive assessment of the time-varying contact phase of phonation over a longer period of time, indicating physiological changes of laryngeal configuration, such as vocal register. EGG wavegrams promise to be useful in research, clinical diagnostics, voice therapy and voice pedagogy.


References
Christian T. Herbst, W. T. S. Fitch, Jan G. Švec (2010). "Electroglottographic wavegrams: a technique for visualizing vocal fold dynamics noninvasively." J. Acoust. Soc. Am. 128 (5), 3070-3078
Christian T. Herbst (2012). Investigation of glottal configurations in singing. Palacký University in Olomouc, the Czech Republic (Doctoral Dissertation)
C56. Christian T. Herbst (2012). Control of sound source properties in singing. 2nd International Artistic and Scientific Symposium on Choral Art, Singing and Voice (invited lecture), Croatian Choral Directors Association, Zagreb, Croatia. April 13, 2012.
C55. Christian T. Herbst (2012). The sound source in singing: acoustical and physiological principles. 2nd International Artistic and Scientific Symposium on Choral Art, Singing and Voice (invited lecture), Croatian Choral Directors Association, Zagreb, Croatia. April 12, 2012.
C54. Malte Kob, Jan G. Svec, Christian T. Herbst (2012). Akustische Analyse von Stimmparametern. 9. Wiener Fortbildungskurs "Praxis der Stimmdiagnostik" (invited lecture), Medizinische Universität Wien, Univ.-HNO-Klinik, Klinische Abteilung Phoniatrie-Logopädie, Vienna, Austria. March 31, 2012.
C53. Christian T. Herbst (2012). Registerbeschreibung mit Hilfe der Elektroglottographie. 9. Wiener Fortbildungskurs "Praxis der Stimmdiagnostik" (invited lecture), Medizinische Universität Wien, Univ.-HNO-Klinik, Klinische Abteilung Phoniatrie-Logopädie, Vienna, Austria. March 30, 2012.
C52. Christian T. Herbst (2012). Akustische Analyse der Singstimme von Freddie Mercury. 5. Freiburger Stimmforum "populärer Gesang" (invited lecture), Freiburger Institut für Musikermedizin, Universitätsklinikum Freiburg, Freiburg, Germany. March 24, 2012.
C51. Christian T. Herbst (2012). Electroglottographic Wavegrams -- a new tool to assess sound source properties in speech and singing. Seminar, Audio Engineering Society (invited lecture), University of York, UK. February 1, 2012. - show abstract
The source of the human voice originates in the larynx. It is in most cases generated by the vibrating vocal folds. In this presentation, a new method for visualization and analysis of the electroglottographic (EGG) signal (i.e. a physiological correlate of vocal fold vibration) is presented. This method, termed ``EGG wavegram'', allows to display EGG signals (and their first derivative, DEGG) across various phonations in one graph, whilst retaining the original appearance of the unaltered waveform.

The EGG signal is decomposed into consecutive individual cycles, each of which is normalized in both duration and amplitude, and is displayed on the y-axis, going from bottom to top. Overall time is shown on the x-axis. In a DEGG wavegram, the first derivative of the EGG signal is used as the input signal. In such a display, the contacting and de-contacting phases for each glottal cycle are approximated by (a) one or more dark horizontal line(s) at the lower end of the graph (contacting phase), and (b) one or more light horizontal line(s) in the upper section of the graph (de-contacting phase).

Much like in a sound spectrogram, information on vibratory behavior developing in time is compacted into one single graph, thus providing insight into changes of vocal fold dynamics. As such, the wavegram allows intuitive assessment of the time-varying contact phase of phonation over a longer period of time, indicating physiological changes of laryngeal configuration, such as vocal register. EGG wavegrams promise to be useful in research, clinical diagnostics, voice therapy and voice pedagogy.


References
Christian T. Herbst, W. T. S. Fitch, Jan G. Švec (2010). "Electroglottographic wavegrams: a technique for visualizing vocal fold dynamics noninvasively." J. Acoust. Soc. Am. 128 (5), 3070-3078
Christian T. Herbst (2012). Investigation of glottal configurations in singing. Palacký University in Olomouc, the Czech Republic (Doctoral Dissertation)
C50. Jan G. Švec, Jaromir Horacek, Tomas Vampola, Christian T. Herbst, Donald G. Miller, Radovan Havlik, Petr Krupa, Mojmir Lejska (2012). Acoustic and articulatory adjustments in operating singing: spectral analysis, magnetic resonance imaging and finite-element modeling. International Voice Symposium. Subglottal Pressure Measurement and Source-Filter Interaction: Two Current Issues in Voice Research (invited lecture), New York University, Steinhardt School of Culture, Education, and Human Development, New York. January 7, 2012. presented by Jan G. Švec.
C49. Jakob Unger, Tobias Meyer, Christian T. Herbst, Michael Döllinger, Jörg Lohscheller (2011). PVG-Wavegramm: Dreidimensionale Visualisierung von Stimmlippendynamik. 28. Wissenschaftliche Jahrestagung der Deutschen Gesellschaft für Phoniatrie und Pädaudiologie e. V., Zurich, Switzerland. September 10, 2011. presented by Jakob Unger.
C48. Christian T. Herbst, W. T. S. Fitch, Josef Schlömicher-Thier, Jan G. Švec (2011). Observing the female middle register using EGG wavegrams. 9th Pan-European Voice Conference (PEVOC) (invited lecture), September 1, 2011. - show abstract
The choice of singing register and the degree of vocal fold adduction are two concepts that are not easily discriminated by inexperienced singers. This is particularly true for the mid range (pitch C4 -- C5) of untrained female singers, where sounds are often produced in either (a) fully adducted chest register or (b) breathy falsetto register. An adducted falsetto register, which is the desired sound source function of classical singing above a pitch of D4 is often not observed in untrained females.

As an underlying physiological principle, vocal fold adduction can be separately controlled by (a) cartilaginous adduction, i.e. the adduction of the posterior glottis via the arytenoids (controlled by the singer with the degree of ``breathiness'' / ''pressedness''); and by (b) membranous medialization through vocal fold bulging (controlled by the choice of vocal register, i.e. chest vs. falsetto).[1] The electroglottographic (EGG) signal is well suited to detect changes in both membranous medialization (i.e. registers)[2] and cartilaginous adduction[3] in singing.

In this study, the EGG wavegram[4], a novel method for displaying and analyzing EGG signals was used as a real-time feedback tool in the voice studio. In the context of singing exercises and instructions designed for this purpose, it was employed to help amateur female singers to understand and to better control the wide range of adductory settings (cartilaginous adduction vs. membranous medialization) in their middle range.

Wavegram data reveals distinct differences between abducted and adducted falsetto register for each individual. The observed differences established themselves as (a) presence/absence of vocal fold contact; (b) degree of irregularities reflected in the EGG signal to noise ratio; (c) absence/presence of DEGG double peaks. The results suggest that subjects can learn to increase cartilaginous adduction in their falsetto register using real time EGG wavegram feedback.


[1] C. T. Herbst, et al., "Membranous and cartilaginous vocal fold adduction in singing," J Acoust Soc Am, vol. accepted for publication, 2011.
[2] N. Henrich, et al., "Glottal open quotient in singing: Measurements and correlation with laryngeal mechanisms, vocal intensity, and fundamental frequency," J. Acoust. Soc. Am., vol. 117, pp. 1417-1430, 2005.
[3] C. T. Herbst, et al., "Using Electroglottographic Real-Time Feedback to Control Posterior Glottal Adduction during Phonation," J. Voice, vol. 24, pp. 72-85, 2010.
[4] C. T. Herbst, et al., "Electroglottographic wavegrams: a technique for visualizing vocal fold dynamics noninvasively," J Acoust Soc Am, vol. 128, pp. 3070-8, Nov 2010.


C47. Christian T. Herbst, Jan G. Švec (2011). Voice acoustics, microphones, recording and computers. One-Day Crash Course on 'Voice' (invited lecture), European Academy of Voice, August 30, 2011.
C46. Jakob Unger, Tobias Meyer, Christian T. Herbst, Michael Döllinger, Jörg Lohscheller (2011). PVG-Wavegrams: Three-dimensional visualization of vocal fold dynamics. 7th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA), August 25, 2011. presented by Jakob Unger.
C45. Christian T. Herbst, W. T. S. Fitch, Jan G. Švec (2011). Wavegrams: A new technique for visualizing vocal fold dynamics noninvasively using electroglottographic signals. 40th Annual Symposium: Care of the Professional Voice, The Voice Foundation, June 2, 2011. - show abstract
A new method for analyzing and displaying EGG signals (and their first derivative, DEGG) is introduced: the electroglottographic wavegram (short: wavegram). It (a) allows monitoring the EGG (or DEGG) signal over time; and (b) provides an intuitive means for quickly assessing the duration of glottal closure and its variation over time.

Based on the EGG or DEGG signal, the time-varying fundamental frequency is calculated and consecutive individual glottal cycles are identified. Each cycle is locally normalized in duration and amplitude and the cycles are then plotted consecutively. The plotting process resembles that of a spectrogram, but instead of spectral amplitudes, the signal deflections are encoded by color intensity. The wavegram maps time on the x-axis, normalized cycle duration on the y-axis and the signal deflection on the color-intensity-coded z-axis.

Variations in vocal fold contact appear in the wavegram as a sequence of events, rather than single phenomena. These events take place over a certain period of time and change with pitch, loudness and register. Multiple DEGG peaks are revealed in wavegrams to behave systematically, indicating subtle changes of vocal fold oscillatory regime. As such, EGG wavegrams promise to reveal more information on vocal fold contacting and de-contacting events than previous methods.

In this presentation, wavegrams of human and mammal phonations are shown. Their physiologic relevance is discussed in relation to glottal configurations and vocal fold vibratory patterns, as seen in laryngeal imaging.

Reference: Christian T. Herbst, W. T. S. Fitch, Jan G. Švec (2010). "Electroglottographic wavegrams: a technique for visualizing vocal fold dynamics noninvasively." J. Acoust. Soc. Am. 128 (5), 3070-3078
C44. Christian T. Herbst (2011). Vocal folds and the voice timbre. 3rd Czech-Slovak Symposium on ART VOICE (invited lecture), Hlasové a sluchové centrum Praha, s.r.o., Prague, Czech Republic. May 21, 2011. - show abstract
It is well known that the voice timbre can be controlled in the vocal tract in various ways. The adjustment of the voice character at the laryngeal level, however, receives less attention, particularly in the pedagogic literature. Hence, this presentation focuses on the sound source: How can singers control and fine-tune the voice timbre by adjustments of the vocal folds?

We recognize three basic vocal fold adjustments: (a) adduction (and abduction) of the posterior glottis; (b) thickening/bulging of the vocal folds; (c) elongation of the vocal folds. In this presentation, the first two adjustments are examined more closely:

(a) Cartilaginous adduction, i.e. adduction of the posterior glottis, is maintained through the (lateral) cricoarytenoid and the interarytenoid muscles. Phonation with a fully adducted glottis is characterized by strong high-frequency partials ("overtones"), thus giving the voice a "brassy", "ringing" or "resonant" quality. On the other hand, phonation with a posterior glottal gap (the posterior glottis is not fully adducted) creates high-frequency partials of lesser strength. The voice then has a "fluty" or "dull" quality, and most likely contains noise components ("breathy voice").

(b) Membranous medialization, i.e. thickening (bulging) of the vocal folds is controlled via the singing register. In chest (modal) register, the thyroarytenoid (vocalis) muscle is contracted, and the vocal folds are medially bulged. This introduces vertical phase differences into the vocal fold vibration, effectively shortening the open phase and increasing the amount of high-frequency partials. Phonation with a relaxed thyroarytenoid muscle, on the other hand, is usually identified as falsetto (or sometimes "head") register. It is characterized by a less "resonant" sound, containing weaker overtones.

It has been shown that these two maneuvers can be controlled separatedly by both trained and untrained singers [1, 2].
The ability to individually and gradually control cartilaginous adduction and membranous medialization allows experienced singers to produce a great variety of vocal timbres at the laryngeal level, thus increasing the quality of their artistic performance. This concept is also very useful in voice pedagogy, particularly in training the female mid-range in classical singing.

References:
[1] Herbst C., Svec J., Ternström S. (2009) Investigation of four distinct glottal configurations in classical singing - a pilot study. J.Acoust.Soc.Am. 125:EL104-EL109.
[2] Herbst C.T., Qiu Q., Schutte H.K., Svec J.G. (2011) Membranous and cartilaginous vocal fold adduction in singing. J Acoust Soc Am 129:2253-2262
C43. Christian T. Herbst (2011). Wie funktioniert Stimme?. Regensburger Stimmtag: ein regionaler Beitrag zum World Voice Day (invited lecture), Regensburger Ärztenetz e.V., Regensburg, Germany. April 16, 2011.
C42. Christian T. Herbst (2011). Beschaffenheit und Ausbaumöglichkeit der Kinderstimme. Kinderchorleitungssymposium (invited lecture), Universität der Künste, Berlin, Germany. February 11, 2011.
C41. Christian T. Herbst (2011). Stimmbildungsunterricht mit Knaben und Mädchen. Kinderchorleitungssymposium (invited lecture), Universität der Künste, Berlin, Germany. February 11, 2011.
C40. Christian T. Herbst (2010). Understanding vocal timbre in singing - a tutorial. Europa Cantat General Assembly 2010 (invited lecture), Europa Cantat, Namur, Belgium. November 28, 2010. - show abstract
Timbre, known in psychoacoustics as tone quality or tone color, distinguishes different types of sound production. In singing, timbre is mainly influenced by: amplitude; fundamental frequency and variations thereof; the amount of high-frequency energy components (``overtones'', singers' formant); vowel quality; and the noise level (degree of breathiness). In a 45 minute tutorial, an overview over those sound qualities is given. It is shown that they are mainly controlled by two physiologic means: adjustments of the vocal tract and adjustments of the sound source, i.e. the laryngeal configuration. The practical application of this knowledge to choir singing practice is briefly discussed.
C39. Christian T. Herbst, Josef Schlömicher-Thier, Matthias Weikert (2010). Berufsstimmbetreuung in der HNO-Praxis - eine stimmdiagnostische, -therapeutische und gesangsdidaktische Synopsis. Stuttgarter Stimmtage (invited lecture), Staatliche Hochschule für Musik und Darstellende Kust, Stuttgart, Germany. October 2, 2010.
C38. Josef Schlömicher-Thier, Hans E. Eckel, Christian T. Herbst (2010). Interventionelle Laryngologie in der HNO-Praxis. 54th annual Meeting of the Austrian Society of Oto-Rhino-Laryngology, Head and Neck Surgery, Salzburg, Austria. September 17, 2010. presented by Josef Schlömicher-Thier. - show abstract
Sprech- und Singstimmprobleme sind hauptsächlich durch eine Beeinträchtigung der Form bzw. Beweglichkeit der laryngealen Strukturen bedingt. Während Formdefizite in der allgemeinen HNO-Heilkunde relativ leicht erkannt werden können, stellen Beweglichkeitsdefizite eine weit größere Herausforderung dar.

Die Bewegung von laryngealen Strukturen kommt auf zwei Arten zustande: (1) größere, relativ langsame, durch Muskelkontraktion bedingte Bewegungen, die mit freiem Auge (Kehlkopfspiegelung) erkennbar sind (< 15 Hz), z.B. Adduktion der Stimmlippen und Taschenfalten, Längsspannung der Stimmlippen, Änderung der vertikalen Kehlkopfposition; und (2) kleinere und relativ schnelle Bewegungen der Stimmlippen, i.e. Oszillation der Stimmlippen bedingt durch aerodynamisch-mechanische Vorgänge (> 50 Hz). Die Gründe der negativen Beeinflussung von Qualität bzw. Periodizität jener Schwingungen sind bei bloßer Kehlkopfspiegelung bzw. ohne videostrobolaryngoskopischer Untersuchung oft nicht zu erkennen.

In diesem Workshop wird auf Spezialfälle von Störungen der Stimmlippen-Beweglichkeit eingegangen. Neben einer grundlegenden Erläuterung von akustischen und physiologischen Rahmenbedingungen werden Diagnose- und Therapieansätze vorgestellt. Insbesondere wird auf folgende Pathologien eingegangen:

(1) Neurologische Stimmstörungen: a) Paresen (Technik der Stimmlippenaugmentation, Vocastim-Therapie); b) Spasmodische Stimmstörungen (Botulinumtoxin-Therapie); (2) Umgang mit der chron. Laryngitis, insb. Diagnostik und Therapie der Refluxerkrankung; (3) Indirekte phonochirurgische Maßnahmen bei organischen Stimmstörungen; (4) Psychogene Stimmstörung: psychotherapeutische Maßnahmen, funktioneller Stimmaufbau
C37. Christian T. Herbst, W. T. S. Fitch, Jan G. Švec (2010). Wavegrams: A new technique for visualizing vocal fold dynamics noninvasively using electroglottographic signals. 9th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research (AQL), September 2010. - show abstract
Electroglottography (EGG) is a non-invasive low-cost method to monitor relative vocal fold contact area (VFCA) during phonation. Increase and decrease of VFCA is related to glottal closing and opening, respectively. In this study, a new method for analyzing and displaying EGG signals (and their first derivative, DEGG) is introduced: the electroglottographic wavegram (short: wavegram). It (a) allows monitoring the EGG (or DEGG) signal over time; and (b) provides an intuitive means for quickly assessing the duration of glottal closure and its variation over time.

Based on the EGG or DEGG signal, the time-varying fundamental frequency is calculated and consecutive individual glottal cycles are identified. Each cycle is locally normalized in duration and amplitude and the cycles are then plotted consecutively. The plotting process resembles that of a spectrogram, but instead of spectral amplitudes, the signal deflections are encoded by color intensity. The wavegram presents the time on the x-axis, normalized cycle duration on the y-axis and the signal deflection on the color-intensity-coded z-axis.

The wavegram reveals changes of vocal fold contact duration in time. It also shows phenomena that remain overlooked in traditional EGG-display techniques, such as multiple DEGG peaks. While these phenomena have usually been considered artifacts, the wavegram displays revealed consistent behavior of these peaks in a large number of subjects. They indicate subtle changes of vocal fold oscillatory regime.

Wavegram analysis suggests that the phenomenon of vocal fold closing and opening is more complex than commonly assumed. Rather than a single event, vocal fold opening and closing should be considered a sequence of events, taking place over a certain period of time. Data show that the sequence of these events can change with pitch, loudness and register. The EGG signal thus promises to reveal more (physiological) information on vocal fold closure and opening events than previously thought.
C36. Christian T. Herbst (2010). Das Timbre im klassischen Gesang: akustisches und physiologisches Tutorial. 9th Voice Symposium Salzburg, Austrian Voice Institute, Salzburg, Austria. August 28, 2010. - show abstract
Timbre, known in psychoacoustics as tone quality or tone color, distinguishes different types of sound production. In singing, timbre is mainly influenced by: amplitude; fundamental frequency and variations thereof; the amount of high-frequency energy components (``overtones'', singers' formant); vowel quality; and the noise level (degree of breathiness). In a 30 minute tutorial, an overview over those sound qualities is given. It is shown that they are mainly controlled by two physiologic means: adjustments of the vocal tract and adjustments of the sound source, i.e. the laryngeal configuration.
C35. Christian T. Herbst, W. T. S. Fitch, Jan G. Švec (2010). Visualizing electroglottographic signals with wavegrams. 5th International Conference on the Physiology and Acoustics of Singing (PAS5), Kungliga Tekniska Högskolan, Stockholm, Sweden. August 11, 2010. - show abstract
Electroglottography (EGG) is a non-invasive low-cost method to monitor relative vocal fold contact area (VFCA) during phonation. Increase and decrease of VFCA is related to glottal closing and opening, respectively. In this study, a new method for analyzing and displaying EGG signals (and their first derivative, DEGG) is introduced: the electroglottographic wavegram (short: wavegram). It (a) allows monitoring the EGG (or DEGG) signal over time; and (b) provides an intuitive means for quickly assessing the duration of glottal closure and its variation over time.

Based on the EGG or DEGG signal, the time-varying fundamental frequency is calculated and consecutive individual glottal cycles are identified. Each cycle is locally normalized in duration and amplitude and the cycles are then plotted consecutively. The plotting process resembles that of a spectrogram, but instead of spectral amplitudes, the signal deflections are encoded by color intensity. The wavegram presents the time on the x-axis, normalized cycle duration on the y-axis and the signal deflection on the color-intensity-coded z-axis.

The wavegram reveals changes of vocal fold contact duration in time. It also shows phenomena that remain overlooked in traditional EGG-display techniques, such as multiple DEGG peaks. While these phenomena have usually been considered artifacts, the wavegram displays revealed consistent behavior of these peaks in a large number of subjects. They indicate subtle changes of vocal fold oscillatory regime.

Wavegram analysis suggests that the phenomenon of vocal fold closing and opening is more complex than commonly assumed. Rather than a single event, vocal fold opening and closing should be considered a sequence of events, taking place over a certain period of time. Data show that the sequence of these events can change with pitch, loudness and register. The EGG signal thus promises to reveal more (physiological) information on vocal fold closure and opening events than previously thought.
C34. Christian T. Herbst, Jan G. Švec, Qingjun Qiu, Harm Schutte (2010). Membranous and cartilaginous glottal adduction in singing - experimental findings and pedagogic considerations. Choice for Voice Conference, British Voice Association, London, U. K.. July 2010. - show abstract
While it has been recognized that glottal adduction is an important parameter in speech, relatively little has been known on the adjustment of the glottal adduction when changing the voice quality in singing. Our previous pilot data on a single subject suggest that the cartilaginous and membranous parts of glottis play different roles in singing -- while the membranous part is expected to aid in switching between the chest and falsetto registers, the cartilaginous part is expected to play a primary role for adjusting the sound quality within the desired register. The goal of this study was to design singing exercises that enable both trained and untrained singers to independently manipulate cartilaginous and membranous glottal adduction and to verify these exercises on a set of subjects laryngoscopically.

A baritone, who was previously found capable of independently manipulating the cartilaginous and membranous glottal adduction, served as an instructor in this study. 6 female and 6 male subjects, singers and non-singers, were asked to imitate the instructor in producing 4 phonation types, i.e. (FaB) 'aBducted falsetto'; (FaD) 'aDducted falsetto'; (CaB) 'aBducted chest'; and (CaD) 'aDducted chest', at a pitch located within the range of the chest/falsetto register transition (C#4 to F4). In order to maintain the desired register, the target notes for chest and falsetto were reached by singing an ascending and descending, respectively, scale of five notes starting in the desired register. The subjects were asked not to 'blend or mix the registers'. The phonation was monitored by videostroboscopy, videokymography (VKG), electroglottography (EGG) and audio recording.

The results showed distinct laryngeal configurations and vocal fold vibration characteristics for the four phonation types. As expected, all the subjects showed a less adducted posterior, i.e. cartilaginous, glottis in phonation types FaB and CaB than in phonations types FaD and CaD. Changes in the membranous part of the vocal folds were reflected in videokymographic imaging which revealed that the chest phonations, as compared to the falsetto phonations, had larger mucosal waves, sharper lateral peaks and longer closed quotient.

The findings indicate that the singers succeeded in independently manipulating the membranous and cartilaginous adduction of the glottis. Individual control over these two types of glottal adduction is expected to be a key factor for the experienced singer to create different vocal timbres. The designed singing exercises were found useful in training the subjects for achieving this goal.

In the final part of this presentation, some practical considerations and possible pedagogical strategies for classical singing are discussed.
C33. Jan G. Švec, Jaromir Horacek, Tomas Vampola, Christian T. Herbst, Donald G. Miller, Radovan Havlik, Petr Krupa, Mojmir Lejska (2010). Acoustic and articulatory adjustments in operatic singing: Spectral analysis and magnetic resonance imaging. COST 2103: 4th Advanced Voice Function Workshop AVFA'10, York, U.K.. May 20, 2010. presented by Jan G. Švec.
C32. Christian T. Herbst, Josef Schlömicher-Thier (2010). Die Sängerbetreuung in stimmpädagogischer und sängermedizinischer Kooperation. Berner Symposium Medizin, Logopädie, Gesangspädagogik (invited lecture), Hochschule der Künste Bern, Bern, Switzerland. April 17, 2010.
C31. Christian T. Herbst, Josef Schlömicher-Thier (2010). Pedagogical and Medical Cooperation in Voice Patient Care. Symposium Ars Choralis (invited lecture), Croatian Choral Directors Association, Zagreb, Croatia. April 10, 2010. presented by Christian T. Herbst. - show abstract
In voice patient care, we are approached by two types of clients: those who want to "be able to sing again" and those who want to "be able to sing better". In both cases, the underlying predicament has several aspects: emotional, life-style related, medical, pedagogical. Since those aspects are often related to each other, a multi-dimensional approach for treating the patient is required.

In this presentation, several boundary conditions for a "good" voice are established. To illustrate the effects of medical and pedagogical measures, we show case studies from our regular work with voice patients.
C30. Christian T. Herbst (2010). Voice Timbre in Singing. Symposium Ars Choralis (invited lecture), Croatian Choral Directors Association, Zagreb, Croatia. April 10, 2010. - show abstract
Timbre, known in psychoacoustics as tone quality or tone color, distinguishes different types of sound production. In singing, timbre is influenced by vowel quality, amount of high-frequency energy components ('overtones', singers' formant) and the noise level (degree of breathiness). Those sound qualities are mainly controlled by two physiologic means: adjustments of the vocal tract and adjustments of the sound source, i.e. the laryngeal configuration.
C29. Christian T. Herbst (2010). Vocal Fold Adduction and Registers in Classical Singing. Symposium Ars Choralis (invited lecture), Croatian Choral Directors Association, Zagreb, Croatia. April 8, 2010. - show abstract
Register control in singing is physiologically achieved mostly by the vocalis muscle (membranous adduction). On the other hand, the degree of adduction of the posterior part of the glottis (cartilaginous adduction, regulated by laryngeal adductory muscles PCA and IA) is known to have an influence on the 'richness' of the vocal sound source.

In a recent study it has been shown, that both trained and untrained singers can independently vary those two types of laryngeal adjustment. The independent control over cartilaginous and membranous adduction allows singers to create different vocal timbres at the laryngeal level. In singing pedagogy, this knowledge can be used to effectively address certain technical problems, such as running out of breath, or register violations.
C28. Ramona Steiner, Christian T. Herbst, David Howard (2009). Electroglottographic (EGG) real-time biofeedback to enhance glottal adduction in patients with unilateral vocal fold pareses. 8th Pan European Voice Conference (PEVOC), Dresden, Germany. August 28, 2009. presented by Ramona Steiner. - show abstract

The central deficit in unilateral vocal fold pareses (UVFP) is insufficient glottal adduction. Several well-established therapy methods exist, but their efficiency is rarely evaluated objectively. Except for auditory feedback via bone conduction, patients have no means to assess the targeted change in voice quality. In a recent study, electroglottography (EGG) has successfully been used as a real-time biofeedback tool in order to increase the degree of posterior glottal closure in a healthy amateur singer. EGG has also been used recently in an attempt to document voice quality in patients with vocal fold pareses. In this study we investigate whether electroglottographic real-time biofeedback can be used to increase therapy efficiency by enhancing glottal adduction in patients with UVFP.

For this experiment four patients with diagnosed infranuclear UVFP act as subjects. Habitual phonation was documented simultaneously by means of videolaryngoscopy, electroglottography and audio recording when sustaining a vowel at a comfortable pitch. In a therapeutic session, using phonatory exercises (conservative approach), subjects were shown a real-time EGG-waveform (normalized in both amplitude and time) representing one glottal cycle which changes over time. As they followed the instructions of the therapist, they were asked consciously to introduce changes into the shape of the displayed EGG-waveform. Therapy sessions were documented by means of simultaneous recording of acoustic and electroglottographic data. Immediately after therapy session the patients' attempt to apply the potentially improved phonatory behaviour was documented simultaneously by means of videolaryngoscopy, electroglottography and audio recording, again when sustaining a vowel at a comfortable pitch.

First tests showed that the EGG-signal could be detected in a patient with a chronic UVFP, and that he was able to willingly introduce changes into the displayed EGG-waveform during therapy for a sustained vowel. The sessions further explored the effect of training on the ability of patients to change the shape of the EGG-waveform at will which provides support for the use of EGG in therapy.
C27. Christian T. Herbst, Jan G. Švec, Qingjun Qiu, Harm Schutte (2009). Membranous and cartilaginous glottal adduction in singing. 8th Pan European Voice Conference (PEVOC), Dresden, Germany. August 27, 2009. - show abstract
While it has been recognized that glottal adduction is an important parameter in speech, relatively little has been known on the adjustment of the glottal adduction when changing the voice quality in singing. Our previous pilot data on a single subject suggest that the cartilaginous and membranous parts of glottis play different roles in singing -- while the membranous part is expected to play an important role for switching between the chest and falsetto registers, the cartilaginous part is expected to play a primary role for adjusting the sound quality within the desired register. The goal of this study was to design singing exercises that enable both trained and untrained singers to independently manipulate cartilaginous and membranous glottal adduction and to verify these exercises on a set of subjects laryngoscopically.

A baritone, who was previously found capable of independently manipulating the cartilaginous and membranous glottal adduction, served as an instructor in this study. 6 female and 6 male subjects, singers and non-singers, were asked to imitate the instructor in producing 4 phonation types, i.e. (A) 'naïve falsetto'; (B) 'quality falsetto'; (C) 'light chest'; and (D) 'heavy chest', at a pitch located within the range of the chest/falsetto register transition (C#4 to F4). In order to maintain the desired register, the target notes for chest and falsetto were reached by singing an ascending and descending, respectively, scale of five notes starting in the desired register. The subjects were asked not to 'blend or mix the registers'. The phonation was monitored by videostroboscopy, videokymography (VKG), electroglottography (EGG) and audio recording.

The results showed distinct laryngeal configurations and vocal fold vibration characteristics for the four phonation types. As expected, all the subjects showed a less adducted posterior, i.e. cartilaginous, glottis in phonation types A and C than in phonations types B and D. Changes in the membranous part of the vocal folds were reflected in videokymographic imaging which revealed that the chest phonations, as compared to the falsetto phonations, had larger mucosal waves, sharper lateral peaks and longer closed quotient.

The findings indicate that the singers succeeded in independently manipulating the membranous and cartilaginous adduction of the glottis. Individual control over these two types of glottal adduction is expected to be a key factor for the experienced singer to create different vocal timbres. The designed singing exercises were found useful in training the subjects for achieving this goal.
C26. Jan G. Švec, Jaromir Horacek, Tomas Vampola, Christian T. Herbst, Donald G. Miller, Radovan Havlik, Petr Krupa, Mojmir Lejska (2009). Acoustic and Articulatory Adjustments for Singers' Formant Production: Spectral Analysis, MRI and Finite Element Modeling. The Voice Foundation's 38th Annual Symposium, The Voice Foundation, Philadelphia. June 2009. presented by Jan G. Švec.
C25. Jan G. Švec, Christian T. Herbst, Sten Ternström (2009). Membranous versus cartilaginous glottal adduction in four singing voice qualities: Pilot laryngostroboscopic and videokymographic observations. Proceedings of AVFA '09, 3rd Advanced Voice Function Assessment International Workshop, May 18, 2009. presented by Jan G. Švec. - show abstract
This study investigates four qualities of singing voice in a classically trained baritone: "naïve falsetto", "countertenor falsetto", "lyrical chest" and "full chest". Laryngeal configuration and vocal fold behavior in these qualities were studied using laryngeal videostroboscopy, videokymography, electroglottography, and sound spectrography. The data suggest that the four voice qualities were produced by independently manipulating mainly two laryngeal parameters: (1) the adduction of the arytenoid cartilages and (2) the thickening of the vocal folds. An independent control of the posterior adductory muscles versus the vocalis muscle is considered to be the physiological basis for achieving these singing voice qualities.
C24. Christian T. Herbst, Josef Schlömicher-Thier (2009). Stimme - Ausdrucksmittel und Werkzeug im Kunstbetrieb. Symposium: Internationales Theaterinstitut der UNESCO - Centrum Österreich, Vienna, Austria. March 28, 2009.
C23. Jan G. Švec, Christian T. Herbst, Radovan Havlik, Jaromir Horacek, P. Krupa, M. Lejska, Donald G. Miller (2008). Singer's formant: Preliminary results of MRI and acoustic evaluations of singers. Proceedings Interaction and Feedbacks 2008, Prague:Institute of Thermomechanics AS CR, November 2008. presented by Jan G. Švec.
C22. Christian T. Herbst, Elke Duus (2008). Stimmliche Leistungsbeurteilung von SängerInnen im Amateurchor. 8th Voice Symposium Salzburg, Austrian Voice Institute, Salzburg, Austria. July 27, 2008.
C21. Christian T. Herbst (2008). Pressen, behauchtes Singen und Registerdivergenzen - Einfluss der glottischen Konfiguration auf das Timbre im klassischen Gesang. 8th Voice Symposium Salzburg, Austrian Voice Institute, Salzburg, Austria. July 26, 2008.
C20. Christian T. Herbst (2008). Einfluss der Stimmlippentätigkeit auf das Gesangstimbre. Guest Lecture, Hochschule für Musik, Köln, June 11, 2008.
C19. Christian T. Herbst, Josef Schlömicher-Thier (2007). Visualization and Analysis of Electroglottographic Waveforms. XVI Annual PVSF/UCLA Voice Conference, Los Angeles, CA. October 25, 2007. presented by Josef Schlömicher-Thier.
C18. Christian T. Herbst (2007). Glottal Contact in Singing. 5th international logopedics and phoniatrics course 'THE ARTISTIC VOICE' (invited lecture), La Voce Artistica, Ravenna, October 18, 2007.
C17. Christian T. Herbst (2007). Kehlkopfkonfigurationen beim Singen. Symposium: Stimmbildung in Knaben-, Mädchen- und gemischten Kinderchören (invited lecture), Universität der Künste Berlin, Germany, October 4, 2007.
C16. Christian T. Herbst, Jan G. Švec (2007). Is the degree of posterior glottal adduction relevant for "voix mixte" phonation?. 7th Pan European Voice Conference (PEVOC), Groningen, The Netherlands. September 1, 2007. - show abstract
In the context of a voice coaching situation, a 52 year old semi-professional baritone was diagnosed to have a limited upper range. Starting with a pitch of about C4, the phenomenon of divergent registers occurred: with increasing pitch, phonation was only possible in either loud chest voice, or falsetto phonation. Messa-di-voce exercises in the range between Bb3 and F#4 exhibited register breaks in both the crescendo and decrescendo. The full chest voice reached its pitch limit at about 365 Hertz. Even though a voice range profile revealed a dynamic range from 72 to 114 dB at pitches from C4 to F4, the region between 94 and 102 dB could hardly be used for artistic purposes at those pitches.

An a priori stroboscopic examination revealed a habitual increase of the degree of posterior glottal adduction at pitches at and above C4. It was hypothesized that a lesser degree of posterior glottal adduction could increase the artistically usable pitch and dynamic range.

In order to test this hypothesis, the baritone was asked to sing various exercises while having constant visual feedback through videostroboscopic imaging. The arytaenoids were targeted to be spread slightly apart during phonation in the upper range. The targeted acoustic quality was described as 'almost breathy'.

When phonating in the upper pitch range with a lesser degree of posterior glottal adduction, the cartillagenous portion of the vocal folds became visible. The arytaenoids changed their position, suggesting active participation of the posterior cricoarytenoid muscle (PCA). The ventricular folds were slightly more retracted, thus widening the epilaryngeal tube.
Immediately after the session with the visual feedback, the baritone was able reach pitches as high as Bb4 without audible register breaks. The SPL was within the previously missing range of 90 to 100 dB. Electroglottographic evidence revealed a decreased waveform width, signifying a decreased duration of glottal closure as opposed to phonation with a high degree of posterior glottal adduction.

The findings suggest that posterior glottal adduction is an important physiological parameter in singing. It allows achieving a specific voice quality which is perceptually and dynamically between the chest and falsetto registers and thus could be considered to correspond to that of a 'voix mixte' tone production.
C15. Christian T. Herbst, David Howard, Josef Schlömicher-Thier (2007). Using electroglottographic real-time feedback to control posterior glottal adduction during phonation. 7th Pan European Voice Conference (PEVOC), Groningen, The Netherlands. September 1, 2007. - show abstract
The goal of this pilot study was to determine whether the ability to change the degree of posterior glottal adduction during phonation can be acquired more easily with the aid of electroglottographic real-time feedback.

A 37 year old untrained female chorister was asked to participate in the experiment. The initial perceptive evaluation of her singing voice revealed extremely breathy phonation, regardless of pitch and loudness. During the experiment, phonation has been monitored simultaneously with videostroboscopy, electroglottography and audio recording. While phonating, the chorister saw the normalized electroglottographic waveform representing one glottal cycle consecutively changing over time. After an initial 'placebo' phase, the actual relevance of the EGG waveform was explained to the chorister. The assignment was to increase the width of the EGG waveform during phonation. Data was collected for sustained notes at a pitch of B3, B4 and G5 respectively.

Laryngeal imaging revealed a considerable posterior glottal chink during habitual phonation. No considerable changes of phonatory quality could be documented for the 'placebo' phase. Once the relevance of the EGG waveform has been made clear to the chorister, visual, acoustic and electroglottographic evidence suggests that the subject was able to make intentional changes to the laryngeal configuration during phonation: An increase of the EGG waveform width coincided with the increase of high frequency partials and an increase of posterior glottal adduction. For pitches B3 and B4, a full glottal closure could be achieved. At G5, a reduction of the posterior glottal chink occurred.

The findings of this study suggest that (a) the skill to control the degree of posterior glottal adduction can be acquired, and that (b) electroglottographic real-time feedback can be a crucial element in optimizing the process of skill acquisition, but only if the context and nature of the feedback is explained.
C14. Christian T. Herbst (2007). Der Knabensolist in der Oper - Akustisches Portrait eines musikalischen Hochleistungssportlers. 75. Kongress der Deutschen Gesellschaft für Sprach- und Stimmheilkunde (DGSS), April 21, 2007.
C13. Gerhard Schmidt-Gaden, Christian T. Herbst, Ernst L. Schmid (2007). Talentschmiede Knabenchor?. 19. Jahreskongress des Bundesverbandes Deutscher Gesangspädagogen (BDG), April 20, 2007.
C12. Christian T. Herbst, Josef Schlömicher-Thier (2007). Stimmbildung und Stimmstörungsprävention. Internationale Tagung "Die Stimme Heute", Zentralkrankenhaus Bozen, Abteilung HNO, January 26, 2007. presented by Christian T. Herbst.
C11. Christian T. Herbst (2006). Stimmbandschluss im klassischen Gesang. Grazer Stimmtage, Hals-, Nasen-, Ohren-Universitätsklinik Graz, Klinische Abteilung für Phoniatrie, November 25, 2006.
C10. Christian T. Herbst (2006). Physiologische Vorgänge beim Registerausgleich der Knabenstimme. 1. Internationales Symposium für Kinderstimmbildung, Freunde der Wiener Sängerknaben, November 4, 2006.
C9. Christian T. Herbst (2006). Acoustic Priciples of Voice Production - a Tutorial. 7th Voice Symposium Salzburg, Austrian Voice Institute, August 4, 2006.
C8. Josef Schlömicher-Thier, Phillip Janssen, Christian T. Herbst (2006). Phonetogram: Architecture of Speaking and Singing Voice. 3rd World Voice Conference, Istanbul, Turkey. June 22, 2006.
C7. Christian T. Herbst, Josef Schlömicher-Thier, Matthias Weikert (2006). Voice Disorders in Childhood & Management of Mutational Problems in Choirboys. 3rd World Voice Conference, Istanbul, Turkey. June 21, 2006.
C6. Christian T. Herbst, Jan G. Švec (2006). Investigation of four distinct glottal configurations in a classically trained male singer. 3rd physiology and acoustics of singing conference (PAS3-06), University of York, U.K., York, U.K.. May 11, 2006.
C5. Christian T. Herbst (2006). Untersuchung der Sängerstimme mittels Elektroglottographie (Workshop). 6. Wiener gesangswissenschaftliche Tagung, Institut Antonio Salieri, Universität für Musik und darstellende Kunst, Wien, Vienna, Austria. January 14, 2006.
C4. Christian T. Herbst (2005). The Singer's Voice: Glottal Configurations and Voice Source Properties. Seminar, School of Arts, Culture & Environment, University Edinburgh, Edinburgh, UK. October 27, 2005.
C3. Christian T. Herbst, Sten Ternström (2005). A comparison of different methods for measuring the electroglottographic contact quotient. 6th Pan European Voice Conference (PEVOC), London, UK. September 2, 2005.
C2. Christian T. Herbst (2005). Die Singstimme als physikalisch-akustisches System. Seminar, Musikum Salzburg, Salzburg, Austria. January 22, 2005.
C1. Christian T. Herbst (2004). The EGG Contact Quotient as a Means of Assessing Vocal Registration Quality in Classical Singing. Seminar, Dept. of Speech, Music and Hearing, Royal Institute of Technology, Stockholm, Sweden. August 24, 2004.
top of page
Television & Radio
M22. Matheo Duarte Sierra (2016). Christian T. Herbst, el cientifico que hizo un analisis de la voz de Freddy Mercury. La hora del regreso, Radio W Colombia, October 18, 2016
M21. Paul Lohberger (2016). Radiokolleg "Paul Simon - Der Grandseigneur der Popmusik". Österreichischer Runkfunk Ö1, October 11, 2016
M20. Michaela Graichen (2016). The science of Freddie Mercury's voice. BBC Newshour, April 26, 2016
M19. Gabe O'Connor (2016). Why Freddie Mercury's voice was so great - as explained by science. NPR, April 25, 2016
M18. Jim Drury (2014). Vienna study gives voice to elephant rumblings. Reuters TV, February 18, 2014
M17. Rainer Rosenberg (2013). Von Tag zu Tag: Vom Gesang zur Stimmforschung. Die Abenteuer des Biophysikers Christian Herbst (live broadcast, 30 min.). Österreichischer Runkfunk Ö1, July 25, 2013
M16. Uli Pförtner (2013). Operation Dolittle - mit Tierstimmenforschern unterwegs. ARTE, April 18, 2013
M15. Michael de Werd (2012). Olifantentaal - De Ochtend. VRT - Radio 1, August 26, 2012
M14. Christine Ricken (2012). Große Tiere, tiefe Töne. Neue Erkenntnisse über die Sprache der Elefanten. SWR2 Impuls, August 21, 2012
M13. Josef P. Glanz (2012). ZIB Flash. ORF, August 7, 2012
M12. Josef P. Glanz (2012). Salzburg Heute - Elefantenforscher. ORF, August 7, 2012
M11. Josef P. Glanz (2012). Heute in Österreich - Elefantensprache entschlüsselt. ORF, August 7, 2012
M10. Miriam Stumpfe (2012). Brummende Elefanten - Was die Dickhäuter über die Entstehung der Stimme verraten. BR5 - Aus Wissenschaft und Technik, August 5, 2012
M9. Kerry Klein (2012). Science Podcast: How Elephants Vocalize. AAAS, August 3, 2012
M8. Arndt Reuning (2012). Stimmbildung bei Elefanten - Interview mit Christian Herbst, Uni Wien. Deutschlandfunk - Forschung Aktuell, August 3, 2012
M7. Martina Preiner (2012). Elefanten sprechen wie Menschen - Laute entstehen durch Luftstrom durch Stimmlippen. WDR5 - Leonardo, August 3, 2012
M6. Paul Lohberger (2011). Radiokolleg - Die Stimme als Instrument. Österreichischer Runkfunk Ö1, September 5, 2011
M5. Paul Lohberger (2011). Radiodoktor - Das Ö1 Gesundheitsmagazin: Stimmbildung - Therapie und Körpererfahrung. Österreichischer Runkfunk Ö1, March 23, 2011
M4. Katrin Müller-Höcker (2010). Vibrierende Muskeln, klingender Atem - Wie funktioniert das Wunderwerk Stimme?. Bayern 2, November 28, 2010
M3. Paul Lohberger (2010). Radiokolleg - Queen. Österreichischer Runkfunk Ö1, May 4, 2010
M2. Paul Lohberger (2009). Radiokolleg - Die Stimme. Österreichischer Runkfunk Ö1, April 29, 2009
M1. Bayerischer Rundfunk (2008). Engelsgleich - Über die Physik der Knabenstimme. Bayern 4 Klassik, April 30, 2008
top of page