716: Class 3.1 (Sep 16, 2019): Psychoacoustic model overview, how to review a paper

Presentation of exercise: measuring spectral tilt measures in Praat

  1. Determine H1-H2, H1-A1, H2-H4, H4-2 kHz, and 2 kHz-5 kHz for the sounds files in one of the following directories in the Box folder in 2/:
    1. gujarati_examples/ (breathy vs. modal)
    2. laver/ (modal, creaky, breathy)
    3. yi_examples/ (tense vs. lax, two tones)

Discussion of psychoacoustic model overview

We'll go through the reading questions and your answers.

Defining voice quality

"Voices are processed as patterns, and not as bundles of features": Is it why we can still recognize a certain voice when the person is sick or congested etc? Is there any feature that significantly contributes to characterizing a voice? 

1)    In Section 1 the author points out that the main parameter in voice therapy for voice disorders is perceived voice quality. Voice disorders are usually link to a production issue, however the most important outcome for therapy is the perception. The model she presents intend to link the changes in production and the resulting perceived changes in voice quality. Could this model be systematized with clinical purposes?

Voice quality rating example

Spectral envelopes and slopes

"Spectral envelopes were estimated by connecting the harmonic peaks, and seventy equally-spaced points were chosen along each envelope." I feel this is an important aspect of understating how the researchers found the acoustically relevant parameters but I can't picture it. It would be helpful if we could go over this. 

What is the "overall spectral slope" and how is it calculated?

I also had a question about spectral envelopes. I can't imagine what it is. Also, how is "the slope from 1.5kHz to 4kHz" two factors? 

3)    How exactly are calculated the parameters in Table 1?

Perceptual evidence for psychoacoustic model/methods

3) In reference to: "As a result, we removed the parameter H4-5 kHz from the model and replaced it with two new parameters: the spectral slope from the fourth harmonic to the harmonic nearest 2 kHz in frequency (H4-2 kHz) and the spectral slope from that harmonic to the harmonic nearest 5 kHz in frequency (2 kHz-5 kHz)."  I don't understand why these parameters in particular were removed in order to create more detail in the source spectrum. 

2)    In Section 3, it is explained how they were determining the parameters at which the listener could be sensitive. They were trying to determine if the listener distinguishes between the natural and the synthetize and with parameters modified voice. I was not really sure despite the explanation of why they decided to modify the source spectrum above H4 and not other? 

In table 2, can each parameter be translated into a certain voice quality? Or can they not, since perception is integral? In the latter case, does it mean that listeners can tell it's "different" but can't tell "how it's different"? 

5)    Why the need of having the values of H1*-H2*?

Articulatory/physiological model

Could we go over what the two-layer mechanical vocal fold model is and how it works? 

What do they mean by "physiological precursors"? SY: It probably means the movement(?) of articulators ....?  but yes I am also curious why they used this specific word "precursors". 

"It is not surprising that speakers would have a variety of phonatory strategies available to them for manipulating H1–H2 in speech": Does this mean that a single parameter H1-H2 is connected to  multiple physiological actions (or movements, gesture)?

5) What is "left-right stiffness mismatches" and what does it have to do with the model the paper is talking about? 

4)    When modeling studies are explained in Section 5, examples of mechanical and computational models were presented and an ex vivo model is presented as well. She mentions that the use of this last model is in the beginnings, but I wonder if is there any experiments already published? If so, how could they link this knowledge in production with the listener perception?

Papers to choose from to review/present next week

  1. Voice quality and tone identification in White Hmong. Garellek, Keating, Esposito and Kreiman (2013)
  2. Measures of the glottal source spectrum. Kreiman, Gerratt and Antonanzas-Barroso 2007
  3. Modeling the voice source in terms of spectral slopes. Garellek, Samlan, Gerratt and Kreiman (2015)., see also POMA proceedings and POMA poster
  4. Perceptual evaluation of voice source models. Kreiman, Garellek, Chen, Alwan, and Gerratt (2015)
  5. Perceptual sensitivity to first harmonic amplitude in the voice source. Kreiman and Gerratt (2010), see also follow-up article

How to review a paper

Recommendations

Specific questions to answer

Sometimes the journal will give you a set of questions to answer. Other times there are no such questions given. Here are some examples:

Journal of the International Phonetic Association

  1. Is the manuscript suitable for publication in JIPA?
  2. Is the study methodologically sound?
  3. Are the results adequately and clearly illustrated by means of figures and tables?
  4. Does the language of the manuscript need attention (e.g. it terms of grammar or vocabulary)?

Journal of the Acoustical Society of America

  1. Is the manuscript of good scientific quality, free from errors, misconceptions or ambiguities; does it present original work; and does it contain sufficient new results, new applications or new developments of reasonable enough significance to warrant its publication in JASA? Please indicate in your report (in detailed comments, below) any points which are objectionable or which need attention.
  2. Is JASA an appropriate journal in which to present this work? In this regard, please consider carefully the commitment of JASA to publish work that is within the scope of Acoustics. Does the content of the manuscript, including terminology and the references cited, meet this criterion?
  3. Is the manuscript a clear, concise, reasonably self-contained presentation of the material, giving adequate references to related work? Is the English satisfactory? Please indicate needed changes in your report.
  4. Are the tables and figures clear and relevant, and are the captions adequate? Are there either too many or too few? If any of the figures are in color, is the color essential for conveying the scientific point?
  5. If Supplementary material was submitted, is it relevant to the manuscript and should it be deposited in the Supplemental Depository for reference to the manuscript?
  6. Does the paper make effective use of journal space, or are parts unnecessary, unimportant, or subject to condensation? If so, which?
  7. Is the title appropriate and the abstract adequate for verbatim reproduction in abstract journals? IMPORTANT: The lead paragraph should advertise the main points of the article and must describe in terms accessible to the general reader the context and significance or the research problem studied and the importance of the results.

Journal of Phonetics

  1. Is the subject matter suitable for publication in the Journal of Phonetics?
  2. Is this a new and original contribution?
  3. Is the discussion of the literature appropriate and adequate?
  4. Is the research presented methodologically sound?
  5. Are the interpretations and conclusions sound and justified by the data?
  6. Are the presentation, organization, and length satisfactory?

Other comments

Then there will be a section for comments to the author/editor, and typically also a section for comments to the editor only (the author doesn't see this).

In my comments to the author, I generally separate out my review into a Major Points section and then a Minor Points section.

Major points: I start with a paragraph that summarizes what I think the paper is about. This is useful for both the editor, and for the author to see if what I think the paper is about matches what they think it is about. I make sure to say something positive about what the paper contributes. Then I go through the major points of revisions I think are needed.

Minor points: Here I point out smaller things like typos, minor suggestions about figures and tables, sentences that were unclear to me.