Vision helps to understand the type of sentence that is being produced

2025.06.20.
Vision helps to understand the type of sentence that is being produced
Study by Luma da Silva Miranda (ELTE), João Antônio de Moraes and Albert Rilliard.

Speech comprehension studies, in general, only consider information coming from the auditory channel. However, recent analyses carried out in the prosody area of ​​speech from a multimodal perspective indicate that the visual channel also contributes to the identification of the utterance types that are produced in a verbal interaction. 

This was the research topic of the article “Visual channel influences the comprehension of the intonation of Brazilian Portuguese wh-questions and wh-exclamations: evidence from congruent and incongruent stimuli”, in which two types of sentences in Brazilian Portuguese, the wh-question and the wh-exclamation, were analyzed in a perceptual experiment in which the auditory and visual channels were either  sending the acoustic and visual cues of the same sentence type, or sending acoustic and visual cues of the two sentence types in an interchanged manner; for example, the intonation of the wh-question with the facial expression of the wh-exclamation and vice versa. 

The perceptual experiment was applied online with a total of thirty-six Brazilians. After statistical analysis of the data, it was proven that the perceptual identification of sentence types is facilitated when the acoustic and visual cues of the same speech acts, that is, congruent stimuli, are exposed to the participants. On the other hand, when the acoustic cues are not the same as the visual cues produced in speech acts, comprehension is hindered. 

Another interesting result of the study is that wh-questions were better recognized than exclamations and this must be related to the fact that wh-exclamations in Brazilian Portuguese can be produced with different intonational contours, depending on the type of Wh-word used in beginning of the sentence, while wh-questions have the same intonational contour, regardless of the type of wh-word. 

The results of this study indicate that human communication is multimodal, that is, the combination of information coming from the auditory and visual channels that are present in the expression of our communicative intentions is used to understand the sentence types emitted by our interlocutor. In terms of language, this study corroborates what has already been verified in conversation analysis research, because in a dialogical interaction, we need clues that anticipate our understanding of the speaker's communicative intentions, especially in the case of questions, to prepare our response quickly and efficiently because as soon as the speaker finishes his/her turn, it is important that an answer is given in order to keep the interaction flow without any lack of communication. Visual cues combined with auditory cues help this communicative process.


Miranda, Luma da Silva, João Antônio de Moraes, and Albert Rilliard. “Visual Channel Facilitates the Comprehension of the Intonation of Brazilian Portuguese Wh-Questions and Wh-Exclamations: Evidence from Congruent and Incongruent Stimuli.” Language and Cognition, 2024, 1–21.