Objective: To obtain dynamic images of articulators using a magnetic resonance imaging (MRI) movie and to clarify the relationships among the articulators.
Materials and Methods: The subjects consisted of 10 volunteers. Custom-made circuitry was connected to an MRI apparatus to enable an external trigger pulse to control the timing of the scanning sequence and to provide an auditory cue for synchronization of the subject's utterance. The subject repeated a bilabial plosive, and the run was measured using a gradient echo sequence with a repetition time of 30 ms. Several variables were defined to delineate the individual movements of articulators and to determine the temporal relationships among them.
Results: It was found that (1) the change in these variables showed distinctive waveforms; (2) mean values of the standard deviations for these variables were relatively small; and (3) the movement of the velum was significantly correlated with those of the lips and the anterior part of the tongue, but not with the posterior part of the tongue.
Conclusions: These results suggest that (1) articulatory movements were clearly recorded using an MRI movie, and (2) there seems to be a central mechanism for controlling articulators, and the level of coupling may be associated with the place of articulation.
Oropharyngeal structures such as the lips, tongue, velum, and vocal cord move to interact with the teeth and contribute to the production of sound, ie, articulation. Articulation can be classified according to both manner (eg, stops, fricatives) and place (eg, bilabial, alveolar). A stop is a sound made with closure in the oral cavity, and there are three classes of stops: plosive, affricate, and nasal. To make a plosive stop, a complex coordination of articulators must be performed with spatial and temporal integration. First, the lips are sealed so that no air can get out of the mouth. Second, the velum is raised so that no air can get out of the nose. Third, air is pumped from the lungs, which builds up pressure in the oral cavity to supra-atmospheric pressure. Finally, the lip seal is opened, allowing the pressured air to flow out in a turbulent burst.
It has been reported that there is a significant correlation between anterior malocclusion and errors in sound production.1 On the other hand, Johnson and Sandy2 reviewed the literature and concluded that while certain dental irregularities show a relationship with speech disorders, this did not appear to correlate with the severity of malocclusion. Controversies regarding the relationship between malocclusion and speech may be partly explained by adaptation and compensation by the subject. For example, subjects with cleft lip and palate frequently exhibit velopharyngeal insufficiency, which leads to compensatory sound production. To clarify the interaction between normal/ abnormal oropharyngeal structures and adaptive/nonadaptive oropharyngeal function, it is necessary to use a noninvasive technique with high spatial and temporal resolution.
Recent advances in magnetic resonance imaging (MRI) have extended its use from producing static images of oropharyngeal structures to producing dynamic images of oropharyngeal function, including temporomandibular joint,3–6 swallowing,7–9 and articulatory10–15 movements. Some MRI studies on the temporomandibular joint used titrated bite blocks to obtain several static images for sequential visualization as a loop to simulate dynamic movement.3–5 While other studies obtained dynamic images during actual motion, contamination by noise (ie, motion artifacts) is inevitable due to large time-resolutions over 100 ms.6–9,11,13 Based on a modification of the method proposed by Masaki and colleagues,10 the usefulness of an MRI movie in the evaluation of speech was previously reported.14 However, the subjects were required to repeat a given sound 128 times, which may not be suitable for naïve subjects, especially children.14 Moreover, the signal/noise ratio may deteriorate due to the fluctuation of timing caused by neuromuscular fatigue. In a subsequent study, the number of repetitions was reduced from 128 to 36 with a time-resolution of 30 ms.16
The purpose of this study was twofold: (1) to obtain high-resolution dynamic images of oropharyngeal structures related to sound production using an MRI movie, and (2) to clarify the relationship among articulators during production of a bilabial plosive, with special attention to the place of articulation.
MATERIALS AND METHODS
The subjects consisted of 10 healthy women between the ages of 24 and 31 years. They had skeletal Class I relationship and no malocclusions. All subjects reported negative neurologic and developmental histories and exhibited no obvious speech difficulties as judged by the experimenter. None had a cold, allergic rhinitis, or an ongoing respiratory tract infection at the time of the assessment. They showed no symptoms of temporomandibular disorder. All of the experimental procedures complied with the Code of Ethics of the World Medical Association (Declaration of Helsinki) and were approved by the institutional ethics committee. Written informed consent was obtained from each subject.
Custom-made circuitry was connected to a 1.5T MRI apparatus (Magnetom Vision, Siemens AG, Erlangen, Germany), which was equipped with a head and neck coil, to enable an external trigger pulse to control the timing of the scanning sequence and to provide an auditory cue for synchronization of the subject's utterance, which was recorded simultaneously. Each image had a 219- × 250-mm field of view with a pixel size of 2.03 × 1.95 mm (slice thickness: 5 mm); the matrix size was 108 × 128. The subject was in the supine position and repeatedly whispered a vowel-consonant-vowel syllable (ie, /apa/) of the bilabial plosive, in a synchronized manner in response to the auditory cue. Each run was measured using a gradient echo sequence for a cardiac cine (repetition time [TR], 30 ms; echo time, 4.8 ms; flip angle, 30°). During data acquisition, an external trigger pulse was fed to the MRI scanner 36 times. The time required for data acquisition was approximately 1 min. Articulatory movements in the midsagittal plane were observed as a movie using the same method as in cardiac MRI (Figure 1A). The protocol used to construct the MRI movie is described elsewhere.14,16 To delineate the individual movement of articulators and to determine the temporal relationships among those articulators, several linear and angular variables were analyzed in the static image of the MRI movie after the following landmarks were defined (Figure 1B):
Tt: tip of the tongue
Tv: tip of the velum
C2ip: most inferoposterior point of the second cervical vertebral body
Tp: intersection of the posterior contour of the tongue and the St-C2ip line
Linear and Angular Variables
IL: inter-lip distance on the line perpendicular to the St-C2ip line (mm)
TA: distance between Tt and C2ip (mm)
TP: distance between Tp and C2ip (mm)
Vp: angle between the Tsp-C2ip line and the St-C2ip line (degrees)
Variables in static images of the MRI movie for each subject were measured five times using software (ImageJ 1.34s, NIH, Bethesda, Md) on different days by the same investigator; means and standard deviations were calculated also. Two-tailed Spearman's correlation coefficient was used to determine the correlations between the timing of the movement of articulators and the peak angulation and peak angular speed of Vp. Statistical significance was established at P < .05.
The spatial and temporal changes in linear and angular measurements of articulators from a representative subject are shown in Figure 2A. The subject showed a change with two positive peaks in IL during the articulation of /apa/. The negative peak at approximately 1000 ms corresponded to the timing of lip-sealing immediately before /pa/-articulation. In TA, a three-tier (ie, a negative-positive-negative sequence) waveform was seen. The first negative peak in TA corresponded to the first positive peak in IL, the positive peak in TA corresponded to the negative peak in IL, and the second negative peak in TA corresponded to the second positive peak in IL. In contrast to TA, TP showed a relatively gross waveform with a three-tier sequence. Vp showed a one-tier waveform for the temporal change during /apa/-articulation. The positive peak corresponded to the timing of the negative peak in IL and also to the positive peak in TA. Similar spatial/temporal patterns for IL, TA, TP, and Vp were also observed in other subjects. Figure 2B shows a schematic drawing of the temporal changes in IL, TA, TP, and Vp for all subjects. IL started to increase with a mean latency of 420 ± 130 ms (mean ± standard deviation [SD]; LIL) and reached the first positive peak (TIL1) at 789 ± 40 ms. IL then decreased and reached the negative peak (TIL2) at 990 ± 24 ms. IL started to increase again and reached the second positive peak (TIL3) at 1170 ± 28 ms. TA started to decrease with a mean latency of 411 ± 181 ms (LTA) and reached the first negative peak (TTA1) at 837 ± 30 ms. TA then increased and reached the positive peak (TTA2) at 990 ± 40 ms. TA started to decrease again and reached the second negative peak (TTA3) at 1182 ± 40 ms. Similarly, TP started to decrease with a mean latency of 324 ± 95 ms (LTP) and reached the first negative peak (TTP1) at 831 ± 35 ms. TP then increased and reached the positive peak (TTP2) at 996 ± 51 ms. TP started to decrease again and reached the second negative peak (TTP3) at 1176 ± 28 ms. Vp started to increase with a mean latency of 357 ± 118 ms (LVp) and reached the positive peak at 1032 ± 25 ms (TVp).
Mean values of standard deviations from five calculations for the four variables in 45 images in 10 subjects are shown in Table 1. Mean values for IL, TA, TP, and Vp were 1.3 mm, 2.3 mm, 1.1 mm, and 3.9°, respectively. These values were relatively small compared to the pixel size.
Mean latencies and standard deviations for the key timings of the four variables are shown in Figure 3. The latencies for IL, TA, TP, and Vp (ie, LIL, LTA, LTP, and LVp, respectively) showed intersubject variation. On the other hand, TVp showed the least intersubject variation.
The relationships between the latency of Vp and those of IL, TA, and TP and among the key timings for the four variables are summarized in Table 2. There was no significant correlation between the latencies of Vp and IL (P = .704), TA (P = .489), or TP (P = .159). On the other hand, there were significant correlations between TVp and TIL2 (P = .006), TTA2 (P = .002), and TTA3 (P = .038), while there were no significant correlations between TVp and TTP1 (P = .881), TTP2 (P = .501), or TTP3 (P = .299). There was a significant correlation (P = .001) between the peak angulation and peak angular speed of Vp (Figure 4).
The MRI movie used in the present study offers a technical advantage compared to conventional cine-loop MRI.3–9,11,13 In conventional cine-loop MRI, each k-space is filled one by one, and each k-space frame is made by filling one or several rows at a time according to the number of phase encodes using a delay. A minimum of 100 ms is needed to construct an image even when the ultrafast MRI sequence is used. This duration allows for unexpected motion of the target and results in contamination by noise or blurring. On the other hand, a different method is used to construct static images in an MRI movie.10,14 In our study, all k-space frames are arranged in parallel and the first three rows of the first k-space frame are filled almost simultaneously by the first phase encode.16 The first three rows of the nth k-space frame are then filled 30 × (n-1) ms later. The final three rows of the first k-space frame are filled almost simultaneously by the last phase encode, and the final three rows of the adjacent k-space frame are sequentially filled every 30 ms. Therefore, the interval required to fill the adjacent k-space frame is equivalent to the time resolution. All reconstructed images are completed concomitantly by repeating this procedure according to the number of phase encodes, and these images are viewed in a loop as actual movement. Therefore, contamination by motion artifacts in the MRI movie can be minimized as much as possible compared to conventional cine-loop MRI.
The impact of the orthodontic and orthognathic correction of malocclusion and dentofacial deformity on speech function is still controversial, since perceptual physiological assessment has shown different patterns of change.2,17,18 Previous controversies highlight the additional need for an objective diagnostic tool that can be performed repeatedly. Moreover, speech impairment has been reported in patients with congenital anomalies, such as cleft lip and palate,19 and Treacher Collins,20 Kabuki,21 and Beckwith-Wiedemann22 syndromes. In these cases, genetic and metabolic factors may affect the central nervous system, which may lead to poor motor control and the possible distorted morphogenesis of articulators. This complex situation requires reliable measurement for speech evaluation, which can result in differential diagnosis and treatment. Since MRI movie is noninvasive, fatigue-free, and allows for the simultaneous visualization of multiple articulators in the whole vocal tract, it may be useful in this setting.
Like other motor behaviors such as reaching the hand to a target, speech requires the interaction of multiple effectors, ie, articulators. Rather than considering each articulator as being controlled independently, it has been suggested that all articulators are functionally coupled as a unit.23 However, no previous studies have focused on the level of coupling among articulators in relation to the place of articulation. In the present study, the level of coupling was found to be greater at the anterior area of the oral cavity (ie, lips and the anterior part of the tongue) with velar movement that is critical in the generation of a bilabial plosive sound to accomplish velopharyngeal closure (Table 2). In contrast, the level of coupling is lower at the posterior area of the oral cavity (ie, the posterior part of the tongue) with velar movement during /apa/-articulation.
The effect of body position on speech production has been studied. In healthy adults, Moon and Canady24 showed that the speech-related electromyographic activity of the velar muscle was significantly affected by the change in body position. They suggested that sensory receptors played a compensatory role for the change in the direction of gravity. Moreover, airflow dynamics are known to be different between upright and supine positions.25 Thus, the movement pattern of articulators revealed in this study might not be comparable to that in the upright position.
Interestingly, part of the tongue functions differentially than other parts of the same organ. This functional diversity may be achieved by the differential composition of the intrinsic and extrinsic muscles in different parts of the tongue. Moreover, it was reported that the coordination of articulatory movement varies at least according to the direction of movement; lip and jaw movement for oral closure are closely coupled, while these articulators do not show the same degree of temporal coupling.26–28 The present findings are consistent with previous studies and further show the close kinematic interaction of the velum with the lips and anterior part of the tongue during a bilabial plosive.
Articulatory movement was recorded with high spatial and temporal resolution using an MRI movie.
Further, there seems to be a central mechanism that controls the spatial and temporal relationship among articulators, and the level of coupling may be associated with the place of articulation.
The technique can be used in further studies to investigate the relationship between malocclusion and articulatory problems.
This study was supported by Grants-in-Aid for Scientific Research Project (16390603 and 18390553) from the Japan Society for the Promotion of Science. The authors are indebted to Dr Shinobu Masaki, Center Head, Brain Activity Imaging Center, Advanced Telecommunications Research Institute International, Kyoto, Japan, for providing valuable technical advice.
Corresponding author: Dr Takashi Ono, Tokyo Medical and Dental University, Maxillofacial Orthognathics, Graduate School, 5-45, Yushima 1-chome, Bunkyo-ku, Tokyo 113-8549, Japan (firstname.lastname@example.org)