Why Articulatory Data
A brief introduction of what articulatory data is and why it is important
What is articulatory data?
We articulate to speak and communicate. By moving the tongue and lips, we shape articulators and with this shape, we generate audible sound. You cannot see the sound, though. You only hear it and understand it. However, humans also see the articulators for example lips or the movement of jaw, which are highly visible on the surface. This multimodal cue is crucial to make human communication more seamless and understandable even in the noisy environment.
But what about the tongue? We cannot see its full appearance or the shape changes while we speak. If we can somehow see or measure its movement, we might be able to see the full picture of how articulation works. With articulatory data, speech production can be fully understood, which enables what is underlying in the speech mechanism that acoustics only cannot show.
Why is it important?
As stated above, articulatory data reveals what acoustics only cannot show; e.g., movement of the tongue, lip rounding etc. There are several benefits especially in the domains as below.
Clinical
When children learns a new language (especially their first language), they pick up the speech in the environment such as child-directed speech from parents, conversation from family members and so on. As they mimic how to speak, their pronunciation becomes better and articulation shapes more aligned with their native language. However, it is not always that easy and simple to learn to articulate the target sound. They might need explicit training with visual feedback to improve their articulation or refine the target sound. This is not just limited children. It also applies to adults who learn a new language or have trouble in articulation clinically. Feedback or intervention based on articulatory information can better assist and improve these cases.
Education
Learning a new language takes years of continued and rigorous effort to reach a conversational level. Some people learn quicker than others, but overall it is not easy to articulate the target language fluently especially when it is so much different from their native language. Articulatory information can be useful when one can actually see the actual shape of their tongue with the target shape or movement. Repeated visual feedback of the articulation with practice can help learners learn where articulators should be (constriction location) as well as how it should be (constriction degree) more easily than without articulation information.
Other applications
When we imagine a robot speaking like human, we may just focus on the TTS (Text-To-Speech) sound and it quality. But, making more human-like robot will require the resemblance of the look when it speaks; e.g., the coordinated lip and tongue movement in conjunction with speech. Maybe building a robotic and mechanical speech articulator requires a huge effort in engineering, but the starting point can be the articulatory data. With the good amount (and good quality) articulatory data, we can imagine such a real "talking head" speaking naturally and human-like.
Last updated