Proceedings of Computer Animation 2002, pp. 252-257, IEEE Press, Los Alamitos, CA, 2002.
Virtual conversational agents are supposed to combine speech
with nonverbal modalities for intelligible and believeable
utterances. However, the automatic synthesis of coverbal
gestures still struggles with several problems like naturalness
in procedurally generated animations, flexibility in pre-defined
movements, and synchronization with speech. In this paper, we
focus on generating complex multimodal utterances including
gesture and speech from XML-based descriptions of their overt
form. We describe a coordination model that reproduces
co-arcticulation and transition effects in both modalities.
In particular, an efficient kinematic approach to creating
gesture animations from shape specifications is presented,
which provides fine adaptation to temporal constraints that
are imposed by cross-modal synchrony.