Feature Morphing Mesh Morphing Text-to-Speech TTS Basic Block

Text-to-Video: Text to Facial Animation Video Convertion – Hamdani Winoto, Hadi Suwastio, Iwan Iwut T. ISSN 1858-1633 2005 ICTS 63

2. BASIC THEORY, DESIGN, AND IMPLEMENTATION

2.1 Basic Theory

2.1.1MPEG-4 Face and Body Animation Video compression technology used before MPEG-4 is just talking about inter and intra frame compression. By having the virtual and synthetic model on MPEG-4, the better compression than the previous video is resulted. The new MPEG-4 technology can be applied in every multimedia field. Nowadays, MPEG-4 is still in the trial of appearing the face and body animation. FDP Face Definition Parameters is a set of values which identified a certain pattern of a face. These FDP values will be translated, scaled, and rotated by FAP Face Animation Parameters. FAP is the displacement vector value which normalized the FDP changes [9].

2.1.2. Facial Animation

The origin of introducing morphing technique is because of the needed of special effect software, for example; how to make the expression which is impossible to be done by human beings Like how to open the mouth widely until it reach the four head. Facial Animation technique is used to make the moving pictures effect smooth transformation among pictures so that the video is resulted. Principally, there are only two pictures which called source picture and target picture, morphing technique is used to make the transformation effect from the source picture become target picture become some shift pictures smoothly, so that it seems like a video.

2.1.3. Morphing and Deformation

Deformation effect is a technique used in order to change an object 2D or 3D object into other object. What is needed is just an o object which is going to be deformed. Morphing effect is the effect which is used to change an object into other object. The difference between morphing and deformation is that morphing needs two objects while deformation needs one object only [8].

2.1.4. Cross Dissolve

Cross Dissolve morphing method is the simplest morphing method. In this method, what is needed is just appearing two pictures by transparent mode. Linearly, Secara linier, transparansi gambar asal akan berkurang dan transparansi gambar target akan bertambah seiring dengan berkurangnya transparansi gambar asal. By using this method, the source pictures will be slowly disappear while the targets slowly appear.

2.1.5. Feature Morphing

There are two steps in using morphing between two pictures, which are deformation between the source pictures and the target one, and then using cross dissolve between them. In order to know the topology relation between two objects face, there are feature lines which every feature line has its pair on the target picture. There are three kinds of feature line changes, which are translation, scaling and rotation. After deformation, do cross dissolve to combine the texture or color between the source picture and the target one.

2.1.6. Mesh Morphing

Mesh morphing technique use closed curve in order to choose a feature. By using closed curve usually triangle, the feature selection become more accurate. In the mesh morphing, the source picture triangle will be interpolated into a target triangle by assuming a triangle consist of three feature lines which will be interpolated. Then, the nodes on the triangle will be deformed. Each of the node is deformed only by the three lines where it is located, so that every node is relative towards all of the features lines lied on those picture.

2.1.7. Text-to-Speech TTS Basic Block

Text-to-Speech is a text into voice conversion system. Synthetic method in Text-to- Speech can be classified into three categories with its strengths and weaknesses, articulation synthetic, formant synthetic and concatenation synthetic [7]. In the articulation synthetic we can simulate the human voice system, like the movement of tongue, air circulation on the throat, and voice band. This method is difficult o be implemented because of the long time needed during the research [5]. Formant synthetic is based on the source-filter simulation, with an acoustic phonetics description approach. The voice is not produced by a physic equation of vocal apparatus, but it is produced by simulating the main acoustic characteristic of a voice signal. The basic acoustic model represent as filter. It is made by some sets of formant, which reflect the articulation of voice[5]. The method which is normally used is concatenation. It can combine Every voice unit like phone, diphone and triphone.

2.1.8. Speech Processing Basic Theory