Information and Communication Technology Seminar, Vol. 1 No. 1, August 2005
ISSN 1858-1633 2005 ICTS 66
3.6 The Fitness Analysis Between Voice And Animation
From some experiments, the fitness level between movie and voice is depend on the total of frames
animation error and the real amount of frames. In this Text-to-Video program, the video speed
used is 30 fps, because the error frames amount is smaller in 30 fps, compared with 15 fps, 25 fps, 40
fps, or 50 fps. The smaller number of error frames, the higher fitness between animation movie and the voice
produced. Eventhough sometimes by the small number of errors, the unfitness is still happened,
usually it is happened if the number of the input sentences is too much. It could be happened because
of the computation in Matlab is basically heavy.
3.7 MOS Valuing
On the MOS valuing, every block of Text-to-Video system which included Text-to- Speech, Facial
Animation, and Text-to-Video it self. The parameter which is going to be measure on
Text-to-Speech is the intelligibility of the sound heard . while on the Facial Animation is the naturally of
facial animation while pronouncing sentences whether the shape of the mouth is suitable with the sentences
pronounced and also the duration of each word and each phoneme.
While on the Text-to-Video, the synchronization between video animation and synthetic voice will be
looked at.
General value of TTS MOS: 3,3033 General value FA MOS : 4,0733
General value TTV MOS : 3,536
4. CONCLUSION AND SUGGESTION 4.1 Conclusion
• Using Cross Dissolve method can produce good enough facial animation in Text-to- Video. But to
morph between two different faces, cross dissolve produce bad animation, because of there is a
shadow in image transition.
• The MOS result in Facial Animation shows that the naturally of animation by controlling
animation duration is good enough MOS result FA is 4.0733.
• From the synchronization testing result in Text- to-Video, shows that the difference number
between animation frame and true frame determine synchronization between animation and
sound.
• From the Text-to-Speech testing result, shows that the quality of diphone determine the
intelligibility of sound. • The optimal frame rate in Text-to-Video is 30 fps,
if we observation in the difference number between animation frame and true frame.
• The optimal kind of window is Kaiser, with range of Beta
β = 1.7-2.2
4.2 Suggestion
• Tools for recording must be good free of noise • Suggested for TTV implementation without
Matlab • More observation in DSP block of TTS
• TTV with prosody
REFERENCE
[1] Bhaskaran , Vasudev dan Konstantinos Konstantinides , Image And Video Compression
Standard , Kluwer Academic Publishers , London , 1996 .
[2] Castleman , Kenneth , Digital Image Pocessing , Prentice Hall ,New Jersey , 1996 .
[3] Sid-Ahmed , Maher , Image Processing ,
McGraww Hill , Singapore , 1995 . [4] Kondoz,A.M , Digital Speech ‘Coding for Low
Bit Rate Communication System ‘ , John Wiley Sons , England , 2001 .
[5] Woolier , Benjamin , A Text to Speech System for the IBM PC , Queensland , 1998 .
[6] Arman , Arry Akhmad , IndoTTS , ITB
Bandung , 2000 . [7] Dutoit , Thierry , A Short Introduction to Text-
to-Speech Synthesis , Belgia , TCTS Laboratory .
[8] Nalwan , Agustinus , Movie dan Special Effect , Gramedia , Jakarta , 2001.
[9] Walsh , Aaron dan Mikael Bourges Sevenier ,
MPEG-4 Jump-Start , Prentice Hall , USA , 2002
Text-to-Video: Text to Facial Animation Video Convertion – Hamdani Winoto, Hadi Suwastio, Iwan Iwut T.
ISSN 1858-1633 2005 ICTS 67
APPENDIX Transition Testing from Vowel “a” to “i”
Testing Sentence “DANI” Frame 1 – Frame 17
Information and Communication Technology Seminar, Vol. 1 No. 1, August 2005
ISSN 1858-1633 2005 ICTS 68
SHARE-IT: A UPNP APPLICATION FOR CONTENT SHARING
Daniel Siahaan
Informatics Department, Faculty of Information Technology, Sepuluh Nopember Institute of Technology Kampus Sukolilo, Jl Raya ITS, Surabaya, Indonesia
email : danielif.its.ac.id
ABSTRACT
Our work is based on the idea of creating a smart environment where individuals are encouraged to
interact with others. One of the applications that we developed is Share-It, an application that enables
people who are present in an environment to share its on-hand content, i.e. mp3s collection with others. The
target of Share-It is consumer electronics for small public place proximity network, such as, home, café,
hotel’s lobby, mall, plaza, airport, etc. To realize our application, we use Universal Plug and Play
framework as the enabling technology. The idea of Universal Plug and Play is a network architecture
framework that allows automatic connectivity and interoperability between appliances, PCs, and services
within a network. We have installed this application on a constrained environment in our laboratory, and it
shows a promised result in term of performance. In the future, we would test them on real public spaces.
Keywords : Content Sharing, Proximity Network, Share-It, Universal Plug and Play UPnP
1. INTRODUCTION