||Multimedia plays an important role in our life. It communicates messages and ideas using a combination of text, audio, image, animation and video, and delivers easily understandable and much more fascinating materials than pure text. As multimedia and internet technologies are advanced, the applications of multimedia to the product presentation and education training are getting popular. Multimedia education makes effective interaction, active engagement and convenient learning plausible, and promote education to a higher level. |
In this thesis, a multimedia searching system of image and speech is developed. Both mobile phone and camera are used for painting image capture, and microphone is used for speech recording. The system will respond the complete information about the painting or the speaker after correct image or speech recognition.
For painting recognition, the image from mobile phone or camera is first preprocessed, segmented and scale normalized. Then, YCbCr color model, projection features and local binary pattern are utilized for feature extraction. The nearest similarity between the testing and the training image is calculated to obtain the final answer. Using mobile phone and camera as the testing image capture devices, recognition rates of 92.06% and 90.07% can be reached respectively for the 20,160 paintings' system under the Intel Core i5 2.6 GHz laptop and Windows 7 operating system environment.
For speaker recognition, Mel frequency cepstrum coefficients are applied as the feature parameters, and a Gaussian mixture model for each speaker is established using 40 second training material. Using 10 second testing material at different times and locations, a correct speaker recognition rate of 83.4% can be obtained for the 2,000 speakers' system under the Intel Core i5-3337U 1.8 GHz personal computer and Ubuntu 14.04 operating system environment.