博碩士論文 etd-0619113-095451 詳細資訊


[回到前頁查詢結果 | 重新搜尋]

姓名 徐有盛(You-sheng Shiu) 電子郵件信箱 E-mail 資料不公開
畢業系所 資訊管理學系研究所(Information Management)
畢業學位 碩士(Master) 畢業時期 101學年第2學期
論文名稱(中) 基於OpenEars的語音辨識用於失語症治療
論文名稱(英) Speech Recognition for Aphasia Treatment Based on OpenEars
檔案
  • etd-0619113-095451.pdf
  • 本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
    請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
    論文使用權限

    紙本論文:1 年後公開 (2014-08-06 公開)

    電子論文:使用者自訂權限:校內 1 年後、校外 1 年後公開

    論文語文/頁數 英文/67
    統計 本論文已被瀏覽 5354 次,被下載 554 次
    摘要(中) 在台灣,每年都有至少數千位失語症患者需要專業的治療師所提供的復健治療,以便恢復正常與人溝通的能力。但在台治療師人數的不足,導致以下兩大問題:一為醫療能量不足以為所有的病患服務,許多的病人無法預約到門診,而必須要用輪流的方式來進行復健;二是復健的時間也因為供需的失調而必須要壓縮在最低的有效時間內,而無法透過拉長復健時間來協助患者早日恢復語言能力。
    Dephasia是個由中山大學資管系學生所開發的用於失語症治療的電腦輔助工具,該系統結合Web以及iPad來協助患者在家進行復健。本篇論文旨在探討以及評估應用語音辨識工具以便讓中文語音類型的輸入可以在客戶端上直接給予即時的回饋之可能性。此工具基於OpenEars製作,透過對語言模型以及字典檔調整以及修改之方式以達到將語音辨識用於失語症患者的語音輸入之目的。
    實驗用數據收集自醫療現場共計156樣本,來自十位患者,並在iPad2, iPad4, MBP上評估此工具之效率以及正確率。結果顯示,在經過適當前處理的檔案,以皮爾森相關係數評估此套件產生之評分與治療師評分可得到最高0.59之相關性。結果顯示應用現行之語音辨識軟體到失語症治療是一個具有可能性以及潛力的方案。
    摘要(英) In Taiwan, there are thousands of aphasic patients who need professional rehabilitation service. The shortage of therapists causes two problems: 1) not all patients can make clinic appointment at the desired time and 2) the average length of rehabilitation session is reduced. These patients need a way to satisfy the need and improve the quality of aphasic treatment.
    Dephasia is a computer system for aphasic rehabilitation based on Web and iPad. In this thesis, we proposed a way to provide real-time evaluation for Chinese speech input in aphasic treatment on the client-side for two types of questions, namely repeating sentence and naming practice. This extension is based on OpenEars, a free shared-source SDK for iPhone Voice Recognition and text to speech. We propose some designs and modification to adapt the system to aphasic treatment.
    We evaluate our proposed approach using 156 samples from 10 different patients. The evaluation is based on accuracy and efficiency running on iPad2, iPad4 and MBP. The result shows that for ordinary recordings, the Pearson correlation between scores given by proposed solution and the therapists could reach up to 0.59, and the recognition time is within 5 seconds. The result shows that it is possible to apply speech recognition to speech therapy to provide real-time feedback on client-side.
    關鍵字(中)
  • 復健
  • 行動裝置
  • OpenEars
  • 語言治療
  • 失語症
  • 語音辨識
  • 關鍵字(英)
  • Speech Therapy
  • OpenEars
  • Aphasia
  • Speech Recognition
  • Mobile devices
  • 論文目次 論文審定書 i
    誌謝 ii
    摘要 iii
    Abstract iv
    CHAPTER 1 Introduction 1
    1.1. Background 1
    1.2. Motivation 2
    1.2.1. The Goal 3
    1.3. Thesis Organization 4
    CHAPTER 2 State of the Art 5
    2.1. Aphasia 5
    2.1.1. What is Aphasia 5
    2.1.2. Classification of Aphasia 7
    2.1.3. Treatment of Aphasia 10
    2.2. Speech Recognition 13
    2.2.1. What is Speech Recognition 13
    2.2.2. Hidden Markov Model 16
    2.2.3. Evaluation Criteria of Speech Recognition Toolkit 19
    2.2.4. PocketSphinx and OpenEars 22
    CHAPTER 3 Architecture 25
    3.1. Dephasia 25
    3.2. Structure of the Solution 26
    CHAPTER 4 The Method 30
    4.1. Observation from Therapy Scene 30
    4.2. Dictionary and Acoustic model 33
    4.3. Language Model 34
    4.4. Algorithm to Decide Final Score 36
    CHAPTER 5 Evaluation 40
    5.1. Experimental Design 40
    5.2. Accuracy of the Proposed Solution for Show & Tell Questions 42
    5.3. Accuracy for Repeating Sentence Questions 49
    5.4. Efficiency of the Proposed Solution 51
    CHAPTER 6 Conclusion 54
    6.1. Future Work 54
    References 56
    參考文獻 [1] M. A. Anusuya and S. K. Katti, “Speech Recognition by Machine : A Review,” International Journal of Computer Science and Information Security, vol. 6, no. 3, pp. 181–205, 2009.
    [2] W. Abdulla and N. Kasabov, “The Concepts of Hidden Markov Model in Speech Recognition,” 1999.
    [3] American Heart Association, “Stroke and Aphasia,” 2012. [Online]. Available: http://www.strokeassociation.org/idc/groups/heart-public/@wcm/@hcm/documents/downloadable/ucm_309703.pdf.
    [4] M. L. Berthier, “Poststroke Aphasia,” Drugs & Aging, vol. 22, no. 2, pp. 163–182, 2005.
    [5] A. W. Black and K. A. Lenzo, “Flite: a small fast run-time synthesis engine,” Workshop (ITRW) on Speech Synthesis, 2001.
    [6] R. O. C. Department of Health, Executive Yuan, “Statistics of Causes of Death,” 2011. [Online]. Available: http://www.doh.gov.tw/CHT2006/DisplayStatisticFile.aspx?d=87554&s=1.
    [7] P. Dowsett, “iOS Module Development Guide.” [Online]. Available: https://wiki.appcelerator.org/display/guides/iOS+Module+Development+Guide.
    [8] H. Goodglass, E. Kaplan, and B. Barresi, The Assessment of Aphasia and Related Disorders, 3rd ed. Lippincott Williams & Wilkins, 2001.
    [9] D. Huggins-Daines, M. Kumar, and A. Chan, “Pocketsphinx: A free, real-time continuous speech recognition system for hand-held devices,” Acoustics, Speech, pp. 185–188, 2006.
    [10] S. K. Gaikwad, B. W. Gawali, and P. Yannawar, “A Review on Speech Recognition Technique,” International Journal of Computer Applications, vol. 10, no. 3, pp. 16–24, Nov. 2010.
    [11] P. Lamere, P. Kwok, E. B. Gouvˆ, B. Raj, R. Singh, W. Walker, and P. Wolf, “The CMU Sphinx-4 Speech Recognition System,” in Proc. European Conf. on Speech Communication and Technology, 2003.
    [12] A. Lee and T. Kawahara, “Recent development of open-source speech recognition engine julius,” Proceedings : APSIPA ASC 2009 : Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference, pp. 131–137, 2009.
    [13] M. R. McNeil and D. A. Copland, “Aphasia Theory,Models,and Classfication,” in in Aphasia and Related Neurogenic Language Disorder, 4th ed., Thieme Medical Pub, 2011, pp. 27–47.
    [14] L. Meng-Jen and C. Yu-Chen, “The Nursing Experience of Caring for a MiddleAged Patient with Aphasia Caused by a Stroke,” Cheng Ching Medical Journal, vol. 6, no. 4, pp. 50–57, 2010.
    [15] K. Poeck, “Fluency,” in in The Characteristics Of Aphasia, C. Code, Ed. 1989, pp. 23–32.
    [16] H. Schuell and J. J. Jenkins, Schuell’s Aphasia in adults: diagnosis, prognosis, and treatment. HarperCollins, 1974.
    [17] G. Widmer, “Machine Learning and Pattern Classification”, Course in JKU. [Online]. Available: http://www.cp.jku.at/teaching/ss12/344.009.html.
    [18] S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. (Andrew) Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK book, 3.4 ed., no. July 2000. Cambridge University Engineering Department, 2009.
    [19] H. Yu-Mei, Chung Shu-Er, Lee Miao-Hsiang, Chang Tao-Chang, “The Concise Chinese Aphasia Test(CCAT) and It’s Applications,” The journal of Speech-language-hearing association, pp. 119–137, 1998.
    [20] 李淑娥, “成人失語症之復健,” in in 語言病理學基礎 第三卷, 曾進興, Ed. 心理出版社, 1999, pp. 257–287.
    [21] “OpenEars.” [Online]. Available: http://www.politepix.com/openears/.
    [22] “Titanium SDK.” [Online]. Available: http://www.appcelerator.com/platform/titanium-sdk/.
    口試委員
  • 魏志平 - 召集委員
  • 張乃文 - 委員
  • 陳嘉平 - 委員
  • 黃三益 - 指導教授
  • 口試日期 2013-07-17 繳交日期 2013-08-06

    [回到前頁查詢結果 | 重新搜尋]


    如有任何問題請與論文審查小組聯繫