博碩士論文 etd-0803118-124937 詳細資訊


[回到前頁查詢結果 | 重新搜尋]

姓名 林炳宏(Bing-Hung Lin) 電子郵件信箱 E-mail 資料不公開
畢業系所 資訊管理學系研究所(Information Management)
畢業學位 碩士(Master) 畢業時期 106學年第2學期
論文名稱(中) 一個基於文本整合實體主題情緒辨識的框架之研究
論文名稱(英) An Integrated Framework for Identifying Entities Topics and Sentiment from Text
檔案
  • etd-0803118-124937.pdf
  • 本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
    請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
    論文使用權限

    紙本論文:2 年後公開 (2020-09-03 公開)

    電子論文:使用者自訂權限:校內 2 年後、校外 2 年後公開

    論文語文/頁數 英文/46
    統計 本論文已被瀏覽 5342 次,被下載 0 次
    摘要(中) 伴隨互聯網上新聞文章量的增長,主題和情感的分析已經被廣泛用於文本挖掘。然而,我們很難在實體層面同時識別主題和情感,尤其是中文的文章。為了解決這些問題,我們研究提供一個可以從文本裡識別實體、主題和情感一個框架。我們使用演算法將文檔切割成以實體為主體的句子,並實現構面與情感整合模型,用以識別主題和情緒。最後,應用詞嵌入模型、高斯相似核心和基於層次的聚類算法來生成結果。為了評估方法,我們從蘋果新聞網收集數據,並選擇2013年到2017 年的政治版面。與其他系統相比,在不同層級下,實驗結果顯示我們的框架在實體、主題和情感識別中是有效的。
    摘要(英) Due to the growth of news articles on the internet, topic and sentiment analysis have been widely used for text mining. However, it's difficult to identify topics and sentiments simultaneously in entity-level, especially in Chinese articles. To solve this problem, our approach provides an integrated framework for identifying entities, topics, and sentiments from texts. We use our algorithm to split documents into sentences with entities and implement ASUM to identify topics and sentiments. In the end, we apply word2vec model, Gaussian similarity kernel, and complete-linkage agglomerative algorithm to generate results. To evaluate our method, we collect data from the news website of “Apple Daily”, and select politics section from 2013 to 2017. Comparing with other system in different level, the experimental results show that our framework in entity-level is effective in topics and sentiments identification.
    關鍵字(中)
  • 實體萃取
  • 主題辨識
  • 情感分析
  • 文字探勘
  • 中文自然語言處理
  • 主題模型
  • 構面與情感整合模型
  • 關鍵字(英)
  • Entity Extraction
  • Text Mining
  • Sentiment Analysis
  • Topic Identification
  • Chinese Natural Language Processing
  • Topic modeling
  • Aspect and Sentiment Unification Model
  • 論文目次 論文審定書 i
    摘要 ii
    Abstract iii
    Table of Contents iv
    List of Figures vi
    List of Tables vii
    Chapter 1 Introduction 1
    1.1 Background 1
    1.2 Motivation 2
    Chapter 2 Literature Review 4
    2.1 Chinese Sentence Segmentation 4
    2.2 Aspect Extraction 4
    2.2.1 Rules-based 4
    2.2.2 Topic modeling 5
    2.2.3 Deep Convolutional Neural Network 6
    2.3 JST and ASUM 6
    Chapter 3 Approach 8
    3.1 Research Skeleton 8
    3.2 Data Collection and Data Preprocessing 9
    3.3 Entity Identification and Sentence Extraction 10
    3.3.1 Named Entity Recognition 10
    3.3.2 Word Segmentation and Part-of-Speech Tagger 11
    3.3.3 Stanford Dependency Parser 11
    3.3.4 Rules for Reconstruct Sentences 12
    3.4 Aspect and Sentiment Unification Model (ASUM) 15
    3.5 Topic-Sentiment Mapping 19
    3.5.1 Similarity of Topics 20
    3.5.2 Convert Similarity to Distance 20
    3.5.3 Generate Integrated Topics 21
    3.6 Apply Integrated Topics 23
    3.6.1 Document-Level Topic-Sentiment Identification 23
    3.6.2 Sentence-Level Topic-Sentiment Identification 23
    Chapter 4 Evaluation 24
    4.1 Data Resource and Data Preprocessing 24
    4.2 Evaluate Entity Segmentation 25
    4.3 Comparisons with Other Methods 27
    4.3.1 Attribute Selection from ASUM 27
    4.3.2 Evaluate Document-Level Topic and Sentiment 28
    4.3.3 Evaluate Entity-Level Topic and Sentiment 32
    Chapter 5 Conclusion 35
    Reference 36
    參考文獻 Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
    Cambria, E., & Hussain, A. (2015) Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis, Springer, Cham, Switzerland.
    Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National academy of Sciences, 101(suppl 1), 5228-5235.
    He, Y., Lin, C., Qatar, W., & Wong, K. F.(2013). Dynamic joint sentiment-topic model. ACM Transactions on Intelligent Systems and Technology. Volume 5 Issue 1
    Hu, M., & Liu, B. (2004, August). Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 168-177). ACM.
    Jin, M., Kim, M. Y., Kim, D., & Lee, J. H. (2004). Segmentation of Chinese long sentences using commas. In Proceedings of the Third SIGHAN Workshop on Chinese Language Processing.
    Jo, Y., & Oh, A. (2011). Aspect and Sentiment Unification Model for Online Review Analysis. Proceedings of the fourth ACM international conference on Web search and data mining, 815-824.
    Kim, S., Zhang, J., Chen, Z., Oh, A., & Liu, S. (2013). A Hierarchical Aspect-Sentiment Model for Online Reviews. Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 527-533.
    Klein, D., & Manning, C. D. (2003, July). Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1 (pp. 423-430). Association for Computational Linguistics.
    Levy, O., Goldberg, Y., & Dagan, I. (2015). Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3, 211-225.
    Lin, C. & He, Y. (2009). Joint Sentiment/Topic Model for Sentiment Analysis. Proceedings of the 18th ACM conference on Information and knowledge management, 375-384.
    Liu, Q., Liu, B., Zhang, Y., Kim, D. S., & Gao, Z. (2016). Improving Opinion Aspect Extraction Using Semantic Similarity and Aspect Associations. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2986-2992.
    Mukherjee, A., & Liu, B. (2012). Aspect Extraction through Semi-Supervised Modeling. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, 339-348.
    Pennebaker, J. W., Boyd, R. L., Jordan, K., & Blackburn, K. (2015). The development and psychometric properties of LIWC2015.
    Popescu, A. M., Nguyen, B., & Etzioni, O. (2005, October). OPINE: Extracting product features and opinions from reviews. In Proceedings of HLT/EMNLP on interactive demonstrations(pp. 32-33). Association for Computational Linguistics.
    Poria, S., Cambria E., Ku, L. W., Gui, C., & Gelbukh, A. (2014). A Rule-Based Approach to Aspect Extraction from Product Reviews. Proceedings of the Second Workshop on Natural Language Processing for Social Media, 29-37.
    Poria, S., Cambriab, E., & Gelbukh, A. (2016). Aspect extraction for opinion mining with a deep convolutional neural network. Knowledge-Based Systems, 42-49.
    Qiu, G., Liu, B., Bu, J., & Chen, C. (2011). Opinion Word Expansion and Target Extraction through Double Propagation. Computational Linguistics, Volume 37 Issue 1, March 2011, 9-27.
    Rana, T. A., & Cheah, Y. N. (2016). Aspect extraction in sentiment analysis: comparative analysis and survey. Artificial Intelligence Review archive Volume 46 Issue 4, December 2016, 459-483
    Scaffidi, C., Bierhoff, K., Chang, E., Felker, M., Ng, H., & Jin, C. (2007, June). Red Opal: product-feature scoring from reviews. In Proceedings of the 8th ACM conference on Electronic commerce(pp. 182-191). ACM.
    Späth, H. (1980). Cluster analysis algorithms for data reduction and classification of objects.
    Xu, S. Q., Kong, F., Li, P. F., & Zhu, Q. M. (2012). A Chinese Sentence Segmentation Approach Based on Comma. Chinese Lexical Semantics, 809-817
    口試委員
  • 魏志平 - 召集委員
  • 倪文君 - 委員
  • 黃三益 - 指導教授
  • 口試日期 2018-07-23 繳交日期 2018-09-03

    [回到前頁查詢結果 | 重新搜尋]


    如有任何問題請與論文審查小組聯繫