國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,基於深度卷積遞歸神經網路的文本分類,Document Classification based on Deep Convolutional Recurrent Neural Networks

論文名稱 Title	基於深度卷積遞歸神經網路的文本分類 Document Classification based on Deep Convolutional Recurrent Neural Networks
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	107 學年度第 2 學期 The spring semester of Academic Year 107	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	69
研究生 Author	黃國忠 Kuo-Chung Huang
指導教授 Advisor	康藝晃 Yihuang Kang
召集委員 Convenor	黃三益 San-Yih Hwang
口試委員 Advisory Committee	李珮如 Pei-Ju Lee
口試日期 Date of Exam	2019-07-22	繳交日期 Date of Submission	2019-08-28
關鍵字 Keywords	遞歸神經網路、深度學習、文本分類、卷積神經網路、特徵圖和最大池化輸出的遞迴關係式、長短期記憶、深度卷積遞歸神經網路 Deep Learning, Document Classification, Long short-term memory, CNN_BiLSTM, Deep Convolutional Recurrent Neural Networks, Convolutional Neural Network
統計 Statistics	本論文已被瀏覽 5947 次，被下載 340 次 The thesis/dissertation has been browsed 5947 times, has been downloaded 340 times.

中文摘要
由於深度學習的崛起，所以本文嘗試藉由資料集20 newsgroups dataset，在Scikit-learn 和Keras的框架下，透過機器學習和深度學習的手法，進行電腦的模擬，我們看見了經由機器學習與深度學習不同手法，作用在文本分類上得到不同的準確度(accuracy)，利用深度學習的方法準確度甚至高達96%。在機器學習上我們利用Naive Bayes classifier，SVM(Support Vector Machines) classifier，Logistic Regression classifier，RandomForest classifier，……..等等模型進行文本分類，並執行量測準確度。深度學習上我們搭建神經網路，Fully Connect Neural Network，LSTM(Long short-term memory) Neural Network，BiLSTM(Bidirectional Long short-term memory) Neural Network，CNN(Convolutional Neural Network)，CNN_LSTM Neural Network，LSTM_CNN，CNN_ BiLSTM Neural Network，BiLSTM_CNN等模型，並使用預先訓練的詞向量崁入神經網路，這裡的詞向量我們使用Glove6b，Word2vec，fastText等模型建立的詞向量，並且進行文本分類及準確度的量測。我們也發現神經網路模型能夠實現卓越的文本分類，特別是卷積神經網路(CNN)，及具記憶的遞歸神經網路(LSTM)這兩種主流架構，我們也結合這兩種架構的優勢，提出了文本分類基於深度卷積遞歸神經網路模型。利用卷積(CNN)來抽取更高級別的短語，及將短語送進具記憶的遞歸神經網路(LSTM)中獲得全局句子的表示。或者是利用LSTM來獲取全局句子的表示，在送進CNN中抽取更高級的短語。我們評估提出的架構在文本分類任務上。實驗表明深度卷積具雙向記憶的遞歸神經網路(CNN¬_ BiLSTM)這個模型優於文本分類上其它模型的實驗結果，我們得到接近96%的準確度。同時我們也歸納了特徵圖Feature Maps和最大池化Max pooling輸出的遞迴關係式。
Abstract
Due to the rise of deep learning, our study aims at using the 20 Newsgroups data to run computer simulation with machine learning and deep learning techniques and the frameworks of scikit-learn and Keras. With different machine learning and deep learning techniques, the accuracies of text classification are various. 96% accuracy can be achieved via deep learning. Regarding deep learning, we constructed neural networks. Further, we utilized pre-trained word vectors as inputs of the neural networks, which are generated by glove6b, word2vec, and fasttext, and carried out text classification and measured their accuracies. Our study shows that neural networks are able to achieve outstanding performances in document classification, in particular the two mainstream methods, CNN and LSTM. we combine the strengths of both architectures and propose a model called deep convolutional recurrent neural networks for for document classification. Thus, we integrated the advantages of these two methods. CNN extracts high-level phrases, which serve as the input of LSTM to generate sentences, and vice versa. We evaluate the proposed architecture on document classification tasks. The experimental results show that the deep convolutional recurrent neural networks outperforms other models and can achieve excellent performance on these tasks. Particularly, CNN_BiLSTM achieved 96% accuracy in document classification. At the same time,we also summarize the recursive relationship between the feature map and the max pooling output.

目次 Table of Contents
論文審定書 i 誌謝 ii 中文摘要 iii 英文摘要 iv 第一章介紹 1 第二章背景及相關工作 3 2.1 機器學習模型 3 2.2 深度學習模型 4 第三章方法及架構 7 3.1 機器學習的方法與架構 7 3.1.1 文檔-詞頻矩陣(Document-Term Matrix) 7 3.1.2 TF-IDF演算法 8 3.2 深度學習的方法與架構 11 3.3 CNN_BiLSTM 模型 15 第四章實驗結果 18 4.1 資料集 18 4.2 詞向量資料集: 19 4.3 機器學習方法的實驗結果 20 4.4 深度學習方法的實驗結果 21 第五章結論 30 第六章參考文獻 33 附錄 36

參考文獻 References
Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks 5(2), 157-166. Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of machine learning research, 12(Aug), 2493-2537. Devlin, J., Zbib, R., Huang, Z., Lamar, T., Schwartz, R., & Makhoul, J. (2014). Fast and robust neural network joint models for statistical machine translation. Paper presented at the Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),page 1370-1380. Haonan, L., Huang, S. H., Ye, T., & Xiuyan, G. (2019). Graph Star Net for Generalized Multi-Task Learning. arXiv preprint arXiv:1906.12330v1. Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., & Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems 13(4), 18-28. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780. Ikonomakis, M., Kotsiantis, S., & Tampakas, V. (2005). Text classification using machine learning techniques. WSEAS transactions on computers 4(8), 966-974. Kalchbrenner, N., Grefenstette, E., & Blunsom, P. (2014). A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188. Kilimci, Z. H., & Akyokus, S. (2018). Deep Learning-and Word Embedding-Based Heterogeneous Classifier Ensembles for Text Classification. Complexity, 2018. Kim, S.-B., Rim, H.-C., Yook, D., & Lim, H.-S. (2002). Effective methods for improving naive bayes text classifiers. Paper presented at the Pacific Rim International Conference on Artificial Intelligence.LNAI 2417,2002,pp.414-423. Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882. Klema, J., Almonayyes, A., & engineering. (2006). Automatic categorization of fanatic texts using random forests. Kuwait journal of science 33(2),pp.1-18. Kowsari, K., Heidarysafa, M., Brown, D. E., Meimandi, K. J., & Barnes, L. E. (2018). Rmdl: Random multimodel deep learning for classification. Paper presented at the Proceedings of the 2nd International Conference on Information System and Data Mining. arXiv preprint arXiv:1805.01890v2. Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. Paper presented at the 31st International conference on machine learning.pages 1188-1196. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature521(7553), 436. McCallum, A., & Nigam, K. (1998). A comparison of event models for naive bayes text classification. Paper presented at the AAAI-98 Workshop on Learning for Text Categorization,vol.752,pp41-48. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Paper presented at the Advances in neural information processing systems,pages 3111-3119. Sainath, T. N., Vinyals, O., Senior, A., & Sak, H. (2015). Convolutional, long short-term memory, fully connected deep neural networks. Paper presented at the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45(11), 2673-2681. Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys 34(1), 1-47. Shanahan, J. G., & Roma, N. (2003). Improving SVM text classification performance through threshold adjustment. Paper presented at the 14th European Conference on Machine Learning.LNAI 2837,361-372. Sundermeyer, M., Ney, H., Schlüter, R., & Processing, L. (2015). From feedforward to recurrent LSTM neural networks for language modeling. IEEE/ACM Transactions on Audio, Speech 23(3), 517-529. Tang, D., Qin, B., & Liu, T. (2015). Document modeling with gated recurrent neural network for sentiment classification. Paper presented at the Proceedings of the 2015 conference on empirical methods in natural language processing. Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., . . . Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. Paper presented at the 2015th International conference on machine learning. Kilimci, Z. H., & Akyokus, S. (2018). Deep Learning-and Word Embedding-Based Heterogeneous Classifier Ensembles for Text Classification. Complexity, 2018. Wu, F., Zhang, T., Souza Jr, A. H. d., Fifty, C., Yu, T., & Weinberger, K. Q. (2019). Simplifying graph convolutional networks. arXiv preprint arXiv:1902.07153v2. Yao, L., Mao, C., & Luo, Y. (2018). Graph convolutional networks for text classification. arXiv preprint arXiv:1809.05679. These data sets are all available on the Internet. The 20 Newsgroups Dataset.http://qwone.com/~jason/20Newsgroups/ Jeffrey Pennington, Richard Socher, Christopher D. Manning. GloVe: Global Vectors for Word Representation.https://nlp.stanford.edu/projects/glove/ GloVe: Global Vectors for Word Representation. https://nlp.stanford.edu/projects/glove/ Word2vec: https://code.google.com/archive/p/word2vec/ Fasttext: https://fasttext.cc/docs/en/english-vectors.html scikit-learn Machine Learning in Python. https://scikit-learn.org/stable/ Keras Documentation. https://keras.io/

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外完全公開 unrestricted 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0728119-135427.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS