Responsive image
博碩士論文 etd-0728119-135427 詳細資訊
Title page for etd-0728119-135427
論文名稱
Title
基於深度卷積遞歸神經網路的文本分類
Document Classification based on Deep Convolutional Recurrent Neural Networks
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
69
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2019-07-22
繳交日期
Date of Submission
2019-08-28
關鍵字
Keywords
遞歸神經網路、深度學習、文本分類、卷積神經網路、特徵圖和最大池化輸出的遞迴關係式、長短期記憶、深度卷積遞歸神經網路
Deep Learning, Document Classification, Long short-term memory, CNN_BiLSTM, Deep Convolutional Recurrent Neural Networks, Convolutional Neural Network
統計
Statistics
本論文已被瀏覽 6004 次,被下載 341
The thesis/dissertation has been browsed 6004 times, has been downloaded 341 times.
中文摘要
由於深度學習的崛起,所以本文嘗試藉由資料集20 newsgroups dataset,在Scikit-learn 和Keras的框架下,透過機器學習和深度學習的手法,進行電腦的模擬,我們看見了經由機器學習與深度學習不同手法,作用在文本分類上得到不同的準確度(accuracy),利用深度學習的方法準確度甚至高達96%。
在機器學習上我們利用Naive Bayes classifier,SVM(Support Vector Machines) classifier,Logistic Regression classifier,RandomForest classifier,……..等等模型進行文本分類,並執行量測準確度。
深度學習上我們搭建神經網路,Fully Connect Neural Network,LSTM(Long short-term memory) Neural Network,BiLSTM(Bidirectional Long short-term memory) Neural Network,CNN(Convolutional Neural Network),CNN_LSTM Neural Network,LSTM_CNN,CNN_ BiLSTM Neural Network,BiLSTM_CNN等模型,並使用預先訓練的詞向量崁入神經網路,這裡的詞向量我們使用Glove6b,Word2vec,fastText等模型建立的詞向量,並且進行文本分類及準確度的量測。
我們也發現神經網路模型能夠實現卓越的文本分類,特別是卷積神經網路(CNN),及具記憶的遞歸神經網路(LSTM)這兩種主流架構,我們也結合這兩種架構的優勢,提出了文本分類基於深度卷積遞歸神經網路模型。
利用卷積(CNN)來抽取更高級別的短語,及將短語送進具記憶的遞歸神經網路(LSTM)中獲得全局句子的表示。或者是利用LSTM來獲取全局句子的表示,在送進CNN中抽取更高級的短語。我們評估提出的架構在文本分類任務上。
實驗表明深度卷積具雙向記憶的遞歸神經網路(CNN¬_ BiLSTM)這個模型優於文本分類上其它模型的實驗結果,我們得到接近96%的準確度。
同時我們也歸納了特徵圖Feature Maps和最大池化Max pooling輸出的遞迴關係式。
Abstract
Due to the rise of deep learning, our study aims at using the 20 Newsgroups data to run computer simulation with machine learning and deep learning techniques and the frameworks of scikit-learn and Keras. With different machine learning and deep learning techniques, the accuracies of text classification are various. 96% accuracy can be achieved via deep learning.
Regarding deep learning, we constructed neural networks. Further, we utilized pre-trained word vectors as inputs of the neural networks, which are generated by glove6b, word2vec, and fasttext, and carried out text classification and measured their accuracies.
Our study shows that neural networks are able to achieve outstanding performances in document classification, in particular the two mainstream methods, CNN and LSTM. we combine the strengths of both architectures and propose a model called deep convolutional recurrent neural networks for for document classification.
Thus, we integrated the advantages of these two methods. CNN extracts high-level phrases, which serve as the input of LSTM to generate sentences, and vice versa. We evaluate the proposed architecture on document classification tasks.
The experimental results show that the deep convolutional recurrent neural networks outperforms other models and can achieve excellent performance on these tasks.
Particularly, CNN_BiLSTM achieved 96% accuracy in document classification.
At the same time,we also summarize the recursive relationship between the feature map and the max pooling output.
目次 Table of Contents
論文審定書 i
誌謝 ii
中文摘要 iii
英文摘要 iv
第 一 章 介紹 1
第 二 章 背景及相關工作 3
2.1 機器學習模型 3
2.2 深度學習模型 4
第 三 章 方法及架構 7
3.1 機器學習的方法與架構 7
3.1.1 文檔-詞頻矩陣(Document-Term Matrix) 7
3.1.2 TF-IDF演算法 8
3.2 深度學習的方法與架構 11
3.3 CNN_BiLSTM 模型 15
第 四 章 實驗結果 18
4.1 資料集 18
4.2 詞向量資料集: 19
4.3 機器學習方法的實驗結果 20
4.4 深度學習方法的實驗結果 21
第 五 章 結論 30
第 六 章 參考文獻 33
附 錄 36
參考文獻 References
Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks
5(2), 157-166.
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011).
Natural language processing (almost) from scratch. Journal of machine learning research, 12(Aug), 2493-2537.
Devlin, J., Zbib, R., Huang, Z., Lamar, T., Schwartz, R., & Makhoul, J. (2014). Fast and
robust neural network joint models for statistical machine translation. Paper
presented at the Proceedings of the 52nd Annual Meeting of the Association
for Computational Linguistics (Volume 1: Long Papers),page 1370-1380.
Haonan, L., Huang, S. H., Ye, T., & Xiuyan, G. (2019). Graph Star Net for Generalized
Multi-Task Learning. arXiv preprint arXiv:1906.12330v1.
Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., & Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems 13(4), 18-28.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural
computation, 9(8), 1735-1780.
Ikonomakis, M., Kotsiantis, S., & Tampakas, V. (2005). Text classification using
machine learning techniques. WSEAS transactions on computers
4(8), 966-974.
Kalchbrenner, N., Grefenstette, E., & Blunsom, P. (2014). A convolutional neural
network for modelling sentences. arXiv preprint arXiv:1404.2188.
Kilimci, Z. H., & Akyokus, S. (2018). Deep Learning-and Word Embedding-Based
Heterogeneous Classifier Ensembles for Text Classification. Complexity, 2018.
Kim, S.-B., Rim, H.-C., Yook, D., & Lim, H.-S. (2002). Effective methods for improving naive bayes text classifiers. Paper presented at the Pacific Rim International Conference on Artificial Intelligence.LNAI 2417,2002,pp.414-423.
Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv
preprint arXiv:1408.5882.
Klema, J., Almonayyes, A., & engineering. (2006). Automatic categorization of fanatic
texts using random forests. Kuwait journal of science 33(2),pp.1-18.
Kowsari, K., Heidarysafa, M., Brown, D. E., Meimandi, K. J., & Barnes, L. E. (2018).
Rmdl: Random multimodel deep learning for classification. Paper presented at the Proceedings of the 2nd International Conference on Information System and Data Mining. arXiv preprint arXiv:1805.01890v2.
Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents.
Paper presented at the 31st International conference on machine
learning.pages 1188-1196.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature521(7553), 436.
McCallum, A., & Nigam, K. (1998). A comparison of event models for naive bayes text
classification. Paper presented at the AAAI-98 Workshop on Learning for Text
Categorization,vol.752,pp41-48.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed
representations of words and phrases and their compositionality. Paper presented at the Advances in neural information processing systems,pages 3111-3119.
Sainath, T. N., Vinyals, O., Senior, A., & Sak, H. (2015). Convolutional, long short-term
memory, fully connected deep neural networks. Paper presented at the 2015
IEEE International Conference on Acoustics, Speech and Signal Processing
(ICASSP).
Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE
Transactions on Signal Processing 45(11), 2673-2681.
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM
computing surveys 34(1), 1-47.
Shanahan, J. G., & Roma, N. (2003). Improving SVM text classification performance
through threshold adjustment. Paper presented at the 14th European
Conference on Machine Learning.LNAI 2837,361-372.
Sundermeyer, M., Ney, H., Schlüter, R., & Processing, L. (2015). From feedforward to
recurrent LSTM neural networks for language modeling. IEEE/ACM Transactions on Audio, Speech 23(3), 517-529.
Tang, D., Qin, B., & Liu, T. (2015). Document modeling with gated recurrent neural
network for sentiment classification. Paper presented at the Proceedings of
the 2015 conference on empirical methods in natural language processing.
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., . . . Bengio, Y.
(2015).
Show, attend and tell: Neural image caption generation with visual attention.
Paper presented at the 2015th International conference on machine learning.
Kilimci, Z. H., & Akyokus, S. (2018). Deep Learning-and Word Embedding-Based Heterogeneous Classifier Ensembles for Text Classification. Complexity, 2018.
Wu, F., Zhang, T., Souza Jr, A. H. d., Fifty, C., Yu, T., & Weinberger, K. Q. (2019).
Simplifying graph convolutional networks. arXiv preprint arXiv:1902.07153v2.
Yao, L., Mao, C., & Luo, Y. (2018). Graph convolutional networks for text
classification. arXiv preprint arXiv:1809.05679.


These data sets are all available on the Internet.
The 20 Newsgroups Dataset.http://qwone.com/~jason/20Newsgroups/
Jeffrey Pennington, Richard Socher, Christopher D. Manning.
GloVe: Global Vectors for Word Representation.https://nlp.stanford.edu/projects/glove/

GloVe: Global Vectors for Word Representation. https://nlp.stanford.edu/projects/glove/
Word2vec: https://code.google.com/archive/p/word2vec/
Fasttext: https://fasttext.cc/docs/en/english-vectors.html
scikit-learn Machine Learning in Python. https://scikit-learn.org/stable/
Keras Documentation. https://keras.io/
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code