Responsive image
博碩士論文 etd-0719120-224032 詳細資訊
Title page for etd-0719120-224032
Topic Evolution and Diffusion Discovery based on Online Deep Non-negative Autoencoder
Year, semester
Number of pages
San-Yih Hwang
Advisory Committee
Date of Exam
Date of Submission
Network Analysis, Autoencoder, Deep learning, Topic Diffusion, Topic Modeling, Topic Evolution
本論文已被瀏覽 5948 次,被下載 88
The thesis/dissertation has been browsed 5948 times, has been downloaded 88 times.
The storage type of books, newspapers and magazines has changed from tangible papers to digital documents. This phenomenon indicates that a large number of documents are stored on the Internet. Therefore, it is infeasible for us to review all information to find out what we need from these numerous papers. We need to rely on keywords or well-defined topics to find out our requirements. Unfortunately, these topics change over time in the real world. How to correctly classify these documents has been an increasingly important issue. Our approach aims to improve the problem of the topic model, which considers time. Considering that the inference method for the posterior probability is too complicated, so for simplicity, we use an autoencoder variant to build a topic model with shared weights at different times, called Deep Non-negative Autoencoder (DNAE). This model is a multi-layer structure, the evolution of topics in each layer is also a focus of this paper. Besides, we use generalized Jensen-Shannon divergence to measure the topic diffusion and use network diagrams to observe the evolution of topics.
目次 Table of Contents
論文審定書 i
摘要 ii
1. Introduction 1
2. Background and related work 2
2.1 Topic model 3
2.2 Time series topic model 4
2.3 Multi-layer topic model 6
2.4 Deep Learning 7
2.5 Online Learning 8
3. Methodology 9
3.1 Topic model based on Autoencoder 11
3.2 Online Deep Non-negative Autoencoder 13
3.3 Evaluation of topic diffusion 15
3.4 Visualization of topic evolution 16
3.5 Topic Evolution and Diffusion Discovery based on online DNAE 18
4. Experiment 19
4.1 Online topic model with DNAE 21
4.2 Topic evolution and diffusion with DNAE 22
4.3 Term evolution with DNAE 24
5. Discussion 27
6. Conclusion 29
7. Reference 30
Appendix A 35
Appendix B 37
參考文獻 References
Baldi, P. (n.d.). Autoencoders, Unsupervised Learning, and Deep Architectures. 14.
Blei, D. M. (n.d.-a). Introduction to Probabilistic Topic Models. 16.
Blei, D. M. (n.d.-b). Latent Dirichlet Allocation. 30.
Blei, D. M., & Lafferty, J. D. (2006). Dynamic topic models. Proceedings of the 23rd International Conference on Machine Learning - ICML ’06, 113–120.
Bourlard, H., & Kamp, Y. (1988). Auto-association by multilayer perceptrons and singular value decomposition. Biological Cybernetics, 59(4), 291–294.
Greene, D., & Cross, J. P. (2016). Exploring the Political Agenda of the European Parliament Using a Dynamic Topic Modeling Approach. ArXiv:1607.03055 [Cs].
Greene, D., O’Callaghan, D., & Cunningham, P. (2014). How Many Topics? Stability Analysis for Topic Models. ArXiv:1404.4606 [Cs].
Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(Supplement 1), 5228–5235.
Griffiths, Thomas L., Jordan, M. I., Tenenbaum, J. B., & Blei, D. M. (2004). Hierarchical Topic Models and the Nested Chinese Restaurant Process. In S. Thrun, L. K. Saul, & B. Schölkopf (Eds.), Advances in Neural Information Processing Systems 16 (pp. 17–24). MIT Press.
Grosse, I., Bernaola-Galván, P., Carpena, P., Román-Roldán, R., Oliver, J., & Stanley, H. E. (2002). Analysis of symbolic sequences using the Jensen-Shannon divergence. Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics, 65(4 Pt 1), 041905.
Handbook of Latent Semantic Analysis. (2007). Routledge Handbooks Online.
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the Dimensionality of Data with Neural Networks. Science, 313(5786), 504–507.
Hinton, Geoffrey E, & Zemel, R. S. (1994). Autoencoders, Minimum Description Length and Helmholtz Free Energy. In J. D. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in Neural Information Processing Systems 6 (pp. 3–10). Morgan-Kaufmann.
Kang, Y., Cheng, I.-L., Mao, W., Kuo, B., & Lee, P.-J. (2019). Towards Interpretable Deep Extreme Multi-label Learning. ArXiv:1907.01723 [Cs, Stat].
Kang, Y., Lin, K.-P., & Cheng, I.-L. (2018). Topic Diffusion Discovery Based on Sparseness-Constrained Non-Negative Matrix Factorization. 2018 IEEE International Conference on Information Reuse and Integration (IRI), 94–101.
Kang, Y., & Zadorozhny, V. (2016). Process Monitoring Using Maximum Sequence Divergence. Knowledge and Information Systems, 48(1), 81–109.
Lake, J. A. (n.d.). Reconstructing evolutionary trees from DNA and protein sequences: Parallnear distances. 5.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.
Lee, D. D., & Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755), 788–791.
McCloskey, M., & Cohen, N. J. (1989). Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem. In G. H. Bower (Ed.), Psychology of Learning and Motivation (Vol. 24, pp. 109–165). Academic Press.
Ognyanova, K. (n.d.). Network visualization with R. 66.
Paatero, P., & Tapper, U. (1994). Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5(2), 111–126.
Phylogenetic trees | Evolutionary tree (article) | Khan Academy. (n.d.). Retrieved July 2, 2020, from
Silge, J., & Robinson, D. (2017). Text Mining with R: A Tidy Approach. O’Reilly Media, Inc.
Song, H. A., & Lee, S.-Y. (2013). Hierarchical Representation Using NMF. In M. Lee, A. Hirose, Z.-G. Hou, & R. M. Kil (Eds.), Neural Information Processing (Vol. 8226, pp. 466–473). Springer Berlin Heidelberg.
Stevens, K., Kegelmeyer, P., Andrzejewski, D., & Buttler, D. (2012). Exploring Topic Coherence over Many Models and Many Topics. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 952–961.
Tu, D., Chen, L., Lv, M., Shi, H., & Chen, G. (2018). Hierarchical online NMF for detecting and tracking topic hierarchies in a text stream. Pattern Recognition, 76, 203–214.
Wang, X., & McCallum, A. (2006). Topics over time: A non-Markov continuous-time model of topical trends. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’06, 424.
Ye, F., Chen, C., & Zheng, Z. (2018). Deep Autoencoder-like Nonnegative Matrix Factorization for Community Detection. Proceedings of the 27th ACM International Conference on Information and Knowledge Management - CIKM ’18, 1393–1402.
Zhou, J., Cui, G., Zhang, Z., Yang, C., Liu, Z., Wang, L., Li, C., & Sun, M. (2019). Graph Neural Networks: A Review of Methods and Applications. ArXiv:1812.08434 [Cs, Stat].
電子全文 Fulltext
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus:開放下載的時間 available 2020-08-19
校外 Off-campus:開放下載的時間 available 2020-08-19

紙本論文 Printed copies
開放時間 available 2020-08-19

QR Code