Responsive image
博碩士論文 etd-0030119-184158 詳細資訊
Title page for etd-0030119-184158
Year, semester
Number of pages
Advisory Committee
Date of Exam
Date of Submission
Deep Autoencoder-like Non-negative Matrix Factorization, Local Interpretable Model-Agnostic Explanations, Deep Learning, Non-negative Matrix Factorization, Multi-Label Learning, Convolutional Neural Network
本論文已被瀏覽 5923 次,被下載 4
The thesis/dissertation has been browsed 5923 times, has been downloaded 4 times.
緣此,本研究設計主要以圖像組成的食材辨識系統,使用深度多標籤學習與深度編碼器的非負矩陣分解(DANMF)法實作食譜食材辨識的技術。本研究引用MIT(麻省理工學院)提供之 Recipe1M 資料集,利用這些圖像資料訓練模型,透過模型可分辨食材料理風格解決大量多標籤的問題,並運用模型局部可解釋器(LIME),來說明圖片中成分預測部位,增加模型的可解釋性。
In recent years, deep learning technology has flourished, and the field of image recognition has been excellent in performance and universal application. Today's Facebook, Google, unmanned stores, etc. have been closely integrated with our lives. We can find posts of eating and drinking are often shared on social media, especially the sharing of food pictures, which often makes people feel happy and finger licking, but often we ignore the deadly attraction hidden behind these foods. For the past few years, there are frequent incidents of food safety, so we should pay more attention to the composition of food. We should eat more whole foods instead of processed foods, avoid processed and high-oil and high-salt foods. Modern people have too many diseases caused by excessive intake of unhealthy foods.
In this study, the Recipe-Ingredient recognition system consisting mainly of images was used, and the method of deep multi-label learning and deep autoencoder-like non-negative matrix factorization (DANMF) was used to implement the Recipe-Ingredient recognition method. This study uses the Recipe1M dataset provided by Massachusetts Institute of Technology to train models using these image data, and solve a large number of multi-label problems through the model-resolved food material style. Furthermore, Local Interpretable Model-Agnostic Explanations (LIME) was employed to explain the prediction parts of the components in the picture and increase the interpretability of the model.
目次 Table of Contents
誌謝 ii
摘要 iii
Abstract iv
第一章 緒論 1
(1-1)研究背景 1
(1-2)研究動機 3
(1-3)研究目的 4
第二章 文獻探討 5
(2-1)深度學習 (Deep Learning) 5
(2-2)卷積神經網路(Convolutional Neural Network) 6
(2-3)多標籤學習(Multi-Label Learning) 8
(2-4)非負矩陣分解 (Non-negative Matrix Factorization) 10
(2-5)Deep Autoencoder-like Non-negative Matrix Factorization (DANMF) 11
(2-6)Local Interpretable Model-Agnostic Explanations (LIME) 12
第三章 研究方法 13
(3-1)原始資料引用 15
(3-2)資料前處理 15
(3-3)分析與建立食材資料集 16
(3-4)運用多標籤學習方法建構深度學習卷積神經網路 19
(3-5)運用圖片辨識模型程序預測料理風格 20
(3-6)料理風格解釋 21
第四章 實驗結果 22
第五章 結論 25
第六章 參考文獻 26
參考文獻 References
Agarwal, R., & Miller, K. (n.d.). Information Extraction from Recipes, 14.
Ahn, Y.-Y., Ahnert, S. E., Bagrow, J. P., & Barabási, A.-L. (2011). Flavor network and the principles of food pairing. Scientific Reports, 1(1).
Bi, W., & Kwok, J. T. (n.d.). Efficient Multi-label Classification with Many Labels, 9.
Bolaños, M., Ferrà, A., & Radeva, P. (2017). Food Ingredients Recognition through Multi-label Learning. ArXiv:1707.08816 [Cs]. Retrieved from
Boutell, M. R., Luo, J., Shen, X., & Brown, C. M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757–1771.
Chen, J., & Ngo, C. (2016a). Deep-based Ingredient Recognition for Cooking Recipe Retrieval. In Proceedings of the 2016 ACM on Multimedia Conference - MM ’16 (pp. 32–41). Amsterdam, The Netherlands: ACM Press.
Chen, J., & Ngo, C. (2016b). Deep-based Ingredient Recognition for Cooking Recipe Retrieval. In Proceedings of the 2016 ACM on Multimedia Conference - MM ’16 (pp. 32–41). Amsterdam, The Netherlands: ACM Press.
CS231n Convolutional Neural Networks for Visual Recognition. (2018, October 2). Retrieved October 2, 2018, from
Deng, J., Dong, W., Socher, R., Li, L.-J., Kai Li, & Li Fei-Fei. (2009). ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255). Miami, FL: IEEE.
Gibaja, E., & Ventura, S. (2015). A Tutorial on Multilabel Learning. ACM Computing Surveys, 47(3), 1–38.
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep Residual Learning for Image Recognition. ArXiv:1512.03385 [Cs]. Retrieved from
Hinton, G. E. (2006). Reducing the Dimensionality of Data with Neural Networks. Science, 313(5786), 504–507.
How do Convolutional Neural Networks work? (2018, October 2). Retrieved October 2, 2018, from
Hoyer, P. O., & Hoyer, P. (n.d.). Non-negative Matrix Factorization with Sparseness Constraints, 13.
Koitka, S., & Friedrich, C. M. (2016). nmfgpu4R: GPU-Accelerated Computation of the Non-Negative Matrix Factorization (NMF) Using CUDA Capable Hardware, 8, 11.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25 (pp. 1097–1105). Curran Associates, Inc. Retrieved from
Lee, D. D., & Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755), 788–791.
Liu, C., Cao, Y., Luo, Y., Chen, G., Vokkarane, V., & Ma, Y. (2016). DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment. In C. K. Chang, L. Chiari, Y. Cao, H. Jin, M. Mokhtari, & H. Aloulou (Eds.), Inclusive Smart Cities and Digital Health (Vol. 9677, pp. 37–48). Cham: Springer International Publishing.
Madjarov, G., Kocev, D., Gjorgjevikj, D., & Džeroski, S. (2012). An extensive experimental comparison of methods for multi-label learning. Pattern Recognition, 45(9), 3084–3104.
Marin, J., Biswas, A., Ofli, F., Hynes, N., Salvador, A., Aytar, Y., … Torralba, A. (2018). Recipe1M: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. ArXiv:1810.06553 [Cs]. Retrieved from
Morency, L.-P., & Baltrusaitis, T. (n.d.-a). Tutorial on Multimodal Machine Learning, 158.
Morency, L.-P., & Baltrusaitis, T. (n.d.-b). Tutorial on Multimodal Machine Learning, 158.
Myers, A., Johnston, N., Rathod, V., Korattikara, A., Gorban, A., Silberman, N., … Murphy, K. (2015). Im2Calories: Towards an Automated Mobile Vision Food Diary. In 2015 IEEE International Conference on Computer Vision (ICCV) (pp. 1233–1241). Santiago, Chile: IEEE.
Pascual-Montano, A., Carazo, J. M., Kochi, K., Lehmann, D., & Pascual-Marqui, R. D. (2006). Nonsmooth nonnegative matrix factorization (nsNMF). IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(3), 403–415.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. ArXiv:1602.04938 [Cs, Stat]. Retrieved from
Ruder, S. (2017). An Overview of Multi-Task Learning in Deep Neural Networks. ArXiv:1706.05098 [Cs, Stat]. Retrieved from
Salvador, A., Hynes, N., Aytar, Y., Marin, J., Ofli, F., Weber, I., & Torralba, A. (2017). Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3068–3076). Honolulu, HI: IEEE.
Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv:1409.1556 [Cs]. Retrieved from
Srivastava, T. (n.d.). Framework to build a niche dictionary for text mining, 12.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A. (2014). Going Deeper with Convolutions. ArXiv:1409.4842 [Cs]. Retrieved from
Teng, C.-Y., Lin, Y.-R., & Adamic, L. A. (2011). Recipe recommendation using ingredient networks. ArXiv:1111.3919 [Physics]. Retrieved from
Thoma, M. (2017). Analysis and Optimization of Convolutional Neural Network Architectures. ArXiv:1707.09725 [Cs]. Retrieved from
Ye, F., Chen, C., & Zheng, Z. (2018). Deep Autoencoder-like Nonnegative Matrix Factorization for Community Detection. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management - CIKM ’18 (pp. 1393–1402). Torino, Italy: ACM Press.
Yeh, C.-K., Wu, W.-C., Ko, W.-J., & Wang, Y.-C. F. (n.d.). Learning Deep Latent Spaces for Multi-Label Classification, 7.
Yu, Q., Mao, D., & Wang, J. (n.d.). Deep Learning Based Food Recognition, 6.
Zhang, M.-L., & Zhou, Z.-H. (2014). A Review on Multi-Label Learning Algorithms. IEEE Transactions on Knowledge and Data Engineering, 26(8), 1819–1837.
Zhou, Z.-H. (n.d.). Multi-Instance Multi-Label Learning with Application to Scene Classification, 8.
電子全文 Fulltext
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available

紙本論文 Printed copies
開放時間 available 已公開 available

QR Code