國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,基於深度多標籤學習的食譜食材識別,RECIPE-INGREDIENT RECOGNITION BASED ON DEEP MULTI-LABEL LEARNING 

論文名稱 Title	基於深度多標籤學習的食譜食材識別 RECIPE-INGREDIENT RECOGNITION BASED ON DEEP MULTI-LABEL LEARNING
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	107 學年度第 1 學期 The fall semester of Academic Year 107	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	34
研究生 Author	邱明相 Ming-Siang Chiu
指導教授 Advisor	康藝晃 Yihuang Kang
召集委員 Convenor	林耕霈 Keng-Pei Lin
口試委員 Advisory Committee	李珮如 Pei-Ju Lee
口試日期 Date of Exam	2019-01-25	繳交日期 Date of Submission	2019-01-30
關鍵字 Keywords	深度學習、多標籤學習、模型局部可解釋器、非負矩陣分解、卷積神經網路、類似非負矩陣分解的深度自動編碼 Deep Autoencoder-like Non-negative Matrix Factorization, Local Interpretable Model-Agnostic Explanations, Deep Learning, Non-negative Matrix Factorization, Multi-Label Learning, Convolutional Neural Network
統計 Statistics	本論文已被瀏覽 5902 次，被下載 2 次 The thesis/dissertation has been browsed 5902 times, has been downloaded 2 times.

中文摘要
近年來深度學習技術蓬勃發展，其中圖像辨識的領域已有著卓越的表現及普遍的應用，舉凡目前的臉書，Google，無人商店等等，這些科技早已緊密與我們的生活結合，我們可以發現社群網路常常被分享的都是吃喝玩樂的訊息，尤其是美食圖片的分享，往往讓人心情愉悅忍不住食指大動，但往往我們都忽視這些美食背後隱藏的致命吸引力，由於近年來食安風暴頻傳，所以我們對吃的內容更應注意，我們要多吃食物而不是食品，避免加工過以及高油高鹽的食物，現代人有太多疾病都是攝取過量的不健康的食品所導致。緣此，本研究設計主要以圖像組成的食材辨識系統，使用深度多標籤學習與深度編碼器的非負矩陣分解（DANMF）法實作食譜食材辨識的技術。本研究引用MIT(麻省理工學院)提供之 Recipe1M 資料集，利用這些圖像資料訓練模型，透過模型可分辨食材料理風格解決大量多標籤的問題，並運用模型局部可解釋器(LIME)，來說明圖片中成分預測部位，增加模型的可解釋性。
Abstract
In recent years, deep learning technology has flourished, and the field of image recognition has been excellent in performance and universal application. Today's Facebook, Google, unmanned stores, etc. have been closely integrated with our lives. We can find posts of eating and drinking are often shared on social media, especially the sharing of food pictures, which often makes people feel happy and finger licking, but often we ignore the deadly attraction hidden behind these foods. For the past few years, there are frequent incidents of food safety, so we should pay more attention to the composition of food. We should eat more whole foods instead of processed foods, avoid processed and high-oil and high-salt foods. Modern people have too many diseases caused by excessive intake of unhealthy foods. In this study, the Recipe-Ingredient recognition system consisting mainly of images was used, and the method of deep multi-label learning and deep autoencoder-like non-negative matrix factorization (DANMF) was used to implement the Recipe-Ingredient recognition method. This study uses the Recipe1M dataset provided by Massachusetts Institute of Technology to train models using these image data, and solve a large number of multi-label problems through the model-resolved food material style. Furthermore, Local Interpretable Model-Agnostic Explanations (LIME) was employed to explain the prediction parts of the components in the picture and increase the interpretability of the model.

目次 Table of Contents
誌謝 ii 摘要 iii Abstract iv 第一章緒論 1 (1-1)研究背景 1 (1-2)研究動機 3 (1-3)研究目的 4 第二章文獻探討 5 (2-1)深度學習（Deep Learning） 5 (2-2)卷積神經網路（Convolutional Neural Network） 6 (2-3)多標籤學習(Multi-Label Learning) 8 (2-4)非負矩陣分解 (Non-negative Matrix Factorization) 10 (2-5)Deep Autoencoder-like Non-negative Matrix Factorization (DANMF) 11 (2-6)Local Interpretable Model-Agnostic Explanations (LIME) 12 第三章研究方法 13 (3-1)原始資料引用 15 (3-2)資料前處理 15 (3-3)分析與建立食材資料集 16 (3-4)運用多標籤學習方法建構深度學習卷積神經網路 19 (3-5)運用圖片辨識模型程序預測料理風格 20 (3-6)料理風格解釋 21 第四章實驗結果 22 第五章結論 25 第六章參考文獻 26

參考文獻 References
Agarwal, R., & Miller, K. (n.d.). Information Extraction from Recipes, 14. Ahn, Y.-Y., Ahnert, S. E., Bagrow, J. P., & Barabási, A.-L. (2011). Flavor network and the principles of food pairing. Scientific Reports, 1(1). https://doi.org/10.1038/srep00196 Bi, W., & Kwok, J. T. (n.d.). Efficient Multi-label Classification with Many Labels, 9. Bolaños, M., Ferrà, A., & Radeva, P. (2017). Food Ingredients Recognition through Multi-label Learning. ArXiv:1707.08816 [Cs]. Retrieved from http://arxiv.org/abs/1707.08816 Boutell, M. R., Luo, J., Shen, X., & Brown, C. M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757–1771. https://doi.org/10.1016/j.patcog.2004.03.009 Chen, J., & Ngo, C. (2016a). Deep-based Ingredient Recognition for Cooking Recipe Retrieval. In Proceedings of the 2016 ACM on Multimedia Conference - MM ’16 (pp. 32–41). Amsterdam, The Netherlands: ACM Press. https://doi.org/10.1145/2964284.2964315 Chen, J., & Ngo, C. (2016b). Deep-based Ingredient Recognition for Cooking Recipe Retrieval. In Proceedings of the 2016 ACM on Multimedia Conference - MM ’16 (pp. 32–41). Amsterdam, The Netherlands: ACM Press. https://doi.org/10.1145/2964284.2964315 CS231n Convolutional Neural Networks for Visual Recognition. (2018, October 2). Retrieved October 2, 2018, from http://cs231n.github.io/convolutional-networks/ Deng, J., Dong, W., Socher, R., Li, L.-J., Kai Li, & Li Fei-Fei. (2009). ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255). Miami, FL: IEEE. https://doi.org/10.1109/CVPR.2009.5206848 Gibaja, E., & Ventura, S. (2015). A Tutorial on Multilabel Learning. ACM Computing Surveys, 47(3), 1–38. https://doi.org/10.1145/2716262 He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep Residual Learning for Image Recognition. ArXiv:1512.03385 [Cs]. Retrieved from http://arxiv.org/abs/1512.03385 Hinton, G. E. (2006). Reducing the Dimensionality of Data with Neural Networks. Science, 313(5786), 504–507. https://doi.org/10.1126/science.1127647 How do Convolutional Neural Networks work? (2018, October 2). Retrieved October 2, 2018, from https://brohrer.github.io/how_convolutional_neural_networks_work.html Hoyer, P. O., & Hoyer, P. (n.d.). Non-negative Matrix Factorization with Sparseness Constraints, 13. Koitka, S., & Friedrich, C. M. (2016). nmfgpu4R: GPU-Accelerated Computation of the Non-Negative Matrix Factorization (NMF) Using CUDA Capable Hardware, 8, 11. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25 (pp. 1097–1105). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf Lee, D. D., & Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755), 788–791. https://doi.org/10.1038/44565 Liu, C., Cao, Y., Luo, Y., Chen, G., Vokkarane, V., & Ma, Y. (2016). DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment. In C. K. Chang, L. Chiari, Y. Cao, H. Jin, M. Mokhtari, & H. Aloulou (Eds.), Inclusive Smart Cities and Digital Health (Vol. 9677, pp. 37–48). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-39601-9_4 Madjarov, G., Kocev, D., Gjorgjevikj, D., & Džeroski, S. (2012). An extensive experimental comparison of methods for multi-label learning. Pattern Recognition, 45(9), 3084–3104. https://doi.org/10.1016/j.patcog.2012.03.004 Marin, J., Biswas, A., Ofli, F., Hynes, N., Salvador, A., Aytar, Y., … Torralba, A. (2018). Recipe1M: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. ArXiv:1810.06553 [Cs]. Retrieved from http://arxiv.org/abs/1810.06553 Morency, L.-P., & Baltrusaitis, T. (n.d.-a). Tutorial on Multimodal Machine Learning, 158. Morency, L.-P., & Baltrusaitis, T. (n.d.-b). Tutorial on Multimodal Machine Learning, 158. Myers, A., Johnston, N., Rathod, V., Korattikara, A., Gorban, A., Silberman, N., … Murphy, K. (2015). Im2Calories: Towards an Automated Mobile Vision Food Diary. In 2015 IEEE International Conference on Computer Vision (ICCV) (pp. 1233–1241). Santiago, Chile: IEEE. https://doi.org/10.1109/ICCV.2015.146 Pascual-Montano, A., Carazo, J. M., Kochi, K., Lehmann, D., & Pascual-Marqui, R. D. (2006). Nonsmooth nonnegative matrix factorization (nsNMF). IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(3), 403–415. https://doi.org/10.1109/TPAMI.2006.60 Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. ArXiv:1602.04938 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1602.04938 Ruder, S. (2017). An Overview of Multi-Task Learning in Deep Neural Networks. ArXiv:1706.05098 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1706.05098 Salvador, A., Hynes, N., Aytar, Y., Marin, J., Ofli, F., Weber, I., & Torralba, A. (2017). Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3068–3076). Honolulu, HI: IEEE. https://doi.org/10.1109/CVPR.2017.327 Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv:1409.1556 [Cs]. Retrieved from http://arxiv.org/abs/1409.1556 Srivastava, T. (n.d.). Framework to build a niche dictionary for text mining, 12. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A. (2014). Going Deeper with Convolutions. ArXiv:1409.4842 [Cs]. Retrieved from http://arxiv.org/abs/1409.4842 Teng, C.-Y., Lin, Y.-R., & Adamic, L. A. (2011). Recipe recommendation using ingredient networks. ArXiv:1111.3919 [Physics]. Retrieved from http://arxiv.org/abs/1111.3919 Thoma, M. (2017). Analysis and Optimization of Convolutional Neural Network Architectures. ArXiv:1707.09725 [Cs]. Retrieved from http://arxiv.org/abs/1707.09725 Ye, F., Chen, C., & Zheng, Z. (2018). Deep Autoencoder-like Nonnegative Matrix Factorization for Community Detection. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management - CIKM ’18 (pp. 1393–1402). Torino, Italy: ACM Press. https://doi.org/10.1145/3269206.3271697 Yeh, C.-K., Wu, W.-C., Ko, W.-J., & Wang, Y.-C. F. (n.d.). Learning Deep Latent Spaces for Multi-Label Classiﬁcation, 7. Yu, Q., Mao, D., & Wang, J. (n.d.). Deep Learning Based Food Recognition, 6. Zhang, M.-L., & Zhou, Z.-H. (2014). A Review on Multi-Label Learning Algorithms. IEEE Transactions on Knowledge and Data Engineering, 26(8), 1819–1837. https://doi.org/10.1109/TKDE.2013.39 Zhou, Z.-H. (n.d.). Multi-Instance Multi-Label Learning with Application to Scene Classification, 8.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0030119-184158.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS