Responsive image
博碩士論文 etd-0030119-184158 詳細資訊
Title page for etd-0030119-184158
論文名稱
Title
基於深度多標籤學習的食譜食材識別
RECIPE-INGREDIENT RECOGNITION BASED ON DEEP MULTI-LABEL LEARNING ​
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
34
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2019-01-25
繳交日期
Date of Submission
2019-01-30
關鍵字
Keywords
深度學習、多標籤學習、模型局部可解釋器、非負矩陣分解、卷積神經網路、類似非負矩陣分解的深度自動編碼
Deep Autoencoder-like Non-negative Matrix Factorization, Local Interpretable Model-Agnostic Explanations, Deep Learning, Non-negative Matrix Factorization, Multi-Label Learning, Convolutional Neural Network
統計
Statistics
本論文已被瀏覽 5902 次,被下載 2
The thesis/dissertation has been browsed 5902 times, has been downloaded 2 times.
中文摘要
近年來深度學習技術蓬勃發展,其中圖像辨識的領域已有著卓越的表現及普遍的應用,舉凡目前的臉書,Google,無人商店等等,這些科技早已緊密與我們的生活結合,我們可以發現社群網路常常被分享的都是吃喝玩樂的訊息,尤其是美食圖片的分享,往往讓人心情愉悅忍不住食指大動,但往往我們都忽視這些美食背後隱藏的致命吸引力,由於近年來食安風暴頻傳,所以我們對吃的內容更應注意,我們要多吃食物而不是食品,避免加工過以及高油高鹽的食物,現代人有太多疾病都是攝取過量的不健康的食品所導致。
緣此,本研究設計主要以圖像組成的食材辨識系統,使用深度多標籤學習與深度編碼器的非負矩陣分解(DANMF)法實作食譜食材辨識的技術。本研究引用MIT(麻省理工學院)提供之 Recipe1M 資料集,利用這些圖像資料訓練模型,透過模型可分辨食材料理風格解決大量多標籤的問題,並運用模型局部可解釋器(LIME),來說明圖片中成分預測部位,增加模型的可解釋性。
Abstract
In recent years, deep learning technology has flourished, and the field of image recognition has been excellent in performance and universal application. Today's Facebook, Google, unmanned stores, etc. have been closely integrated with our lives. We can find posts of eating and drinking are often shared on social media, especially the sharing of food pictures, which often makes people feel happy and finger licking, but often we ignore the deadly attraction hidden behind these foods. For the past few years, there are frequent incidents of food safety, so we should pay more attention to the composition of food. We should eat more whole foods instead of processed foods, avoid processed and high-oil and high-salt foods. Modern people have too many diseases caused by excessive intake of unhealthy foods.
In this study, the Recipe-Ingredient recognition system consisting mainly of images was used, and the method of deep multi-label learning and deep autoencoder-like non-negative matrix factorization (DANMF) was used to implement the Recipe-Ingredient recognition method. This study uses the Recipe1M dataset provided by Massachusetts Institute of Technology to train models using these image data, and solve a large number of multi-label problems through the model-resolved food material style. Furthermore, Local Interpretable Model-Agnostic Explanations (LIME) was employed to explain the prediction parts of the components in the picture and increase the interpretability of the model.
目次 Table of Contents
誌謝 ii
摘要 iii
Abstract iv
第一章 緒論 1
(1-1)研究背景 1
(1-2)研究動機 3
(1-3)研究目的 4
第二章 文獻探討 5
(2-1)深度學習 (Deep Learning) 5
(2-2)卷積神經網路(Convolutional Neural Network) 6
(2-3)多標籤學習(Multi-Label Learning) 8
(2-4)非負矩陣分解 (Non-negative Matrix Factorization) 10
(2-5)Deep Autoencoder-like Non-negative Matrix Factorization (DANMF) 11
(2-6)Local Interpretable Model-Agnostic Explanations (LIME) 12
第三章 研究方法 13
(3-1)原始資料引用 15
(3-2)資料前處理 15
(3-3)分析與建立食材資料集 16
(3-4)運用多標籤學習方法建構深度學習卷積神經網路 19
(3-5)運用圖片辨識模型程序預測料理風格 20
(3-6)料理風格解釋 21
第四章 實驗結果 22
第五章 結論 25
第六章 參考文獻 26
參考文獻 References
Agarwal, R., & Miller, K. (n.d.). Information Extraction from Recipes, 14.
Ahn, Y.-Y., Ahnert, S. E., Bagrow, J. P., & Barabási, A.-L. (2011). Flavor network and the principles of food pairing. Scientific Reports, 1(1). https://doi.org/10.1038/srep00196
Bi, W., & Kwok, J. T. (n.d.). Efficient Multi-label Classification with Many Labels, 9.
Bolaños, M., Ferrà, A., & Radeva, P. (2017). Food Ingredients Recognition through Multi-label Learning. ArXiv:1707.08816 [Cs]. Retrieved from http://arxiv.org/abs/1707.08816
Boutell, M. R., Luo, J., Shen, X., & Brown, C. M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757–1771. https://doi.org/10.1016/j.patcog.2004.03.009
Chen, J., & Ngo, C. (2016a). Deep-based Ingredient Recognition for Cooking Recipe Retrieval. In Proceedings of the 2016 ACM on Multimedia Conference - MM ’16 (pp. 32–41). Amsterdam, The Netherlands: ACM Press. https://doi.org/10.1145/2964284.2964315
Chen, J., & Ngo, C. (2016b). Deep-based Ingredient Recognition for Cooking Recipe Retrieval. In Proceedings of the 2016 ACM on Multimedia Conference - MM ’16 (pp. 32–41). Amsterdam, The Netherlands: ACM Press. https://doi.org/10.1145/2964284.2964315
CS231n Convolutional Neural Networks for Visual Recognition. (2018, October 2). Retrieved October 2, 2018, from http://cs231n.github.io/convolutional-networks/
Deng, J., Dong, W., Socher, R., Li, L.-J., Kai Li, & Li Fei-Fei. (2009). ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255). Miami, FL: IEEE. https://doi.org/10.1109/CVPR.2009.5206848
Gibaja, E., & Ventura, S. (2015). A Tutorial on Multilabel Learning. ACM Computing Surveys, 47(3), 1–38. https://doi.org/10.1145/2716262
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep Residual Learning for Image Recognition. ArXiv:1512.03385 [Cs]. Retrieved from http://arxiv.org/abs/1512.03385
Hinton, G. E. (2006). Reducing the Dimensionality of Data with Neural Networks. Science, 313(5786), 504–507. https://doi.org/10.1126/science.1127647
How do Convolutional Neural Networks work? (2018, October 2). Retrieved October 2, 2018, from https://brohrer.github.io/how_convolutional_neural_networks_work.html
Hoyer, P. O., & Hoyer, P. (n.d.). Non-negative Matrix Factorization with Sparseness Constraints, 13.
Koitka, S., & Friedrich, C. M. (2016). nmfgpu4R: GPU-Accelerated Computation of the Non-Negative Matrix Factorization (NMF) Using CUDA Capable Hardware, 8, 11.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25 (pp. 1097–1105). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Lee, D. D., & Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755), 788–791. https://doi.org/10.1038/44565
Liu, C., Cao, Y., Luo, Y., Chen, G., Vokkarane, V., & Ma, Y. (2016). DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment. In C. K. Chang, L. Chiari, Y. Cao, H. Jin, M. Mokhtari, & H. Aloulou (Eds.), Inclusive Smart Cities and Digital Health (Vol. 9677, pp. 37–48). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-39601-9_4
Madjarov, G., Kocev, D., Gjorgjevikj, D., & Džeroski, S. (2012). An extensive experimental comparison of methods for multi-label learning. Pattern Recognition, 45(9), 3084–3104. https://doi.org/10.1016/j.patcog.2012.03.004
Marin, J., Biswas, A., Ofli, F., Hynes, N., Salvador, A., Aytar, Y., … Torralba, A. (2018). Recipe1M: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. ArXiv:1810.06553 [Cs]. Retrieved from http://arxiv.org/abs/1810.06553
Morency, L.-P., & Baltrusaitis, T. (n.d.-a). Tutorial on Multimodal Machine Learning, 158.
Morency, L.-P., & Baltrusaitis, T. (n.d.-b). Tutorial on Multimodal Machine Learning, 158.
Myers, A., Johnston, N., Rathod, V., Korattikara, A., Gorban, A., Silberman, N., … Murphy, K. (2015). Im2Calories: Towards an Automated Mobile Vision Food Diary. In 2015 IEEE International Conference on Computer Vision (ICCV) (pp. 1233–1241). Santiago, Chile: IEEE. https://doi.org/10.1109/ICCV.2015.146
Pascual-Montano, A., Carazo, J. M., Kochi, K., Lehmann, D., & Pascual-Marqui, R. D. (2006). Nonsmooth nonnegative matrix factorization (nsNMF). IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(3), 403–415. https://doi.org/10.1109/TPAMI.2006.60
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. ArXiv:1602.04938 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1602.04938
Ruder, S. (2017). An Overview of Multi-Task Learning in Deep Neural Networks. ArXiv:1706.05098 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1706.05098
Salvador, A., Hynes, N., Aytar, Y., Marin, J., Ofli, F., Weber, I., & Torralba, A. (2017). Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3068–3076). Honolulu, HI: IEEE. https://doi.org/10.1109/CVPR.2017.327
Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv:1409.1556 [Cs]. Retrieved from http://arxiv.org/abs/1409.1556
Srivastava, T. (n.d.). Framework to build a niche dictionary for text mining, 12.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A. (2014). Going Deeper with Convolutions. ArXiv:1409.4842 [Cs]. Retrieved from http://arxiv.org/abs/1409.4842
Teng, C.-Y., Lin, Y.-R., & Adamic, L. A. (2011). Recipe recommendation using ingredient networks. ArXiv:1111.3919 [Physics]. Retrieved from http://arxiv.org/abs/1111.3919
Thoma, M. (2017). Analysis and Optimization of Convolutional Neural Network Architectures. ArXiv:1707.09725 [Cs]. Retrieved from http://arxiv.org/abs/1707.09725
Ye, F., Chen, C., & Zheng, Z. (2018). Deep Autoencoder-like Nonnegative Matrix Factorization for Community Detection. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management - CIKM ’18 (pp. 1393–1402). Torino, Italy: ACM Press. https://doi.org/10.1145/3269206.3271697
Yeh, C.-K., Wu, W.-C., Ko, W.-J., & Wang, Y.-C. F. (n.d.). Learning Deep Latent Spaces for Multi-Label Classification, 7.
Yu, Q., Mao, D., & Wang, J. (n.d.). Deep Learning Based Food Recognition, 6.
Zhang, M.-L., & Zhou, Z.-H. (2014). A Review on Multi-Label Learning Algorithms. IEEE Transactions on Knowledge and Data Engineering, 26(8), 1819–1837. https://doi.org/10.1109/TKDE.2013.39
Zhou, Z.-H. (n.d.). Multi-Instance Multi-Label Learning with Application to Scene Classification, 8.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code