Responsive image
博碩士論文 etd-0723120-225123 詳細資訊
Title page for etd-0723120-225123
論文名稱
Title
基於輔助性多工與可解釋多標籤學習的食材辨識系統
Food Ingredients Recognition via Interpretable Multi-label Learning with Auxiliary tasks
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
41
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2020-07-30
繳交日期
Date of Submission
2020-08-23
關鍵字
Keywords
詞嵌入、多標籤學習、卷積神經網絡、多任務學習、機器學習可解釋性
Word-embedding, Convolutional neural network, Multi-task learning, Machine Learning Interpretability, Multi-label Learning
統計
Statistics
本論文已被瀏覽 5980 次,被下載 108
The thesis/dissertation has been browsed 5980 times, has been downloaded 108 times.
中文摘要
隨著大數據的興起,深度學習已廣泛用於解決各種分類問題,在食品相關領域,食材識別是一種熱門且具有挑戰性的應用。挑戰之一是烹飪後食材難以識別,另一個挑戰是多標籤學習。在本研究中,我們將多標籤學習應用於 BBC食品網站上的食譜資料,嘗試在食物圖像中找到相應的食材。我們提出了一種多任務學習方法來解決多標籤問題。首先將烹飪步驟的文本內容做轉換,得到的向量用作多任務學習的輸出之一,而另一個輸出是食材。我們的方法通過多任務學習,兩個​​任務彼此共享學習到的資訊,可以學習單任務學習無法學習的資訊,從而提高食材預測的準確性,並對模型提供可理解的解釋。
Abstract
With the rise of big data in recent years, deep learning has been extensively used to solve various classification problems, for food-related fields, ingredient recognition is one of the popular and challenging applications. One of the challenges is the difficulty of recognition after cooking, and another challenge is multi-label learning.
In this thesis, we try to find the corresponding ingredient set in food images from the recipe data on the BBC food website. by proposing a deep learning multi-task learning algorithm to solve this multi-label problem. This method first converts the cooking instruction text into the vector and uses it as one of the outputs of multi-task learning, and another output is the ingredient set.
With multi-task learning, the two tasks share the learned information with each other, and learn the patterns that single-task learning may not learn, thereby improving the accuracy of the ingredient prediction and providing an understandable explanation for the model.
目次 Table of Contents
論文審定書................................................................................................................... i
誌謝............................................................................................................................. ii
摘要............................................................................................................................ iii
Abstract....................................................................................................................... iv
目錄............................................................................................................................. v
List of Figures.............................................................................................................. vi
List of Table.................................................................................................................vii
1. Introduction................................................................................................................1
2. Background and Related Work.................................................................................3
2.1 Convolutional neural network...............................................................................3
2.2 Food Understanding............................................................................................ 5
2.3 Multi-label classification......................................................................................8
2.4 Multi-task Learning........................................................................................... 13
2.5 Word embedding...............................................................................................15
2.6 Explainable AI..................................................................................................17
3. Proposed approach............................................................................................18
4. Experiments...................................................................................................... 23
4.1 Dataset............................................................................................................. 23
4.2 Evaluation metrics.............................................................................................23
4.3 Comparison Methods........................................................................................24
4.4 Experimental Results.........................................................................................25
5. Conclusion.............................................................................................................. 27
6. References...............................................................................................................28
參考文獻 References
Agrawal, R., Gupta, A., Prabhu, Y., & Varma, M. (n.d.). Multi-Label Learning with Millions of Labels: Recommending Advertiser Bid Phrases for Web Pages. 11.
Argyriou, A., Evgeniou, T., & Pontil, M. (n.d.). Convex Multi-Task Feature Learning. 40.
Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning, 2(1), 1–127. https://doi.org/10.1561/2200000006
Bhatia, K., Jain, H., Kar, P., Varma, M., & Jain, P. (n.d.). Sparse Local Embeddings for Extreme Multi-label Classification. 18.
Bossard, L., Guillaumin, M., & Van Gool, L. (2014). Food-101 – Mining Discriminative Components with Random Forests. In D. Fleet, T. Pajdla, B. Schiele, & T. Tuytelaars (Eds.), Computer Vision – ECCV 2014 (Vol. 8694, pp. 446–461). Springer International Publishing. https://doi.org/10.1007/978-3-319-10599-4_29
Caruana, R. (n.d.). Multitask Learning. 35.
Chen, J., & Ngo, C. (2016). Deep-based Ingredient Recognition for Cooking Recipe Retrieval. Proceedings of the 2016 ACM on Multimedia Conference - MM ’16, 32–41. https://doi.org/10.1145/2964284.2964315
Convolutional neural network. (2019). In Wikipedia. https://en.wikipedia.org/w/index.php?title=Convolutional_neural_network&oldid=925804458
Ege, T., & Yanai, K. (2017). Image-Based Food Calorie Estimation Using Knowledge on Food Categories, Ingredients and Cooking Directions. Proceedings of the on Thematic Workshops of ACM Multimedia 2017 - Thematic Workshops ’17, 367–375. https://doi.org/10.1145/3126686.3126742
Evgeniou, T. (2004). Regularized multi-task learning. 109–117.
Jain, H., Prabhu, Y., & Varma, M. (2016). Extreme Multi-label Loss Functions for Recommendation, Tagging, Ranking & Other Missing Label Applications. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, 935–944. https://doi.org/10.1145/2939672.2939756
Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of Tricks for Efficient Text Classification. ArXiv:1607.01759 [Cs]. http://arxiv.org/abs/1607.01759
Kang, Y., Cheng, I.-L., Mao, W., Kuo, B., & Lee, P.-J. (2019). Towards Interpretable Deep Extreme Multi-label Learning. ArXiv:1907.01723 [Cs, Stat]. http://arxiv.org/abs/1907.01723
Kawano, Y., & Yanai, K. (2014). Food image recognition with deep convolutional features. Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing Adjunct Publication - UbiComp ’14 Adjunct, 589–593. https://doi.org/10.1145/2638728.2641339
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25 (pp. 1097–1105). Curran Associates, Inc. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
Liang, Y., & Li, J. (2017). Computer vision-based food calorie estimation: Dataset, method, and experiment. ArXiv:1705.07632 [Cs]. http://arxiv.org/abs/1705.07632
Liu, J., Chang, W.-C., Wu, Y., & Yang, Y. (2017). Deep Learning for Extreme Multi-label Text Classification. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR ’17, 115–124. https://doi.org/10.1145/3077136.3080834
Lounici, K., Pontil, M., Tsybakov, A. B., & van de Geer, S. (2009). Taking Advantage of Sparsity in Multi-Task Learning. ArXiv:0903.1468 [Math, Stat]. http://arxiv.org/abs/0903.1468
Marin, J., Biswas, A., Ofli, F., Hynes, N., Salvador, A., Aytar, Y., Weber, I., & Torralba, A. (2019). Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1–1. https://doi.org/10.1109/TPAMI.2019.2927476
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (n.d.). Distributed Representations of Words and Phrases and their Compositionality. 9.
Myers, A., Johnston, N., Rathod, V., Korattikara, A., Gorban, A., Silberman, N., Guadarrama, S., Papandreou, G., Huang, J., & Murphy, K. (2015). Im2Calories: Towards an Automated Mobile Vision Food Diary. 2015 IEEE International Conference on Computer Vision (ICCV), 1233–1241. https://doi.org/10.1109/ICCV.2015.146
Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. https://doi.org/10.3115/v1/D14-1162
Pouladzadeh, P., Kuhad, P., Peddi, S. V. B., Yassine, A., & Shirmohammadi, S. (2016). Food calorie measurement using deep learning neural network. 2016 IEEE International Instrumentation and Measurement Technology Conference Proceedings, 1–6. https://doi.org/10.1109/I2MTC.2016.7520547
Pouladzadeh, P., Yassine, A., & Shirmohammadi, S. (2015). FooDD: Food Detection Dataset for Calorie Measurement Using Food Images. In V. Murino, E. Puppo, D. Sona, M. Cristani, & C. Sansone (Eds.), New Trends in Image Analysis and Processing—ICIAP 2015 Workshops (Vol. 9281, pp. 441–448). Springer International Publishing. https://doi.org/10.1007/978-3-319-23222-5_54
Prabhu, Y., & Varma, M. (2014). FastXML: A fast, accurate and stable tree-classifier for extreme multi-label learning. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’14, 263–272. https://doi.org/10.1145/2623330.2623651
Recipes—BBC Food. (n.d.). Retrieved July 23, 2020, from https://www.bbc.co.uk/food/recipes
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. ArXiv:1602.04938 [Cs, Stat]. http://arxiv.org/abs/1602.04938
Ruder, S. (2017). An Overview of Multi-Task Learning in Deep Neural Networks. ArXiv:1706.05098 [Cs, Stat]. http://arxiv.org/abs/1706.05098
Salvador, A., Drozdzal, M., Giro-i-Nieto, X., & Romero, A. (n.d.). Inverse Cooking: Recipe Generation From Food Images. 10.
Samek, W., Wiegand, T., & Müller, K.-R. (2017). Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. ArXiv:1708.08296 [Cs, Stat]. http://arxiv.org/abs/1708.08296
Shen, D., Wang, G., Wang, W., Min, M. R., Su, Q., Zhang, Y., Li, C., Henao, R., & Carin, L. (2018). Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms. ArXiv:1805.09843 [Cs]. http://arxiv.org/abs/1805.09843
Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv:1409.1556 [Cs]. http://arxiv.org/abs/1409.1556
Sorower, M. S. (2010). A Literature Survey on Algorithms for Multi-label Learning.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2014). Going Deeper with Convolutions. ArXiv:1409.4842 [Cs]. http://arxiv.org/abs/1409.4842
Wang, J., Yang, Y., Mao, J., Huang, Z., Huang, C., & Xu, W. (2016). CNN-RNN: A Unified Framework for Multi-label Image Classification. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2285–2294. https://doi.org/10.1109/CVPR.2016.251
Wei, T., & Li, Y.-F. (n.d.). Does Tail Label Help for Large-Scale Multi-Label Learning. 7.
Yeh, C.-K., Wu, W.-C., Ko, W.-J., & Wang, Y.-C. F. (2017). Learning Deep Latent Spaces for Multi-Label Classification. ArXiv:1707.00418 [Cs]. http://arxiv.org/abs/1707.00418
Yu, H.-F., Jain, P., Kar, P., & Dhillon, I. S. (n.d.). Large-scale Multi-label Learning with Missing Labels. 9.
Zhang, J., Wu, Q., Shen, C., Zhang, J., & Lu, J. (2017). Multi-Label Image Classification with Regional Latent Semantic Dependencies. ArXiv:1612.01082 [Cs]. http://arxiv.org/abs/1612.01082
Zhang, W., Yan, J., Wang, X., & Zha, H. (2017). Deep Extreme Multi-label Learning. ArXiv:1704.03718 [Cs]. http://arxiv.org/abs/1704.03718
Zhang, Y., & Yang, Q. (2018). A Survey on Multi-Task Learning. ArXiv:1707.08114 [Cs]. http://arxiv.org/abs/1707.08114
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code