Responsive image
博碩士論文 etd-0805118-133423 詳細資訊
Title page for etd-0805118-133423
論文名稱
Title
結合使用者評論之電子商務協同過濾推薦系統
Leveraging User Comments for Collaborative Filtering Recommendation in E-Commerce
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
59
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2018-07-26
繳交日期
Date of Submission
2018-09-05
關鍵字
Keywords
協同過濾、使用者物品評分矩陣、語意分析、自建構式分群、維度縮減、推薦系統
user-item rating matrix, Dimensionality reduction, Collaborative filtering, Ranking algorithm, Self-constructing clustering
統計
Statistics
本論文已被瀏覽 5697 次,被下載 458
The thesis/dissertation has been browsed 5697 times, has been downloaded 458 times.
中文摘要
隨著電子商務近年來的迅速發展,如何能有效的從使用者過往消費紀錄中擷取出有用的資訊,並分析出使用者對商品的喜好,以利於其找到投其所好的商品,是迫切需求的技術。傳統的協同過濾型的推薦系統僅單純運用使用者物品評分矩陣中的資訊,來進行對使用者或物品的分析進行推薦,雖然簡單且方便,然而長久以來,此種方式都有難以克服的缺陷,儘管眾多學者提出了大量方式試圖解決,但至今仍成效不彰。本文將就兩項最嚴峻的缺陷進行探討並試圖改善,分別為資料過於稀疏和推薦系統延展性這兩個難題。資料過於稀疏是指使用者物品評分矩陣中的分數過於稀少,而延展性則是現今推薦系統往往要面對巨量的使用者和物品,而傳統的這些推薦系統演算法難以處理如此龐大的資料量,致使結果產生偏差甚至難以採用或是需要過長的時間獲取推薦結果。我們將於本文中提出一個新穎的方法,用以解決前述的兩項難題。我們將利用Word2Vec和使用者對過往購買物品的文字評論,來建構每個物品的物品向量,其後根據使用者物品評分矩陣和剛剛所建構的物品向量來建立每個使用者的使用者向量,接下來我們會運用分群和縮維方法來對資料進行處理,以降低巨量資料所帶來的時間複雜度問題,之後利用這些縮減維度後的資料和分群結果來進行推薦系統演算,最後我們將演算的結果再重新轉換為每個使用者對每個物品的喜好排序序列,也就是最終的個人化推薦結果。
Abstract
The fast development of E-commerce causes the urgent need of various recommender systems that help consumers to find interesting products by extracting knowledge from the previous interaction information of users. Collaborative filtering recommender systems traditionally recommend products to users solely based on the user-item rating matrix and are simple, convenient to use. However, some issues have long been concerned, and researchers have been trying hard with different solutions to make collaborative filtering more practical and useful. In this paper, we focus on two main issues, data sparsity and scalability. Data sparsity is related to the sparse ratings in the useritem rating matrix and it can lead to inaccurate recommendations, while scalability is related to the huge number of products and users involved in E-commerce, which may cause an unacceptably long delay before valuable recommendations are acquired. We propose a novel approach to deal with these two issues. Word2Vec is employed to build item vectors, one item vector for each product, from the comments made by users on their previously bought goods. Through the user-item rating matrix, user vectors of all the users are then obtained. Dimensionality reduction and clustering techniques are applied to reduce the time complexity related to the large numbers of items and users. Recommendation work is then done with the resulting clusters. Finally, reverse transformation is performed and a ranked list of recommended items is offered to each user. With the proposed approach, the inaccuracy caused by the sparse ratings in the useritem rating matrix is overcome and the processing time for making recommendations from an enormous amount of data is much reduced. Experimental results of real data sets are shown to demonstrate the effectiveness of our proposed approach.
目次 Table of Contents
論文審定書 i
致謝 ii
摘要 iii
Abstract iv
圖目錄 vii
表目錄 viii
第 1 章 簡介 1
1.1 研究背景 1
1.2 論文架構 3
第 2 章 文獻探討 4
2.1 協同過濾推薦系統 4
2.2 針對資料前處理的協同過濾推薦系統 4
2.3 分群式協同過濾推薦系統 5
2.4 使用額外資訊的協同過濾推薦系統 6
第 3 章 研究方法 8
3.1 研究動機 8
3.2 我們的方法 10
3.3 步驟一 : 建立使用者和物品向量 11
3.3.1 Word2Vec 11
3.3.2 物品向量 13
3.3.3 使用者向量 15
3.4 步驟二 : 將使用者和物品進行分群 15
3.4.1 主成分分析 15
3.4.2 自建構式分群(self-construct clustering) 16
3.4.3 利用主成分分析降低維度 19
3.4.4 組成物品和使用者群 20
3.5 步驟三 : 降低使用者物品評分矩陣的維度 22
3.6 步驟四 : 預測使用者群對物品群的評分 22
3.6.1 ItemRank 23
3.6.2 建立關係拓譜 24
3.6.3 預測使用者群的喜好 25
3.7 預測個人化推薦表 25
3.8 範例 26
第 4 章 實驗結果與分析 32
4.1 所使用的評量指標 32
4.2 所使用的資料集 33
4.3 比較方法和實驗環境 34
4.6 補值影響的實驗 36
4.7 PCA 能量值影響實驗 39
4.8 物品和使用者群數多寡影響實驗 40
4.9 和其他方法比較 40
第 5 章 結論與未來展望 43
5.1 結論 43
5.2 未來研究方向 43
參考文獻 44
參考文獻 References
[1] Allabakhsh,M.,& Ignjatovic, A. (2015). An iterative method for calculating robust rating scores. IEEE Transactions on Parallel and Distributed Systems, 26(2),340-
350.
[2] Zahra, S., Ghazanfar, M. A., Khalid, A., Azam, M. A., Naeem,U.,&Prugel-Bennett, A. (2015). Novel centroid selection approaches for KMeans-clustering based recommender systems. Information sciences, 320, 156-189.
[3] Park, Y. J. (2012). The adaptive clustering method for the long tail problem of recommender systems. IEEE Transactions on Knowledge and Data Engineering.
[4] Zhang, D., Hsu, C. H., Chen, M., Chen, Q., Xiong, N., & Lloret,J.(2014).Cold-start recommendation using bi-clustering and fusion for large-scale social recommender systems. IEEE Transactions on Emerging Topics in Computing, 2(2), 239-250.
[5] J. Das, P. Mukherjee, S. Majumder, P. Gupta, Clustering-based recommender system using principles of voting theory, in: Contemporary computing and informatics(IC3I), 2014 international conference on, IEEE, 2014, pp. 230–235.
[6] X. Zheng, Y. Luo, L. Sun, F. Chen, A new recommender system using context clustering based on matrix factorization techniques, Chinese Journal of Electronics 25 (2) (2016) 334–340.
[7] L. Yang, W. Huang, X. Niu, Defending shilling attacks in recommender systems using soft co-clustering, IET Information Security 11 (6) (2017) 319– 325.
[8] J. Bobadilla, R. Bojorque, A. H. Esteban, R. Hurtado, Recommender systems clustering using bayesian non negative matrix factorization, IEEE Access 6 (2018) 3549–3564.
[9] P. Victor, N. Verbiest, C. Cornelis, M. D. Cock, Enhancing the trust-based recommendation process with explicit distrust, ACM Transactions on the Web (TWEB) 7 (2) (2013) 6.
[10] R. Forsati, M. Mahdavi, M. Shamsfard, M. Sarwat, Matrix factorization with explicit trust and distrust side information for improved social recommendation, ACM Transactions on Information Systems (TOIS) 32 (4) (2014) 17.
[11] S. Huang, J. Ma, P. Cheng, S. Wang, A hybrid multigroup coclustering recommendation framework based on information fusion, ACM Transactions on Intelligent Systems and Technology (TIST) 6 (2) (2015) 27.
[12] J. Das, P. Mukherjee, S. Majumder, P. Gupta, Clustering-based recommender system using principles of voting theory, in: Contemporary computing and informatics (IC3I), 2014 international conference on, IEEE, 2014, pp. 230–235.
[13] B. Sarwar, G. Karypis, J. Konstan, J. Riedl, Item-based collaborative filtering recommendation algorithms, in: Proceedings of the 10th international conference on World Wide Web, ACM, 2001, pp. 285–295.
[14] B. M. Sarwar, G. Karypis, J. Konstan, J. Riedl, Recommender systems for largescalee-commerce: Scalable neighborhood formation using clustering, in: Proceedings of the fifth international conference on computer and information technology, Vol. 1, 2002, pp. 291–324.
[15] O. Barkan, N. Koenigstein, Item2vec: neural item embedding for collaborative filtering, in: Machine Learning for Signal Processing (MLSP), 2016 IEEE 26th International Workshop on, IEEE, 2016, pp. 1–6.
[16] H. Pang, L. Zhou, H. Liu, Personalization portal system based on collaborative filtering algorithm, in: Computer, Mechatronics, Control and Electronic Engineering (CMCE), 2010 International Conference on, Vol. 1, IEEE, 2010, pp. 383–386.
[17] F. Xie, Z. Chen, J. Shang, W. Huang, J. Li, Item similarity learning methods for collaborative filtering recommender systems, in: Advanced Information Networking and Applications (AINA), 2015 IEEE 29th International Conference on,IEEE, 2015, pp. 896–903.
[18] M. V. Gopalachari, P. Sammulal, Personalized collaborative filtering recommender system using domain knowledge, in: Computer and Communications Technologies (ICCCT), 2014 International Conference on, IEEE, 2014, pp. 1–6.
[19] G. Adomavicius, Y. Kwon, Improving aggregate recommendation diversity using ranking-based techniques, IEEE Transactions on Knowledge and Data Engineering 24 (5) (2012) 896–911.
[20] Q. Liu, E. Chen, H. Xiong, C. H. Ding, J. Chen, Enhancing collaborative filtering by user interest expansion via personalized ranking, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42 (1) (2012) 218–233.
[21] M. Gori, A. Pucci, V. Roma, I. Siena, Itemrank: A random-walk based scoring algorithm for recommender engines., in: IJCAI, Vol. 7, 2007, pp. 2766–2771.
[22] S. Debnath, N. Ganguly, P. Mitra, Feature weighting in content based recommendation system using social network analysis, in: Proceedings of the 17th international conference on World Wide Web, ACM, 2008, pp. 1041–1042.
[23] M. De Gemmis, P. Lops, G. Semeraro, P. Basile, Integrating tags in a semantic content-based recommender, in: Proceedings of the 2008 ACM conference on Recommender systems, ACM, 2008, pp. 163–170.
[24] L. Zhang, B. Zhang, K. Mei, An implementation of electronic commerce recommender system based on improved k-means clustering algorithm, in:Computer Science & Service System (CSSS), 2012 International Conference on, IEEE, 2012, pp. 1508–1511.
[25] S. Gong, A collaborative filtering recommendation algorithm based on user clustering and item clustering., JSW 5 (7) (2010) 745–752.
[26] B. Xu, J. Bu, C. Chen, D. Cai, An exploration of improving collaborative recommender systems via user-item subgroups, in: Proceedings of the 21st international conference on World Wide Web, ACM, 2012, pp. 21–30.
[27] B. Sarwar, G. Karypis, J. Konstan, J. Riedl, Application of dimensionality reduction in recommender system-a case study, Tech. rep., Minnesota Univ Minneapolis Dept of Computer Science (2000).
[28] P. Resnick, H. R. Varian, Recommender systems, Communications of the ACM 40 (3) (1997) 56–58.
[29] R. Burke, Integrating knowledge-based and collaborative-filtering recommender systems, in: Proceedings of theWorkshop on AI and Electronic Commerce, 1999, pp. 69–72.
[30] G. Linden, B. Smith, J. York, Amazon. com recommendations: Item-to-item collaborative filtering, IEEE Internet computing (1) (2003) 76–80.
[31] P. Massa, P. Avesani, Trust-aware collaborative filtering for recommender systems,in: OTM Confederated International Conferences” On the Move to Meaningful Internet Systems”, Springer, 2004, pp. 492–508.
[32] Y. Koren, R. Bell, C. Volinsky, Matrix factorization techniques for recommender systems, Computer (8) (2009) 30–37.
[33] B. Yang, Y. Lei, J. Liu, W. Li, Social collaborative filtering by trust, IEEE transactions on pattern analysis and machine intelligence 39 (8) (2017) 1633–1647.
[34] X. Guan, On reducing the data sparsity in collaborative filtering recommender systems, Ph.D. thesis, University of Warwick (2017).
[35] B. Smith, G. Linden, Two decades of recommender systems at amazon. com, Ieee internet computing 21 (3) (2017) 12–18.
[36] G. Li, Z. Zhang, L. Wang, Q. Chen, J. Pan, One-class collaborative filtering based on rating prediction and ranking prediction, Knowledge-Based Systems 124 (2017) 46–54.
[37] Y. Xiu, M. Lan, Y. Wu, J. Lang, Exploring semantic content to user profiling for user cluster-based collaborative point-of-interest recommender system, in: Asian Language Processing (IALP), 2017 International Conference on, IEEE, 2017, pp. 268–271.
[38] T. George, S. Merugu, A scalable collaborative filtering framework based on coclustering, in: Data Mining, Fifth IEEE international conference on, IEEE, 2005.
[39] J.Wei, J. He, K. Chen, Y. Zhou, Z. Tang, Collaborative filtering and deep learning based recommendation system for cold start items, Expert Systems with Applications 69 (2017) 29–39.
[40] M. Nilashi, O. Ibrahim, K. Bagherifard, A recommender system based on collaborative filtering using ontology and dimensionality reduction techniques, Expert Systems with Applications 92 (2018) 507–520.
[41] H. Liu, J.Wu, T. Liu, D. Tao, Y. Fu, Spectral ensemble clustering via weighted kmeans: Theoretical and practical evidence, IEEE transactions on knowledge and data engineering 29 (5) (2017) 1129–1143.
[42] J.-Y. Jiang, R.-J. Liou, S.-J. Lee, A fuzzy self-constructing feature clustering algorithm for text classification, IEEE transactions on knowledge and data engineering 23 (3) (2011) 335–349.
[43] T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781.
[44] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in: Advances in neural information processing systems, 2013, pp. 3111–3119.
[45] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al., Scikit-learn: Machinem learning in python, Journal of machine learning research 12 (Oct) (2011) 2825– 2830.
[46] S. Wold, K. Esbensen, P. Geladi, Principal component analysis, Chemometrics and intelligent laboratory systems 2 (1-3) (1987) 37–52.
[47] M. Diligenti, M. Gori, M. Maggini, A unified probabilistic framework for web page scoring systems, IEEE Transactions on knowledge and data engineering 16 (1) (2004) 4–16.
[48] M.Wan, J. McAuley, Modeling ambiguity, subjectivity, and diverging viewpoints in opinion question answering systems, in: Data Mining (ICDM), 2016 IEEE 16th International Conference on, IEEE, 2016, pp. 489–498.
[49] J. McAuley, A. Yang, Addressing complex and subjective product-related queries with customer reviews, in: Proceedings of the 25th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, 2016, pp. 625–635.
[50] C.-L. Liao, S.-J. Lee, A clustering based approach to improving the efficiency of collaborative filtering recommendation, Electronic Commerce Research and Applications 18 (2016) 1–9.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code