國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,薄客戶廣域計算環境下資料預存機制之研究 ,Data Prefetching in Thin-Client/Server Computing over Wide Area Network

論文名稱 Title	薄客戶廣域計算環境下資料預存機制之研究 Data Prefetching in Thin-Client/Server Computing over Wide Area Network
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	91 學年度第 2 學期 The spring semester of Academic Year 91	語文別 Language	英文 English
學位類別 Degree	碩士 Master	頁數 Number of pages	48
研究生 Author	安豐文 Feng-Wen An
指導教授 Advisor	黃三益 San-Yih Hwang
召集委員 Convenor	魏志平 Chih-Ping Wei
口試委員 Advisory Committee	黃冠寰, 簡立峰 Gwan-Hwan Hwang; Lee-Feng Chien
口試日期 Date of Exam	2003-07-10	繳交日期 Date of Submission	2003-07-28
關鍵字 Keywords	薄客戶、預存、字尾樹 Suffix Tree, Prefetch, Thin Client
統計 Statistics	本論文已被瀏覽 5942 次，被下載 26 次 The thesis/dissertation has been browsed 5942 times, has been downloaded 26 times.

中文摘要
薄客戶電腦模型將所有的應用程式運算集中在伺服器上，而薄客戶裝置則是透過網路連上伺服器執行工作，傳統的薄客戶只有一台伺服器，並且只在局部區域網路執行。隨著時間推移，使用者可以在任何的地區出現。為了達到在廣域網路上，薄客戶電腦模型仍可達到合理的回應時間，修改的薄客戶電腦模型，MAS TC/S，被提出了。在MAS TC/S 中，有數台的伺服器在廣域網路中執行，而使用者可以自由地連接鄰近的伺服器使用。然而，如何減少取得儲存在其他伺服器理的檔案所需花費的時間仍是一個具挑戰性的問題，我們提出使用資料預取機制來加速檔案的取得。我們使用類似字尾樹的資料結構來儲存使用者的檔案使用記錄，並且定義兩種檔案的時序關係來決定哪些檔案應該一起預取，分別是隨後開啟與同時開啟兩種關係。已知使用者目前的檔案使用情形，在找到相符的分支序列後，我們根據每個預測集合中所包含的檔案計算其權重。以預測集合為基礎，將適當的檔案預存至所連接的伺服器。我們將所提出的方法與全階馬可夫模型比較，發現所提出的方法在任務包含較多檔案和使用例子中含有較多任務的情況下，有較高的命中率。
Abstract
The thin-client/server computing model mandates applications running solely on a server and client devices connecting to the server through the Internet for carrying out works. Traditional thin-client/server computing model comprises only a single server and works only within LAN environment, which severely restrict its applicability. To meet the demand of reasonable response time over WAN, a modified thin-client/server computing model, MAS TC/S, was proposed. In MAS TC/S, multiple application servers spreading over WAN are installed, and each client device can freely connect to any application server that is close to it. However, reducing delay associated with fetching absent files, which are stored in other servers, is a challenging issue in MAS TC/S. We propose to employ data prefetching mechanisms to speed up file fetching. We use the suffix tree-like structure to store users’ previous file access records and define two temporal relationships between two records: followed by or concurrent with, to decide the set of files that should be prefetched together. Each file access subsequence is associated with a set of predicted file sets, each carrying a different weight. Given a current file access session, we will first find a matching file access subsequence and then choose the predicted set that has the highest weight. Based on the chosen predicted set, suitable files are prefeteched to the connected server. We compare our method with All-Kth-Order Markov model and find our method gets higher hit ratio under various operating regions.

目次 Table of Contents
Chapter 1 Introduction................................................ 1 Chapter 2 Literature review........................................... 4 2.1 Statistics-based prefetching ......................................4 2.2 Data mining-based prefetching......................................7 2.3 Markov model.......................................................8 2.4 Index structures..................................................13 Chapter 3 Problem description ....................................... 20 3.1 File access log ..................................................20 3.2 Problem definition ...............................................23 Chapter 4 Our approach............................................... 29 4.1 Suffix tree for storing file access ..............................29 4.1.1 Instances.......................................................29 4.2 Matching process .................................................31 Chapter 5 Evaluations................................................ 37 5.1 Generation of synthetic data .....................................37 5.2 Performance metrics ..............................................39 5.3 Experimental results on synthetic data ...........................40 5.4 Experimental results on real MAS TC/S data........................44 Chapter 6 Conclusion ................................................ 46

參考文獻 References
[Boca99] Boca Research, Citrix ICA Technology Brief. Technical White Paper, Boca Raton, FL, 1999. [Kanter98] J. P. Kanter. Understanding Thin-Client/Server Computing. Microsoft Press, 1998. [SLN99] B. K. Schmidt, M. S. Lam, J. D. Northcutt. “The Interactive Performance of SLIM: A Stateless, Thin-Client Architecture,” The 17th ACM Symposium on Operating Systems Principles (SOSP’99), 1999. [DK00] M. Deshpande, G. Karypis, “Selective Markov Models for Predicting Web-Page Accesses,” U. of Minnesota, Dept. of Computer Science/Army HPC Research Center, Minneapolis, MN 55455, Technical Report #00-056, 2000. [GA94] J. Griffioen, R. Appleton, “Reducing Files System Latency using a Predictive Approach,” U. of Kentucky, Dept. of Computer Science, Technical Report #CS247-94, June 1994. [HLH00] G. H. Hwang, J. S. Li, S. Y. Hwang, Multiple-application-server Topology for Thin-client/server Computing Model, in 1’st Workshop on Advanced Software for Pervasive Environments and Information and Server Appliances (ASPEISA’00), 2000. [PP99] J. Pitkow, P. Pirolli, “Mining Longest Repeating Subsequences To Predict World Wide Web Surfing,” Proc. of USITS’ 99: The 2nd USENIX Symposum on Internet Technologies and Systems, Boulder, Colorado, USA, 1999. [PM96] V. N. Padmanabhan, J. C. Mogul, “Using Predictive Prefetching to Improve World Wide Web Latency,” ACM Computer Communication Review, 26(3), 1996, 22-36. [Su00] Z. Su, Q. Yang, Y. Lu, H. J. Zhang, “WhatNext: A Prediciton System for Web Requests Using N-gram Sequence Models,” Proc. of the First International Conference on Web Information System and Engineering Conference (WISE2000), Hong Kong, 200-207, June 2000. [YZL01] Q. Yang, H. H. Zhang, T. Li, “Mining Web Logs for Prediction Models in WWW Caching and Prefetching,” 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California, USA, 473-478, 2001. [WC02] Y. H. Wu, A. L. P. Chen, “Prediction of Web Page Accesses by Proxy Server Log,” World Wide Web: Internet and Information Systems, 5, 67-88, 2002. [HPY00] J. Han, J. Pei, Y. Yin, “Mining Frequent Patterns without Candidate Generation,” ACM SIGMOD Intl. Conference on Management of Data, 1-12, 2000. [GBS92] G. Gonnet, R. Baeza-Yates, T. Snider, “Chapter 5: New Indices for Text: PAT Trees and PAT Arrays,” Information Tetrieval: Data Structures & Algorithms, edited by W. B. Frakes, R. Baeza-Yates, Prentice-Hall, 1992. [YG94] T. W. Yan, H. Garcia-Molina, “Index Structures for Selective Sissemination of Information under the Boolean Model,” ACM Transactions on Database Systems, 19(2), 332-364, June 1994. [ZE98] O. Zamir, O. Etzioni, “Web Document Clustering: a Feasibility Demonstration,” Proc. of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’98), 46-54, 1998. [SKS98] S. Schechter, M. Krishnan, M. D. Smith, “Using Path Profiles to Predict HTTP Requests,” Proceedings of the Seventh International World Wide Web Conference, Brisbane, Australia, 457-467, 1998. [HSA93] E. Horowitz, S. Sahni, S. Anderson-Freed, “Chapter 5: Trees,” Fundamentals of Data Structures in C, Computer Science Press, New York, 1993. [M76] E. M. McCreight, “A Space-Economical Suffix Tree Construction Algorithm,” Journal of ACM (JACM), 23(2), 262-272, 1976. [AHU74] A. Aho, J. Hopcroff, J. Ullman. The Design and Analysis of Computer Algorithms, Addison-Wesley, 1974. [M68] D. Morrison, “PATRICIA-Practical Algorithm to Retrieve Information Coded in Alphanumeric,” Journal of ACM (JACM), 15(4), 514-534, 1968. [K73] D. Knuth, The Art of Computer Programming: Sorting and Searching, vol. 3, Addison-Wesley, 1973. [FS86] P. Flajolet, R. Sedgewick, “Digital Search Trees Revisited,” SIAM Jour. of Computing, 15(3), 748-767, 1986. [G88] G. Gonnet, “Efficient Searching of Text and Pictures (extended abstract),”Technical Report OED-88-02, Centre for the New OED, University of Waterloo, 1988.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內公開，校外永不公開 restricted 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：永不公開 not available 您的 IP(校外) 位址是 3.15.219.217 論文開放下載的時間是校外不公開 Your IP address is 3.15.219.217 This thesis will be available to you on Indicate off-campus access is not available.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS