Title page for etd-0724115-200644


[Back to Results | New Search]

URN etd-0724115-200644
Author Yu-Yang Liu
Author's Email Address No Public.
Statistics This thesis had been viewed 5572 times. Download 11 times.
Department Computer Science and Engineering
Year 2014
Semester 2
Degree Master
Type of Document
Language English
Title Parallel Genetic-Fuzzy Mining with MapReduce Architecture
Date of Defense 2015-07-24
Page Count 87
Keyword
  • MapReduce
  • genetic algorithm
  • FP-growth
  • fuzzy mining
  • data preprocessing
  • Abstract Fuzzy data mining can successfully find out hidden linguistic association rules by transforming quantity information into fuzzy membership values. In the derivation process, good membership functions play a key role in achieving the quality of finial results. In the past, some researches were proposed to train membership functions by genetic algorithms and could indeed improve the quality of found rules. Those kinds of methods were, however, suffered from the long execution time in the training phase. Besides, after appropriate fuzzy membership functions are found, mining out the frequent itemsets from them is also a very time-consuming process as traditional data mining. In this thesis, we thus propose a series of approaches based on the MapReduce architecture to speed up the GA-fuzzy mining process. The contributions can be divided into three parts, including data preprocessing, membership-function training by GA, and fuzzy association-rule derivation. All are performed by MapReduce. For data preprocessing, the proposed approach can not only transform the original data into key-value format to fit the requirement of MapReduce, but also efficiently reduce the redundant database scan by joining the quantities into lists. For membership-function training by GA, the fitness evaluation, which is the most time-costly process, is distributed to shorten the execution time. At last, a distributed fuzzy rule mining approach based on FP-growth is designed to improve the time efficiency of finding fuzzy association rules. The performance between using a single processor and using MapReduce will be compared and discussed from experiments and the results show that our approaches can efficiently reduce the execution time of the whole process.
    Advisory Committee
  • Chun-Hao Chen - chair
  • Chung-Nan Lee - co-chair
  • Chun-Wei Tsai - co-chair
  • Tzung-Pei Hong - advisor
  • Files
  • etd-0724115-200644.pdf
  • Indicate in-campus at 5 year and off-campus access at 5 year.
    Date of Submission 2015-08-24

    [Back to Results | New Search]


    Browse | Search All Available ETDs

    If you have more questions or technical problems, please contact eThesys