|Author's Email Address
||This thesis had been viewed 5355 times. Download 0 times.|
||Computer Science and Engineering|
|Type of Document
||Data Mining of National Health Insurance Research Database: Design and Implementation of Visualized Automatic Query Language Generator|
|Date of Defense
||Taiwan is one of the minority country that implement National Health Insurance. From 1995, Taiwan started National Health Insurance and have been saved the whole medical data till now. These medical data are very important in medical research because of their high comprehensiveness and completeness. They can be the keystone for development of Preventive Medicine.|
We cooperate with Kaohsiung Medical University to build the platform of Healthcare database that can do the mining and analysis. In order to save and search the big data on Healthcare database, we choose Hadoop to build the platform because it have high scalability and high speed of distribute processing. And we can also use Impala which can do the SQL searching on Hadoop with In-memory distribute processing feature to get the data we want in a short time.
Our system includes two types of searching methods:
The first one is the “General Searching Method”. This method will separate the logic of SQL language and guide users to generate their SQL step by step. And this method can get almost all the columns in Healthcare database.
The second one is the “Disease and Drug Searching Method”. This Method focus on disease and drug. It is simpler than General Searching Method by using just one page form to get all the condition that needed. It also has some special design for condition setting or data limitation for disease/drug. The SQL language generated by this method will provide users an all people form that contains basic information and target disease/drug information. This form can provide medical researcher an easier way to analysis those target disease/drug. After SQL generate step, Disease and Drug Searching Method has the analysis function that offers medical researcher a visualize way to know the contents of the SQL they generated.
Finally, we have an experiment to optimize our SQL module used in Disease and Drug Searching Method. In this experiment, we will explore the criteria and methods of Impala SQL optimization.
||Shi-Huang Chen - chair|
Jain-shing Liu - co-chair
You-Chiun Wang - co-chair
Wei-Kuang Lai - co-chair
Chun-Hung Lin - advisor
Indicate in-campus at 5 year and off-campus access at 5 year.|
|Date of Submission