||The study is the analysis and prediction of the health insurance data , financial stock market and climate. Based on the data platform (Hadoop) and the storage of static data (i.e. historical data), through the thesis, we analyzed and tried to predict the correlation among disease, related factors and stock market.|
In the past, studies focused on historical information and the estimation of stimulation, but ignored the influence of time. That is to say, if we concentrate on the characteristics of time series (smoothness, trend, seasonal), with high correlation coefficient factor, we can issue a model which is predictable and analyzes properly. According to different situations, such as diseases, climate, region and some other external factors, such a model can check its accuracy and, therefore, predict future trend.
First of all, we confirm the relevant factors to explore the initial characteristics, finding the correlation. Secondly, we test and check whether or not the target time series meet the model requirements, and find out the best combination, which is based on ACF and PACF. The result will be used on ARIMA (Auto-Regressive Integrated Moving Average Model) to do the regression correction respectively.
Last but not least, we test the reliability of the model again. In the thesis, we will also present a variety of different correction model cases and show revised predictions.