Open Access Open Access  Restricted Access Subscription or Fee Access

Comparison of Missing Value Imputation Methods for Malaysian Hourly Rainfall Data

Noorhafizah Mazlan, Nurul Aishah Rahman, Sayang Mohd Deni


Precipitation or rainfall plays a significant role in climatological and hydrological research. Rainfall is an important climatic parameter and the studies on rainfall are commonly hampered due to lack of continuous data. The problems of missing data, insufficient length of hydrological data series and poor quality are common in developing countries. The aim of this study is to compare several imputation methods in order to complete the rainfall missing values in time series. The missing value of hourly rainfall artificially creates a series of station obtained from the Malaysian Department of Drainage and Irrigation (DID). The Simple Arithmetic Average (SAA) and Expectation Maximization Algorithm Method (EM) are the two traditional methods by single imputation techniques that will be in use for the purpose of this study. The most recent methods such as Markov Chain Monte Carlo (MCMC) and the EM-MCMC algorithm method which are both compassionate methods will also be considered. In evaluating the performance of all the imputation methods mentioned above, the three scenarios of missing data set are considered to represent the missing percentage at the period of time. All the four methods employ concurrent observations of reference stations to impute the missing values. The selection of target stations is based on the degree of highly correlated reference stations. The performances of all methods are then compared both quantitatively and graphically for better understanding. The mean square error, root mean square error and coefficient of variation root mean square error will be used to assess the performance of the imputation methods. The result indicated that the EM-MCMC is found to be more suitable for Lalang Sg. Lui station for its consistent results with the respective percentage of missingness compared to other methods. However, the multiple imputation using MCMC methods shows more favorable result as the findings showed data that is more consistent, accurate and robust at Batu Pahat and Ibu Bekalan Km.16, Gombak station.


Rainfall, missing, imputation, MCMC, EM-MCMC.

Full Text:


Disclaimer/Regarding indexing issue:

We have provided the online access of all issues and papers to the indexing agencies (as given on journal web site). It’s depend on indexing agencies when, how and what manner they can index or not. Hence, we like to inform that on the basis of earlier indexing, we can’t predict the today or future indexing policy of third party (i.e. indexing agencies) as they have right to discontinue any journal at any time without prior information to the journal. So, please neither sends any question nor expects any answer from us on the behalf of third party i.e. indexing agencies.Hence, we will not issue any certificate or letter for indexing issue. Our role is just to provide the online access to them. So we do properly this and one can visit indexing agencies website to get the authentic information.