Viral Hepatitis
Copyright ©The Author(s) 2004. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Gastroenterol. Dec 15, 2004; 10(24): 3579-3582
Published online Dec 15, 2004. doi: 10.3748/wjg.v10.i24.3579
Forecasting model for the incidence of hepatitis A based on artificial neural network
Peng Guan, De-Sheng Huang, Bao-Sen Zhou
Peng Guan, Bao-Sen Zhou, Department of Epidemiology, School of Public Health, China Medical University, Shenyang 110001, Liaoning Province, China
De-Sheng Huang, Department of Mathematics, College of Basic Medical Sciences, China Medical University, Shenyang 110001, Liaoning Province, China
Author contributions: All authors contributed equally to the work.
Supported by the National Natural Science Foundation of China, No. 30170833
Correspondence to: Dr. Bao-Sen Zhou, Department of Epidemiology, School of Public Health, China Medical University, Shenyang 110001, Liaoning Province, China. bszhou@mail.cmu.edu.cn
Telephone: +86-24-23256666 Ext. 5401
Received: April 6, 2004
Revised: May 2, 2004
Accepted: May 9, 2004
Published online: December 15, 2004
Abstract

AIM: To study the application of artificial neural network (ANN) in forecasting the incidence of hepatitis A, which had an autoregression phenomenon.

METHODS: The data of the incidence of hepatitis A in Liaoning Province from 1981 to 2001 were obtained from Liaoning Disease Control and Prevention Center. We used the autoregressive integrated moving average (ARIMA) model of time series analysis to determine whether there was any autoregression phenomenon in the data. Then the data of the incidence were switched into [0,1] intervals as the network theoretical output. The data from 1981 to 1997 were used as the training and verifying sets and the data from 1998 to 2001 were made up into the test set. STATISTICA neural network (ST NN) was used to construct, train and simulate the artificial neural network.

RESULTS: Twenty-four networks were tested and seven were retained. The best network we found had excellent performance, its regression ratio was 0.73, and its correlation was 0.69. There were 2 input variables in the network, one was AR(1), and the other was time. The number of units in hidden layer was 3. In ARIMA time series analysis results, the best model was first order autoregression without difference and smoothness. The total sum square error of the ANN model was 9090.21, the sum square error of the training set and testing set was 8377.52 and 712.69, respectively, they were all less than that of ARIMA model. The corresponding value of ARIMA was 12291.79, 8944.95 and 3346.84, respectively. The correlation coefficient of nonlinear regression (RNL) of ANN was 0.71, while the RNL of ARIMA linear autoregression model was 0.66.

CONCLUSION: ANN is superior to conventional methods in forecasting the incidence of hepatitis A which has an autoregression phenomenon.

Keywords: $[Keywords]