Multiple-Time-Series Clinical Data Processing for Classification With Merging Algorithm and Statistical Measures


A description of patient conditions should consist of the changes in and combination of clinical measures. Traditional data-processing method and classification algorithms might cause clinical information to disappear and reduce prediction performance. To improve the accuracy of clinical-outcome prediction by using multiple measurements, a new multiple-time-series data-processing algorithm with period merging is proposed. Clinical data from 83 hepatocellular carcinoma (HCC) patients were used in this research. Their clinical reports from a defined period were merged using the proposed merging algorithm, and statistical measures were also calculated. After data processing, Multiple Measurements Support Vector Machine (MMSVM) with radial basis function (RBF) kernels was used as a classification method to predict HCC recurrence. A Multiple Measurements Random Forest regression (MMRF) was also used as an additional evaluation/ classification method. To evaluate the data-merging algorithm, the performance of prediction using processed multiple measurements was compared to prediction using single measurements. The results of recurrence prediction by MMSVM with RBF using multiple measurements and a period of 120 days (accuracy 0.771, balanced accuracy 0.603) were optimal, and their superiority to the results obtained using single measurements was statistically significant (accuracy 0.626, balanced accuracy 0.459, P<0.01). In the cases of MMRF, the prediction results obtained after applying the proposed merging algorithm were also better than single-measurement results (P<0.05). The results show that the performance of HCC-recurrence prediction was significantly improved when the proposed data-processing algorithm was used, and that multiple measurements could be of greater value than single measurements in HCC-recurrence prediction.

In IEEE Journal of Biomedical and Health Informatics.