Multiple-Time-Series Clinical Data Processing for Classification Using Merging Algorithm

Authors

  • Kokila Ikhar  Department of Computer Science & Engineering, V. M. Institute of Engineering & Technology, Nagpur, Madhya Pradesh, India
  • Prof. Gurudev B. Sawarkar  Department of Computer Science & Engineering, V. M. Institute of Engineering & Technology, Nagpur, Madhya Pradesh, India

Keywords:

Data Mining, Data Processing, Multiple Measurements, Support Vector Machine (SVM), Time-Series Analysis.

Abstract

A depiction of patient conditions ought to comprise of the progressions in and mix of clinical measures. Customary data-preparing technique and classification calculations may make clinical data vanish and lessen forecast execution. To enhance the precision of clinical-result forecast by utilizing numerous estimations, another various time-arrangement data preparing calculation with period combining is proposed. Clinical data from 83 hepatocellular carcinoma (HCC) patients were utilized as a part of this exploration. Their clinical reports from a characterized period were combined utilizing the proposed blending calculation, and factual measures were likewise ascertained. After data handling, numerous estimations bolster vector machine (MMSVM) with outspread premise work (RBF) parts was utilized as a classification technique to foresee HCC repeat. A numerous estimations arbitrary backwoods relapse (MMRF) was likewise utilized as an extra assessment/classification method. To assess the data-combining calculation, the execution of forecast utilizing handled different estimations was contrasted with expectation utilizing single estimations. The aftereffects of repeat expectation by MMSVM with RBF utilizing different estimations and a time of 120 days (precision 0.771, adjusted exactness 0.603) were ideal, and their prevalence over the outcomes acquired utilizing single estimations was factually noteworthy (exactness 0.626, adjusted exactness 0.459, P < 0.01). In the instances of MMRF, the forecast comes about acquired in the wake of applying the proposed combining calculations were additionally superior to anything single measurement comes about (P < 0.05). The outcomes demonstrate that the execution of HCC-repeat forecast was fundamentally enhanced when the proposed data-handling calculation was utilized, and that various estimations could be of more noteworthy incentive than single.

References

  1. J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, 2nd ed. San Mateo, CA, USA: Morgan Kaufmann, 2005.
  2. J. Blackburn, S. Brumby, S. Willder, and R. McKnight, “Intervening to improve health indicators among Australian farm families,” J. Agromed., vol. 14, no. 3, pp. 345–356, 2009.
  3. E. Sobngwi, J.-C. Mbanya, N. C. Unwin, R. Porcher, A.-P. Kengne, L. Fezeu, E. M. Minkoulou, C. Tournoux, J.-F. Gautier, and T. J. Aspray, “Exposure over the life course to an urban environment and its relation with obesity, diabetes, and hypertension in rural and urban Cameroon,” Int. J. Epidemiol., vol. 33, no. 4, pp. 769–776, 2004.
  4. C. Scheidt-Nave, P. Kamtsiuris, A. G¨oßwald, H. H¨olling, M. Lange, M. A. Busch, S. Dahm, R. D¨olle, U. Ellert, and J. Fuchs, “German health interview and examination survey for adults (DEGS)— design, objectives and implementation of the first data collection wave,” BMC Public Health, vol. 12, art. no. 730, 2012. [Online]. Available http://www.biomedcentral.com/1471-2458/12/730
  5. S. Zhi, W. Sheng, and S. P. Levine, “National occupational health service policies and programs for workers in small-scale industries in China,” Amer. Ind. Hygiene Assoc., vol. 61, no. 6, pp. 842–849, 2000.
  6. J. Han and M. Kamber, Data Mining: Concepts and Techniques, 2nd ed. San Francisco, CA, USA: Morgan Kaufmann, 2006
  7. M.A.Hern´andez and S. J. Stolfo, “Real-world data is dirty: Data cleansing and the merge/purge problem,” Data Mining Knowl. Discovery, vol. 2, no. 1, pp. 9–37, 1998.
  8. M. Lee, H. Lu, T. Ling, and Y. Ko, “Cleansing data for mining and warehousing,” in Database and Expert Systems Applications, T. Bench- Capon, G. Soda, and A. Tjoa, Eds. Berlin, Germany: Springer, 1999, p. 807.
  9. W. Lup Low, M. Li Lee, and T. Wang Ling, “A knowledge-based approach for duplicate elimination in data cleaning,” Inf. Syst., vol. 26, no. 8, pp. 585–606, 2001.
  10. M. Lenzerini, “Data integration: A theoretical perspective,” in Proc. 21st ACMSIGMOD-SIGACT-SIGART Symp. Principles Database Syst.,Madison, WI, USA, 2002, pp. 233–246.
  11. A. A. Hancock, E. N. Bush, D. Stanisic, J. J. Kyncl, and C. T. Lin, “Data normalization before statistical analysis: Keeping the horse before the cart,” Trends Pharmacol. Sci., vol. 9, no. 1, pp. 29–32, 1988.
  12. A. S. C. Ehrenberg, Data Reduction: Analysing and Interpreting Statistical Data. New York, NY, USA: Wiley, 1975.
  13. M. Stacey and C. McGregor, “Temporal abstraction in intelligent clinical data analysis: A survey,” Artif. Intell. Med., vol. 39, no. 1, pp. 1–24, 2007.
  14. M. Stacey, C. McGregor, and M. Tracy, “An architecture for multidimensional temporal abstraction and its application to support neonatal intensive care,” in Conf. Proc. IEEE Eng. Med. Biol. Soc., 2007, pp. 3752–3756.
  15. M. Campos, J. Palma, and R. Mar´ın, “Temporal data mining with temporal constraints artificial intelligence in medicine,” in Artificial Intelligence inMedicine, R. Bellazzi, A. Abu-Hanna, and J. Hunter, Eds. Berlin, Germany: Springer, 2007, pp. 67–76.
  16. A. Juan Carlos, “Temporal reasoning for decision support in medicine,” Artif. Intell. Med., vol. 33, no. 1, pp. 1–24, 2005.
  17. Breiman L, ‘Random forests’ (2001), Mach. Learning, Vol. 45, No. 1, pp. 5–32.
  18. Damian Bargiel and Herrmann S (2011), ‘Multi-temporal land-cover classification of agricultural areas in two European regions with high resolution spotlight terraSAR-X data’., Vol.3, No.5, pp. 859–877.
  19. Campos M, Palma J, and Mar´ın R (2007), ‘Temporal data mining with temporal constraints artificial intelligence in medicine’, Vol.2, No.4, pp. 67–76.
  20. Dowdy S, Wearden S, and Chilko D (2004), ‘Statistics for Research’, 3rd ed. New York, NY, USA: Wiley, Vol.42, No.6, pp. 625-640.
  21. Exarchos T, Tsipouras M, Papaloukas C, and Fotiadis D (2009), ‘An optimized sequential pattern matching methodology for sequence classification,’ Vol. 19, No. 2, pp. 249–264.
  22. Gardner S P (2005) ‘Ontologies and Semantic Data Integration’, Drug Discovery Today, Vol.10, No.14, pp.1001-1007.
  23. Gupta S, Kumar D, Sharma A (2011), ‘Data Mining Classification Techniques Applied For Breast Cancer Diagnosis And Prognosis’, Vol.11, No.7, pp.125-135.
  24. Han J and Kamber M (2006), ‘Data Mining: Concepts and Techniques’, 2nd ed. San Francisco, CA, USA: Morgan Kaufmann, Vol.20, No.12, pp.325-365.
  25. Hayder m, Albehadili, Abdurrahman and E. Islam, An algorithm for time series prediction using particle swarm optimization (PSO), In IJSK, vol.4, 2014.
  26. Joseph Sexton, Urban D L, Donohue M J, and Song C (2005), ‘Long-term land cover dynamics by multi-temporal classification across the Landsat- 5 record,’ Remote Sens. Environ., Vol. 128, No.4, pp. 246–258.
  27. K. Kaushansky, “Thrombopoietin: The primary regulator of platelet production,” Blood, vol. 86, no. 2, pp. 419–431, 1995.
  28. H.-I. Yang, S.-N. Lu, Y.-F. Liaw, S.-L. You, C.-A. Sun, L.-Y.Wang, C. K. Hsiao, P.-J. Chen, D.-S. Chen, and C.-J. Chen, “Hepatitis B e antigen and the risk of hepatocellular carcinoma,” New Engl. J. Med., vol. 347, no. 3, pp. 168–174, 2002.
  29. A. M. Di Bisceglie, “Hepatitis B and hepatocellular carcinoma,” Hepatology, vol. 49, no. S5, pp. S56–S60, 2009.
  30. M. Chuma, S. Hige, T.Kamiyama, T. Meguro, A. Nagasaka,K.Nakanishi, Y. Yamamoto, M. Nakanishi, T. Kohara, T. Sho, K. Yamamoto, H. Horimoto, T. Kobayashi, H. Yokoo, M. Matsushita, S. Todo, and M. Asaka, “The influence of hepatitisBDNAlevel and antiviral therapy on recurrence after initial curative treatment in patients with hepatocellular carcinoma,” J. Gastroenterol., vol. 44, no. 9, pp. 991–999, 2009.
  31. J.-C. Wu, Y.-H. Huang, G.-Y. Chau, C.-W. Su, C.-R. Lai, P.-C. Lee, T.-I. Huo, I. J. Sheen, S.-D. Lee, andW.-Y. Lui, “Risk factors for early and late recurrence in hepatitis B-related hepatocellular carcinoma,” J. Hepatol., vol. 51, no. 5, pp. 890–897, 2009.
  32. I. F. N. Hung, R. T. P. Poon, C.-L. Lai, J. Fung, S.-T. Fan, andM.-F. Yuen, “Recurrence of hepatitis b-related hepatocellular carcinoma is associated with high viral load at the time of resection,” Amer. J. Gastroenterol., vol. 103, no. 7, pp. 1663–1673, 2007.

Downloads

Published

2017-08-31

Issue

Section

Research Articles

How to Cite

[1]
Kokila Ikhar, Prof. Gurudev B. Sawarkar, " Multiple-Time-Series Clinical Data Processing for Classification Using Merging Algorithm, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 4, pp.52-59, July-August-2017.