Research Issues in Biological Data Mining: A Review

Authors

  • Kanica Sachdev  Computer Science and Engineering Department SMVDU, J&K, India
  • Manoj Kumar Gupta  Computer Science and Engineering Department SMVDU, J&K, India

Keywords:

Biological Data Mining, Visual Data Mining, Biclustering, Pathway Analysis

Abstract

Biological data is evolving at a very fast rate in the recent years. Large datasets of biological data are now available for analysis and inference. Biological data mining techniques help in the understanding of this data to help biologists to study and visualize the relation between this data under different conditions. This paper presents the biological data mining research areas and the corresponding tools that have been developed in these areas. It studies the various techniques of biological data mining data to provide an idea of the current state of research and introduces future directions for researchers to work in these fields.

References

  1. Romero, Cristóbal, and Sebastián Ventura. "Educational data mining: a review of the state of the art." IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 40, no. 6 (2010): 601-618.
  2. Chen, Ming-Syan, Jiawei Han, and Philip S. Yu. "Data mining: an overview from a database perspective." IEEE Transactions on Knowledge and data Engineering 8, no. 6 (1996): 866-883.
  3. Han, Jiawei. "How can data mining help bio-data analysis?." In Proceedings of the 2nd International Conference on Data Mining in Bioinformatics, pp. 1-2. Springer-Verlag, 2002.
  4. Cook, Charles E., Mary Todd Bergman, Robert D. Finn, Guy Cochrane, Ewan Birney, and Rolf Apweiler. "The European Bioinformatics Institute in 2016: data growth and integration." Nucleic acids research 44, no. D1 (2015): D20-D26.
  5. Li, Xiaoli, See-Kiong Ng, and Jason TL Wang, eds. Biological data mining and its applications in healthcare. Vol. 8. World Scientific, 2013.
  6. Li, Yixue, and Luonan Chen. "Big biological data: challenges and opportunities." Genomics, proteomics & bioinformatics 12, no. 5 (2014): 187-189.
  7. Yang, Qiang, and Xindong Wu. "10 challenging problems in data mining research." International Journal of Information Technology & Decision Making 5, no. 04 (2006): 597-604.
  8. Marx, Vivien. "Biology: The big challenges of big data." Nature 498, no. 7453 (2013): 255-260.
  9. O'Donoghue, Seán I., Anne-Claude Gavin, Nils Gehlenborg, David S. Goodsell, Jean-Karim Hériché, Cydney B. Nielsen, Chris North et al. "Visualizing biological data—now and in the future." Nature methods 7 (2010): S2-S4.
  10. Venter, J. Craig, Mark D. Adams, Eugene W. Myers, Peter W. Li, Richard J. Mural, Granger G. Sutton, Hamilton O. Smith et al. "The sequence of the human genome." science 291, no. 5507 (2001): 1304-1351.
  11. Dunham, I., R. Durbin, J. Thierry-Mieg, and D. R. Bentley. "Physical mapping projects and ACEDB." Guide to human genome computing (1994): 111-158.
  12. Epstein, Jonathan A., Jonathan A. Kans, and Gregory D. Schuler. "WWW Entrez: A Hypertext Retrieval Tool for Molecular Biology." (1994).
  13. Epstein, Jonathan A., Jonathan A. Kans, and Gregory D. Schuler. "WWW Entrez: A Hypertext Retrieval Tool for Molecular Biology." (1994).
  14. Pook, Stuart, Guy Vaysseix, and Emmanuel Barillot. "Zomit: biological data visualization and browsing." Bioinformatics (Oxford, England) 14, no. 9 (1998): 807-814.
  15. Lewis, Suzanna E., S. M. J. Searle, N. Harris, M. Gibson, V. Iyer, J. Richter, C. Wiel et al. "Apollo: a sequence annotation editor." Genome biology 3, no. 12 (2002): research0082-1.
  16. Stein, Lincoln D., Christopher Mungall, ShengQiang Shu, Michael Caudy, Marco Mangone, Allen Day, Elizabeth Nickerson et al. "The generic genome browser: a building block for a model organism system database." Genome research 12, no. 10 (2002): 1599-1610.
  17. Sun, Hao, and Ramana V. Davuluri. "Java-based application framework for visualization of gene regulatory region annotations." Bioinformatics 20, no. 5 (2004): 727-734.
  18. Thorvaldsdóttir, Helga, James T. Robinson, and Jill P. Mesirov. "Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration." Briefings in bioinformatics 14, no. 2 (2013): 178-192.
  19. Cui, Ya, Xiaowei Chen, Huaxia Luo, Zhen Fan, Jianjun Luo, Shunmin He, Haiyan Yue, Peng Zhang, and Runsheng Chen. "BioCircos. js: an interactive Circos JavaScript library for biological data visualization on web applications." Bioinformatics 32, no. 11 (2016): 1740-1742.
  20. Bertini, Enrico, Andrada Tatu, and Daniel Keim. "Quality metrics in high-dimensional data visualization: An overview and systematization." IEEE Transactions on Visualization and Computer Graphics 17, no. 12 (2011): 2203-2212.
  21. Tanay, Amos, Roded Sharan, and Ron Shamir. "Discovering statistically significant biclusters in gene expression data." Bioinformatics 18, no. suppl_1 (2002): S136-S144.
  22. Tanay, Amos, Roded Sharan, and Ron Shamir. "Discovering statistically significant biclusters in gene expression data." Bioinformatics 18, no. suppl_1 (2002): S136-S144.
  23. Madeira, Sara C., and Arlindo L. Oliveira. "Biclustering algorithms for biological data analysis: a survey." IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 1, no. 1 (2004): 24-45.
  24. Cheng, Yizong, and George M. Church. "Biclustering of expression data." In Ismb, vol. 8, no. 2000, pp. 93-103. 2000.
  25. Ben-Dor, Amir, Benny Chor, Richard Karp, and Zohar Yakhini. "Discovering local structure in gene expression data: the order-preserving submatrix problem." Journal of computational biology 10, no. 3-4 (2003): 373-384.
  26. Murali, T. M., and Simon Kasif. "Extracting conserved gene expression motifs from gene expression data." In Pacific symposium on biocomputing, vol. 8, pp. 77-88. 2003.
  27. Prelić, Amela, Stefan Bleuler, Philip Zimmermann, Anja Wille, Peter Bühlmann, Wilhelm Gruissem, Lars Hennig, Lothar Thiele, and Eckart Zitzler. "A systematic comparison and evaluation of biclustering methods for gene expression data." Bioinformatics 22, no. 9 (2006): 1122-1129.
  28. Li, Guojun, Qin Ma, Haibao Tang, Andrew H. Paterson, and Ying Xu. "QUBIC: a qualitative biclustering algorithm for analyses of gene expression data." Nucleic acids research 37, no. 15 (2009): e101-e101.
  29. Eren, Kemal, Mehmet Deveci, Onur Küçüktunç, and Ümit V. Çatalyürek. "A comparative analysis of biclustering algorithms for gene expression data." Briefings in bioinformatics 14, no. 3 (2012): 279-292.
  30. Gu, Jiajun, and Jun S. Liu. "Bayesian biclustering of gene expression data." BMC genomics 9, no. 1 (2008): S4.
  31. Kluger, Yuval, Ronen Basri, Joseph T. Chang, and Mark Gerstein. "Spectral biclustering of microarray data: coclustering genes and conditions." Genome research 13, no. 4 (2003): 703-716.
  32. Madeira, Sara C., and Arlindo L. Oliveira. "Biclustering algorithms for biological data analysis: a survey." IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 1, no. 1 (2004): 24-45.
  33. Wang, Kai, Mingyao Li, and Hakon Hakonarson. "Analysing biological pathways in genome-wide association studies." Nature reviews. Genetics 11, no. 12 (2010): 843.
  34. Kramer, Andreas, Jeff Green, Jack Pollard Jr, and Stuart Tugendreich. "Causal analysis approaches in ingenuity pathway analysis." Bioinformatics 30, no. 4 (2013): 523-530.
  35. Chu, Lillian, Eric Scharf, and Takashi Kondo. "GeneSpringTM: tools for analyzing microarray expression data." Genome Informatics 12 (2001): 227-229.
  36. Shannon, Paul, Andrew Markiel, Owen Ozier, Nitin S. Baliga, Jonathan T. Wang, Daniel Ramage, Nada Amin, Benno Schwikowski, and Trey Ideker. "Cytoscape: a software environment for integrated models of biomolecular interaction networks." Genome research 13, no. 11 (2003): 2498-2504.
  37. Hu, Zhenjun, Joseph Mellor, Jie Wu, and Charles DeLisi. "VisANT: an online visualization and analysis tool for biological interaction data." BMC bioinformatics 5, no. 1 (2004): 17.

Downloads

Published

2017-09-30

Issue

Section

Research Articles

How to Cite

[1]
Kanica Sachdev, Manoj Kumar Gupta, " Research Issues in Biological Data Mining: A Review, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 7, pp.26-34, September-2017.