Importance of Name Disambiguation in Scientific Databases

Authors

  • Tasleem Arif  Department of Information Technology, BGSB University, Rajouri, J&K, India and Department of Computer Science, Shaqra University, Kingdom of Saudi Arabia
  • Majid Bashir Malik  Department of Computer Sciences, BGSB University, Rajouri, J&K, India

DOI:

https://doi.org//10.32628/CSEIT217358

Keywords:

Name Disambiguation, Digital Libraries, Scientific Databases, Ambiguous Author References

Abstract

Ambiguity in digital citation databases is a major bottleneck in attribution of proper credit to authors and thus hampers the process of profiling authors in true sense. It is quite common for academics and researchers to share common or similar names and the recent surge of digital citation records has amplified the problem exponentially. Realizing the prowess of information and communication technologies and the ease with which the information can be stored, managed and shared online, traditional publishers and databases have joined the bandwagon and embarked on the journey of digitizing their records. In the absence of an effective mechanism, it becomes extremely difficult for a computer to discriminate between similar entities and more so in case of our names. This paper highlights some of the major advantages and drawbacks of prominent categories of solutions by supporting the inferences with relevant backups, wherever required.

References

  1. Hussain, I., and Asghar, S. (2017). A survey of author name disambiguation techniques: 2010–2016. The Knowledge Engineering Review, 32, E22. doi:10.1017/S0269888917000182
  2. Ferreira, A.A., Gonçalves, G.A., and Laender, H.F.A. (2012). A brief survey of automatic methods for author name disambiguation. ACM SIGMOD Record, 41(2), pp: 15-26.
  3. Arif, T., Ali, R., and Asger, M. (2015). A multistage hierarchical method for author name disambiguation, International Journal of Information Processing, 9(3), pp: 92-105.
  4. Coscia, M., Giannotti, F., Pensa., R. (2009). Social Network Analysis as Knowledge Discovery process: A case study on Digital Bibliography. Proceedings of the Advances in Social Network Analysis and Mining, 2009, pp: 279-283.
  5. Arif, T., Ali, R., and Asger, M. (2014). Author name disambiguation using vector space model and hybrid similarity measures. In Proceedings of 7th International Conference on Contemporary Computing-IC3’2014, Noida, India: IEEE. pp: 135-140.
  6. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B, 39(1), pp:1–38, 1977.
  7. Griffiths, T. and Steyvers, M. (2004). Finding scientific topics. The National Academy of Sciences, 101(1), pp: 5228–5235, 2004.
  8. Han, H., Giles, C. L., Zha, H., Li, C. and Tsioutsiouliklis, K. (2004). Two supervised learning approaches for name disambiguation in author citations. In Proceedings of 2004 JCDL, pp: 296–305, 2004.
  9. Veloso, A., Ferreira, A. A., Gonçalves, M. A., Laender, A. H. F. and Meira Jr., W. (2012). Cost-effective on-demand associative author name disambiguation. Information Processing and Management, 48(4), pp: 680– 697, 2012.
  10. Ferreira, A. A., Veloso, A., Gonçalves, M. A., and Laender, A. H. F. (2010). Effective self-training author name disambiguation in scholarly digital libraries. In Proceedings of 2010 JCDL, pp: 39–48, 2010.
  11. Torvik, V.I., Weeber, M., Swanson, D.R., and Smalheiser, N.R. (2005). A probabilistic similarity metric for Medline records: A model for author name disambiguation: Research articles. Journal of the American Society for Information Science and Technology, 56(2), pp: 140–158.
  12. Kanani, P., McCallum, A., and Pal, C. (2007). Improving author coreference by resource-bounded information gathering from the web. In Proceedings of 20th International Joint Conference on Artificial Intelligence-IJCAI, Hyderabad, India, pp: 429-434.
  13. Yang, K.-H., Peng, H.-T., Jiang, J.-Y., Lee, H.-M., and Ho, J.-M. (2008). Author name disambiguation for citations using topic and web correlation. In B. Christensen-Dalsgaard, D. Castelli, B.A. Jurik, & J. Lippincott (Eds.), Research and advanced technology for digital libraries (pp. 185–196). Berlin Heidelberg: Springer.
  14. Kang, I.-S., Na, S.-H., Lee, S., Jung, H., Kim, P., Sung,W.-K., and Lee, J.-H. (2009). On co-authorship for author disambiguation. Information Processing & Management, 45(1), pp: 84–97.
  15. Pereira, D.A., Ribeiro-Neto, B., Ziviani, N., Laender, A.H., Gonçalves, M.A., and Ferreira, A.A. (2009). Using web information for author name disambiguation. Paper presented at the Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, ACM.
  16. D’Angelo, C.A., Giuffrida, C., and Abramo, G. (2011). A heuristic approach to author name disambiguation in bibliometrics databases for large-scale research assessments. Journal of the American Society for Information Science and Technology, 62(2), pp: 257–269.
  17. Smalheiser, N.R., and Torvik, V.I. (2009). Author name disambiguation. Annual Review of Information Science and Technology, 43(1), pp: 1–43.
  18. Tang, J., Fong, A.C.M., Wang, B., and Zhang, J. (2012). “A Unified Probabilistic Framework for Name Disambiguation in Digital Library.” IEEE Transactions on Knowledge and Data Engineering, Vol. 24, No. 6, June 2012, pages: 975-987.
  19. Liu, Y., Li, W., Huang, Z. and Fang, Q. (2014). A fast method based on multiple clustering for name disambiguation in bibliographic citations. Journal of the Association for Information Science and Technology. DOI: 10.1002/asi.23183
  20. Arif, T., Asger, M., and Ali, R. (2014). Author name disambiguation using two stage clustering. INROADS (Special Issue), ISSN: 2277-4904, 3(1), pp: 340-345.

Downloads

Published

2018-09-28

Issue

Section

Research Articles

How to Cite

[1]
Tasleem Arif, Majid Bashir Malik, " Importance of Name Disambiguation in Scientific Databases, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 7, pp.487-502, September-October-2018. Available at doi : https://doi.org/10.32628/CSEIT217358