Predicting the Best Tweet Using Machine Learning

Authors

  • Mr. L. A. Saleem   Assistant Professor, Department of Computer Science & Engineering, BVRIT Hyderabad College of Engineering for women, Hyderabad, Telangana, India
  • I. Yagna Likhitha  Department of Computer Science & Engineering, BVRIT Hyderabad College of Engineering for women, Hyderabad, Telangana, India
  • E. Haritha  Department of Computer Science & Engineering, BVRIT Hyderabad College of Engineering for women, Hyderabad, Telangana, India
  • P. Jyothsna  Department of Computer Science & Engineering, BVRIT Hyderabad College of Engineering for women, Hyderabad, Telangana, India

Keywords:

AI, Machine learning model, Theil-Sen Estimator, best fit line, retweet, score, topology, information propagation

Abstract

Twitter is an online social networking service that enables users to post and interact with messages known as tweets which are limited to 140 characters. Most of the communication here is done through following or retweeting. Retweet, is a main way to spread information in twitter. It is changing the geography of communication. Recently research focuses on analyzing the factors of retweet behavior. It's data feed includes all kinds of meta-data. In order to predict which tweet would be retweeted there are two ways. One is through studying the information propagation[1] path topology to build prediction model, but it is a very difficult task to construct the topology of user networks[2]. The other way is to build prediction model based on machine learning algorithm[3].Previously, a machine learning model was built using Theil-Sen Estimator[4] and a best fit line was fit and words in a tweet were scored. We built a machine learning model which not only scores the words in a tweet but also predicts[5] the best tweet among two tweets given as the input from the user. A basic fact is that different people are interested in different kinds of tweets, and they will retweet tweets which they are interested in. First, we collect tweets of different categories from valid account of famous news media as learning corpus which acts as a dataset for our machine learning algorithm.

References

  1. Sadikov,E. & Martinez,M. "Information Propagation on Twitter." CS322 Project Report 2009.
  2. Uysal,I. and Croft,W. B. 2011. User oriented tweet ranking: a filtering approach to microblogs. In proceedings of the 20th ACM international conference on Information and knowledge management (CIKM '11),Bettina Berendt,Arjen de Vries,Wenfei Fan,Craig Macdonald,Iadh Ounis,and Ian Ruthven (Eds.). ACM,New York,NY,USA,2261-2264. DOI=10.1145/2063576.2063941
  3. Scikit-learn: Machine Learning in Python,Pedregosa et al.,JMLR 12,2011,pp. 2825-2830.
  4. Dang Xin,H Peng,X Wang and H Zhang (2008),Theil-Sen Estimators in a Multiple Linear Regression Model. Submitted paper.
  5. L. Madlberger and A. Almansour,"Predictions based on Twitter-A critical view on the research process," 2014 International Conference on Data and Software Engineering (ICODSE),Bandung,2014,pp. 1-6.doi: 10.1109/ICODSE.2014.7062667
  6. Yoon S,Elhadad N,Bakken S. A Practical Approach for Content Mining of Tweets. American journal of preventive medicine. 2013;45(1):122-129. doi:10.1016/j.amepre.2013.02.025.
  7. Anjali Ganesh Jivani ,A Comparative Study of Stemming Algorithms,International Journal of Computer,Technology and Application,Volume 2,ISSN:2229-6093.
  8. S. Diaz-Santiago,L. M. Rodriguez-Henriquez,D. Chakraborty,"A cryptographic study of tokenization systems",International Journal of Information Security,vol. 15,no. 4,pp. 413-432,2016.
  9. K. Lang,"20 newsgroup data set". Available at: qwone.com/~jason/20Newsgroups/. Accessed 30-Sep-2015
  10. Srivastava,A.K. and Shalabh (1995): "Predictions in Linear Regression Models With Measurement Errors",Indian Journal of Applied Economics,Vol. 4,No. 2,pp. 1-14.
  11. Y. Mei,W. Zhao and J. Yang,"Maximizing the Effectiveness of Advertising Campaigns on Twitter," 2017 IEEE International Congress on Big Data (BigData Congress), Honolulu, HI,2 017, pp. 73-80.doi: 10.1109/BigDataCongress.2017.19

Downloads

Published

2018-04-30

Issue

Section

Research Articles

How to Cite

[1]
Mr. L. A. Saleem , I. Yagna Likhitha, E. Haritha, P. Jyothsna, " Predicting the Best Tweet Using Machine Learning, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 3, pp.1213-1216, March-April-2018.