Please login first
Twitter Data Mining and Predictive Modeling in R
* ,
1  School of Science, Technology and Engineering Management, St. Thomas University, Miami Gardens, FL 33054, USA


R, an open source statistical programming language, can be used to gather information from the social media platform Twitter, from which tweets are collected from various news sources, celebrities, political figures, and some official colleges accounts. Other information such as screen names, number of tweets, number of followers, list of friends, and locations can be collected using the twitteR package in combination with the Twitter application programming interface (Twitter API). After collecting this data, one can perform text mining by counting the word frequency in news sources' tweets, creating data visualizations to represent frequency of words, and conduct a sentiment analysis to understand and measure the impact of certain topics and opinions expressed in this social media venue. Spatial visualizations are also created in the form of interactive maps using the location data collected from different Twitter accounts. This project explores the various ways that Twitter can be used to gather information on certain topics and how this data could be used to help predict some of the behaviors and characteristics on how people communicate through this social media source, as well as how different topics are perceived by society.

Keywords: data science, R-Studio, twitter analysis, prediction