R, an open source statistical programming language, can be used to gather information from the social media platform Twitter, from which tweets are collected from various news sources, celebrities, political figures, and some official colleges accounts. Other information such as screen names, number of tweets, number of followers, list of friends, and locations can be collected using the twitteR package in combination with the Twitter application programming interface (Twitter API). After collecting this data, one can perform text mining by counting the word frequency in news sources' tweets, creating data visualizations to represent frequency of words, and conduct a sentiment analysis to understand and measure the impact of certain topics and opinions expressed in this social media venue. Spatial visualizations are also created in the form of interactive maps using the location data collected from different Twitter accounts. This project explores the various ways that Twitter can be used to gather information on certain topics and how this data could be used to help predict some of the behaviors and characteristics on how people communicate through this social media source, as well as how different topics are perceived by society.
Previous Article in event Previous Article in congress
Next Article in event
Twitter Data Mining and Predictive Modeling in R
Published: 20 December 2017 by MDPI in MOL2NET'17, Conference on Molecular, Biomed., Comput. & Network Science and Engineering, 3rd ed. congress USEDAT-03: USA-EU Data Analysis Training Prog. Work., Cambridge, UK-Bilbao, Spain-Duluth, USA, 2017
Keywords: data science, R-Studio, twitter analysis, prediction