Twitter Data Mining and Predictive Modeling in R

Reinaldo Sanchez-Arias; Eliana Espinosa

doi:10.3390/mol2net-03-05097

Previous Article in event

Differential expression of native potatoes genes in response to drought conditions

Previous Article in congress

The KP algorithm for the analysis of the optimal flow of information

Next Article in event

Creating a Model to Predict Student Success using WeBWorK data

Twitter Data Mining and Predictive Modeling in R

Reinaldo Sanchez-Arias

^*,

Eliana Espinosa

¹ School of Science, Technology and Engineering Management, St. Thomas University, Miami Gardens, FL 33054, USA

Published: 20 December 2017 by MDPI in MOL2NET'17, Conference on Molecular, Biomed., Comput. & Network Science and Engineering, 3rd ed. congress USEDAT-03: USA-EU Data Analysis Training Prog. Work., Cambridge, UK-Bilbao, Spain-Duluth, USA, 2017

https://doi.org/10.3390/mol2net-03-05097

Abstract:

R, an open source statistical programming language, can be used to gather information from the social media platform Twitter, from which tweets are collected from various news sources, celebrities, political figures, and some official colleges accounts. Other information such as screen names, number of tweets, number of followers, list of friends, and locations can be collected using the twitteR package in combination with the Twitter application programming interface (Twitter API). After collecting this data, one can perform text mining by counting the word frequency in news sources' tweets, creating data visualizations to represent frequency of words, and conduct a sentiment analysis to understand and measure the impact of certain topics and opinions expressed in this social media venue. Spatial visualizations are also created in the form of interactive maps using the location data collected from different Twitter accounts. This project explores the various ways that Twitter can be used to gather information on certain topics and how this data could be used to help predict some of the behaviors and characteristics on how people communicate through this social media source, as well as how different topics are perceived by society.

Keywords: data science, R-Studio, twitter analysis, prediction

View Poster

133 Reads

Reinaldo Sanchez-Arias

Eliana Espinosa